From melwittt at gmail.com Tue Oct 1 03:14:31 2019 From: melwittt at gmail.com (melanie witt) Date: Mon, 30 Sep 2019 20:14:31 -0700 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: <14cab401-c416-2eb8-b1d9-97aff0642a8e@gmail.com> On 9/30/19 12:08 PM, Matt Riedemann wrote: > On 9/30/2019 12:27 PM, Dan Smith wrote: >>> 2. Do console proxies need to live in the cells? This is what devstack >>> does in superconductor mode. I did some digging through nova code, and >>> it looks that way. Testing with novncproxy agrees. This suggests we >>> need to expose a unique proxy endpoint for each cell, and configure >>> all computes to use the right one via e.g. novncproxy_base_url, >>> correct? >> I'll punt this to Melanie, as she's the console expert at this point, >> but I imagine you're right. >> > > Based on the Rocky spec [1] which says: > > "instead we will resolve the cell database issue by running console > proxies per cell instead of global to a deployment, such that the cell > database is local to the console proxy" > > Yes it's per-cell. There was stuff in the Rock release notes about this > [2] and a lot of confusion around the deprecation of the > nova-consoleauth service for which Mel knows the details, but it looks > like we really should have something documented about this too, here [3] > and/or here [4]. To echo, yes, console proxies need to run per cell. This used to be mentioned in our docs and I looked and found it got removed by the following commit: https://github.com/openstack/nova/commit/009fd0f35bcb88acc80f12e69d5fb72c0ee5391f so, we just need to add back the bit about running console proxies per cell. -melanie > [1] > https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html > > [2] https://docs.openstack.org/releasenotes/nova/rocky.html > [3] https://docs.openstack.org/nova/latest/user/cellsv2-layout.html > [4] https://docs.openstack.org/nova/latest/admin/remote-console-access.html > From balazs.gibizer at est.tech Tue Oct 1 07:30:57 2019 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 1 Oct 2019 07:30:57 +0000 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> Message-ID: <1569915055.26355.1@smtp.office365.com> On Tue, Oct 1, 2019 at 1:09 AM, Eric Fried wrote: > Nova developers and maintainers- > > Every cycle we approve some number of blueprints and then complete a > low > percentage [1] of them. Which blueprints go unfinished seems to be > completely random (notably, it appears to have nothing to do with our > declared cycle priorities). This is especially frustrating for > consumers > of a feature, who (understandably) interpret blueprint/spec approval > as > a signal that they can reasonably expect the feature to land [2]. > > The cause for non-completion usually seems to fall into one of several > broad categories: > > == Inadequate *developer* attention == > - There's not much to be done about the subset of these where the > contributor actually walks away. > > - The real problem is where the developer thinks they're ready for > reviewers to look, but reviewers don't. Even things that seem obvious > to > experienced reviewers, like failing CI or "WIP" in the commit title, > will cause patches to be completely ignored -- but unseasoned > contributors don't necessarily understand even that, let alone more > subtle issues. Consequently, patches will languish, with each side > expecting the other to take the next action. This is a problem of > culture: contributors don't understand nova reviewer procedures and > psychology. > > == Inadequate *reviewer* attention == > - Upstream maintainer time is limited. > > - We always seem to have low review activity until the last two or > three > weeks before feature freeze, when there's a frantic uptick and lots > gets > done. > > - But there's a cultural rift here as well. Getting maintainers to > care > about a blueprint is hard if they don't already have a stake in it. > The > "squeaky wheel" concept is not well understood by unseasoned > contributors. The best way to get reviews is to lurk in IRC and beg. > Aside from not being intuitive, this can also be difficult > logistically > (time zone pain, knowing which nicks to ping and how) as well as > interpersonally (how much begging is enough? too much? when is it > appropriate?). When I joined I was taught that instead of begging go and review open patches which a) helps the review load of dev team b) makes you known in the community. Both helps getting reviews on your patches. Does it always work? No. Do I like begging for review? No. Do I like to get repatedly pinged to review? No. So I would suggest not to declare that the only way to get review is to go and beg. > > == Multi-release efforts that we knew were going to be multi-release > == > These may often drag on far longer than they perhaps should, but I'm > not > going to try to address that here. > > ======== > > There's nothing new or surprising about the above. We've tried to > address these issues in various ways in the past, with varying degrees > of effectiveness. > > I'd like to try a couple more. > > (A) Constrain scope, drastically. We marked 25 blueprints complete in > Train [3]. Since there has been no change to the core team, let's > limit > Ussuri to 25 blueprints [4]. If this turns out to be too few, what's > the > worst thing that happens? We finish everything, early, and wish we had > done more. If that happens, drinks are on me, and we can bump the > number > for V. I support the ide that we limit our scope. But it is pretty hard to select which 25 (or whathever amount we agree on) bp we approve out of possible ~50ish. What will be the method of selection? > > (B) Require a core to commit to "caring about" a spec before we > approve > it. The point of this "core liaison" is to act as a mentor to mitigate > the cultural issues noted above [5], and to be a first point of > contact > for reviews. I've proposed this to the spec template here [6]. I proposed this before and I still think this could help. And partially answer my question above, this could be one of the way to limit the approved bps. If each core only commits to "care about" the implementation of 2 bps, then we already have a limit for the number of approved bps. Cheers, gibi > > Thoughts? > > efried > > [1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware > of > a good way to go back and mine actual data. > [2] Stuff happens, sure, and nobody expects 100%, but 60%? Come on, we > have to be able to do better than that. > [3] https://blueprints.launchpad.net/nova/train > [4] Recognizing of course that not all blueprints are created equal, > this is more an attempt at a reasonable heuristic than an actual > expectation of total size/LOC/person-hours/etc. The theory being that > constraining to an actual number, whatever the number may be, is > better > than not constraining at all. > [5] If you're a core, you can be your own liaison, because presumably > you don't need further cultural indoctrination or help begging for > reviews. > [6] https://review.opendev.org/685857 > From amotoki at gmail.com Tue Oct 1 07:33:24 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Tue, 1 Oct 2019 16:33:24 +0900 Subject: [all][PTG] Strawman Schedule In-Reply-To: <20190930214215.GQ10891@t440s> References: <20190929101340.GM10891@t440s> <20190930214215.GQ10891@t440s> Message-ID: Thanks Slawek for taking care of this. I am really fine with your plan discussed in the team meeting. Akihiro On Tue, Oct 1, 2019 at 6:44 AM Slawek Kaplonski wrote: > > Hi, > > After today's discussion on Neutron's meeting I think that we will "keep" full 3 > days for Neutron. I will than try to plan Neutron's sessions in such way that on > Wednesday morning we will have things which Akihiro will not be interested much. > We will sync about it later. > > On Sun, Sep 29, 2019 at 12:13:40PM +0200, Slawek Kaplonski wrote: > > Hi, > > > > I think that 2.5 days for Neutron should be fine too so we can start on > > Wednesday after the lunch. > > Or, if we should do onboarding session during PTG (I heard something like that > > but I'm not actually sure that it's true), maybe we can do it on > > Wednesday morning and than start PTG discussions after lunch when You will be > > ready Akihiro. > > What do You think about it? > > > > On Sat, Sep 28, 2019 at 02:56:24AM +0900, Akihiro Motoki wrote: > > > Hi Kendall, > > > > > > Looking at the updated version of the schedule, neutron has 2.5 days > > > but actually 3 days are assigned to neutron. > > > As horizon PTL and neutron core, hopefully neutron session starts from > > > Wednesday afternoon (and horizon has Wed morning session). > > > > > > In addition, I see "1.5 or 3.5" or "2 or 3.5" in several projects. I > > > guess they are the number of days assigned, but two numbers are very > > > different so I wonder what this means. > > > > > > Thanks, > > > Akihiro Motoki (irc: amotoki) > > > > > > On Sat, Sep 28, 2019 at 1:59 AM Kendall Nelson wrote: > > > > > > > > Hello Everyone! > > > > > > > > Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 > > > > > > > > The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. > > > > > > > > Please let me know if there are any conflicts! > > > > > > > > -Kendall (diablo_rojo) > > > > > > > > On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: > > > >> > > > >> Hello Everyone! > > > >> > > > >> In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. > > > >> > > > >> We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. > > > >> > > > >> If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! > > > >> > > > >> -Kendall (diablo_rojo) > > > >> > > > >> [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png > > > > > > > -- > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > -- > Slawek Kaplonski > Senior software engineer > Red Hat > > From balazs.gibizer at est.tech Tue Oct 1 08:19:45 2019 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 1 Oct 2019 08:19:45 +0000 Subject: [oslo][nova] Revert of oslo.messaging JSON serialization change In-Reply-To: <1569857750.5848.0@smtp.office365.com> References: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> <1569857750.5848.0@smtp.office365.com> Message-ID: <1569917983.26355.2@smtp.office365.com> On Mon, Sep 30, 2019 at 5:35 PM, Balázs Gibizer wrote: > > > On Mon, Sep 30, 2019 at 4:45 PM, Ben Nemec > wrote: >> Hi, >> >> I've just proposed https://review.opendev.org/#/c/685724/ which >> reverts a change that recently went in to make the fake driver in >> oslo.messaging use jsonutils for message serialization instead of >> json.dumps. >> >> As explained in the commit message on the revert, this is >> problematic >> because the rabbit driver uses kombu's default serialization method, >> which is json.dumps. By changing the fake driver to use jsonutils >> we've made it more lenient than the most used real driver which >> opens >> us up to merging broken changes in consumers of oslo.messaging. >> >> We did have some discussion of whether we should try to override the >> kombu default and tell it to use jsonutils too, as a number of other >> drivers do. The concern with this was that the jsonutils handler for >> things like datetime objects is not tz-aware, which means if you >> send >> a datetime object over RPC and don't explicitly handle it you could >> lose important information. >> >> I'm open to being persuaded otherwise, but at the moment I'm leaning >> toward less magic happening at the RPC layer and requiring projects >> to explicitly handle types that aren't serializable by the standard >> library json module. If you have a different preference, please >> share >> it here. > > Hi, > > I might me totally wrong here and please help me understand how the > RabbitDriver works. What I did when I created the original patch that > I > looked at each drivers how they handle sending messages. The > oslo_messaging._drivers.base.BaseDriver defines the interface with a > send() message. The oslo_messaging._drivers.amqpdriver.AMQPDriverBase > implements the BaseDriver interface's send() method to call _send(). > Then _send() calls rpc_commom.serialize_msg which then calls > jsonutils.dumps. > > The oslo_messaging._drivers.impl_rabbit.RabbitDriver driver inherits > from AMQPDriverBase and does not override send() or _send() so I think > the AMQPDriverBase ._send() is called that therefore jsonutils is used > during sending a message with RabbitDriver. I did some tracing in devstack to prove my point. See the result in https://review.opendev.org/#/c/685724/1//COMMIT_MSG at 11 Cheers, gibi > > Cheers, > gibi > > > [1] > https://github.com/openstack/oslo.messaging/blob/7734ac1376a1a9285c8245a91cf43599358bfa9d/oslo_messaging/_drivers/amqpdriver.py#L599 > >> >> Thanks. >> >> -Ben >> > > From mark at stackhpc.com Tue Oct 1 10:00:49 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 1 Oct 2019 11:00:49 +0100 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: Thanks all for your responses. Replies to Dan inline. On Mon, 30 Sep 2019 at 18:27, Dan Smith wrote: > > > 1. Is there any benefit to not having a superconductor? Presumably > > it's a little more efficient in the single cell case? Also IIUC it > > only requires a single message queue so is a little simpler? > > In a multi-cell case you need it, but you're asking about the case where > there's only one (real) cell yeah? > > If the deployment is really small, then the overhead of having one is > probably measurable and undesirable. I dunno what to tell you about > where that cut-off is, unfortunately. However, once you're over a > certain number of nodes, that probably shakes out a bit. The > superconductor does things that the cell-specific ones won't have to do, > so there's about the same amount of total load, just a potentially > larger memory footprint for running extra services, which would be > measurable at small scales. For a tiny deployment there's also overhead > just in the complexity, but one of the goals of v2 has always been to > get everyone on the same architecture, so having a "small mode" and a > "large mode" brings with it its own complexity. Thanks for the explanation. We've built in a switch for single or super mode, and single mode keeps us compatible with existing deployments, so I guess we'll keep the switch. > > > 2. Do console proxies need to live in the cells? This is what devstack > > does in superconductor mode. I did some digging through nova code, and > > it looks that way. Testing with novncproxy agrees. This suggests we > > need to expose a unique proxy endpoint for each cell, and configure > > all computes to use the right one via e.g. novncproxy_base_url, > > correct? > > I'll punt this to Melanie, as she's the console expert at this point, > but I imagine you're right. > > > 3. Should I upgrade the superconductor or conductor service first? > > Superconductor first, although they all kinda have to go around the same > time. Superconductor, like the regular conductors, needs to look at the > cell database directly, so if you were to upgrade superconductor before > the cell database you'd likely have issues. I think probably the ideal > would be to upgrade the db schema everywhere (which you can do without > rolling code), then upgrade the top-level services (conductor, > scheduler, api) and then you could probably get away with doing > conductor in the cell along with computes, or whatever. If possible > rolling the cell conductors with the top-level services would be ideal. I should have included my strawman deploy and upgrade flow for context, but I'm still honing it. All DB schema changes will be done up front in both cases. In terms of ordering, the API-level services (superconductor, API scheduler) are grouped together and will be rolled first - agreeing with what you've said. I think between Ansible's tags and limiting actions to specific hosts, the code can be written to support upgrading all cell conductors together, or at the same time as (well, immediately before) the cell's computes. The thinking behind upgrading one cell at a time is to limit the blast radius if something goes wrong. You suggest it would be better to roll all cell conductors at the same time though - do you think it's safer to run with the version disparity between conductor and computes rather than super- and cell- conductors? > > > 4. Does the cell conductor need access to the API DB? > > Technically it should not be allowed to talk to the API DB for > "separation of concerns" reasons. However, there are a couple of > features that still rely on the cell conductor being able to upcall to > the API database, such as the late affinity check. If you can only > choose one, then I'd say configure the cell conductors to talk to the > API DB, but if there's a knob for "isolate them" it'd be better. Knobs are easy to make, and difficult to keep working in all positions :) It seems worthwhile in this case. > > > 5. What DB configuration should be used in nova.conf when running > > online data migrations? I can see some migrations that seem to need > > the API DB, and others that need a cell DB. If I just give it the API > > DB, will it use the cell mappings to get to each cell DB, or do I need > > to run it once for each cell? > > The API DB has its own set of migrations, so you obviously need API DB > connection info to make that happen. There is no fanout to all the rest > of the cells (currently), so you need to run it with a conf file > pointing to the cell, for each cell you have. The latest attempt > at making this fan out was abanoned in July with no explanation, so it > dropped off my radar at least. That makes sense. The rolling upgrade docs could be a little clearer for multi-cell deployments here. > > > 6. After an upgrade, when can we restart services to unpin the compute > > RPC version? Looking at the compute RPC API, it looks like the super > > conductor will remain pinned until all computes have been upgraded. > > For a cell conductor, it looks like I could restart it to unpin after > > upgrading all computes in that cell, correct? > > Yeah. > > > 7. Which services require policy.{yml,json}? I can see policy > > referenced in API, conductor and compute. > > That's a good question. I would have thought it was just API, so maybe > someone else can chime in here, although it's not specific to cells. Yeah, unrelated to cells, just something I wondered while digging through our nova Ansible role. Here is the line that made me think policies are required in conductors: https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. I guess this is only required for cell conductors though? > > --Dan From dtantsur at redhat.com Tue Oct 1 10:05:52 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Tue, 1 Oct 2019 12:05:52 +0200 Subject: Release Cycle Observations In-Reply-To: <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> References: <40ab2bd3-e23a-6877-e515-63bbc1663f66@gmail.com> <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> Message-ID: On Fri, Sep 27, 2019 at 10:47 PM Thomas Goirand wrote: > On 9/26/19 9:51 PM, Sean McGinnis wrote: > >> I know we'd like to have everyone CD'ing master > > > > Watch who you're lumping in with the "we" statement. ;) > > You've pinpointed what the problem is. > > Everyone but OpenStack upstream would like to stop having to upgrade > every 6 months. Yep, but the same "everyone" want to have features now or better yesterday, not in 2-3 years ;) > The only way this could be resolved would be an > OpenStack LTS release let's say every 2 years, and allowing upgrade > between them, though that's probably too much effort upstream. We have > different groups wishing for the opposite thing to happen. > > I don't see this problem going away anytime soon. > > Cheers, > > Thomas Goirand (zigo) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at stackhpc.com Tue Oct 1 10:53:07 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Tue, 1 Oct 2019 12:53:07 +0200 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Hi Kendall, Friday works for all who have replied so far, but I am still expecting answers from two people. Is there a room available for our Project Onboarding session that day? Probably in the morning, though I will confirm depending on availability of participants. We've never run one, so I don't know how many people to expect. Thanks, Pierre On Mon, 30 Sep 2019 at 23:29, Kendall Waters wrote: > > Hi Pierre, > > Apologies for the oversight on Blazar. Would all day Friday work for your team? > > Thanks, > Kendall > > Kendall Waters > OpenStack Marketing & Events > kendall at openstack.org > > > > On Sep 30, 2019, at 12:27 PM, Pierre Riteau wrote: > > Hi Kendall, > > I couldn't see Blazar anywhere on the schedule. We had requested time > for a Project Onboarding session. > > Additionally, there are more people travelling than initially planned, > so we may want to allocate a half day for technical discussions as > well (probably in the shared space, since we don't expect a huge > turnout). > > Would it be possible to update the schedule accordingly? > > Thanks, > Pierre > > On Fri, 27 Sep 2019 at 19:02, Kendall Nelson wrote: > > > Hello Everyone! > > Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 > > The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. > > Please let me know if there are any conflicts! > > -Kendall (diablo_rojo) > > On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: > > > Hello Everyone! > > In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. > > We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. > > If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! > > -Kendall (diablo_rojo) > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png > > > From thierry at openstack.org Tue Oct 1 11:18:29 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 1 Oct 2019 13:18:29 +0200 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> Message-ID: <0464fd02-d393-5cc1-f03d-c2638d8fdd1f@openstack.org> Eric Fried wrote: > [...] > There's nothing new or surprising about the above. We've tried to > address these issues in various ways in the past, with varying degrees > of effectiveness. > > I'd like to try a couple more. > > (A) Constrain scope, drastically. We marked 25 blueprints complete in > Train [3]. Since there has been no change to the core team, let's limit > Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the > worst thing that happens? We finish everything, early, and wish we had > done more. If that happens, drinks are on me, and we can bump the number > for V. > > (B) Require a core to commit to "caring about" a spec before we approve > it. The point of this "core liaison" is to act as a mentor to mitigate > the cultural issues noted above [5], and to be a first point of contact > for reviews. I've proposed this to the spec template here [6]. > > Thoughts? Setting expectations more reasonably is key to grow a healthy long-term environment, so I completely support your efforts here. However I suspect there will always be blueprints that fail to be completed. If it were purely a question of reviewer resources, then I agree that capping the number of blueprints to the reviewer team's throughput is the right approach. But as you noted, incomplete blueprints come from a few different reasons, sometimes not related to reviewers efforts at all. So if out of 50 blueprints, say 5 are incomplete due to lack of reviewers attention, 5 due to lack of developer attention, and 15 fail due to reviewers also being developers and having to make a hard choice... Targeting 30-35 might be better (expecting 5-10 of them to fail anyway, and not due to constrained resources). The other comment I have is that I suspect all blueprints do not have the same weight, so assigning them complexity points could help avoid under/overshooting. -- Thierry Carrez (ttx) From jean-philippe at evrard.me Tue Oct 1 12:29:07 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 01 Oct 2019 14:29:07 +0200 Subject: [tc] Weekly update Message-ID: Hello friends, Here's what need attention for the OpenStack TC this week: 1. We should ensure we have two TC members focusing on next cycle goal selection process. Only Ghanshyam is dealing with this, and we must help him on the way! Any volunteers? Thanks again gmann for working on that. 2. Jimmy McArthur sent us the results of the OpenStack User survey on the ML [1]. We currently haven't analyzed the information yet. Any volunteer to analyse the information (in order to extract action items) is welcomed. It would be great if we could discuss this at our next official meeting, or at least discuss the next steps. 3. Our next meeting date will be the Thursday 10 October. I will be travelling that day, so it would be nice to have a volunteer to host the meeting. For that, our next meeting agenda needs clarifications. It would be great if you could update the agenda (please also write if your absent) on the wiki [2], so that I can send the invite to the ML. I will send the invite on Thursday. 4. We still haven't finished the conversationg about naming releases. There are a few new ideas floated around, so we should maybe drop the current process to take count of the newly proposed ideas (The large cities lists proposed by Nate, the movie quotes proposed by Thierry [9])? Alternatively, if we can't find consensus, should we just entrust the release naming process to the release team? 5. We should decide to deprecate or not the PowerVMStackers team [3] and move it as a SIG. The votes don't reflect this. Thank you everyone! [1]: http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html [2]: https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [3]: https://review.opendev.org/680438 [4]: https://review.opendev.org/680985 [5]: https://review.opendev.org/681260 [6]: https://review.opendev.org/681480 [7]: https://review.opendev.org/681924 [8]: https://review.opendev.org/682380 [9]: https://review.opendev.org/684688 From tpb at dyncloud.net Tue Oct 1 12:38:50 2019 From: tpb at dyncloud.net (Tom Barron) Date: Tue, 1 Oct 2019 08:38:50 -0400 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <1569915055.26355.1@smtp.office365.com> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> Message-ID: <20191001123850.f7h4wmupoo3oyzta@barron.net> On 01/10/19 07:30 +0000, Balázs Gibizer wrote: > > >On Tue, Oct 1, 2019 at 1:09 AM, Eric Fried wrote: >> Nova developers and maintainers- >> >> Every cycle we approve some number of blueprints and then complete a >> low >> percentage [1] of them. Which blueprints go unfinished seems to be >> completely random (notably, it appears to have nothing to do with our >> declared cycle priorities). This is especially frustrating for >> consumers >> of a feature, who (understandably) interpret blueprint/spec approval >> as >> a signal that they can reasonably expect the feature to land [2]. >> >> The cause for non-completion usually seems to fall into one of several >> broad categories: >> >> == Inadequate *developer* attention == >> - There's not much to be done about the subset of these where the >> contributor actually walks away. >> >> - The real problem is where the developer thinks they're ready for >> reviewers to look, but reviewers don't. Even things that seem obvious >> to >> experienced reviewers, like failing CI or "WIP" in the commit title, >> will cause patches to be completely ignored -- but unseasoned >> contributors don't necessarily understand even that, let alone more >> subtle issues. Consequently, patches will languish, with each side >> expecting the other to take the next action. This is a problem of >> culture: contributors don't understand nova reviewer procedures and >> psychology. >> >> == Inadequate *reviewer* attention == >> - Upstream maintainer time is limited. >> >> - We always seem to have low review activity until the last two or >> three >> weeks before feature freeze, when there's a frantic uptick and lots >> gets >> done. >> >> - But there's a cultural rift here as well. Getting maintainers to >> care >> about a blueprint is hard if they don't already have a stake in it. >> The >> "squeaky wheel" concept is not well understood by unseasoned >> contributors. The best way to get reviews is to lurk in IRC and beg. >> Aside from not being intuitive, this can also be difficult >> logistically >> (time zone pain, knowing which nicks to ping and how) as well as >> interpersonally (how much begging is enough? too much? when is it >> appropriate?). > >When I joined I was taught that instead of begging go and review open >patches which a) helps the review load of dev team b) makes you known >in the community. Both helps getting reviews on your patches. Does it >always work? No. Do I like begging for review? No. Do I like to get >repatedly pinged to review? No. So I would suggest not to declare that >the only way to get review is to go and beg. +1 In projects I have worked on there is no need to encourage extra begging and squeaky wheel prioritization has IMO not been a healthy thing. There is no better way to get ones reviews stalled than to beg for reviews with patches that are not close to ready for review and at the same time contribute no useful reviews oneself. There is nothing wrong with pinging to get attention to a review if it is ready and languishing, or if it solves an urgent issue, but even in these cases a ping from someone who doesn't "cry wolf" and who has built a reputation as a contributor carries more weight. > >> >> == Multi-release efforts that we knew were going to be multi-release >> == >> These may often drag on far longer than they perhaps should, but I'm >> not >> going to try to address that here. >> >> ======== >> >> There's nothing new or surprising about the above. We've tried to >> address these issues in various ways in the past, with varying degrees >> of effectiveness. >> >> I'd like to try a couple more. >> >> (A) Constrain scope, drastically. We marked 25 blueprints complete in >> Train [3]. Since there has been no change to the core team, let's >> limit >> Ussuri to 25 blueprints [4]. If this turns out to be too few, what's >> the >> worst thing that happens? We finish everything, early, and wish we had >> done more. If that happens, drinks are on me, and we can bump the >> number >> for V. > >I support the ide that we limit our scope. But it is pretty hard to >select which 25 (or whathever amount we agree on) bp we approve out of >possible ~50ish. What will be the method of selection? > >> >> (B) Require a core to commit to "caring about" a spec before we >> approve >> it. The point of this "core liaison" is to act as a mentor to mitigate >> the cultural issues noted above [5], and to be a first point of >> contact >> for reviews. I've proposed this to the spec template here [6]. > >I proposed this before and I still think this could help. And partially >answer my question above, this could be one of the way to limit the >approved bps. If each core only commits to "care about" the >implementation of 2 bps, then we already have a limit for the number of >approved bps. > >Cheers, >gibi > >> >> Thoughts? >> >> efried >> >> [1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware >> of >> a good way to go back and mine actual data. >> [2] Stuff happens, sure, and nobody expects 100%, but 60%? Come on, we >> have to be able to do better than that. >> [3] https://blueprints.launchpad.net/nova/train >> [4] Recognizing of course that not all blueprints are created equal, >> this is more an attempt at a reasonable heuristic than an actual >> expectation of total size/LOC/person-hours/etc. The theory being that >> constraining to an actual number, whatever the number may be, is >> better >> than not constraining at all. >> [5] If you're a core, you can be your own liaison, because presumably >> you don't need further cultural indoctrination or help begging for >> reviews. >> [6] https://review.opendev.org/685857 >> > > From a.settle at outlook.com Tue Oct 1 12:49:03 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Tue, 1 Oct 2019 12:49:03 +0000 Subject: [all] [tc] [ptls] PDF Goal Aftermath Message-ID: Hi all, Thanks to all those who worked super hard to achieve the PDF Enablement goal for Train. Things are looking great, and I've already received feedback from downstream about how useful the PDFs are! So that's fantastic. For the sake of the goal, the intention was to have potentially imperfect PDFs and to ensure they're building. In some cases, this does leave room for improvement. In openstack-doc, we've been approached a few times regarding moving forward when there are issues with no known workarounds. So, asking all those that have been working on/building their PDFs and are experiencing issues that can be fixed post-mortem to ensure everything is documented in our Common Problems etherpad [1]. Ensuring everything is documented, means we can work together to identify appropriate workarounds or fixes in the future. We will review all items without a workaround in the short term. Cheers, Alex [1] https://etherpad.openstack.org/p/pdf-goal-train-common-problems -- Alexandra Settle IRC: asettle From fungi at yuggoth.org Tue Oct 1 13:00:36 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 1 Oct 2019 13:00:36 +0000 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <20191001123850.f7h4wmupoo3oyzta@barron.net> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> Message-ID: <20191001130035.hm2alc63eab4cpek@yuggoth.org> On 2019-10-01 08:38:50 -0400 (-0400), Tom Barron wrote: [...] > In projects I have worked on there is no need to encourage extra > begging and squeaky wheel prioritization has IMO not been a > healthy thing. > > There is no better way to get ones reviews stalled than to beg for > reviews with patches that are not close to ready for review and at > the same time contribute no useful reviews oneself. > > There is nothing wrong with pinging to get attention to a review > if it is ready and languishing, or if it solves an urgent issue, > but even in these cases a ping from someone who doesn't "cry wolf" > and who has built a reputation as a contributor carries more > weight. [...] Agreed, it drives back to Eric's comment about familiarity with the team's reviewer culture. Just saying "hey I pushed these patches can someone look" is often far less effective for a newcomer than "I reported a bug in subsystem X which is really aggravating and review 654321 fixes it if anyone has a moment to look" or "tbarron: I addressed your comments on review 654321 when you get a chance to revisit your -1, thanks!" My cardinal rules of begging: Don't mention the nicks of random people who have not been involved in the change unless you happen to actually know it's one they'll personally be interested in. Provide as much context as possible (within reason) to attract the actual interest of potential reviewers. Be polite, thank people, and don't assume your change is important to anyone nor that there's someone who has time to look at it. And most important, as you noted too, if you're waiting around then take a few minutes and go review something to pass the time! ;) -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dms at danplanet.com Tue Oct 1 13:37:42 2019 From: dms at danplanet.com (Dan Smith) Date: Tue, 01 Oct 2019 06:37:42 -0700 Subject: [nova][kolla] questions on cells In-Reply-To: (Mark Goddard's message of "Tue, 1 Oct 2019 11:00:49 +0100") References: Message-ID: Mark Goddard writes: > The thinking behind upgrading one cell at a time is to limit the blast > radius if something goes wrong. You suggest it would be better to roll > all cell conductors at the same time though - do you think it's safer > to run with the version disparity between conductor and computes > rather than super- and cell- conductors? Yes, the conductors and computes are built to work at different versions. Conductors, not so much. While you can pin the conductor RPC version to *technically* make them talk, they will do things like migrate data to new formats in the cell databases and since they *are* the insulation layer against such changes, older conductors are going to be unhappy if new conductors move data underneath them before they're ready. > Here is the line that made me think policies are required in > conductors: > https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. > I guess this is only required for cell conductors though? No, actually more likely to be the superconductors I think. However, it could technically be called at the cell level so you probably need to make sure it's there. That might be something left-over from a check that moved to the API and could now be removed (or ignored). --Dan From jungleboyj at gmail.com Tue Oct 1 13:52:56 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 1 Oct 2019 08:52:56 -0500 Subject: [tc] Weekly update In-Reply-To: References: Message-ID: On 10/1/2019 7:29 AM, Jean-Philippe Evrard wrote: > Hello friends, > > Here's what need attention for the OpenStack TC this week: > > 1. We should ensure we have two TC members focusing on next cycle goal > selection process. Only Ghanshyam is dealing with this, and we must > help him on the way! Any volunteers? Thanks again gmann for working on > that. > > 2. Jimmy McArthur sent us the results of the OpenStack User survey on > the ML [1]. We currently haven't analyzed the information yet. > Any volunteer to analyse the information (in order to extract action > items) is welcomed. It would be great if we could discuss this at our > next official meeting, or at least discuss the next steps. JP, I generally go through this for the Cinder team.  Since I will be in there I can review the comments and create an overview/action items before our next meeting. Jay > 3. Our next meeting date will be the Thursday 10 October. I will be > travelling that day, so it would be nice to have a volunteer to host > the meeting. For that, our next meeting agenda needs clarifications. > It would be great if you could update the agenda (please also write if > your absent) on the wiki [2], so that I can send the invite to the ML. > I will send the invite on Thursday. > > 4. We still haven't finished the conversationg about naming releases. > There are a few new ideas floated around, so we should maybe drop the > current process to take count of the newly proposed ideas (The large > cities lists proposed by Nate, the movie quotes proposed by Thierry > [9])? Alternatively, if we can't find consensus, should we just entrust > the release naming process to the release team? > > 5. We should decide to deprecate or not the PowerVMStackers team [3] > and move it as a SIG. The votes don't reflect this. > > Thank you everyone! > > [1]: > http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html > > [2]: > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > [3]: https://review.opendev.org/680438 > [4]: https://review.opendev.org/680985 > [5]: https://review.opendev.org/681260 > [6]: https://review.opendev.org/681480 > [7]: https://review.opendev.org/681924 > [8]: https://review.opendev.org/682380 > [9]: https://review.opendev.org/684688 > > From openstack at fried.cc Tue Oct 1 14:15:22 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 1 Oct 2019 09:15:22 -0500 Subject: [nova][ops] Removing Debug middleware Message-ID: <8659945a-91d3-2057-b089-92002508e188@fried.cc> Deployers- There's a Debug middleware in nova that's not used in the codebase since 2010, so we're removing it [1]. BUT Theoretically, deployments could be making use of it by injecting it into the paste pipeline (a ~3LOC edit to your local api-paste.ini). If this is you, and you're really relying on this behavior, let us know. We can either revert the change or show you how to carry it locally (which would be really easy to do). Thanks, efried [1] https://review.opendev.org/#/c/662506/ From no-reply at openstack.org Tue Oct 1 14:34:49 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Tue, 01 Oct 2019 14:34:49 -0000 Subject: networking-midonet 9.0.0.0rc2 (train) Message-ID: Hello everyone, A new release candidate for networking-midonet for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/networking-midonet/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/networking-midonet/src/branch/stable/train Release notes for networking-midonet can be found at: https://docs.openstack.org/releasenotes/networking-midonet/ If you find an issue that could be considered release-critical, please file it at: https://bugs.launchpad.net/networking-midonet/+bugs and tag it *train-rc-potential* to bring it to the networking-midonet release crew's attention. From openstack at fried.cc Tue Oct 1 15:00:20 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 1 Oct 2019 10:00:20 -0500 Subject: [nova][ptg] Review culture (was: Ussuri scope containment) In-Reply-To: <20191001130035.hm2alc63eab4cpek@yuggoth.org> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> <20191001130035.hm2alc63eab4cpek@yuggoth.org> Message-ID: <72a5c7e7-58a5-187d-3422-44fb110e0f10@fried.cc> Thanks for the responses, all. This subthread is becoming tangential to my original purpose, so I'm renaming it. >> The best way to get reviews is to lurk in IRC and beg. > When I joined I was taught that instead of begging go and review open > patches which a) helps the review load of dev team b) makes you known > in the community. Both helps getting reviews on your patches. Does it > always work? No. Do I like begging for review? No. Do I like to get > repatedly pinged to review? No. So I would suggest not to declare that > the only way to get review is to go and beg. I recognize I was generalizing; begging isn't really "the best way" to get reviews. Doing reviews and becoming known (and *then* begging :) is far more effective -- but is literally impossible for many contributors. Even if they have the time (percentage of work week) to dedicate upstream, it takes massive effort and time (calendar) to get there. We can not and should not expect this of every contributor. More... On 10/1/19 8:00 AM, Jeremy Stanley wrote: > On 2019-10-01 08:38:50 -0400 (-0400), Tom Barron wrote: > [...] >> In projects I have worked on there is no need to encourage extra >> begging and squeaky wheel prioritization has IMO not been a >> healthy thing. >> >> There is no better way to get ones reviews stalled than to beg for >> reviews with patches that are not close to ready for review and at >> the same time contribute no useful reviews oneself. >> >> There is nothing wrong with pinging to get attention to a review >> if it is ready and languishing, or if it solves an urgent issue, >> but even in these cases a ping from someone who doesn't "cry wolf" >> and who has built a reputation as a contributor carries more >> weight. > [...] > > Agreed, it drives back to Eric's comment about familiarity with the > team's reviewer culture. Just saying "hey I pushed these patches can > someone look" is often far less effective for a newcomer than "I > reported a bug in subsystem X which is really aggravating and review > 654321 fixes it if anyone has a moment to look" or "tbarron: I > addressed your comments on review 654321 when you get a chance to > revisit your -1, thanks!" > > My cardinal rules of begging: Don't mention the nicks of random > people who have not been involved in the change unless you happen to > actually know it's one they'll personally be interested in. Provide > as much context as possible (within reason) to attract the actual > interest of potential reviewers. Be polite, thank people, and don't > assume your change is important to anyone nor that there's someone > who has time to look at it. And most important, as you noted too, if > you're waiting around then take a few minutes and go review > something to pass the time! ;) > This is *precisely* the kind of culture that we cannot expect inexperienced contributors to understand. We can write it down [1], but then we have to get people to read what's written. To tie back to the original thread, this is where it would help to have a core (or experienced dev) as a mentor/liaison to be the first point of contact for questions, guidance, etc. Putting it in the spec process ensures it doesn't get missed (like a doc sitting "out there" somewhere). efried [1] though I fear that would end up being a long-winded and wandering tome, difficult to read and grok, assuming we could even agree on what it should say (frankly, there are some aspects we should be embarrassed to admit in writing) From moreira.belmiro.email.lists at gmail.com Tue Oct 1 15:12:55 2019 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Tue, 1 Oct 2019 17:12:55 +0200 Subject: [nova][kolla] questions on cells In-Reply-To: <0c65b9eb-63af-6daa-c82b-61034ca52440@gmail.com> References: <0c65b9eb-63af-6daa-c82b-61034ca52440@gmail.com> Message-ID: Hi, just to clarify, CERN runs the superconductor. Yes, affinity check is an issue. We plan work on it in the next cycle. The metadata API runs per cell. The main reason is that we still run nova-network in few cells. cheers, Belmiro On Mon, Sep 30, 2019 at 8:56 PM Matt Riedemann wrote: > On 9/30/2019 12:27 PM, Dan Smith wrote: > >> 4. Does the cell conductor need access to the API DB? > > Technically it should not be allowed to talk to the API DB for > > "separation of concerns" reasons. However, there are a couple of > > features that still rely on the cell conductor being able to upcall to > > the API database, such as the late affinity check. > > In case you haven't seen this yet, we have a list of operations > requiring "up-calls" from compute/cell-conductor to the API DB in the > docs here: > > > https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#operations-requiring-upcalls > > Some have been fixed for awhile and some are still open because they are > not default configuration we normally deal with (cross_az_attach=False) > or hit in CI* runs (reschedules). > > I think the biggest/hardest problem there to solve is the late affinity > check which long-term should be solved with placement but no one is > working on that. The reschedule stuff related to getting AZ/aggregate > info is simpler but involves some RPC changes so it's not trivial and > again no one is working on fixing that. > > I think for those reasons CERN is running without a superconductor mode > and can hit the API DB from the cells. Devstack superconductor mode is > the ideal though for the separation of concerns Dan pointed out. > > *Note we do hit the reschedule issue sometimes in multi-cell jobs: > > > http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22CantStartEngineError%3A%20No%20sql_connection%20parameter%20is%20established%5C%22%20AND%20tags%3A%5C%22screen-n-cond-cell1.txt%5C%22&from=7d > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kendall at openstack.org Tue Oct 1 15:37:07 2019 From: kendall at openstack.org (Kendall Waters) Date: Tue, 1 Oct 2019 10:37:07 -0500 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: <29C580AF-47C6-426A-B571-E0D0E9E8806E@openstack.org> Hi Pierre, Most of our space at the Shanghai PTG is shared space so we can offer you a designated table in the shared room all day Friday. There will be extra chairs in the room if you need to pull up more chairs to your table. Best, Kendall Kendall Waters OpenStack Marketing & Events kendall at openstack.org > On Oct 1, 2019, at 5:53 AM, Pierre Riteau wrote: > > Hi Kendall, > > Friday works for all who have replied so far, but I am still expecting > answers from two people. > > Is there a room available for our Project Onboarding session that day? > Probably in the morning, though I will confirm depending on > availability of participants. > We've never run one, so I don't know how many people to expect. > > Thanks, > Pierre > > On Mon, 30 Sep 2019 at 23:29, Kendall Waters wrote: >> >> Hi Pierre, >> >> Apologies for the oversight on Blazar. Would all day Friday work for your team? >> >> Thanks, >> Kendall >> >> Kendall Waters >> OpenStack Marketing & Events >> kendall at openstack.org >> >> >> >> On Sep 30, 2019, at 12:27 PM, Pierre Riteau wrote: >> >> Hi Kendall, >> >> I couldn't see Blazar anywhere on the schedule. We had requested time >> for a Project Onboarding session. >> >> Additionally, there are more people travelling than initially planned, >> so we may want to allocate a half day for technical discussions as >> well (probably in the shared space, since we don't expect a huge >> turnout). >> >> Would it be possible to update the schedule accordingly? >> >> Thanks, >> Pierre >> >> On Fri, 27 Sep 2019 at 19:02, Kendall Nelson wrote: >> >> >> Hello Everyone! >> >> Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 >> >> The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. >> >> Please let me know if there are any conflicts! >> >> -Kendall (diablo_rojo) >> >> On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: >> >> >> Hello Everyone! >> >> In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. >> >> We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. >> >> If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! >> >> -Kendall (diablo_rojo) >> >> [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From nate.johnston at redhat.com Tue Oct 1 15:47:07 2019 From: nate.johnston at redhat.com (Nate Johnston) Date: Tue, 1 Oct 2019 11:47:07 -0400 Subject: [tc] Weekly update In-Reply-To: References: Message-ID: Jean-Philippe, I'd be happy to run the meeting for you. Thanks, Nate On Tue, Oct 1, 2019 at 8:34 AM Jean-Philippe Evrard wrote: > Hello friends, > > Here's what need attention for the OpenStack TC this week: > > 1. We should ensure we have two TC members focusing on next cycle goal > selection process. Only Ghanshyam is dealing with this, and we must > help him on the way! Any volunteers? Thanks again gmann for working on > that. > > 2. Jimmy McArthur sent us the results of the OpenStack User survey on > the ML [1]. We currently haven't analyzed the information yet. > Any volunteer to analyse the information (in order to extract action > items) is welcomed. It would be great if we could discuss this at our > next official meeting, or at least discuss the next steps. > > 3. Our next meeting date will be the Thursday 10 October. I will be > travelling that day, so it would be nice to have a volunteer to host > the meeting. For that, our next meeting agenda needs clarifications. > It would be great if you could update the agenda (please also write if > your absent) on the wiki [2], so that I can send the invite to the ML. > I will send the invite on Thursday. > > 4. We still haven't finished the conversationg about naming releases. > There are a few new ideas floated around, so we should maybe drop the > current process to take count of the newly proposed ideas (The large > cities lists proposed by Nate, the movie quotes proposed by Thierry > [9])? Alternatively, if we can't find consensus, should we just entrust > the release naming process to the release team? > > 5. We should decide to deprecate or not the PowerVMStackers team [3] > and move it as a SIG. The votes don't reflect this. > > Thank you everyone! > > [1]: > > http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html > > [2]: > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > [3]: https://review.opendev.org/680438 > [4]: https://review.opendev.org/680985 > [5]: https://review.opendev.org/681260 > [6]: https://review.opendev.org/681480 > [7]: https://review.opendev.org/681924 > [8]: https://review.opendev.org/682380 > [9]: https://review.opendev.org/684688 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nate.johnston at redhat.com Tue Oct 1 15:48:15 2019 From: nate.johnston at redhat.com (Nate Johnston) Date: Tue, 1 Oct 2019 11:48:15 -0400 Subject: [tc] Weekly update In-Reply-To: References: Message-ID: Ah, never mind, I did not notice that asettle already volunteered. Apologies! Nate On Tue, Oct 1, 2019 at 11:47 AM Nate Johnston wrote: > Jean-Philippe, > > I'd be happy to run the meeting for you. > > Thanks, > > Nate > > On Tue, Oct 1, 2019 at 8:34 AM Jean-Philippe Evrard < > jean-philippe at evrard.me> wrote: > >> Hello friends, >> >> Here's what need attention for the OpenStack TC this week: >> >> 1. We should ensure we have two TC members focusing on next cycle goal >> selection process. Only Ghanshyam is dealing with this, and we must >> help him on the way! Any volunteers? Thanks again gmann for working on >> that. >> >> 2. Jimmy McArthur sent us the results of the OpenStack User survey on >> the ML [1]. We currently haven't analyzed the information yet. >> Any volunteer to analyse the information (in order to extract action >> items) is welcomed. It would be great if we could discuss this at our >> next official meeting, or at least discuss the next steps. >> >> 3. Our next meeting date will be the Thursday 10 October. I will be >> travelling that day, so it would be nice to have a volunteer to host >> the meeting. For that, our next meeting agenda needs clarifications. >> It would be great if you could update the agenda (please also write if >> your absent) on the wiki [2], so that I can send the invite to the ML. >> I will send the invite on Thursday. >> >> 4. We still haven't finished the conversationg about naming releases. >> There are a few new ideas floated around, so we should maybe drop the >> current process to take count of the newly proposed ideas (The large >> cities lists proposed by Nate, the movie quotes proposed by Thierry >> [9])? Alternatively, if we can't find consensus, should we just entrust >> the release naming process to the release team? >> >> 5. We should decide to deprecate or not the PowerVMStackers team [3] >> and move it as a SIG. The votes don't reflect this. >> >> Thank you everyone! >> >> [1]: >> >> http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html >> >> [2]: >> https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting >> [3]: https://review.opendev.org/680438 >> [4]: https://review.opendev.org/680985 >> [5]: https://review.opendev.org/681260 >> [6]: https://review.opendev.org/681480 >> [7]: https://review.opendev.org/681924 >> [8]: https://review.opendev.org/682380 >> [9]: https://review.opendev.org/684688 >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Tue Oct 1 16:19:44 2019 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 1 Oct 2019 16:19:44 +0000 Subject: [nova][ptg] Review culture (was: Ussuri scope containment) In-Reply-To: <72a5c7e7-58a5-187d-3422-44fb110e0f10@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> <20191001130035.hm2alc63eab4cpek@yuggoth.org> <72a5c7e7-58a5-187d-3422-44fb110e0f10@fried.cc> Message-ID: <1569946782.31568.0@smtp.office365.com> On Tue, Oct 1, 2019 at 5:00 PM, Eric Fried wrote: > Thanks for the responses, all. > > This subthread is becoming tangential to my original purpose, so I'm > renaming it. > >>> The best way to get reviews is to lurk in IRC and beg. > >> When I joined I was taught that instead of begging go and review >> open >> patches which a) helps the review load of dev team b) makes you >> known >> in the community. Both helps getting reviews on your patches. Does >> it >> always work? No. Do I like begging for review? No. Do I like to get >> repatedly pinged to review? No. So I would suggest not to declare >> that >> the only way to get review is to go and beg. > > I recognize I was generalizing; begging isn't really "the best way" to > get reviews. Doing reviews and becoming known (and *then* begging :) > is > far more effective -- but is literally impossible for many > contributors. > Even if they have the time (percentage of work week) to dedicate > upstream, it takes massive effort and time (calendar) to get there. We > can not and should not expect this of every contributor. > Sure, it is not easy for a new commer to read a random nova patch. But I think we should encourage them to do so. As that is one of the way how a newcomer will learn how nova (as software) works. I don't expect from a newcommer to point out in a nova review that I made a mistake about an obscure nova specific construct. But I think a newcommer still can give us valuable feedback about the code readability, about generic python usage, about English grammar... gibi From openstack at fried.cc Tue Oct 1 18:33:27 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 1 Oct 2019 13:33:27 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <1569915055.26355.1@smtp.office365.com> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> Message-ID: Thanks all for the feedback, refinements, suggestions. Please keep them coming! > If each core only commits to "care about" the > implementation of 2 bps, then we already have a limit for the number of > approved bps. I'd like to not try to prescribe this level of detail. [all blueprints are not created equal] x [all cores are not created equal] = [too many variables]. Different cores will have different amounts of time, effort, and willingness to be liaisons. > I support the ide that we limit our scope. But it is pretty hard to > select which 25 (or whathever amount we agree on) bp we approve out of > possible ~50ish. What will be the method of selection? Basically have a meeting and decide what should fall above or below the line, like you would in a corporate setting. It's not vastly different than how we already decide whether to approve a blueprint; it's just based on resource rather than technical criteria. (It's a hard thing to have to tell somebody their feature is denied despite having technical merit, but my main driver here is that they would rather know that up front than it be basically a coin toss whose result they don't know until feature freeze.) > So if out of 50 blueprints, say 5 are incomplete due to lack of > reviewers attention, 5 due to lack of developer attention, and 15 fail > due to reviewers also being developers and having to make a hard > choice... Targeting 30-35 might be better (expecting 5-10 of them to > fail anyway, and not due to constrained resources). Yup, your math makes sense. It pains me to phrase it this way, but it's more realistic: (A) Let's aim to complete 25 blueprints in Ussuri; so we'll approve 30, expecting 5 to fail. And the goal of this manifesto is to ensure that ~zero of the 5 incompletes are due to (A) overcommitment and (B) cultural disconnects. > The other comment I have is that I suspect all blueprints do not have > the same weight, so assigning them complexity points could help avoid > under/overshooting. Yeah, that's a legit suggestion, but I really didn't want to go there [1]. I want to try to keep this conceptually as simple as possible, at least the first time around. (I also really don't see the team trying to subvert the process by e.g. making sure we pick the 30 biggest blueprints.) efried [1] I have long-lasting scars from my experiences with "story points" and other "agile" planning techniques. From satish.txt at gmail.com Tue Oct 1 18:39:28 2019 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 1 Oct 2019 14:39:28 -0400 Subject: issues creating a second vm with numa affinity In-Reply-To: <9D8A2486E35F0941A60430473E29F15B017EB7B8AE@MXDB1.ad.garvan.unsw.edu.au> References: <9D8A2486E35F0941A60430473E29F15B017EB7B8AE@MXDB1.ad.garvan.unsw.edu.au> Message-ID: did you try to removing "hw:numa_nodes=1" ? On Tue, Oct 1, 2019 at 2:16 PM Manuel Sopena Ballesteros wrote: > > Dear Openstack user community, > > > > I have a compute node with 2 numa nodes and I would like to create 2 vms, each one using a different numa node through numa affinity with cpu, memory and nvme pci devices. > > > > pci passthrough whitelist > > [root at zeus-53 ~]# tail /etc/kolla/nova-compute/nova.conf > > [notifications] > > > > [filter_scheduler] > > enabled_filters = enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter > > available_filters = nova.scheduler.filters.all_filters > > > > [pci] > > passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, {"address":"0000:87:00.0"} ] > > alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"} > > > > Openstack flavor > > openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property pci_passthrough:alias='nvme:4' > > > > The first vm is successfully created > > openstack server create --network hpc --flavor xlarge.numa.perf.test --image centos7.6-image --availability-zone nova:zeus-53.localdomain --key-name mykey kudu-1 > > > > However the second vm fails > > openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone nova:zeus-53.localdomain --key-name mykey kudu-4 > > > > Errors in nova compute node > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [req-b5a25c73-8c7d-466c-8128-71f29e7ae8aa 91e83343e9834c8ba0172ff369c8acac b91520cff5bd45c59a8de07c38641582 - default default] [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] Instance failed to spawn: libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] Traceback (most recent call last): > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2369, in _build_resources > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] yield resources > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2133, in _build_and_run_instance > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] block_device_info=block_device_info) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3142, in spawn > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] destroy_disks_on_failure=True) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5705, in _create_domain_and_network > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] destroy_disks_on_failure) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] self.force_reraise() > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5674, in _create_domain_and_network > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] post_xml_callback=post_xml_callback) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5608, in _create_domain > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] guest.launch(pause=pause) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] self._encoded_xml, errors='ignore') > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] self.force_reraise() > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] return self._domain.createWithFlags(flags) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] result = proxy_call(self._autowrap, f, *args, **kwargs) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = execute(f, *args, **kwargs) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] six.reraise(c, e, tb) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = meth(*args, **kwargs) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] > > > > Numa cell/node 1 (the one assigned on kudu-4) has enough cpu, memory, pci devices and disk capacity to fit this vm. NOTE: below is the information relevant I could think of that shows resources available after creating the second vm. > > > > [root at zeus-53 ~]# numactl -H > > available: 2 nodes (0-1) > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > node 0 size: 262029 MB > > node 0 free: 52787 MB > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > node 1 size: 262144 MB > > node 1 free: 250624 MB > > node distances: > > node 0 1 > > 0: 10 21 > > 1: 21 10 > > NOTE: this is to show that numa node/cell 1 has enough resources available (also nova-compute logs shows that kudu-4 is assigned to cell 1) > > > > [root at zeus-53 ~]# df -h > > Filesystem Size Used Avail Use% Mounted on > > /dev/md127 3.7T 9.1G 3.7T 1% / > > ... > > NOTE: vm disk files goes to root (/) partition > > > > [root at zeus-53 ~]# lsblk > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > sda 8:0 0 59.6G 0 disk > > ├─sda1 8:1 0 1G 0 part /boot > > └─sda2 8:2 0 16G 0 part [SWAP] > > loop0 7:0 0 100G 0 loop > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > loop1 7:1 0 2G 0 loop > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > nvme0n1 259:8 0 1.8T 0 disk > > └─nvme0n1p1 259:9 0 1.8T 0 part > > └─md127 9:127 0 3.7T 0 raid0 / > > nvme1n1 259:6 0 1.8T 0 disk > > └─nvme1n1p1 259:7 0 1.8T 0 part > > └─md127 9:127 0 3.7T 0 raid0 / > > nvme2n1 259:2 0 1.8T 0 disk > > nvme3n1 259:1 0 1.8T 0 disk > > nvme4n1 259:0 0 1.8T 0 disk > > nvme5n1 259:3 0 1.8T 0 disk > > NOTE: this is to show that there are 4 nvme disks (nvme2n1, nvme3n1, nvme4n1, nvme5n1) available for the second vm > > > > What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context? > > > > Thank you very much > > NOTICE > Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. From sean.mcginnis at gmx.com Tue Oct 1 18:57:07 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 1 Oct 2019 13:57:07 -0500 Subject: [all] Planned Ussuri release schedule published Message-ID: <20191001185707.GA17150@sm-workstation> Hey everyone, The proposed release schedule for Ussuri was up for a few weeks with only cosmetic issues to address. The proposed schedule has now been merged and published to: https://releases.openstack.org/ussuri/schedule.html Barring any new issues with the schedule being raised, this should be our schedule for the Ussuri development cycle. The planned Ussuri release date is May 13, 2020. Thanks! Sean From Arkady.Kanevsky at dell.com Tue Oct 1 19:25:51 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Tue, 1 Oct 2019 19:25:51 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <20191001185707.GA17150@sm-workstation> References: <20191001185707.GA17150@sm-workstation> Message-ID: <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> Why do we have requirements freeze after feature freeze? -----Original Message----- From: Sean McGinnis Sent: Tuesday, October 1, 2019 1:57 PM To: openstack-discuss at lists.openstack.org Subject: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] Hey everyone, The proposed release schedule for Ussuri was up for a few weeks with only cosmetic issues to address. The proposed schedule has now been merged and published to: https://releases.openstack.org/ussuri/schedule.html Barring any new issues with the schedule being raised, this should be our schedule for the Ussuri development cycle. The planned Ussuri release date is May 13, 2020. Thanks! Sean From gouthampravi at gmail.com Tue Oct 1 19:40:06 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Tue, 1 Oct 2019 12:40:06 -0700 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> Message-ID: On Tue, Oct 1, 2019 at 12:30 PM wrote: > Why do we have requirements freeze after feature freeze? > It isn't after the feature freeze - it is alongside. As I understand it, requirements freeze is the same week as feature freeze - it's been the case with Train and past cycles too. > > -----Original Message----- > From: Sean McGinnis > Sent: Tuesday, October 1, 2019 1:57 PM > To: openstack-discuss at lists.openstack.org > Subject: [all] Planned Ussuri release schedule published > > > [EXTERNAL EMAIL] > > Hey everyone, > > The proposed release schedule for Ussuri was up for a few weeks with only > cosmetic issues to address. The proposed schedule has now been merged and > published to: > > https://releases.openstack.org/ussuri/schedule.html > > Barring any new issues with the schedule being raised, this should be our > schedule for the Ussuri development cycle. The planned Ussuri release date > is May 13, 2020. > > Thanks! > Sean > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.rosser at rd.bbc.co.uk Tue Oct 1 20:08:42 2019 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Tue, 1 Oct 2019 21:08:42 +0100 Subject: [OSA][openstack-ansible] Stepping down from core reviewer In-Reply-To: <99083b43-54a3-4fc3-a5c8-fec01907756d@www.fastmail.com> References: <99083b43-54a3-4fc3-a5c8-fec01907756d@www.fastmail.com> Message-ID: <28cde9c7-15ea-a644-d776-cd1a063ce134@rd.bbc.co.uk> Agreed this is a sad day - you've been super cool helping me grapple with all that comes with OpenStack, and I wholeheartedly agree that the OSA community has been a special place where deployers have "got on with it" in a very user oriented way. Hopefully we can maintain that DNA you describe... your description is spot on, and thanks for being part of creating it :) On 27/09/2019 15:20, Jean-Philippe Evrard wrote: > Hello OSA friends, > > It's with great sadness that announcing I will be stepping down from OpenStack-Ansible's core role. > OSA has been the place where I grew from contributor to OpenStack core for the first time, so it will always keep a special place in my mind :) It's also where I met contributors I can now consider personal friends. It's a project I've helped grow and prosper the last 4 years. I am very happy of what we have all achieved. > > My last goals in OSA were to simplify it further and a focus on bare metal. Those efforts are either merged or advanced enough nowadays. I consider I have achieved what I wanted to: OSA is now easier than ever to manage, contribute, and deliver. > > With more experienced people leaving and new people joining, I sure hope the DNA of the project will stay the same: An always welcoming and friendly community, with a no-bullshit and not-too-serious attitude. A project focusing on operator issues and use cases, mentoring contributors to be great members of the OpenStack community. > > Again, I want to thank you for being an amazing community, and it's been great working with all of you. > I think there are still plenty of things OSA can achieve. If you want my opinion, you shouldn't hesitate to contact me. I just don't have enough time to keep up with the reviews, nor am I actively contributing enough to stay at core. > > All the best, > Jean-Philippe Evrard (evrardjp) > > From kgiusti at gmail.com Tue Oct 1 20:35:27 2019 From: kgiusti at gmail.com (Ken Giusti) Date: Tue, 1 Oct 2019 16:35:27 -0400 Subject: [oslo][nova] Revert of oslo.messaging JSON serialization change In-Reply-To: <1569917983.26355.2@smtp.office365.com> References: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> <1569857750.5848.0@smtp.office365.com> <1569917983.26355.2@smtp.office365.com> Message-ID: Sorry I'm late to the party.... At the risk of stating the obvious I wouldn't put much faith in the fact that the Kafka and Amqp1 drivers use jsonutils. The use of jsonutils in these drivers is simply a cut-n-paste from the way old qpidd driver. Why jsonutils was used there... I dunno. IMHO the RabbitMQ driver is the authoritative source for correct driver implementation - the Fake driver (and the others) should use the same serialization as the rabbitmq driver if possible. -K On Tue, Oct 1, 2019 at 4:30 AM Balázs Gibizer wrote: > > > On Mon, Sep 30, 2019 at 5:35 PM, Balázs Gibizer > wrote: > > > > > > On Mon, Sep 30, 2019 at 4:45 PM, Ben Nemec > > wrote: > >> Hi, > >> > >> I've just proposed https://review.opendev.org/#/c/685724/ which > >> reverts a change that recently went in to make the fake driver in > >> oslo.messaging use jsonutils for message serialization instead of > >> json.dumps. > >> > >> As explained in the commit message on the revert, this is > >> problematic > >> because the rabbit driver uses kombu's default serialization method, > >> which is json.dumps. By changing the fake driver to use jsonutils > >> we've made it more lenient than the most used real driver which > >> opens > >> us up to merging broken changes in consumers of oslo.messaging. > >> > >> We did have some discussion of whether we should try to override the > >> kombu default and tell it to use jsonutils too, as a number of other > >> drivers do. The concern with this was that the jsonutils handler for > >> things like datetime objects is not tz-aware, which means if you > >> send > >> a datetime object over RPC and don't explicitly handle it you could > >> lose important information. > >> > >> I'm open to being persuaded otherwise, but at the moment I'm leaning > >> toward less magic happening at the RPC layer and requiring projects > >> to explicitly handle types that aren't serializable by the standard > >> library json module. If you have a different preference, please > >> share > >> it here. > > > > Hi, > > > > I might me totally wrong here and please help me understand how the > > RabbitDriver works. What I did when I created the original patch that > > I > > looked at each drivers how they handle sending messages. The > > oslo_messaging._drivers.base.BaseDriver defines the interface with a > > send() message. The oslo_messaging._drivers.amqpdriver.AMQPDriverBase > > implements the BaseDriver interface's send() method to call _send(). > > Then _send() calls rpc_commom.serialize_msg which then calls > > jsonutils.dumps. > > > > The oslo_messaging._drivers.impl_rabbit.RabbitDriver driver inherits > > from AMQPDriverBase and does not override send() or _send() so I think > > the AMQPDriverBase ._send() is called that therefore jsonutils is used > > during sending a message with RabbitDriver. > > I did some tracing in devstack to prove my point. See the result in > https://review.opendev.org/#/c/685724/1//COMMIT_MSG at 11 > > Cheers, > gibi > > > > > Cheers, > > gibi > > > > > > [1] > > > https://github.com/openstack/oslo.messaging/blob/7734ac1376a1a9285c8245a91cf43599358bfa9d/oslo_messaging/_drivers/amqpdriver.py#L599 > > > >> > >> Thanks. > >> > >> -Ben > >> > > > > > > > -- Ken Giusti (kgiusti at gmail.com) -------------- next part -------------- An HTML attachment was scrubbed... URL: From gsteinmuller at vexxhost.com Tue Oct 1 21:00:59 2019 From: gsteinmuller at vexxhost.com (=?UTF-8?Q?Guilherme_Steinm=C3=BCller?=) Date: Tue, 1 Oct 2019 18:00:59 -0300 Subject: [OSA][openstack-ansible] Stepping down from core reviewer In-Reply-To: <99083b43-54a3-4fc3-a5c8-fec01907756d@www.fastmail.com> References: <99083b43-54a3-4fc3-a5c8-fec01907756d@www.fastmail.com> Message-ID: You've done an enormous contribution, evrard! Not only to me as a contributor but to the whole project! I wish you success! On Fri, Sep 27, 2019 at 11:26 AM Jean-Philippe Evrard < jean-philippe at evrard.me> wrote: > Hello OSA friends, > > It's with great sadness that announcing I will be stepping down from > OpenStack-Ansible's core role. > OSA has been the place where I grew from contributor to OpenStack core for > the first time, so it will always keep a special place in my mind :) It's > also where I met contributors I can now consider personal friends. It's a > project I've helped grow and prosper the last 4 years. I am very happy of > what we have all achieved. > > My last goals in OSA were to simplify it further and a focus on bare > metal. Those efforts are either merged or advanced enough nowadays. I > consider I have achieved what I wanted to: OSA is now easier than ever to > manage, contribute, and deliver. > > With more experienced people leaving and new people joining, I sure hope > the DNA of the project will stay the same: An always welcoming and friendly > community, with a no-bullshit and not-too-serious attitude. A project > focusing on operator issues and use cases, mentoring contributors to be > great members of the OpenStack community. > > Again, I want to thank you for being an amazing community, and it's been > great working with all of you. > I think there are still plenty of things OSA can achieve. If you want my > opinion, you shouldn't hesitate to contact me. I just don't have enough > time to keep up with the reviews, nor am I actively contributing enough to > stay at core. > > All the best, > Jean-Philippe Evrard (evrardjp) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Oct 1 21:33:29 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 01 Oct 2019 16:33:29 -0500 Subject: [qa][stable] tempest.api.volume.test_versions.VersionsTest.test_show_version fails on stable/pike In-Reply-To: <20190926070920.GA26051@sm-workstation> References: <423b48c2-ef1c-bf66-92f5-0d52007076c9@gmail.com> <5f21eadc-8ae3-934b-e354-e326aedba0b5@gmail.com> <20190926070920.GA26051@sm-workstation> Message-ID: <16d893e11b2.118dd8987130088.5272906037496078@ghanshyammann.com> ---- On Thu, 26 Sep 2019 02:09:20 -0500 Sean McGinnis wrote ---- > On Wed, Sep 25, 2019 at 10:00:30AM -0500, Matt Riedemann wrote: > > On 9/25/2019 9:51 AM, Matt Riedemann wrote: > > > Anyway, it sounds like this is another case where we're going to have to > > > pin tempest to a tag in devstack on stable/pike to continue running > > > tempest jobs against stable/pike changes, similar to what recently > > > happened with stable/ocata [3]. > > > > Here is the devstack patch to pin tempest to 21.0.0 in stable/pike: > > > > https://review.opendev.org/#/c/684769/ > > > > -- > > > > Thanks, > > > > Matt > > > > We should be seeing this in queens too. We will need to get this patch merged > there first, then into pike. We can either pin tempest, or get this fixed. > > https://review.opendev.org/#/c/684954/ > > It was a long standing issue that disabled API versions were still listed. This > can probably be backported back to ocata. I do not think the cinder backport will fix the issue. In my test patch, its v1 version which causing the issue and v1 should not be returned in GET / as per cinder pike code. 684954 is only taking care for v2 and v3 things if those are disabled. - https://zuul.opendev.org/t/openstack/build/e13e8a408f214e1b9d03b41c23955c7e/log/controller/logs/tempest_log.txt.gz#70152 Something else is causing this issue. -gmann > > Sean > > From ken1ohmichi at gmail.com Tue Oct 1 21:40:23 2019 From: ken1ohmichi at gmail.com (Kenichi Omichi) Date: Tue, 1 Oct 2019 14:40:23 -0700 Subject: [nova] Stepping down from core reviewer Message-ID: Hello, Today my job description is changed and I cannot have enough time for regular reviewing work of Nova project. So I need to step down from the core reviewer. I spend 6 years in the project, the experience is amazing. OpenStack gave me a lot of chances to learn technical things deeply, make friends in the world and bring me and my family to foreign country from our home country. I'd like to say thank you for everyone in the community :-) My personal private cloud is based on OpenStack, so I'd like to still keep contributing for the project if I find bugs or idea. Thanks Kenichi Omichi --- -------------- next part -------------- An HTML attachment was scrubbed... URL: From Arkady.Kanevsky at dell.com Tue Oct 1 22:03:18 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Tue, 1 Oct 2019 22:03:18 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> Message-ID: <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> On the plan it is one week after feature freeze From: Goutham Pacha Ravi Sent: Tuesday, October 1, 2019 2:40 PM To: Kanevsky, Arkady Cc: sean.mcginnis at gmx.com; OpenStack Discuss Subject: Re: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] On Tue, Oct 1, 2019 at 12:30 PM > wrote: Why do we have requirements freeze after feature freeze? It isn't after the feature freeze - it is alongside. As I understand it, requirements freeze is the same week as feature freeze - it's been the case with Train and past cycles too. -----Original Message----- From: Sean McGinnis > Sent: Tuesday, October 1, 2019 1:57 PM To: openstack-discuss at lists.openstack.org Subject: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] Hey everyone, The proposed release schedule for Ussuri was up for a few weeks with only cosmetic issues to address. The proposed schedule has now been merged and published to: https://releases.openstack.org/ussuri/schedule.html Barring any new issues with the schedule being raised, this should be our schedule for the Ussuri development cycle. The planned Ussuri release date is May 13, 2020. Thanks! Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Oct 1 22:10:22 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 01 Oct 2019 17:10:22 -0500 Subject: [nova][ptg] Review culture (was: Ussuri scope containment) In-Reply-To: <1569946782.31568.0@smtp.office365.com> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> <20191001130035.hm2alc63eab4cpek@yuggoth.org> <72a5c7e7-58a5-187d-3422-44fb110e0f10@fried.cc> <1569946782.31568.0@smtp.office365.com> Message-ID: <16d895fd668.f4f02bc6130462.7929159786020541256@ghanshyammann.com> ---- On Tue, 01 Oct 2019 11:19:44 -0500 Balázs Gibizer wrote ---- > > > On Tue, Oct 1, 2019 at 5:00 PM, Eric Fried wrote: > > Thanks for the responses, all. > > > > This subthread is becoming tangential to my original purpose, so I'm > > renaming it. >> (A) Constrain scope, drastically. We marked 25 blueprints complete in >> Train [3]. Since there has been no change to the core team, let's limit >> Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the >> worst thing that happens? We finish everything, early, and wish we had >> do ne more. If that happens, drinks are on me, and we can bump the number >> for V. I like the idea here and be more practical than theoretical ways to handle such situation especially in Nova case. If the operator complains about less accepted BP then, we can ask them to invest developers in upstream which can avoid such cap. But my question is same as gibi, what will be the selection criteria (when we have a large number of ready specs)? >> (B) Require a core to commit to "caring about" a spec before we approve >> it. The point of this "core liaison" is to act as a mentor to mitigate >> the cultural issues noted above [5], and to be a first point of contact >> for reviews. I've proposed this to the spec template here [6]. +100 for this. I am sure this way we can burn more approved BP. -gmann > > > >>> The best way to get reviews is to lurk in IRC and beg. > > > >> When I joined I was taught that instead of begging go and review > >> open > >> patches which a) helps the review load of dev team b) makes you > >> known > >> in the community. Both helps getting reviews on your patches. Does > >> it > >> always work? No. Do I like begging for review? No. Do I like to get > >> repatedly pinged to review? No. So I would suggest not to declare > >> that > >> the only way to get review is to go and beg. > > > > I recognize I was generalizing; begging isn't really "the best way" to > > get reviews. Doing reviews and becoming known (and *then* begging :) > > is > > far more effective -- but is literally impossible for many > > contributors. > > Even if they have the time (percentage of work week) to dedicate > > upstream, it takes massive effort and time (calendar) to get there. We > > can not and should not expect this of every contributor. > > > > Sure, it is not easy for a new commer to read a random nova patch. But > I think we should encourage them to do so. As that is one of the way > how a newcomer will learn how nova (as software) works. I don't expect > from a newcommer to point out in a nova review that I made a mistake > about an obscure nova specific construct. But I think a newcommer still > can give us valuable feedback about the code readability, about generic > python usage, about English grammar... > > gibi > > > From openstack at fried.cc Tue Oct 1 22:11:21 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 1 Oct 2019 17:11:21 -0500 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: Kenichi- Thank you for all of your contributions over the years. efried On 10/1/19 4:40 PM, Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign country > from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > --- From gmann at ghanshyammann.com Tue Oct 1 22:46:25 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 01 Oct 2019 17:46:25 -0500 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: <16d8980d8fe.b046ac8e130820.667075030702112040@ghanshyammann.com> ---- On Tue, 01 Oct 2019 16:40:23 -0500 Kenichi Omichi wrote ---- > Hello, > Today my job description is changed and I cannot have enough time for regular reviewing work of Nova project.So I need to step down from the core reviewer. > I spend 6 years in the project, the experience is amazing.OpenStack gave me a lot of chances to learn technical things deeply, make friends in the world and bring me and my family to foreign country from our home country.I'd like to say thank you for everyone in the community :-) Thanks a lot, kenichi for your valuable contribution over the years. I have learnt a lot from you and thanks for being so helpful and humble always. You have done a lot for making Nova better and OpenStack more stable while serving as a QA developer in parallel. -gmann > > My personal private cloud is based on OpenStack, so I'd like to still keep contributing for the project if I find bugs or idea. > > ThanksKenichi Omichi > --- > From smooney at redhat.com Tue Oct 1 23:24:58 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 02 Oct 2019 00:24:58 +0100 Subject: issues creating a second vm with numa affinity In-Reply-To: References: <9D8A2486E35F0941A60430473E29F15B017EB7B8AE@MXDB1.ad.garvan.unsw.edu.au> Message-ID: <0a702d26811856186130e5ed28c908665026821b.camel@redhat.com> On Tue, 2019-10-01 at 14:39 -0400, Satish Patel wrote: > did you try to removing "hw:numa_nodes=1" ? that will have no effect the vm implcitly has a numa toplogy of 1 node due to usei cpu pinning. so hw:numa_nodes=1 is identical to what will be setting hw:cpu_policy=dedicated openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property pci_passthrough:alias='nvme:4' looking at the numa info that was provided. node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > > > node 0 size: 262029 MB > > > > node 0 free: 52787 MB > > > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > it looks like you have a dual socket host with 14 cores per secket and hyper threading enabled. looking at the flaovr hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate enables pinning and allocate 1 addtional pinned for the emulator thread. since hw:cpu_treads_policy is not defien the behavior will be determined by the numa of cores requrested. by default if the flavor.vcpu is even it would default to the require policy and try to use hyper tread siblibngs if flavor.vcpu was odd it would defualt to isolate policy and try to isolate individual cores. this was originally done to prevent a class of timing based attach that can be executed if two vms were pinned differnet hyperthread on the same core. i say be defualt as you are also useing hw:emulator_threads_policy=isolate which actully means you are askign for 21 cores and im not sure of the top of my head which policy will take effect. strictly speacking the prefer policy is used but it behavior is subtle. anway form the numa info above reaange the data to show the tread siblings node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 if you have look at this long enough you will know after kernel 4.0 enumates in a prediable way that is different form the predicable way that older kernels used to enuamrete cores in. if we boot 1 vm say on node0 which is the socket 0 in this case as well with the above flaovr i would expect the free cores to look like this node 0 cpus: - - - - - - - - - - - 11 12 13 - - - - - - - - - - 38 39 40 41 node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 looking at the pci white list there is something else that you can see passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, > > {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, > > {"address":"0000:87:00.0"} ] for all the devies the first 2 bytes are 0000 this is the pci domain. on a multi socket sytems, or at least on any 2 socket system new enouglsht to processor wth 14 cores and hypertreading you will have 2 different pci roots. 1 pci route complex per phyical processor. in a system with multiple pci root complex 1 becomes the primary pci root and is assigned the 0000 domain and the second is can be asigned a different domain adress but that depend on your kernel commandline option and the number of devices. form my experince when only a single domain is create the second numa node device start with 0000:80:xx.y or higher so {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"},{"address":"0000:09:00.0"} shoudl be on numa node 0 and shuld be assinged to the vm that leave {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, {"address":"0000:87:00.0"} and node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 so from an openstack point of view there are enough core free to pin the second vm and there are devices free on the same numa node. node 0 free: 52787 MB node 1 free: 250624 MB 200G is being used on node 0 by the first vm and there is 250 is free on node 1. as kashyap pointed out in an earlier reply the most likely cause of the "qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory" error is a libvirt interaction with a kvm kernel bug that was fix in kernel 4.19 (4.19 fixes a lot of kvm bugs and enabled nestexd virt by default so you shoudl use 4.19+ if you can) kashyap submitted https://review.opendev.org/#/c/684375/ as a possible way to workaround to the kernel issue by relaxing the requirement nova places on the memory assgined to a guest that is not used for guest ram. effectivly we belive that the root case is on the host if you run "grep DMA32 /proc/zoneinfo" the DMA32 zone will only exist on 1 nuam node. e.g. sean at workstation:~$ grep DMA32 /proc/zoneinfo Node 0, zone DMA32 Node 1, zone DMA32 https://review.opendev.org/#/c/684375/ we belive would allow the second vm to booth with numa affined guest ram but non numa affined DMA memroy however that could have negative performace implication in some cases. nova connot contol how where DMA memroy is allocated by the kernel so this cannot be fully adress by nova. ideally the best way to fix this would be to some how force your kenel to allocate DMA32 zones per numa node but i am not aware of a way to do that. so to ansewr the orginal question 'What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context?' my understanding is that it mean qemu could not allcate memory form a DMA32 zone on the same numa node as the cpus and guest ram for the PCI passthough devices which would be required when is defiend. we always require strict mode when we have a vm with a numa toplogy to ensure that the guest memroy is allocated form the node we requested but if you are using pci passtouhg and do not have DMA32 zones. it is my understanding that on newewr kernels the kvm modules allows non local DMA zones to be used. with all that said it is very uncommon to have hardware that dose not have a DMA and DMA32 zone per numa node so most peopel will never have this problem. > On Tue, Oct 1, 2019 at 2:16 PM Manuel Sopena Ballesteros > wrote: > > > > Dear Openstack user community, > > > > > > > > I have a compute node with 2 numa nodes and I would like to create 2 vms, each one using a different numa node > > through numa affinity with cpu, memory and nvme pci devices. > > > > > > > > pci passthrough whitelist > > > > [root at zeus-53 ~]# tail /etc/kolla/nova-compute/nova.conf > > > > [notifications] > > > > > > > > [filter_scheduler] > > > > enabled_filters = enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, > > ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter > > > > available_filters = nova.scheduler.filters.all_filters > > > > > > > > [pci] > > > > passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, > > {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, > > {"address":"0000:87:00.0"} ] > > > > alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"} > > > > > > > > Openstack flavor > > > > openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property > > hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property > > pci_passthrough:alias='nvme:4' > > > > > > > > The first vm is successfully created > > > > openstack server create --network hpc --flavor xlarge.numa.perf.test --image centos7.6-image --availability-zone > > nova:zeus-53.localdomain --key-name mykey kudu-1 > > > > > > > > However the second vm fails > > > > openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone > > nova:zeus-53.localdomain --key-name mykey kudu-4 > > > > > > > > Errors in nova compute node > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [req-b5a25c73-8c7d-466c-8128-71f29e7ae8aa > > 91e83343e9834c8ba0172ff369c8acac b91520cff5bd45c59a8de07c38641582 - default default] [instance: ebe4e78c-501e-4535- > > ae15-948301cbf1ae] Instance failed to spawn: libvirtError: internal error: qemu unexpectedly closed the monitor: > > 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] Traceback > > (most recent call last): > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2369, in _build_resources > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] yield > > resources > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2133, in _build_and_run_instance > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] block_device_info=block_device_info) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3142, in spawn > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] destroy_disks_on_failure=True) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5705, in _create_domain_and_network > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] destroy_disks_on_failure) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] self.force_reraise() > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5674, in _create_domain_and_network > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] post_xml_callback=post_xml_callback) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5608, in _create_domain > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] guest.launch(pause=pause) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] self._encoded_xml, errors='ignore') > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] self.force_reraise() > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] return > > self._domain.createWithFlags(flags) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] result = > > proxy_call(self._autowrap, f, *args, **kwargs) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = > > execute(f, *args, **kwargs) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] six.reraise(c, e, tb) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = > > meth(*args, **kwargs) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] if ret == > > -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] libvirtError: > > internal error: qemu unexpectedly closed the monitor: 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: > > Cannot allocate memory > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] > > > > > > > > Numa cell/node 1 (the one assigned on kudu-4) has enough cpu, memory, pci devices and disk capacity to fit this vm. > > NOTE: below is the information relevant I could think of that shows resources available after creating the second > > vm. > > > > > > > > [root at zeus-53 ~]# numactl -H > > > > available: 2 nodes (0-1) > > > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > > > node 0 size: 262029 MB > > > > node 0 free: 52787 MB > > > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > > > node 1 size: 262144 MB > > > > node 1 free: 250624 MB > > > > node distances: > > > > node 0 1 > > > > 0: 10 21 > > > > 1: 21 10 > > > > NOTE: this is to show that numa node/cell 1 has enough resources available (also nova-compute logs shows that kudu-4 > > is assigned to cell 1) > > > > > > > > [root at zeus-53 ~]# df -h > > > > Filesystem Size Used Avail Use% Mounted on > > > > /dev/md127 3.7T 9.1G 3.7T 1% / > > > > ... > > > > NOTE: vm disk files goes to root (/) partition > > > > > > > > [root at zeus-53 ~]# lsblk > > > > NAME MAJ:MIN RM SIZE RO > > TYPE MOUNTPOINT > > > > sda 8:0 0 59.6G 0 > > disk > > > > ├─sda1 8:1 0 1G 0 > > part /boot > > > > └─sda2 8:2 0 16G 0 > > part [SWAP] > > > > loop0 7:0 0 100G 0 > > loop > > > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > > > loop1 7:1 0 2G 0 > > loop > > > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > > > nvme0n1 259:8 0 1.8T 0 > > disk > > > > └─nvme0n1p1 259:9 0 1.8T 0 > > part > > > > └─md127 9:127 0 3.7T 0 > > raid0 / > > > > nvme1n1 259:6 0 1.8T 0 > > disk > > > > └─nvme1n1p1 259:7 0 1.8T 0 > > part > > > > └─md127 9:127 0 3.7T 0 > > raid0 / > > > > nvme2n1 259:2 0 1.8T 0 > > disk > > > > nvme3n1 259:1 0 1.8T 0 > > disk > > > > nvme4n1 259:0 0 1.8T 0 > > disk > > > > nvme5n1 259:3 0 1.8T 0 > > disk > > > > NOTE: this is to show that there are 4 nvme disks (nvme2n1, nvme3n1, nvme4n1, nvme5n1) available for the second vm > > > > > > > > What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context? > > > > > > > > Thank you very much > > > > NOTICE > > Please consider the environment before printing this email. This message and any attachments are intended for the > > addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended > > recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this > > message in error please notify us at once by return email and then delete both messages. We accept no liability for > > the distribution of viruses or similar in electronic communications. This notice should not be removed. > > From satish.txt at gmail.com Wed Oct 2 02:24:26 2019 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 1 Oct 2019 22:24:26 -0400 Subject: issues creating a second vm with numa affinity In-Reply-To: <0a702d26811856186130e5ed28c908665026821b.camel@redhat.com> References: <9D8A2486E35F0941A60430473E29F15B017EB7B8AE@MXDB1.ad.garvan.unsw.edu.au> <0a702d26811856186130e5ed28c908665026821b.camel@redhat.com> Message-ID: Sean good to hear from you, amazing reply, i took some notes from it. On Tue, Oct 1, 2019 at 7:25 PM Sean Mooney wrote: > > On Tue, 2019-10-01 at 14:39 -0400, Satish Patel wrote: > > did you try to removing "hw:numa_nodes=1" ? > that will have no effect > the vm implcitly has a numa toplogy of 1 node due to usei cpu pinning. > so hw:numa_nodes=1 is identical to what will be setting hw:cpu_policy=dedicated > > openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property > hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property > pci_passthrough:alias='nvme:4' > > looking at the numa info that was provided. > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > > > > > node 0 size: 262029 MB > > > > > > node 0 free: 52787 MB > > > > > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > > > > it looks like you have a dual socket host with 14 cores per secket and hyper threading enabled. > > looking at the flaovr > hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate > enables pinning and allocate 1 addtional pinned for the emulator thread. > > since hw:cpu_treads_policy is not defien the behavior will be determined by the numa of cores requrested. > > by default if the flavor.vcpu is even it would default to the require policy and try to use hyper tread siblibngs > if flavor.vcpu was odd it would defualt to isolate policy and try to isolate individual cores. this was originally > done to prevent a class of timing based attach that can be executed if two vms were pinned differnet hyperthread on the > same core. i say be defualt as you are also useing hw:emulator_threads_policy=isolate which actully means you are > askign for 21 cores and im not sure of the top of my head which policy will take effect. > strictly speacking the prefer policy is used but it behavior is subtle. > > anway form the numa info above reaange the data to show the tread siblings > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 > 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 > 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > if you have look at this long enough you will know after kernel 4.0 enumates in a prediable way that is different > form the predicable way that older kernels used to enuamrete cores in. > > if we boot 1 vm say on node0 which is the socket 0 in this case as well with the above flaovr i would expect the free > cores to look like this > > node 0 cpus: - - - - - - - - - - - 11 12 13 > - - - - - - - - - - 38 39 40 41 > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 > 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > looking at the pci white list there is something else that you can see > > passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, > > > {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, > > > {"address":"0000:87:00.0"} ] > > for all the devies the first 2 bytes are 0000 this is the pci domain. > > on a multi socket sytems, or at least on any 2 socket system new enouglsht to processor wth 14 cores and hypertreading > you will have 2 different pci roots. 1 pci route complex per phyical processor. in a system with multiple pci root > complex 1 becomes the primary pci root and is assigned the 0000 domain and the second is can be asigned a different > domain adress but that depend on your kernel commandline option and the number of devices. > form my experince when only a single domain is create the second numa node device start with 0000:80:xx.y or higher > so {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"},{"address":"0000:09:00.0"} shoudl > be on numa node 0 and shuld be assinged to the vm > > that leave > {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, {"address":"0000:87:00.0"} > and > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 > 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > so from an openstack point of view there are enough core free to pin the second vm and there are devices free on the > same numa node. > > node 0 free: 52787 MB > node 1 free: 250624 MB > > 200G is being used on node 0 by the first vm and there is 250 is free on node 1. > > as kashyap pointed out in an earlier reply the most likely cause of the > "qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory" error is a libvirt interaction with a kvm > kernel bug that was fix in kernel 4.19 (4.19 fixes a lot of kvm bugs and enabled nestexd virt by default so you shoudl > use 4.19+ if you can) > kashyap submitted https://review.opendev.org/#/c/684375/ as a possible way to workaround to the kernel issue by relaxing > the requirement nova places on the memory assgined to a guest that is not used for guest ram. > > effectivly we belive that the root case is on the host if you run "grep DMA32 /proc/zoneinfo" the DMA32 zone will only > exist on 1 nuam node. > > e.g. sean at workstation:~$ grep DMA32 /proc/zoneinfo > Node 0, zone DMA32 > Node 1, zone DMA32 > > https://review.opendev.org/#/c/684375/ we belive would allow the second vm to booth with numa affined guest ram > but non numa affined DMA memroy however that could have negative performace implication in some cases. > nova connot contol how where DMA memroy is allocated by the kernel so this cannot be fully adress by nova. > > ideally the best way to fix this would be to some how force your kenel to allocate DMA32 zones per numa node > but i am not aware of a way to do that. > > so to ansewr the orginal question > 'What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context?' > my understanding is that it mean > qemu could not allcate memory form a DMA32 zone on the > same numa node as the cpus and guest ram for the PCI passthough devices which would be required > when is defiend. we always require strict mode when we have a vm with a numa > toplogy to ensure that the guest memroy is allocated form the node we requested but if you are using pci > passtouhg and do not have DMA32 zones. it is my understanding that on newewr kernels the kvm modules > allows non local DMA zones to be used. with all that said it is very uncommon to have hardware that > dose not have a DMA and DMA32 zone per numa node so most peopel will never have this problem. > > > > On Tue, Oct 1, 2019 at 2:16 PM Manuel Sopena Ballesteros > > wrote: > > > > > > Dear Openstack user community, > > > > > > > > > > > > I have a compute node with 2 numa nodes and I would like to create 2 vms, each one using a different numa node > > > through numa affinity with cpu, memory and nvme pci devices. > > > > > > > > > > > > pci passthrough whitelist > > > > > > [root at zeus-53 ~]# tail /etc/kolla/nova-compute/nova.conf > > > > > > [notifications] > > > > > > > > > > > > [filter_scheduler] > > > > > > enabled_filters = enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, > > > ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter > > > > > > available_filters = nova.scheduler.filters.all_filters > > > > > > > > > > > > [pci] > > > > > > passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, > > > {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, > > > {"address":"0000:87:00.0"} ] > > > > > > alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"} > > > > > > > > > > > > Openstack flavor > > > > > > openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property > > > hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property > > > pci_passthrough:alias='nvme:4' > > > > > > > > > > > > The first vm is successfully created > > > > > > openstack server create --network hpc --flavor xlarge.numa.perf.test --image centos7.6-image --availability-zone > > > nova:zeus-53.localdomain --key-name mykey kudu-1 > > > > > > > > > > > > However the second vm fails > > > > > > openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone > > > nova:zeus-53.localdomain --key-name mykey kudu-4 > > > > > > > > > > > > Errors in nova compute node > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [req-b5a25c73-8c7d-466c-8128-71f29e7ae8aa > > > 91e83343e9834c8ba0172ff369c8acac b91520cff5bd45c59a8de07c38641582 - default default] [instance: ebe4e78c-501e-4535- > > > ae15-948301cbf1ae] Instance failed to spawn: libvirtError: internal error: qemu unexpectedly closed the monitor: > > > 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] Traceback > > > (most recent call last): > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2369, in _build_resources > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] yield > > > resources > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2133, in _build_and_run_instance > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] block_device_info=block_device_info) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3142, in spawn > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] destroy_disks_on_failure=True) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5705, in _create_domain_and_network > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] destroy_disks_on_failure) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] self.force_reraise() > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5674, in _create_domain_and_network > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] post_xml_callback=post_xml_callback) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5608, in _create_domain > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] guest.launch(pause=pause) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] self._encoded_xml, errors='ignore') > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] self.force_reraise() > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] return > > > self._domain.createWithFlags(flags) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] result = > > > proxy_call(self._autowrap, f, *args, **kwargs) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = > > > execute(f, *args, **kwargs) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] six.reraise(c, e, tb) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = > > > meth(*args, **kwargs) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] if ret == > > > -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] libvirtError: > > > internal error: qemu unexpectedly closed the monitor: 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: > > > Cannot allocate memory > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] > > > > > > > > > > > > Numa cell/node 1 (the one assigned on kudu-4) has enough cpu, memory, pci devices and disk capacity to fit this vm. > > > NOTE: below is the information relevant I could think of that shows resources available after creating the second > > > vm. > > > > > > > > > > > > [root at zeus-53 ~]# numactl -H > > > > > > available: 2 nodes (0-1) > > > > > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > > > > > node 0 size: 262029 MB > > > > > > node 0 free: 52787 MB > > > > > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > > > > > node 1 size: 262144 MB > > > > > > node 1 free: 250624 MB > > > > > > node distances: > > > > > > node 0 1 > > > > > > 0: 10 21 > > > > > > 1: 21 10 > > > > > > NOTE: this is to show that numa node/cell 1 has enough resources available (also nova-compute logs shows that kudu-4 > > > is assigned to cell 1) > > > > > > > > > > > > [root at zeus-53 ~]# df -h > > > > > > Filesystem Size Used Avail Use% Mounted on > > > > > > /dev/md127 3.7T 9.1G 3.7T 1% / > > > > > > ... > > > > > > NOTE: vm disk files goes to root (/) partition > > > > > > > > > > > > [root at zeus-53 ~]# lsblk > > > > > > NAME MAJ:MIN RM SIZE RO > > > TYPE MOUNTPOINT > > > > > > sda 8:0 0 59.6G 0 > > > disk > > > > > > ├─sda1 8:1 0 1G 0 > > > part /boot > > > > > > └─sda2 8:2 0 16G 0 > > > part [SWAP] > > > > > > loop0 7:0 0 100G 0 > > > loop > > > > > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > > > > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > > > > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > > > > > loop1 7:1 0 2G 0 > > > loop > > > > > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > > > > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > > > > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > > > > > nvme0n1 259:8 0 1.8T 0 > > > disk > > > > > > └─nvme0n1p1 259:9 0 1.8T 0 > > > part > > > > > > └─md127 9:127 0 3.7T 0 > > > raid0 / > > > > > > nvme1n1 259:6 0 1.8T 0 > > > disk > > > > > > └─nvme1n1p1 259:7 0 1.8T 0 > > > part > > > > > > └─md127 9:127 0 3.7T 0 > > > raid0 / > > > > > > nvme2n1 259:2 0 1.8T 0 > > > disk > > > > > > nvme3n1 259:1 0 1.8T 0 > > > disk > > > > > > nvme4n1 259:0 0 1.8T 0 > > > disk > > > > > > nvme5n1 259:3 0 1.8T 0 > > > disk > > > > > > NOTE: this is to show that there are 4 nvme disks (nvme2n1, nvme3n1, nvme4n1, nvme5n1) available for the second vm > > > > > > > > > > > > What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context? > > > > > > > > > > > > Thank you very much > > > > > > NOTICE > > > Please consider the environment before printing this email. This message and any attachments are intended for the > > > addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended > > > recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this > > > message in error please notify us at once by return email and then delete both messages. We accept no liability for > > > the distribution of viruses or similar in electronic communications. This notice should not be removed. > > > > > From zhangbailin at inspur.com Wed Oct 2 03:21:02 2019 From: zhangbailin at inspur.com (=?gb2312?B?QnJpbiBaaGFuZyjVxbDZwdYp?=) Date: Wed, 2 Oct 2019 03:21:02 +0000 Subject: =?gb2312?B?tPC4tDogW2xpc3RzLm9wZW5zdGFjay5vcme0+reiXVtub3ZhXSBTdGVwcGlu?= =?gb2312?Q?g_down_from_core_reviewer?= In-Reply-To: References: <15310a972a58e9d50f9a255fcae249b4@sslemail.net> Message-ID: <052032cc89884b3d8c645aef4d43f308@inspur.com> Kenichi- Thank you for your contribution to nova and for the help of these newcomers, and I hope to see you often in the community. brinzhang item: [lists.openstack.org代发][nova] Stepping down from core reviewer Hello, Today my job description is changed and I cannot have enough time for regular reviewing work of Nova project. So I need to step down from the core reviewer. I spend 6 years in the project, the experience is amazing. OpenStack gave me a lot of chances to learn technical things deeply, make friends in the world and bring me and my family to foreign country from our home country. I'd like to say thank you for everyone in the community :-) My personal private cloud is based on OpenStack, so I'd like to still keep contributing for the project if I find bugs or idea. Thanks Kenichi Omichi --- -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at stackhpc.com Wed Oct 2 07:29:50 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 2 Oct 2019 09:29:50 +0200 Subject: [all][PTG] Strawman Schedule In-Reply-To: <29C580AF-47C6-426A-B571-E0D0E9E8806E@openstack.org> References: <29C580AF-47C6-426A-B571-E0D0E9E8806E@openstack.org> Message-ID: Hi Kendall, I got confirmation from all participants that they will be available all day on Friday. Thanks for adding us to the schedule. Best wishes, Pierre On Tue, 1 Oct 2019 at 17:37, Kendall Waters wrote: > > Hi Pierre, > > Most of our space at the Shanghai PTG is shared space so we can offer you a designated table in the shared room all day Friday. There will be extra chairs in the room if you need to pull up more chairs to your table. > > Best, > Kendall > > Kendall Waters > OpenStack Marketing & Events > kendall at openstack.org > > > > On Oct 1, 2019, at 5:53 AM, Pierre Riteau wrote: > > Hi Kendall, > > Friday works for all who have replied so far, but I am still expecting > answers from two people. > > Is there a room available for our Project Onboarding session that day? > Probably in the morning, though I will confirm depending on > availability of participants. > We've never run one, so I don't know how many people to expect. > > Thanks, > Pierre > > On Mon, 30 Sep 2019 at 23:29, Kendall Waters wrote: > > > Hi Pierre, > > Apologies for the oversight on Blazar. Would all day Friday work for your team? > > Thanks, > Kendall > > Kendall Waters > OpenStack Marketing & Events > kendall at openstack.org > > > > On Sep 30, 2019, at 12:27 PM, Pierre Riteau wrote: > > Hi Kendall, > > I couldn't see Blazar anywhere on the schedule. We had requested time > for a Project Onboarding session. > > Additionally, there are more people travelling than initially planned, > so we may want to allocate a half day for technical discussions as > well (probably in the shared space, since we don't expect a huge > turnout). > > Would it be possible to update the schedule accordingly? > > Thanks, > Pierre > > On Fri, 27 Sep 2019 at 19:02, Kendall Nelson wrote: > > > Hello Everyone! > > Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 > > The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. > > Please let me know if there are any conflicts! > > -Kendall (diablo_rojo) > > On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: > > > Hello Everyone! > > In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. > > We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. > > If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! > > -Kendall (diablo_rojo) > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png > > > > From pierre at stackhpc.com Wed Oct 2 07:39:12 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 2 Oct 2019 09:39:12 +0200 Subject: [kolla-ansible] migration In-Reply-To: References: Message-ID: Hi everyone, I hope you don't mind me reviving this thread, to let you know I wrote an article after we successfully completed the migration of a running OpenStack deployment to Kolla: http://www.stackhpc.com/migrating-to-kolla.html Don't hesitate to contact me if you have more questions about how this type of migration can be performed. Pierre On Mon, 1 Jul 2019 at 14:02, Ignazio Cassano wrote: > > I checked them and I modified for fitting to new installation > thanks > Ignazio > > Il giorno lun 1 lug 2019 alle ore 13:36 Mohammed Naser ha scritto: >> >> You should check your cell mapping records inside Nova. They're probably not right of you moved your database and rabbit >> >> Sorry for top posting this is from a phone. >> >> On Mon., Jul. 1, 2019, 5:46 a.m. Ignazio Cassano, wrote: >>> >>> PS >>> I presume the problem is neutron, because instances on new kvm nodes remain in building state e do not aquire address. >>> Probably the netron db imported from old openstack installation has some difrrences ....probably I must check defferences from old and new neutron services configuration files. >>> Ignazio >>> >>> Il giorno lun 1 lug 2019 alle ore 10:10 Mark Goddard ha scritto: >>>> >>>> It sounds like you got quite close to having this working. I'd suggest >>>> debugging this instance build failure. One difference with kolla is >>>> that we run libvirt inside a container. Have you stopped libvirt from >>>> running on the host? >>>> Mark >>>> >>>> On Sun, 30 Jun 2019 at 09:55, Ignazio Cassano wrote: >>>> > >>>> > Hi Mark, >>>> > let me to explain what I am trying. >>>> > I have a queens installation based on centos and pacemaker with some instances and heat stacks. >>>> > I would like to have another installation with same instances, projects, stacks ....I'd like to have same uuid for all objects (users,projects instances and so on, because it is controlled by a cloud management platform we wrote. >>>> > >>>> > I stopped controllers on old queens installation backupping the openstack database. >>>> > I installed the new kolla openstack queens on new three controllers with same addresses of the old intallation , vip as well. >>>> > One of the three controllers is also a kvm node on queens. >>>> > I stopped all containeres except rabbit,keepalive,rabbit,haproxy and mariadb. >>>> > I deleted al openstack db on mariadb container and I imported the old tables, changing the address of rabbit for pointing to the new rabbit cluster. >>>> > I restarded containers. >>>> > Changing the rabbit address on old kvm nodes, I can see the old virtual machines and I can open console on them. >>>> > I can see all networks (tenant and provider) of al installation, but when I try to create a new instance on the new kvm, it remains in buiding state. >>>> > Seems it cannot aquire an address. >>>> > Storage between old and new installation are shred on nfs NETAPP, so I can see cinder volumes. >>>> > I suppose db structure is different between a kolla installation and a manual instaltion !? >>>> > What is wrong ? >>>> > Thanks >>>> > Ignazio >>>> > >>>> > >>>> > >>>> > >>>> > Il giorno gio 27 giu 2019 alle ore 16:44 Mark Goddard ha scritto: >>>> >> >>>> >> On Thu, 27 Jun 2019 at 14:46, Ignazio Cassano wrote: >>>> >> > >>>> >> > Sorry, for my question. >>>> >> > It does not need to change anything because endpoints refer to haproxy vips. >>>> >> > So if your new glance works fine you change haproxy backends for glance. >>>> >> > Regards >>>> >> > Ignazio >>>> >> >>>> >> That's correct - only the haproxy backend needs to be updated. >>>> >> >>>> >> > >>>> >> > >>>> >> > Il giorno gio 27 giu 2019 alle ore 15:21 Ignazio Cassano ha scritto: >>>> >> >> >>>> >> >> Hello Mark, >>>> >> >> let me to verify if I understood your method. >>>> >> >> >>>> >> >> You have old controllers,haproxy,mariadb and nova computes. >>>> >> >> You installed three new controllers but kolla.ansible inventory contains old mariadb and old rabbit servers. >>>> >> >> You are deployng single service on new controllers staring with glance. >>>> >> >> When you deploy glance on new controllers, it changes the glance endpoint on old mariadb db ? >>>> >> >> Regards >>>> >> >> Ignazio >>>> >> >> >>>> >> >> Il giorno gio 27 giu 2019 alle ore 10:52 Mark Goddard ha scritto: >>>> >> >>> >>>> >> >>> On Wed, 26 Jun 2019 at 19:34, Ignazio Cassano wrote: >>>> >> >>> > >>>> >> >>> > Hello, >>>> >> >>> > Anyone have tried to migrate an existing openstack installation to kolla containers? >>>> >> >>> >>>> >> >>> Hi, >>>> >> >>> >>>> >> >>> I'm aware of two people currently working on that. Gregory Orange and >>>> >> >>> one of my colleagues, Pierre Riteau. Pierre is away currently, so I >>>> >> >>> hope he doesn't mind me quoting him from an email to Gregory. >>>> >> >>> >>>> >> >>> Mark >>>> >> >>> >>>> >> >>> "I am indeed working on a similar migration using Kolla Ansible with >>>> >> >>> Kayobe, starting from a non-containerised OpenStack deployment based >>>> >> >>> on CentOS RPMs. >>>> >> >>> Existing OpenStack services are deployed across several controller >>>> >> >>> nodes and all sit behind HAProxy, including for internal endpoints. >>>> >> >>> We have additional controller nodes that we use to deploy >>>> >> >>> containerised services. If you don't have the luxury of additional >>>> >> >>> nodes, it will be more difficult as you will need to avoid processes >>>> >> >>> clashing when listening on the same port. >>>> >> >>> >>>> >> >>> The method I am using resembles your second suggestion, however I am >>>> >> >>> deploying only one containerised service at a time, in order to >>>> >> >>> validate each of them independently. >>>> >> >>> I use the --tags option of kolla-ansible to restrict Ansible to >>>> >> >>> specific roles, and when I am happy with the resulting configuration I >>>> >> >>> update HAProxy to point to the new controllers. >>>> >> >>> >>>> >> >>> As long as the configuration matches, this should be completely >>>> >> >>> transparent for purely HTTP-based services like Glance. You need to be >>>> >> >>> more careful with services that include components listening for RPC, >>>> >> >>> such as Nova: if the new nova.conf is incorrect and you've deployed a >>>> >> >>> nova-conductor that uses it, you could get failed instances launches. >>>> >> >>> Some roles depend on others: if you are deploying the >>>> >> >>> neutron-openvswitch-agent, you need to run the openvswitch role as >>>> >> >>> well. >>>> >> >>> >>>> >> >>> I suggest starting with migrating Glance as it doesn't have any >>>> >> >>> internal services and is easy to validate. Note that properly >>>> >> >>> migrating Keystone requires keeping existing Fernet keys around, so >>>> >> >>> any token stays valid until the time it is expected to stop working >>>> >> >>> (which is fairly complex, see >>>> >> >>> https://bugs.launchpad.net/kolla-ansible/+bug/1809469). >>>> >> >>> >>>> >> >>> While initially I was using an approach similar to your first >>>> >> >>> suggestion, it can have side effects since Kolla Ansible uses these >>>> >> >>> variables when templating configuration. As an example, most services >>>> >> >>> will only have notifications enabled if enable_ceilometer is true. >>>> >> >>> >>>> >> >>> I've added existing control plane nodes to the Kolla Ansible inventory >>>> >> >>> as separate groups, which allows me to use the existing database and >>>> >> >>> RabbitMQ for the containerised services. >>>> >> >>> For example, instead of: >>>> >> >>> >>>> >> >>> [mariadb:children] >>>> >> >>> control >>>> >> >>> >>>> >> >>> you may have: >>>> >> >>> >>>> >> >>> [mariadb:children] >>>> >> >>> oldcontrol_db >>>> >> >>> >>>> >> >>> I still have to perform the migration of these underlying services to >>>> >> >>> the new control plane, I will let you know if there is any hurdle. >>>> >> >>> >>>> >> >>> A few random things to note: >>>> >> >>> >>>> >> >>> - if run on existing control plane hosts, the baremetal role removes >>>> >> >>> some packages listed in `redhat_pkg_removals` which can trigger the >>>> >> >>> removal of OpenStack dependencies using them! I've changed this >>>> >> >>> variable to an empty list. >>>> >> >>> - compare your existing deployment with a Kolla Ansible one to check >>>> >> >>> for differences in endpoints, configuration files, database users, >>>> >> >>> service users, etc. For Heat, Kolla uses the domain heat_user_domain, >>>> >> >>> while your existing deployment may use another one (and this is >>>> >> >>> hardcoded in the Kolla Heat image). Kolla Ansible uses the "service" >>>> >> >>> project while a couple of deployments I worked with were using >>>> >> >>> "services". This shouldn't matter, except there was a bug in Kolla >>>> >> >>> which prevented it from setting the roles correctly: >>>> >> >>> https://bugs.launchpad.net/kolla/+bug/1791896 (now fixed in latest >>>> >> >>> Rocky and Queens images) >>>> >> >>> - the ml2_conf.ini generated for Neutron generates physical network >>>> >> >>> names like physnet1, physnet2… you may want to override >>>> >> >>> bridge_mappings completely. >>>> >> >>> - although sometimes it could be easier to change your existing >>>> >> >>> deployment to match Kolla Ansible settings, rather than configure >>>> >> >>> Kolla Ansible to match your deployment." >>>> >> >>> >>>> >> >>> > Thanks >>>> >> >>> > Ignazio >>>> >> >>> > From renat.akhmerov at gmail.com Wed Oct 2 07:57:24 2019 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Wed, 2 Oct 2019 14:57:24 +0700 Subject: [requirements][mistral][amqp] Failing =?utf-8?Q?=E2=80=9Cdocs=E2=80=9D_?=job due to the upper constraint conflict for amqp In-Reply-To: References: Message-ID: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> Hi, We have a failing “docs” ([1]) CI job that fails because it implicitly brings amqp 2.5.2 but this lib is not allowed to be higher than 2.5.1 in the upper-constraings.txt in the requirements project ([2]). We see that there’s the patch [3] generated by the proposal bot that bumps the constraint to 2.5.2 for amqp (among others) but it was given -2. Please assist on how to address in the best way. Should we bump only amqp version in upper constraints for now? [1] https://zuul.opendev.org/t/openstack/build/6fe7c7d3e60b40458d2a98f3a293f412/log/job-output.txt#840 [2] https://github.com/openstack/requirements/blob/master/upper-constraints.txt#L258 [3] https://review.opendev.org/#/c/681382 Thanks Renat Akhmerov @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From surya.seetharaman9 at gmail.com Wed Oct 2 08:25:07 2019 From: surya.seetharaman9 at gmail.com (Surya Seetharaman) Date: Wed, 2 Oct 2019 10:25:07 +0200 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: On Tue, Oct 1, 2019 at 11:42 PM Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, make > friends in the world and bring me and my family to foreign country from our > home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still keep > contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > --- > Thanks Kenichi for all your contributions. I wish you all the best for your future endeavors. Cheers, Surya. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Wed Oct 2 08:29:40 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 2 Oct 2019 10:29:40 +0200 Subject: Release Cycle Observations In-Reply-To: References: <40ab2bd3-e23a-6877-e515-63bbc1663f66@gmail.com> <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> Message-ID: On 10/1/19 12:05 PM, Dmitry Tantsur wrote: > > > On Fri, Sep 27, 2019 at 10:47 PM Thomas Goirand > wrote: > > On 9/26/19 9:51 PM, Sean McGinnis wrote: > >> I know we'd like to have everyone CD'ing master > > > > Watch who you're lumping in with the "we" statement. ;) > > You've pinpointed what the problem is. > > Everyone but OpenStack upstream would like to stop having to upgrade > every 6 months. > > > Yep, but the same "everyone" want to have features now or better > yesterday, not in 2-3 years ;) This probably was the case a few years ago, when OpenStack was young. Now that it has matured, and has all the needed features, things have changed a lot. Thomas From no-reply at openstack.org Wed Oct 2 10:19:46 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Wed, 02 Oct 2019 10:19:46 -0000 Subject: glance 19.0.0.0rc1 (train) Message-ID: Hello everyone, A new release candidate for glance for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/glance/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/glance/src/branch/stable/train Release notes for glance can be found at: https://docs.openstack.org/releasenotes/glance/ If you find an issue that could be considered release-critical, please file it at: https://bugs.launchpad.net/glance/+bugs and tag it *train-rc-potential* to bring it to the glance release crew's attention. From jesse at odyssey4.me Wed Oct 2 11:07:19 2019 From: jesse at odyssey4.me (Jesse Pretorius) Date: Wed, 2 Oct 2019 11:07:19 +0000 Subject: [openstack-ansible] Stepping down as core reviewer Message-ID: <3f149abe04bc915fff4aa460eb07e1f0b2a44071.camel@odyssey4.me> Hi everyone, While I had hoped to manage keeping up with OSA reviews and some contributions, unfortunately there is too much on my plate in my new role to allow me to give OSA sufficient time and I feel that it's important to not give any false promises. I am therefore stepping down as a core reviewer for OSA. My journey with OpenStack-Ansible started with initial contributions before it was an official OpenStack project, went on to helping lead the project to becoming an official project in the big tent, then on to becoming a successful project with diverse contributors of which I was proud to be a part. Over time I learned a heck of a lot about building and leading an Open Source community, about developing Ansible playbooks and roles at significant scale, and about building, packaging and deploying python software. It has been a very valuable experience through which I have grown personally and professionally. This community's strengths are in its leadership by operators, its readiness to assist newcomers and in striving to maintain a deployment system which is easy to understand and use (while somehow also being ridiculously flexible). As Jean-Philippe Evrard has recently expressed, this is the DNA which makes the community special. As you should all be aware, I am always ready to help when asked and I can also share historical context if there is a need for that so please feel free to ping me on IRC or add me to a review and I'll do my best. My journey onward is working with TripleO in the upgrades team, so you'll still find me contributing to OpenStack as a whole. I'll be hanging out in #tripleo and #openstack-dev on IRC if you're looking for me. All the best, Jesse (odyssey4me) From mdulko at redhat.com Wed Oct 2 11:18:57 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Wed, 02 Oct 2019 13:18:57 +0200 Subject: [kuryr][kuryr-libnetwork] Nominating Hongbin Lu to kuryr-libnetwork and kuryr core Message-ID: <3b89b976c17cfda617cf68b0c9308f97ae013b78.camel@redhat.com> Hi, I'd like to nominate Hongbin Lu to be core reviewer in both kuryr- libnetwork and kuryr projects. Besides saying that he's doing a great job maintaining kuryr-libnetwork I'm simply surprised he don't have +2/-2 rights there and that should definitely get fixed. As there isn't a lot of people maintaining those projects anymore, I'll just skip the voting part and add Hongbin to core teams immediately. Thanks, Michał From sfinucan at redhat.com Wed Oct 2 12:15:30 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Wed, 02 Oct 2019 13:15:30 +0100 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: <9dc4ada9e1690d7da75422c5fcb3037cb28e2125.camel@redhat.com> On Tue, 2019-10-01 at 14:40 -0700, Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign > country from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi Thanks for all the help over the years. You shall be missed :( Stephen From sean.mcginnis at gmx.com Wed Oct 2 12:41:48 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 2 Oct 2019 07:41:48 -0500 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> Message-ID: <20191002124148.GA16684@sm-workstation> On Tue, Oct 01, 2019 at 10:03:18PM +0000, Arkady.Kanevsky at dell.com wrote: > On the plan it is one week after feature freeze > No, Goutham is correct, it is the same week: https://releases.openstack.org/ussuri/schedule.html This is how it has been for as long as I've been aware of our release schedule. By milestone 3 we want to start locking down the changes that could introduce instability and start preparing for the final release. Sean From Arkady.Kanevsky at dell.com Wed Oct 2 13:43:55 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 2 Oct 2019 13:43:55 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <20191002124148.GA16684@sm-workstation> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> Message-ID: <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> Sean, On https://releases.openstack.org/ussuri/schedule.html Feature freeze is R6 but Requirements freeze is R5. Thanks, Arkady -----Original Message----- From: Sean McGinnis Sent: Wednesday, October 2, 2019 7:42 AM To: Kanevsky, Arkady Cc: gouthampravi at gmail.com; openstack-discuss at lists.openstack.org Subject: Re: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] On Tue, Oct 01, 2019 at 10:03:18PM +0000, Arkady.Kanevsky at dell.com wrote: > On the plan it is one week after feature freeze > No, Goutham is correct, it is the same week: https://releases.openstack.org/ussuri/schedule.html This is how it has been for as long as I've been aware of our release schedule. By milestone 3 we want to start locking down the changes that could introduce instability and start preparing for the final release. Sean From mriedemos at gmail.com Wed Oct 2 13:48:08 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 2 Oct 2019 08:48:08 -0500 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: On 10/1/2019 5:00 AM, Mark Goddard wrote: >>> 5. What DB configuration should be used in nova.conf when running >>> online data migrations? I can see some migrations that seem to need >>> the API DB, and others that need a cell DB. If I just give it the API >>> DB, will it use the cell mappings to get to each cell DB, or do I need >>> to run it once for each cell? >> The API DB has its own set of migrations, so you obviously need API DB >> connection info to make that happen. There is no fanout to all the rest >> of the cells (currently), so you need to run it with a conf file >> pointing to the cell, for each cell you have. The latest attempt >> at making this fan out was abanoned in July with no explanation, so it >> dropped off my radar at least. > That makes sense. The rolling upgrade docs could be a little clearer > for multi-cell deployments here. > This recently merged, hopefully it helps clarify: https://review.opendev.org/#/c/671298/ >>> 6. After an upgrade, when can we restart services to unpin the compute >>> RPC version? Looking at the compute RPC API, it looks like the super >>> conductor will remain pinned until all computes have been upgraded. >>> For a cell conductor, it looks like I could restart it to unpin after >>> upgrading all computes in that cell, correct? >> Yeah. >> >>> 7. Which services require policy.{yml,json}? I can see policy >>> referenced in API, conductor and compute. >> That's a good question. I would have thought it was just API, so maybe >> someone else can chime in here, although it's not specific to cells. > Yeah, unrelated to cells, just something I wondered while digging > through our nova Ansible role. > > Here is the line that made me think policies are required in > conductors:https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. > I guess this is only required for cell conductors though? > That is not the conductor service, it's the API. -- Thanks, Matt From fungi at yuggoth.org Wed Oct 2 14:14:11 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 2 Oct 2019 14:14:11 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> Message-ID: <20191002141411.pisvn7okkmxbhx3y@yuggoth.org> On 2019-10-02 13:43:55 +0000 (+0000), Arkady.Kanevsky at dell.com wrote: > Sean, > On https://releases.openstack.org/ussuri/schedule.html > Feature freeze is R6 but > Requirements freeze is R5. [...] Could it be a local rendering or interpretation problem? When I load that same URL it tells me they're both in R5. The shaded grey band which has R5 vertically centered in the left column contains 6 ordered list entries, of which those are two. The only thing I see for the R6 week is "Final release for non-client libraries." -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at nemebean.com Wed Oct 2 14:18:42 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 2 Oct 2019 09:18:42 -0500 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> Message-ID: <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: > Sean, > On https://releases.openstack.org/ussuri/schedule.html > Feature freeze is R6 but > Requirements freeze is R5. Is your browser dropping the background color for the table cells? There are actually six bullet points in the R-5 one, but because it's vertically centered some of them may appear to be under R-6. The only thing that's in R-6 though is the final non-client library release. > Thanks, > Arkady > > -----Original Message----- > From: Sean McGinnis > Sent: Wednesday, October 2, 2019 7:42 AM > To: Kanevsky, Arkady > Cc: gouthampravi at gmail.com; openstack-discuss at lists.openstack.org > Subject: Re: [all] Planned Ussuri release schedule published > > > [EXTERNAL EMAIL] > > On Tue, Oct 01, 2019 at 10:03:18PM +0000, Arkady.Kanevsky at dell.com wrote: >> On the plan it is one week after feature freeze >> > > No, Goutham is correct, it is the same week: > > https://releases.openstack.org/ussuri/schedule.html > > This is how it has been for as long as I've been aware of our release schedule. > By milestone 3 we want to start locking down the changes that could introduce instability and start preparing for the final release. > > Sean > From sean.mcginnis at gmx.com Wed Oct 2 14:57:23 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 2 Oct 2019 09:57:23 -0500 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> Message-ID: <20191002145723.GA27063@sm-workstation> > > On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: > > Sean, > > On https://releases.openstack.org/ussuri/schedule.html > > Feature freeze is R6 but > > Requirements freeze is R5. > > Is your browser dropping the background color for the table cells? There are > actually six bullet points in the R-5 one, but because it's vertically > centered some of them may appear to be under R-6. The only thing that's in > R-6 though is the final non-client library release. > That's what I see and how the schedule is defined. I'm assuming this has to be some sort of local rendering problem. Maybe openstackdocstheme needs to bring back table cell borders? Looks fine from my view though. Sean From cems at ebi.ac.uk Wed Oct 2 15:39:16 2019 From: cems at ebi.ac.uk (Charles) Date: Wed, 2 Oct 2019 16:39:16 +0100 Subject: OOK,Airship Message-ID: <62a00fb3-ea17-1cd1-fb9f-e4b6f3434047@ebi.ac.uk> Hi, We are interested in OOK and Openstack Helm. Has anyone any experience with Airship (now that 1.0 is out)? Noticed that a few Enterprise distributions are looking at managing the Openstack control plane with Kubernetes and have been testing Airship with a view to rolling it out (Mirantis,SUSE) Is this a signal that there is momentum around Openstack Helm? Is it possible to roll out an open source production grade Airship/Openstack Helm deployment today, or is it too early? Thoughts? Charles From Arkady.Kanevsky at dell.com Wed Oct 2 16:01:22 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 2 Oct 2019 16:01:22 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <20191002145723.GA27063@sm-workstation> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> <20191002145723.GA27063@sm-workstation> Message-ID: <88053759ce094142b756c17a83e099a1@AUSX13MPS308.AMER.DELL.COM> -----Original Message----- From: Sean McGinnis Sent: Wednesday, October 2, 2019 9:57 AM To: Ben Nemec Cc: Kanevsky, Arkady; gouthampravi at gmail.com; openstack-discuss at lists.openstack.org Subject: Re: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] > > On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: > > Sean, > > On https://releases.openstack.org/ussuri/schedule.html > > Feature freeze is R6 but > > Requirements freeze is R5. > > Is your browser dropping the background color for the table cells? > There are actually six bullet points in the R-5 one, but because it's > vertically centered some of them may appear to be under R-6. The only > thing that's in > R-6 though is the final non-client library release. > That's what I see and how the schedule is defined. I'm assuming this has to be some sort of local rendering problem. Maybe openstackdocstheme needs to bring back table cell borders? Looks fine from my view though. Sean -------------- next part -------------- A non-text attachment was scrubbed... Name: U-timeline.PNG Type: image/png Size: 26795 bytes Desc: U-timeline.PNG URL: From sean.mcginnis at gmx.com Wed Oct 2 16:05:35 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 2 Oct 2019 11:05:35 -0500 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <88053759ce094142b756c17a83e099a1@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> <20191002145723.GA27063@sm-workstation> <88053759ce094142b756c17a83e099a1@AUSX13MPS308.AMER.DELL.COM> Message-ID: <20191002160535.GA29937@sm-workstation> On Wed, Oct 02, 2019 at 04:01:22PM +0000, Arkady.Kanevsky at dell.com wrote: > > > -----Original Message----- > From: Sean McGinnis > Sent: Wednesday, October 2, 2019 9:57 AM > To: Ben Nemec > Cc: Kanevsky, Arkady; gouthampravi at gmail.com; openstack-discuss at lists.openstack.org > Subject: Re: [all] Planned Ussuri release schedule published > > > [EXTERNAL EMAIL] > > > > > On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: > > > Sean, > > > On https://releases.openstack.org/ussuri/schedule.html > > > Feature freeze is R6 but > > > Requirements freeze is R5. > > > > Is your browser dropping the background color for the table cells? > > There are actually six bullet points in the R-5 one, but because it's > > vertically centered some of them may appear to be under R-6. The only > > thing that's in > > R-6 though is the final non-client library release. > > Looks like you fixed it? Any idea what you changed in case someone else has the same issue? From mthode at mthode.org Wed Oct 2 16:34:15 2019 From: mthode at mthode.org (Matthew Thode) Date: Wed, 2 Oct 2019 11:34:15 -0500 Subject: [FFE][requirements][mistral][amqp] Failing =?utf-8?B?4oCcZG9j?= =?utf-8?B?c+KAnQ==?= job due to the upper constraint conflict for amqp In-Reply-To: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> References: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> Message-ID: <20191002163415.nu7okcn5de44txoz@mthode.org> On 19-10-02 14:57:24, Renat Akhmerov wrote: > Hi, > > We have a failing “docs” ([1]) CI job that fails because it implicitly brings amqp 2.5.2 but this lib is not allowed to be higher than 2.5.1 in the upper-constraings.txt in the requirements project ([2]). We see that there’s the patch [3] generated by the proposal bot that bumps the constraint to 2.5.2 for amqp (among others) but it was given -2. > > Please assist on how to address in the best way. Should we bump only amqp version in upper constraints for now? > > [1] https://zuul.opendev.org/t/openstack/build/6fe7c7d3e60b40458d2a98f3a293f412/log/job-output.txt#840 > [2] https://github.com/openstack/requirements/blob/master/upper-constraints.txt#L258 > [3] https://review.opendev.org/#/c/681382 > I'm going to be treating this as a FFE request to bump amqp from 2.5.1 to 2.5.2. It looks like a bugfix only release so I'm fine with it. As long as we don't need to mask 2.5.1 in global-requirements (which would cause a re-release for openstack/oslo.messaging). https://github.com/celery/py-amqp/compare/2.5.1...2.5.2 So, if you propose a constraints only bump of amqp-2.5.1 to 2.5.2 then I approve. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From fsbiz at yahoo.com Wed Oct 2 16:41:42 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Wed, 2 Oct 2019 16:41:42 +0000 (UTC) Subject: Port creation times out for some VMs in large group In-Reply-To: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> Message-ID: <1226029673.2675287.1570034502180@mail.yahoo.com> Thanks. This definitely helps. I am running a stable release of Queens.Even after this change I still see 10-15 failures when I create 100 VMs in our cluster. I have tracked this down (to a reasonable degree of certainty) to the SIGHUPs caused by DNSMASQ reloadsevery time a new MAC entry is added, deleted or updated.  It seems to be related tohttps://bugs.launchpad.net/neutron/+bug/1598078 The fix for the above bug was abandoned.  Gerrit Code Review | | | | Gerrit Code Review | | | Any further fine tuning that can be done?  Thanks,Fred. On Friday, September 27, 2019, 09:37:51 AM PDT, Chris Apsey wrote: Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away.  I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden wrote: When I create 100 VMs in our prod cluster:   openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest   Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.”   If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure.   What config variables should I be looking at?   Here are the relevant log entries from the HV:   2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds   More logs and data:   http://paste.openstack.org/show/779524/   -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Wed Oct 2 16:45:52 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 2 Oct 2019 09:45:52 -0700 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: On 10/1/19 2:40 PM, Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign country > from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. Kenichi, Thank you for all of your work in nova throughout the years. I have enjoyed working with you and I wish you all the best for the future. Hope to see you around again in nova some time down the road. :) Cheers, -melanie From ianyrchoi at gmail.com Wed Oct 2 17:10:55 2019 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Thu, 3 Oct 2019 02:10:55 +0900 Subject: [i18n] Request to be added as Vietnamese translation group coordinators In-Reply-To: <49e1a362-aeea-b230-536c-8778e3f3d885@suse.com> References: <49e1a362-aeea-b230-536c-8778e3f3d885@suse.com> Message-ID: Hello, Sorry for replying here late (I was travelling by the end of last week and have been following-up many things which I couldn't take care of). Yesterday, I approved all the open requests including requests mentioned below :) With many thanks, /Ian Andreas Jaeger wrote on 9/26/2019 10:14 PM: > On 26/09/2019 13.59, Trinh Nguyen wrote: >> Hi i18n team, >> >> Dai and I would like to volunteer as the coordinators of the >> Vietnamese translation group. If you find us qualified, please let us >> know. >> > > Looking at translate.openstack.org: > > I saw that Dai asked to be a translator and approved his request as an > admin, I do not see you in Vietnamese, please apply as translator for > Vietnamese first. > > Ian, will you reach out to the current coordinator? > > Ian, a couple of language teams have open requests, could you check > those and whether the coordinators are still alive, please? > > Andreas From satish.txt at gmail.com Wed Oct 2 17:34:12 2019 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 2 Oct 2019 13:34:12 -0400 Subject: [openstack-ansible] Stepping down as core reviewer In-Reply-To: <3f149abe04bc915fff4aa460eb07e1f0b2a44071.camel@odyssey4.me> References: <3f149abe04bc915fff4aa460eb07e1f0b2a44071.camel@odyssey4.me> Message-ID: Jesse, Damn!!! one more sad news :( I talked to you couple of time when i was building my openstack cloud using OSA and you truly encourage me to step up and what i am running multiple big cloud using OSA :) Thank you for your support and contribution. Good luck for your future projects. On Wed, Oct 2, 2019 at 7:17 AM Jesse Pretorius wrote: > > Hi everyone, > > While I had hoped to manage keeping up with OSA reviews and some > contributions, unfortunately there is too much on my plate in my new > role to allow me to give OSA sufficient time and I feel that it's > important to not give any false promises. I am therefore stepping down > as a core reviewer for OSA. > > My journey with OpenStack-Ansible started with initial contributions > before it was an official OpenStack project, went on to helping lead > the project to becoming an official project in the big tent, then on to > becoming a successful project with diverse contributors of which I was > proud to be a part. > > Over time I learned a heck of a lot about building and leading an Open > Source community, about developing Ansible playbooks and roles at > significant scale, and about building, packaging and deploying python > software. It has been a very valuable experience through which I have > grown personally and professionally. > > This community's strengths are in its leadership by operators, its > readiness to assist newcomers and in striving to maintain a deployment > system which is easy to understand and use (while somehow also being > ridiculously flexible). As Jean-Philippe Evrard has recently expressed, > this is the DNA which makes the community special. > > As you should all be aware, I am always ready to help when asked and I > can also share historical context if there is a need for that so please > feel free to ping me on IRC or add me to a review and I'll do my best. > > My journey onward is working with TripleO in the upgrades team, so > you'll still find me contributing to OpenStack as a whole. I'll be > hanging out in #tripleo and #openstack-dev on IRC if you're looking for > me. > > All the best, > > Jesse (odyssey4me) From ignaziocassano at gmail.com Wed Oct 2 17:36:04 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 2 Oct 2019 19:36:04 +0200 Subject: [kolla-ansible] migration In-Reply-To: References: Message-ID: Many tHanks Ignazio Il Mer 2 Ott 2019, 09:44 Pierre Riteau ha scritto: > Hi everyone, > > I hope you don't mind me reviving this thread, to let you know I wrote > an article after we successfully completed the migration of a running > OpenStack deployment to Kolla: > http://www.stackhpc.com/migrating-to-kolla.html > > Don't hesitate to contact me if you have more questions about how this > type of migration can be performed. > > Pierre > > On Mon, 1 Jul 2019 at 14:02, Ignazio Cassano > wrote: > > > > I checked them and I modified for fitting to new installation > > thanks > > Ignazio > > > > Il giorno lun 1 lug 2019 alle ore 13:36 Mohammed Naser < > mnaser at vexxhost.com> ha scritto: > >> > >> You should check your cell mapping records inside Nova. They're > probably not right of you moved your database and rabbit > >> > >> Sorry for top posting this is from a phone. > >> > >> On Mon., Jul. 1, 2019, 5:46 a.m. Ignazio Cassano, < > ignaziocassano at gmail.com> wrote: > >>> > >>> PS > >>> I presume the problem is neutron, because instances on new kvm nodes > remain in building state e do not aquire address. > >>> Probably the netron db imported from old openstack installation has > some difrrences ....probably I must check defferences from old and new > neutron services configuration files. > >>> Ignazio > >>> > >>> Il giorno lun 1 lug 2019 alle ore 10:10 Mark Goddard < > mark at stackhpc.com> ha scritto: > >>>> > >>>> It sounds like you got quite close to having this working. I'd suggest > >>>> debugging this instance build failure. One difference with kolla is > >>>> that we run libvirt inside a container. Have you stopped libvirt from > >>>> running on the host? > >>>> Mark > >>>> > >>>> On Sun, 30 Jun 2019 at 09:55, Ignazio Cassano < > ignaziocassano at gmail.com> wrote: > >>>> > > >>>> > Hi Mark, > >>>> > let me to explain what I am trying. > >>>> > I have a queens installation based on centos and pacemaker with > some instances and heat stacks. > >>>> > I would like to have another installation with same instances, > projects, stacks ....I'd like to have same uuid for all objects > (users,projects instances and so on, because it is controlled by a cloud > management platform we wrote. > >>>> > > >>>> > I stopped controllers on old queens installation backupping the > openstack database. > >>>> > I installed the new kolla openstack queens on new three controllers > with same addresses of the old intallation , vip as well. > >>>> > One of the three controllers is also a kvm node on queens. > >>>> > I stopped all containeres except rabbit,keepalive,rabbit,haproxy > and mariadb. > >>>> > I deleted al openstack db on mariadb container and I imported the > old tables, changing the address of rabbit for pointing to the new rabbit > cluster. > >>>> > I restarded containers. > >>>> > Changing the rabbit address on old kvm nodes, I can see the old > virtual machines and I can open console on them. > >>>> > I can see all networks (tenant and provider) of al installation, > but when I try to create a new instance on the new kvm, it remains in > buiding state. > >>>> > Seems it cannot aquire an address. > >>>> > Storage between old and new installation are shred on nfs NETAPP, > so I can see cinder volumes. > >>>> > I suppose db structure is different between a kolla installation > and a manual instaltion !? > >>>> > What is wrong ? > >>>> > Thanks > >>>> > Ignazio > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > Il giorno gio 27 giu 2019 alle ore 16:44 Mark Goddard < > mark at stackhpc.com> ha scritto: > >>>> >> > >>>> >> On Thu, 27 Jun 2019 at 14:46, Ignazio Cassano < > ignaziocassano at gmail.com> wrote: > >>>> >> > > >>>> >> > Sorry, for my question. > >>>> >> > It does not need to change anything because endpoints refer to > haproxy vips. > >>>> >> > So if your new glance works fine you change haproxy backends for > glance. > >>>> >> > Regards > >>>> >> > Ignazio > >>>> >> > >>>> >> That's correct - only the haproxy backend needs to be updated. > >>>> >> > >>>> >> > > >>>> >> > > >>>> >> > Il giorno gio 27 giu 2019 alle ore 15:21 Ignazio Cassano < > ignaziocassano at gmail.com> ha scritto: > >>>> >> >> > >>>> >> >> Hello Mark, > >>>> >> >> let me to verify if I understood your method. > >>>> >> >> > >>>> >> >> You have old controllers,haproxy,mariadb and nova computes. > >>>> >> >> You installed three new controllers but kolla.ansible inventory > contains old mariadb and old rabbit servers. > >>>> >> >> You are deployng single service on new controllers staring with > glance. > >>>> >> >> When you deploy glance on new controllers, it changes the > glance endpoint on old mariadb db ? > >>>> >> >> Regards > >>>> >> >> Ignazio > >>>> >> >> > >>>> >> >> Il giorno gio 27 giu 2019 alle ore 10:52 Mark Goddard < > mark at stackhpc.com> ha scritto: > >>>> >> >>> > >>>> >> >>> On Wed, 26 Jun 2019 at 19:34, Ignazio Cassano < > ignaziocassano at gmail.com> wrote: > >>>> >> >>> > > >>>> >> >>> > Hello, > >>>> >> >>> > Anyone have tried to migrate an existing openstack > installation to kolla containers? > >>>> >> >>> > >>>> >> >>> Hi, > >>>> >> >>> > >>>> >> >>> I'm aware of two people currently working on that. Gregory > Orange and > >>>> >> >>> one of my colleagues, Pierre Riteau. Pierre is away currently, > so I > >>>> >> >>> hope he doesn't mind me quoting him from an email to Gregory. > >>>> >> >>> > >>>> >> >>> Mark > >>>> >> >>> > >>>> >> >>> "I am indeed working on a similar migration using Kolla > Ansible with > >>>> >> >>> Kayobe, starting from a non-containerised OpenStack deployment > based > >>>> >> >>> on CentOS RPMs. > >>>> >> >>> Existing OpenStack services are deployed across several > controller > >>>> >> >>> nodes and all sit behind HAProxy, including for internal > endpoints. > >>>> >> >>> We have additional controller nodes that we use to deploy > >>>> >> >>> containerised services. If you don't have the luxury of > additional > >>>> >> >>> nodes, it will be more difficult as you will need to avoid > processes > >>>> >> >>> clashing when listening on the same port. > >>>> >> >>> > >>>> >> >>> The method I am using resembles your second suggestion, > however I am > >>>> >> >>> deploying only one containerised service at a time, in order to > >>>> >> >>> validate each of them independently. > >>>> >> >>> I use the --tags option of kolla-ansible to restrict Ansible to > >>>> >> >>> specific roles, and when I am happy with the resulting > configuration I > >>>> >> >>> update HAProxy to point to the new controllers. > >>>> >> >>> > >>>> >> >>> As long as the configuration matches, this should be completely > >>>> >> >>> transparent for purely HTTP-based services like Glance. You > need to be > >>>> >> >>> more careful with services that include components listening > for RPC, > >>>> >> >>> such as Nova: if the new nova.conf is incorrect and you've > deployed a > >>>> >> >>> nova-conductor that uses it, you could get failed instances > launches. > >>>> >> >>> Some roles depend on others: if you are deploying the > >>>> >> >>> neutron-openvswitch-agent, you need to run the openvswitch > role as > >>>> >> >>> well. > >>>> >> >>> > >>>> >> >>> I suggest starting with migrating Glance as it doesn't have any > >>>> >> >>> internal services and is easy to validate. Note that properly > >>>> >> >>> migrating Keystone requires keeping existing Fernet keys > around, so > >>>> >> >>> any token stays valid until the time it is expected to stop > working > >>>> >> >>> (which is fairly complex, see > >>>> >> >>> https://bugs.launchpad.net/kolla-ansible/+bug/1809469). > >>>> >> >>> > >>>> >> >>> While initially I was using an approach similar to your first > >>>> >> >>> suggestion, it can have side effects since Kolla Ansible uses > these > >>>> >> >>> variables when templating configuration. As an example, most > services > >>>> >> >>> will only have notifications enabled if enable_ceilometer is > true. > >>>> >> >>> > >>>> >> >>> I've added existing control plane nodes to the Kolla Ansible > inventory > >>>> >> >>> as separate groups, which allows me to use the existing > database and > >>>> >> >>> RabbitMQ for the containerised services. > >>>> >> >>> For example, instead of: > >>>> >> >>> > >>>> >> >>> [mariadb:children] > >>>> >> >>> control > >>>> >> >>> > >>>> >> >>> you may have: > >>>> >> >>> > >>>> >> >>> [mariadb:children] > >>>> >> >>> oldcontrol_db > >>>> >> >>> > >>>> >> >>> I still have to perform the migration of these underlying > services to > >>>> >> >>> the new control plane, I will let you know if there is any > hurdle. > >>>> >> >>> > >>>> >> >>> A few random things to note: > >>>> >> >>> > >>>> >> >>> - if run on existing control plane hosts, the baremetal role > removes > >>>> >> >>> some packages listed in `redhat_pkg_removals` which can > trigger the > >>>> >> >>> removal of OpenStack dependencies using them! I've changed this > >>>> >> >>> variable to an empty list. > >>>> >> >>> - compare your existing deployment with a Kolla Ansible one to > check > >>>> >> >>> for differences in endpoints, configuration files, database > users, > >>>> >> >>> service users, etc. For Heat, Kolla uses the domain > heat_user_domain, > >>>> >> >>> while your existing deployment may use another one (and this is > >>>> >> >>> hardcoded in the Kolla Heat image). Kolla Ansible uses the > "service" > >>>> >> >>> project while a couple of deployments I worked with were using > >>>> >> >>> "services". This shouldn't matter, except there was a bug in > Kolla > >>>> >> >>> which prevented it from setting the roles correctly: > >>>> >> >>> https://bugs.launchpad.net/kolla/+bug/1791896 (now fixed in > latest > >>>> >> >>> Rocky and Queens images) > >>>> >> >>> - the ml2_conf.ini generated for Neutron generates physical > network > >>>> >> >>> names like physnet1, physnet2… you may want to override > >>>> >> >>> bridge_mappings completely. > >>>> >> >>> - although sometimes it could be easier to change your existing > >>>> >> >>> deployment to match Kolla Ansible settings, rather than > configure > >>>> >> >>> Kolla Ansible to match your deployment." > >>>> >> >>> > >>>> >> >>> > Thanks > >>>> >> >>> > Ignazio > >>>> >> >>> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed Oct 2 17:48:06 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 2 Oct 2019 12:48:06 -0500 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: On 10/1/2019 4:40 PM, Kenichi Omichi wrote: > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign country > from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. Ken'ichi, thank you for all of your work over the years both in nova and the QA team. You played a key role in making microversions happen in the compute API and that has spread out to other projects so it's something you can be proud of. Good luck in your next position. -- Thanks, Matt From mriedemos at gmail.com Wed Oct 2 18:04:45 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 2 Oct 2019 13:04:45 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> Message-ID: <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> On 9/30/2019 6:09 PM, Eric Fried wrote: > Every cycle we approve some number of blueprints and then complete a low > percentage [1] of them. > > [1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware of > a good way to go back and mine actual data. When Mel and I were PTLs we tracked and reported post-release numbers on blueprint activity, what was proposed, what was approved and what was completed: Ocata: http://lists.openstack.org/pipermail/openstack-dev/2017-February/111639.html Pike: http://lists.openstack.org/pipermail/openstack-dev/2017-September/121875.html Queens: http://lists.openstack.org/pipermail/openstack-dev/2018-February/127402.html Rocky: http://lists.openstack.org/pipermail/openstack-dev/2018-August/133342.html Stein: http://lists.openstack.org/pipermail/openstack-discuss/2019-March/004234.html So there are numbers in there for calculating completion percentage over the last 5 releases before Train. Of course the size of the core team and diversity of contributors over that time has changed drastically so it's not comparing apples to apples. But you said you weren't aware of data to mine so I'm giving you an axe and shovel. -- Thanks, Matt From bitskrieg at bitskrieg.net Wed Oct 2 18:30:19 2019 From: bitskrieg at bitskrieg.net (Chris Apsey) Date: Wed, 02 Oct 2019 18:30:19 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: <1226029673.2675287.1570034502180@mail.yahoo.com> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> <1226029673.2675287.1570034502180@mail.yahoo.com> Message-ID: Is that still spitting out a vif plug failure or are your instances spawning but not getting addresses? I've found that adding in the no-ping option to dnsmasq lowers load significantly, but can be dangerous if you've got potentially conflicting sources of address allocation. While it doesn't address the below bug report specifically, it may breathe some more CPU cycles into dnsmasq so it can handle other items better. R CA -------- Original Message -------- On Oct 2, 2019, 12:41, fsbiz at yahoo.com wrote: > Thanks. This definitely helps. > > I am running a stable release of Queens. > Even after this change I still see 10-15 failures when I create 100 VMs in our cluster. > > I have tracked this down (to a reasonable degree of certainty) to the SIGHUPs caused by DNSMASQ reloads > every time a new MAC entry is added, deleted or updated. > > It seems to be related to > https://bugs.launchpad.net/neutron/+bug/1598078 > > The fix for the above bug was abandoned. > [Gerrit Code Review](https://review.opendev.org/#/c/336462/) > > https://review.opendev.org/#/c/336462/ > > Gerrit Code Review > > Any further fine tuning that can be done? > > Thanks, > Fred. > > On Friday, September 27, 2019, 09:37:51 AM PDT, Chris Apsey wrote: > > Albert, > > Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ > > The problem will go away. I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. > > Either way, that should solve your problem. > > r > > Chris Apsey > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ > On Friday, September 27, 2019 12:17 PM, Albert Braden wrote: > >> When I create 100 VMs in our prod cluster: >> >> openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest >> >> Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.” >> >> If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure. >> >> What config variables should I be looking at? >> >> Here are the relevant log entries from the HV: >> >> 2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') >> >> 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds >> >> More logs and data: >> >> http://paste.openstack.org/show/779524/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Wed Oct 2 18:59:57 2019 From: dms at danplanet.com (Dan Smith) Date: Wed, 02 Oct 2019 11:59:57 -0700 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> (Matt Riedemann's message of "Wed, 2 Oct 2019 13:04:45 -0500") References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> Message-ID: > So there are numbers in there for calculating completion percentage > over the last 5 releases before Train. Of course the size of the core > team and diversity of contributors over that time has changed > drastically so it's not comparing apples to apples. But you said you > weren't aware of data to mine so I'm giving you an axe and shovel. Perhaps drastic over the last five, but not over the last three, IMHO. Some change, but not enough to account for going from 59 completed in Rocky to 25 in Train. Not all blueprints are the same size, nor require the same amount of effort on the part of any of the parties involved. Involvement ebbs and flows with other commitments, like downstream release timelines. Comparing numbers across many releases makes some sense to me, but I would definitely not think that saying "we completed 25 in T, so we will only approve 25 in U" is reasonable. > (B) Require a core to commit to "caring about" a spec before we > approve it. The point of this "core liaison" is to act as a mentor to > mitigate the cultural issues noted above [5], and to be a first point > of contact for reviews. I've proposed this to the spec template here > [6]. As I'm sure you know, we've tried the "core sponsor" thing before. I don't really think it's a bad idea, but it does have a history of not solving the problem like you might think. Constraining cores to not committing to a ton of things may help (although you'll end up with fewer things actually approved if you do that). --Dan From fsbiz at yahoo.com Wed Oct 2 19:01:00 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Wed, 2 Oct 2019 19:01:00 +0000 (UTC) Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> <1226029673.2675287.1570034502180@mail.yahoo.com> Message-ID: <1127664659.2766839.1570042860356@mail.yahoo.com> Thanks.Instances are spawning but not getting addresses.We have Infoblox as the IPAM so --no-ping should be fine.Will run the tests and update. Thanks,Fred. On Wednesday, October 2, 2019, 11:34:39 AM PDT, Chris Apsey wrote: Is that still spitting out a vif plug failure or are your instances spawning but not getting addresses? I've found that adding in the no-ping option to dnsmasq lowers load significantly, but can be dangerous if you've got potentially conflicting sources of address allocation. While it doesn't address the below bug report specifically, it may breathe some more CPU cycles into dnsmasq so it can handle other items better. R CA -------- Original Message -------- On Oct 2, 2019, 12:41, fsbiz at yahoo.com < fsbiz at yahoo.com> wrote: Thanks. This definitely helps. I am running a stable release of Queens.Even after this change I still see 10-15 failures when I create 100 VMs in our cluster. I have tracked this down (to a reasonable degree of certainty) to the SIGHUPs caused by DNSMASQ reloadsevery time a new MAC entry is added, deleted or updated.  It seems to be related tohttps://bugs.launchpad.net/neutron/+bug/1598078 The fix for the above bug was abandoned.  Gerrit Code Review | | | | Gerrit Code Review | | | Any further fine tuning that can be done?  Thanks,Fred. On Friday, September 27, 2019, 09:37:51 AM PDT, Chris Apsey wrote: Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away.  I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden wrote: When I create 100 VMs in our prod cluster:   openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest   Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.”   If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure.   What config variables should I be looking at?   Here are the relevant log entries from the HV:   2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds   More logs and data:   http://paste.openstack.org/show/779524/   -------------- next part -------------- An HTML attachment was scrubbed... URL: From kendall at openstack.org Wed Oct 2 19:06:57 2019 From: kendall at openstack.org (Kendall Waters) Date: Wed, 2 Oct 2019 14:06:57 -0500 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: <29C580AF-47C6-426A-B571-E0D0E9E8806E@openstack.org> Message-ID: <569B70C9-58F0-4860-B2A6-4F597D819FB4@openstack.org> Hi Pierre, Wonderful! You are confirmed for all day Friday. We will post an updated schedule on the website next week. Cheers, Kendall Kendall Waters OpenStack Marketing & Events kendall at openstack.org > On Oct 2, 2019, at 2:29 AM, Pierre Riteau wrote: > > Hi Kendall, > > I got confirmation from all participants that they will be available > all day on Friday. Thanks for adding us to the schedule. > > Best wishes, > Pierre > > On Tue, 1 Oct 2019 at 17:37, Kendall Waters wrote: >> >> Hi Pierre, >> >> Most of our space at the Shanghai PTG is shared space so we can offer you a designated table in the shared room all day Friday. There will be extra chairs in the room if you need to pull up more chairs to your table. >> >> Best, >> Kendall >> >> Kendall Waters >> OpenStack Marketing & Events >> kendall at openstack.org >> >> >> >> On Oct 1, 2019, at 5:53 AM, Pierre Riteau wrote: >> >> Hi Kendall, >> >> Friday works for all who have replied so far, but I am still expecting >> answers from two people. >> >> Is there a room available for our Project Onboarding session that day? >> Probably in the morning, though I will confirm depending on >> availability of participants. >> We've never run one, so I don't know how many people to expect. >> >> Thanks, >> Pierre >> >> On Mon, 30 Sep 2019 at 23:29, Kendall Waters wrote: >> >> >> Hi Pierre, >> >> Apologies for the oversight on Blazar. Would all day Friday work for your team? >> >> Thanks, >> Kendall >> >> Kendall Waters >> OpenStack Marketing & Events >> kendall at openstack.org >> >> >> >> On Sep 30, 2019, at 12:27 PM, Pierre Riteau wrote: >> >> Hi Kendall, >> >> I couldn't see Blazar anywhere on the schedule. We had requested time >> for a Project Onboarding session. >> >> Additionally, there are more people travelling than initially planned, >> so we may want to allocate a half day for technical discussions as >> well (probably in the shared space, since we don't expect a huge >> turnout). >> >> Would it be possible to update the schedule accordingly? >> >> Thanks, >> Pierre >> >> On Fri, 27 Sep 2019 at 19:02, Kendall Nelson wrote: >> >> >> Hello Everyone! >> >> Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 >> >> The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. >> >> Please let me know if there are any conflicts! >> >> -Kendall (diablo_rojo) >> >> On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: >> >> >> Hello Everyone! >> >> In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. >> >> We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. >> >> If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! >> >> -Kendall (diablo_rojo) >> >> [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Wed Oct 2 20:09:24 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 2 Oct 2019 13:09:24 -0700 Subject: [nova][kolla] questions on cells In-Reply-To: <14cab401-c416-2eb8-b1d9-97aff0642a8e@gmail.com> References: <14cab401-c416-2eb8-b1d9-97aff0642a8e@gmail.com> Message-ID: On 9/30/19 8:14 PM, melanie witt wrote: > On 9/30/19 12:08 PM, Matt Riedemann wrote: >> On 9/30/2019 12:27 PM, Dan Smith wrote: >>>> 2. Do console proxies need to live in the cells? This is what devstack >>>> does in superconductor mode. I did some digging through nova code, and >>>> it looks that way. Testing with novncproxy agrees. This suggests we >>>> need to expose a unique proxy endpoint for each cell, and configure >>>> all computes to use the right one via e.g. novncproxy_base_url, >>>> correct? >>> I'll punt this to Melanie, as she's the console expert at this point, >>> but I imagine you're right. >>> >> >> Based on the Rocky spec [1] which says: >> >> "instead we will resolve the cell database issue by running console >> proxies per cell instead of global to a deployment, such that the cell >> database is local to the console proxy" >> >> Yes it's per-cell. There was stuff in the Rock release notes about >> this [2] and a lot of confusion around the deprecation of the >> nova-consoleauth service for which Mel knows the details, but it looks >> like we really should have something documented about this too, here >> [3] and/or here [4]. > > To echo, yes, console proxies need to run per cell. This used to be > mentioned in our docs and I looked and found it got removed by the > following commit: > > https://github.com/openstack/nova/commit/009fd0f35bcb88acc80f12e69d5fb72c0ee5391f > > > so, we just need to add back the bit about running console proxies per > cell. FYI I've proposed a patch to restore the doc about console proxies for review: https://review.opendev.org/686271 -melanie >> [1] >> https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html >> >> [2] https://docs.openstack.org/releasenotes/nova/rocky.html >> [3] https://docs.openstack.org/nova/latest/user/cellsv2-layout.html >> [4] >> https://docs.openstack.org/nova/latest/admin/remote-console-access.html >> > From openstack at fried.cc Wed Oct 2 20:32:28 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 2 Oct 2019 15:32:28 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> Message-ID: <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> > When Mel and I were PTLs we tracked and reported post-release numbers on blueprint activity, what was proposed, what was approved and what was completed Thanks Matt. I realized too late in Train that these weren't numbers I would be able to go back and collect after the fact (at least not without a great deal of manual effort) because a blueprint "disappears" from the release once we defer it. Best approximation: The specs directory for Train contains 37 approved specs. I count five completed specless blueprints in Train. So best case (assuming there were no deferred specless blueprints) that's 25/42=60%. Combining with Matt & Mel's data: Newton: 64% Ocata: 67% Pike: 72% Queens: 79% Rocky: 82% Stein: 59% Train: 60% The obvious trend is that new PTLs produce low completion percentages, and Matt would have hit 100% by V if only he hadn't quit :P But seriously... > Perhaps drastic over the last five, but not over the last three, > IMHO. Some change, but not enough to account for going from 59 > completed in Rocky to 25 in Train. Extraction of placement and departure of Jay are drastic, IMHO. But this is just the kind of thing I really wanted to avoid attempting to quantify -- see below. > I would definitely not think that saying "we > completed 25 in T, so we will only approve 25 in U" is reasonable. I agree it's an extremely primitive heuristic. It was a stab at having a cap (as opposed to *not* having a cap) without attempting to account for all the factors, an impossible ask. I'd love to discuss suggestions for other numbers, or other concrete mechanisms for saying "no" for reasons of resource rather than technical merit. My bid (as of [1]) is 30 approved, shooting for 25 completed (83%, approx the peak of the above numbers). Go. efried [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/009860.html From dms at danplanet.com Wed Oct 2 20:46:23 2019 From: dms at danplanet.com (Dan Smith) Date: Wed, 02 Oct 2019 13:46:23 -0700 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> (Eric Fried's message of "Wed, 2 Oct 2019 15:32:28 -0500") References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> Message-ID: > Extraction of placement and departure of Jay are drastic, IMHO. But this > is just the kind of thing I really wanted to avoid attempting to > quantify -- see below. I'm pretty sure Jay wasn't doing 60% of the reviews in Nova, justifying an equivalent drop in our available throughput. Further, I thought splitting out placement was supposed to *reduce* the load on the nova core team? If anything that was a time sink that is now finished, placement is off soaring on its own merits and we have a bunch of resource back as a result, no? > I'd love to discuss suggestions for other numbers, or other concrete > mechanisms for saying "no" for reasons of resource rather than technical > merit. My bid (as of [1]) is 30 approved, shooting for 25 completed > (83%, approx the peak of the above numbers). Go. How about approved specs require a majority (or some larger-than-two number) of the cores to +2 it to indicate "yes we should do this, and yes we should do it this cycle"? Some might argue that this unfairly weight efforts that have a lot of cores interested in seeing them land, instead of the actual requisite two, but it sounds like that's what you're shooting for? --Dan From gouthampravi at gmail.com Wed Oct 2 20:58:58 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Wed, 2 Oct 2019 13:58:58 -0700 Subject: [manila] Proposal to add dviroel to the core maintainers team Message-ID: Dear Zorillas and other Stackers, I would like to formalize the conversations we've been having amongst ourselves over IRC and in-person. At the outset, we have a lot of incoming changes to review, but we have limited core maintainer attention. We haven't re-jigged our core maintainers team as often as we'd like, and that's partly to blame. We have some relatively new and enthusiastic contributors that we would love to encourage to become maintainers! We've mentored contributors 1-1, n-1 before before adding them to the maintainers team. We would like to do more of this!** In this spirit, I would like your inputs on adding Douglas Viroel (dviroel) to the core maintainers team for manila and its associated projects (manila-specs, manila-ui, python-manilaclient, manila-tempest-plugin, manila-test-image, manila-image-elements). Douglas has been an active contributor for the past two releases and has valuable review inputs in the project. While he's been around here less longer than some of us, he brings a lot of experience to the table with his background in networking and shared file systems. He has a good grasp of the codebase and is enthusiastic in adding new features and fixing bugs in the Ussuri cycle and beyond. Please give me a +/-1 for this proposal. ** If you're interested in helping us maintain Manila by being part of the manila core maintainer team, please reach out to me or any of the current maintainers, we would love to work with you and help you grow into that role! Thanks, Goutham Pacha Ravi (gouthamr) From mriedemos at gmail.com Wed Oct 2 21:05:29 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 2 Oct 2019 16:05:29 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <20191001123850.f7h4wmupoo3oyzta@barron.net> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> Message-ID: <61306048-2fe4-059b-f033-81c9945e61e7@gmail.com> On 10/1/2019 7:38 AM, Tom Barron wrote: > There is no better way to get ones reviews stalled than to beg for > reviews with patches that are not close to ready for review and at the > same time contribute no useful reviews oneself. > > There is nothing wrong with pinging to get attention to a review if it > is ready and languishing, or if it solves an urgent issue, but even in > these cases a ping from someone who doesn't "cry wolf" and who has built > a reputation as a contributor carries more weight. This is, in large part, why we started doing the runways stuff a few cycles ago so that people wouldn't have to beg when they had blueprint work that was ready to be reviewed, meaning there was mergeable code, i.e. not large chunks of it still in WIP status or untested. It also created a timed queue of blueprints to focus on in a two week window. However, it's not part of everyone's daily review process nor does something being in a runway queue make more than one core care about it, so it's not perfect. Related to the sponsors idea elsewhere in this thread, I do believe that since we've expanded the entire core team to be able to approve specs, people that are +2 on a spec should be expected to be willing to help in reviewing the resulting blueprint code that comes out of it, but that doesn't always happen. I'm sure I'm guilty of that as well, but in my defense I will say I know I've approved at least more than one spec I don't personally care about but have felt pressured to approve it just to stop getting asked to review it, i.e. the squeaky wheel thing. -- Thanks, Matt From openstack at fried.cc Wed Oct 2 21:18:55 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 2 Oct 2019 16:18:55 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> Message-ID: <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> > I'm pretty sure Jay wasn't doing 60% of the reviews in Nova Clearly not what I was implying. > splitting out placement was supposed to *reduce* the load on the nova > core team? In a sense, that's exactly what I'm suggesting - but it took a couple releases (those releases) to get there. Both the effort to do the extraction and the overlap between the placement and nova teams during that time frame pulled resource away from nova itself. > If anything that was a time sink that is now finished, > placement is off soaring on its own merits and we have a bunch of > resource back as a result, no? Okay, I can buy that. Care to put a number on it? > How about approved specs require a majority (or some larger-than-two > number) of the cores to +2 it to indicate "yes we should do this, and > yes we should do it this cycle"? Some might argue that this unfairly > weight efforts that have a lot of cores interested in seeing them land, > instead of the actual requisite two, but it sounds like that's what > you're shooting for? I think the "core sponsor" thing will have this effect: if you can't get a core to sponsor your blueprint, it's a signal that "we" don't think it should be done (this cycle). I like the >2-core idea, though the real difference would be asking for cores to consider "should we do this *in this cycle*" when they +2 a spec. Which is good and valid, but (I think) difficult to explain/track/quantify/validate. And it's asking each core to have some sense of the "big picture" (understand the scope of all/most of the candidates) which is very difficult. > since we've expanded the entire core team to be able to approve specs, > people that are +2 on a spec should be expected to be willing to help in > reviewing the resulting blueprint code that comes out of it, but that > doesn't always happen. Agree. I considered trying to enforce that spec and/or blueprint approvers are implicitly signing up to "care about" those specs/blueprints, but I assumed that would result in a drastic reduction in willingness to be an approver :P Which I suppose would serve to reduce the number of approved blueprints in the cycle... Hm.... efried . From tpb at dyncloud.net Wed Oct 2 22:34:09 2019 From: tpb at dyncloud.net (Tom Barron) Date: Wed, 2 Oct 2019 18:34:09 -0400 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: <20191002223409.zy5jqp7lziiznfdx@barron.net> +1 from me! On 02/10/19 13:58 -0700, Goutham Pacha Ravi wrote: >Dear Zorillas and other Stackers, > >I would like to formalize the conversations we've been having amongst >ourselves over IRC and in-person. At the outset, we have a lot of >incoming changes to review, but we have limited core maintainer >attention. We haven't re-jigged our core maintainers team as often as >we'd like, and that's partly to blame. We have some relatively new and >enthusiastic contributors that we would love to encourage to become >maintainers! We've mentored contributors 1-1, n-1 before before adding >them to the maintainers team. We would like to do more of this!** > >In this spirit, I would like your inputs on adding Douglas Viroel >(dviroel) to the core maintainers team for manila and its associated >projects (manila-specs, manila-ui, python-manilaclient, >manila-tempest-plugin, manila-test-image, manila-image-elements). >Douglas has been an active contributor for the past two releases and >has valuable review inputs in the project. While he's been around here >less longer than some of us, he brings a lot of experience to the >table with his background in networking and shared file systems. He >has a good grasp of the codebase and is enthusiastic in adding new >features and fixing bugs in the Ussuri cycle and beyond. > >Please give me a +/-1 for this proposal. > >** If you're interested in helping us maintain Manila by being part of >the manila core maintainer team, please reach out to me or any of the >current maintainers, we would love to work with you and help you grow >into that role! > >Thanks, >Goutham Pacha Ravi (gouthamr) > From rodrigo.barbieri2010 at gmail.com Wed Oct 2 22:45:22 2019 From: rodrigo.barbieri2010 at gmail.com (Rodrigo Barbieri) Date: Wed, 2 Oct 2019 19:45:22 -0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: +1 -- Rodrigo Barbieri MSc Computer Scientist OpenStack Manila Core Contributor Federal University of São Carlos On Wed, Oct 2, 2019, 18:04 Goutham Pacha Ravi wrote: > Dear Zorillas and other Stackers, > > I would like to formalize the conversations we've been having amongst > ourselves over IRC and in-person. At the outset, we have a lot of > incoming changes to review, but we have limited core maintainer > attention. We haven't re-jigged our core maintainers team as often as > we'd like, and that's partly to blame. We have some relatively new and > enthusiastic contributors that we would love to encourage to become > maintainers! We've mentored contributors 1-1, n-1 before before adding > them to the maintainers team. We would like to do more of this!** > > In this spirit, I would like your inputs on adding Douglas Viroel > (dviroel) to the core maintainers team for manila and its associated > projects (manila-specs, manila-ui, python-manilaclient, > manila-tempest-plugin, manila-test-image, manila-image-elements). > Douglas has been an active contributor for the past two releases and > has valuable review inputs in the project. While he's been around here > less longer than some of us, he brings a lot of experience to the > table with his background in networking and shared file systems. He > has a good grasp of the codebase and is enthusiastic in adding new > features and fixing bugs in the Ussuri cycle and beyond. > > Please give me a +/-1 for this proposal. > > ** If you're interested in helping us maintain Manila by being part of > the manila core maintainer team, please reach out to me or any of the > current maintainers, we would love to work with you and help you grow > into that role! > > Thanks, > Goutham Pacha Ravi (gouthamr) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xingyang105 at gmail.com Thu Oct 3 00:27:32 2019 From: xingyang105 at gmail.com (Xing Yang) Date: Wed, 2 Oct 2019 20:27:32 -0400 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: +1 On Wed, Oct 2, 2019 at 5:03 PM Goutham Pacha Ravi wrote: > Dear Zorillas and other Stackers, > > I would like to formalize the conversations we've been having amongst > ourselves over IRC and in-person. At the outset, we have a lot of > incoming changes to review, but we have limited core maintainer > attention. We haven't re-jigged our core maintainers team as often as > we'd like, and that's partly to blame. We have some relatively new and > enthusiastic contributors that we would love to encourage to become > maintainers! We've mentored contributors 1-1, n-1 before before adding > them to the maintainers team. We would like to do more of this!** > > In this spirit, I would like your inputs on adding Douglas Viroel > (dviroel) to the core maintainers team for manila and its associated > projects (manila-specs, manila-ui, python-manilaclient, > manila-tempest-plugin, manila-test-image, manila-image-elements). > Douglas has been an active contributor for the past two releases and > has valuable review inputs in the project. While he's been around here > less longer than some of us, he brings a lot of experience to the > table with his background in networking and shared file systems. He > has a good grasp of the codebase and is enthusiastic in adding new > features and fixing bugs in the Ussuri cycle and beyond. > > Please give me a +/-1 for this proposal. > > ** If you're interested in helping us maintain Manila by being part of > the manila core maintainer team, please reach out to me or any of the > current maintainers, we would love to work with you and help you grow > into that role! > > Thanks, > Goutham Pacha Ravi (gouthamr) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aoren at infinidat.com Thu Oct 3 06:18:17 2019 From: aoren at infinidat.com (Amit Oren) Date: Thu, 3 Oct 2019 09:18:17 +0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: +1 On Thu, Oct 3, 2019 at 3:31 AM Xing Yang wrote: > +1 > > On Wed, Oct 2, 2019 at 5:03 PM Goutham Pacha Ravi > wrote: > >> Dear Zorillas and other Stackers, >> >> I would like to formalize the conversations we've been having amongst >> ourselves over IRC and in-person. At the outset, we have a lot of >> incoming changes to review, but we have limited core maintainer >> attention. We haven't re-jigged our core maintainers team as often as >> we'd like, and that's partly to blame. We have some relatively new and >> enthusiastic contributors that we would love to encourage to become >> maintainers! We've mentored contributors 1-1, n-1 before before adding >> them to the maintainers team. We would like to do more of this!** >> >> In this spirit, I would like your inputs on adding Douglas Viroel >> (dviroel) to the core maintainers team for manila and its associated >> projects (manila-specs, manila-ui, python-manilaclient, >> manila-tempest-plugin, manila-test-image, manila-image-elements). >> Douglas has been an active contributor for the past two releases and >> has valuable review inputs in the project. While he's been around here >> less longer than some of us, he brings a lot of experience to the >> table with his background in networking and shared file systems. He >> has a good grasp of the codebase and is enthusiastic in adding new >> features and fixing bugs in the Ussuri cycle and beyond. >> >> Please give me a +/-1 for this proposal. >> >> ** If you're interested in helping us maintain Manila by being part of >> the manila core maintainer team, please reach out to me or any of the >> current maintainers, we would love to work with you and help you grow >> into that role! >> >> Thanks, >> Goutham Pacha Ravi (gouthamr) >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bdobreli at redhat.com Thu Oct 3 07:35:16 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Thu, 3 Oct 2019 09:35:16 +0200 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: On 01.10.2019 12:00, Mark Goddard wrote: > Thanks all for your responses. Replies to Dan inline. > > On Mon, 30 Sep 2019 at 18:27, Dan Smith wrote: >> >>> 1. Is there any benefit to not having a superconductor? Presumably >>> it's a little more efficient in the single cell case? Also IIUC it >>> only requires a single message queue so is a little simpler? >> >> In a multi-cell case you need it, but you're asking about the case where >> there's only one (real) cell yeah? >> >> If the deployment is really small, then the overhead of having one is >> probably measurable and undesirable. I dunno what to tell you about >> where that cut-off is, unfortunately. However, once you're over a >> certain number of nodes, that probably shakes out a bit. The >> superconductor does things that the cell-specific ones won't have to do, >> so there's about the same amount of total load, just a potentially >> larger memory footprint for running extra services, which would be >> measurable at small scales. For a tiny deployment there's also overhead >> just in the complexity, but one of the goals of v2 has always been to >> get everyone on the same architecture, so having a "small mode" and a >> "large mode" brings with it its own complexity. > > Thanks for the explanation. We've built in a switch for single or > super mode, and single mode keeps us compatible with existing > deployments, so I guess we'll keep the switch. > >> >>> 2. Do console proxies need to live in the cells? This is what devstack >>> does in superconductor mode. I did some digging through nova code, and >>> it looks that way. Testing with novncproxy agrees. This suggests we >>> need to expose a unique proxy endpoint for each cell, and configure >>> all computes to use the right one via e.g. novncproxy_base_url, >>> correct? >> >> I'll punt this to Melanie, as she's the console expert at this point, >> but I imagine you're right. >> >>> 3. Should I upgrade the superconductor or conductor service first? >> >> Superconductor first, although they all kinda have to go around the same >> time. Superconductor, like the regular conductors, needs to look at the >> cell database directly, so if you were to upgrade superconductor before >> the cell database you'd likely have issues. I think probably the ideal >> would be to upgrade the db schema everywhere (which you can do without >> rolling code), then upgrade the top-level services (conductor, >> scheduler, api) and then you could probably get away with doing >> conductor in the cell along with computes, or whatever. If possible >> rolling the cell conductors with the top-level services would be ideal. > > I should have included my strawman deploy and upgrade flow for > context, but I'm still honing it. All DB schema changes will be done > up front in both cases. > > In terms of ordering, the API-level services (superconductor, API > scheduler) are grouped together and will be rolled first - agreeing > with what you've said. I think between Ansible's tags and limiting > actions to specific hosts, the code can be written to support > upgrading all cell conductors together, or at the same time as (well, > immediately before) the cell's computes. > > The thinking behind upgrading one cell at a time is to limit the blast > radius if something goes wrong. You suggest it would be better to roll > all cell conductors at the same time though - do you think it's safer > to run with the version disparity between conductor and computes > rather than super- and cell- conductors? I'd say upgrading one cell at a time may be in important consideration for EDGE (DCN) multi-cells deployments, where it may be technically impossible to roll it over all of the remote sites due to reasons. > >> >>> 4. Does the cell conductor need access to the API DB? >> >> Technically it should not be allowed to talk to the API DB for >> "separation of concerns" reasons. However, there are a couple of >> features that still rely on the cell conductor being able to upcall to >> the API database, such as the late affinity check. If you can only >> choose one, then I'd say configure the cell conductors to talk to the >> API DB, but if there's a knob for "isolate them" it'd be better. > > Knobs are easy to make, and difficult to keep working in all positions > :) It seems worthwhile in this case. > >> >>> 5. What DB configuration should be used in nova.conf when running >>> online data migrations? I can see some migrations that seem to need >>> the API DB, and others that need a cell DB. If I just give it the API >>> DB, will it use the cell mappings to get to each cell DB, or do I need >>> to run it once for each cell? >> >> The API DB has its own set of migrations, so you obviously need API DB >> connection info to make that happen. There is no fanout to all the rest >> of the cells (currently), so you need to run it with a conf file >> pointing to the cell, for each cell you have. The latest attempt >> at making this fan out was abanoned in July with no explanation, so it >> dropped off my radar at least. > > That makes sense. The rolling upgrade docs could be a little clearer > for multi-cell deployments here. > >> >>> 6. After an upgrade, when can we restart services to unpin the compute >>> RPC version? Looking at the compute RPC API, it looks like the super >>> conductor will remain pinned until all computes have been upgraded. >>> For a cell conductor, it looks like I could restart it to unpin after >>> upgrading all computes in that cell, correct? >> >> Yeah. >> >>> 7. Which services require policy.{yml,json}? I can see policy >>> referenced in API, conductor and compute. >> >> That's a good question. I would have thought it was just API, so maybe >> someone else can chime in here, although it's not specific to cells. > > Yeah, unrelated to cells, just something I wondered while digging > through our nova Ansible role. > > Here is the line that made me think policies are required in > conductors: https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. > I guess this is only required for cell conductors though? > >> >> --Dan > -- Best regards, Bogdan Dobrelya, Irc #bogdando From sbauza at redhat.com Thu Oct 3 07:44:25 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Thu, 3 Oct 2019 09:44:25 +0200 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> Message-ID: On Wed, Oct 2, 2019 at 11:24 PM Eric Fried wrote: > > I'm pretty sure Jay wasn't doing 60% of the reviews in Nova > > Clearly not what I was implying. > > > splitting out placement was supposed to *reduce* the load on the nova > > core team? > > In a sense, that's exactly what I'm suggesting - but it took a couple > releases (those releases) to get there. Both the effort to do the > extraction and the overlap between the placement and nova teams during > that time frame pulled resource away from nova itself. > > > If anything that was a time sink that is now finished, > > placement is off soaring on its own merits and we have a bunch of > > resource back as a result, no? > > Okay, I can buy that. Care to put a number on it? > > > How about approved specs require a majority (or some larger-than-two > > number) of the cores to +2 it to indicate "yes we should do this, and > > yes we should do it this cycle"? Some might argue that this unfairly > > weight efforts that have a lot of cores interested in seeing them land, > > instead of the actual requisite two, but it sounds like that's what > > you're shooting for? > > I think the "core sponsor" thing will have this effect: if you can't get > a core to sponsor your blueprint, it's a signal that "we" don't think it > should be done (this cycle). > > I like the >2-core idea, though the real difference would be asking for > cores to consider "should we do this *in this cycle*" when they +2 a > spec. Which is good and valid, but (I think) difficult to > explain/track/quantify/validate. And it's asking each core to have some > sense of the "big picture" (understand the scope of all/most of the > candidates) which is very difficult. > > > since we've expanded the entire core team to be able to approve specs, > > people that are +2 on a spec should be expected to be willing to help in > > reviewing the resulting blueprint code that comes out of it, but that > > doesn't always happen. > > Agree. I considered trying to enforce that spec and/or blueprint > approvers are implicitly signing up to "care about" those > specs/blueprints, but I assumed that would result in a drastic reduction > in willingness to be an approver :P > > Actually, that sounds a very reasonable suggestion from Matt. If you do care reviewing a spec, that also means you do care reviewing the implementation side. Of course, things can happen meanwhile and you can be dragged on "other stuff" (claim what you want) so you won't have time to commit on the implementation review ASAP, but your interest is still fully there. On other ways, it's a reasonable assumption to consider that cores approving a spec somehow have the responsibility to move forward with the implementation and can consequently be gently pinged for begging reviews. Which I suppose would serve to reduce the number of approved blueprints > in the cycle... Hm.... > > That's just the reflect of the reality IMHO. efried > . > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From renat.akhmerov at gmail.com Thu Oct 3 07:45:14 2019 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Thu, 3 Oct 2019 14:45:14 +0700 Subject: [FFE][requirements][mistral][amqp] Failing =?utf-8?Q?=E2=80=9Cdocs=E2=80=9D_?=job due to the upper constraint conflict for amqp In-Reply-To: <20191002163415.nu7okcn5de44txoz@mthode.org> References: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> <20191002163415.nu7okcn5de44txoz@mthode.org> Message-ID: <3cc2f690-313a-4e40-abec-8d7df96846ec@Spark> Thanks Matthew, For now we did this: https://review.opendev.org/#/c/685932/. So we just added “kombu” explicitly into our dependencies that forces to load the right version of amqp before oslo.messaging. That works. If that looks OK for you we can skip the mentioned bump. Renat Akhmerov @Nokia On 2 Oct 2019, 23:35 +0700, Matthew Thode , wrote: > On 19-10-02 14:57:24, Renat Akhmerov wrote: > > Hi, > > > > We have a failing “docs” ([1]) CI job that fails because it implicitly brings amqp 2.5.2 but this lib is not allowed to be higher than 2.5.1 in the upper-constraings.txt in the requirements project ([2]). We see that there’s the patch [3] generated by the proposal bot that bumps the constraint to 2.5.2 for amqp (among others) but it was given -2. > > > > Please assist on how to address in the best way. Should we bump only amqp version in upper constraints for now? > > > > [1] https://zuul.opendev.org/t/openstack/build/6fe7c7d3e60b40458d2a98f3a293f412/log/job-output.txt#840 > > [2] https://github.com/openstack/requirements/blob/master/upper-constraints.txt#L258 > > [3] https://review.opendev.org/#/c/681382 > > > > I'm going to be treating this as a FFE request to bump amqp from 2.5.1 > to 2.5.2. > It looks like a bugfix only release so I'm fine with it. As long as we > don't need to mask 2.5.1 in global-requirements (which would cause a > re-release for openstack/oslo.messaging). > > https://github.com/celery/py-amqp/compare/2.5.1...2.5.2 > > So, if you propose a constraints only bump of amqp-2.5.1 to 2.5.2 then I > approve. > > -- > Matthew Thode -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbauza at redhat.com Thu Oct 3 07:47:07 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Thu, 3 Oct 2019 09:47:07 +0200 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: On Tue, Oct 1, 2019 at 11:45 PM Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, make > friends in the world and bring me and my family to foreign country from our > home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still keep > contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > Your contributions were greatly appreciated over the time and thank you for all the hard work you made on polishing the API side. I can't wait for your proposals or bugs :-) Hopefully see you later. -Sylvain --- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Thu Oct 3 08:24:10 2019 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 3 Oct 2019 09:24:10 +0100 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: On Wed, 2 Oct 2019 at 14:48, Matt Riedemann wrote: > > On 10/1/2019 5:00 AM, Mark Goddard wrote: > >>> 5. What DB configuration should be used in nova.conf when running > >>> online data migrations? I can see some migrations that seem to need > >>> the API DB, and others that need a cell DB. If I just give it the API > >>> DB, will it use the cell mappings to get to each cell DB, or do I need > >>> to run it once for each cell? > >> The API DB has its own set of migrations, so you obviously need API DB > >> connection info to make that happen. There is no fanout to all the rest > >> of the cells (currently), so you need to run it with a conf file > >> pointing to the cell, for each cell you have. The latest attempt > >> at making this fan out was abanoned in July with no explanation, so it > >> dropped off my radar at least. > > That makes sense. The rolling upgrade docs could be a little clearer > > for multi-cell deployments here. > > > > This recently merged, hopefully it helps clarify: > > https://review.opendev.org/#/c/671298/ It does help a little for the schema migrations, but the point was about data migrations. > > >>> 6. After an upgrade, when can we restart services to unpin the compute > >>> RPC version? Looking at the compute RPC API, it looks like the super > >>> conductor will remain pinned until all computes have been upgraded. > >>> For a cell conductor, it looks like I could restart it to unpin after > >>> upgrading all computes in that cell, correct? > >> Yeah. > >> > >>> 7. Which services require policy.{yml,json}? I can see policy > >>> referenced in API, conductor and compute. > >> That's a good question. I would have thought it was just API, so maybe > >> someone else can chime in here, although it's not specific to cells. > > Yeah, unrelated to cells, just something I wondered while digging > > through our nova Ansible role. > > > > Here is the line that made me think policies are required in > > conductors:https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. > > I guess this is only required for cell conductors though? > > > > That is not the conductor service, it's the API. My mistake, still learning the flow of communication. > > -- > > Thanks, > > Matt > From mark at stackhpc.com Thu Oct 3 08:28:34 2019 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 3 Oct 2019 09:28:34 +0100 Subject: [nova][kolla] questions on cells In-Reply-To: References: <14cab401-c416-2eb8-b1d9-97aff0642a8e@gmail.com> Message-ID: On Wed, 2 Oct 2019 at 21:11, melanie witt wrote: > > On 9/30/19 8:14 PM, melanie witt wrote: > > On 9/30/19 12:08 PM, Matt Riedemann wrote: > >> On 9/30/2019 12:27 PM, Dan Smith wrote: > >>>> 2. Do console proxies need to live in the cells? This is what devstack > >>>> does in superconductor mode. I did some digging through nova code, and > >>>> it looks that way. Testing with novncproxy agrees. This suggests we > >>>> need to expose a unique proxy endpoint for each cell, and configure > >>>> all computes to use the right one via e.g. novncproxy_base_url, > >>>> correct? > >>> I'll punt this to Melanie, as she's the console expert at this point, > >>> but I imagine you're right. > >>> > >> > >> Based on the Rocky spec [1] which says: > >> > >> "instead we will resolve the cell database issue by running console > >> proxies per cell instead of global to a deployment, such that the cell > >> database is local to the console proxy" > >> > >> Yes it's per-cell. There was stuff in the Rock release notes about > >> this [2] and a lot of confusion around the deprecation of the > >> nova-consoleauth service for which Mel knows the details, but it looks > >> like we really should have something documented about this too, here > >> [3] and/or here [4]. > > > > To echo, yes, console proxies need to run per cell. This used to be > > mentioned in our docs and I looked and found it got removed by the > > following commit: > > > > https://github.com/openstack/nova/commit/009fd0f35bcb88acc80f12e69d5fb72c0ee5391f > > > > > > so, we just need to add back the bit about running console proxies per > > cell. > > FYI I've proposed a patch to restore the doc about console proxies for > review: > > https://review.opendev.org/686271 Great, thanks. I know it's merged, but I added a comment. > > -melanie > > >> [1] > >> https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html > >> > >> [2] https://docs.openstack.org/releasenotes/nova/rocky.html > >> [3] https://docs.openstack.org/nova/latest/user/cellsv2-layout.html > >> [4] > >> https://docs.openstack.org/nova/latest/admin/remote-console-access.html > >> > > > > From a.settle at outlook.com Thu Oct 3 09:26:29 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Thu, 3 Oct 2019 09:26:29 +0000 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Hey, Could you add something for docs? Or combine with i18n again if Ian doesn't mind? We don't need a lot, just a room for people to ask questions about the future of the docs team. Stephen will be there, as co-PTL. There's 0 chance of it not conflicting with nova. Please :) Thank you! Alex On Wed, 2019-09-25 at 14:13 -0700, Kendall Nelson wrote: > Hello Everyone! > > In the attached picture or link [0] you will find the proposed > schedule for the various tracks at the Shanghai PTG in November. > > We did our best to avoid the key conflicts that the track leads > (PTLs, SIG leads...) mentioned in their PTG survey responses, > although there was no perfect solution that would avoid all conflicts > especially when the event is three-ish days long and we have over 40 > teams meeting. > > If there are critical conflicts we missed or other issues, please let > us know, by October 6th at 7:00 UTC! > > -Kendall (diablo_rojo) > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedu > le.png -- Alexandra Settle IRC: asettle From kchamart at redhat.com Thu Oct 3 10:10:54 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Thu, 3 Oct 2019 12:10:54 +0200 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> Message-ID: <20191003101054.GB26595@paraplu> On Mon, Sep 30, 2019 at 06:09:16PM -0500, Eric Fried wrote: > Nova developers and maintainers- [...] > I'd like to try a couple more. > > (A) Constrain scope, drastically. We marked 25 blueprints complete in > Train [3]. Since there has been no change to the core team, let's > limit Ussuri to 25 blueprints [4]. If this turns out to be too few, > what's the worst thing that happens? We finish everything, early, and > wish we had done more. If that happens, drinks are on me, and we can > bump the number for V. I welcome scope reduction, focusing on fewer features, stability, and bug fixes than "more gadgetries and gongs". Which also means: less frenzy, less split attention, fewer mistakes, more retained concentration, and more serenity. And, yeah, any reasonable person would read '25' as _an_ educated limit, rather than some "optimal limit". If we end up with bags of "spare time", there's loads of tech-debt items, performance (it's a feature, let's recall) issues, and meaningful clean-ups waiting to be tackled. [...] -- /kashyap From thierry at openstack.org Thu Oct 3 10:24:36 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 3 Oct 2019 12:24:36 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai Message-ID: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> Hi everyone, The summit is going to mainland China for the first time. It's a great opportunity to meet the Chinese community, make ourselves available for direct discussion, and on-board new team members. In order to facilitate that, the TC has been suggesting that the Foundation organizes two opportunities to "meet the project leaders" during the Summit in Shanghai: one around the Monday evening marketplace mixer, and one around the Wednesday lunch: https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24417/ https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24426/meet-the-project-leaders OpenStack PTLs, TC members, core reviewers, UC members interested in meeting the local community are all welcome. We'll also have leaders from the other OSF-supported projects around. See you there! -- Thierry Carrez (ttx) From a.settle at outlook.com Thu Oct 3 10:28:04 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Thu, 3 Oct 2019 10:28:04 +0000 Subject: [all] [tc] [docs] [release] [ptls] Docs as SIG: Ownership of docs.openstack.org In-Reply-To: References: <20190819154106.GA25909@sm-workstation> <9DABCC6E-1E61-45A6-8370-4F086428B3B6@doughellmann.com> <20190819174941.GA4730@sm-workstation> <20190819175652.dkbyerlmblqkvzdk@yuggoth.org> Message-ID: Dragging this thread back up from the depths as I've updated the governance patch as of this morning: https://review.opendev.org/#/c/657 142/ On Mon, 2019-08-19 at 14:16 -0400, Doug Hellmann wrote: > > On Aug 19, 2019, at 1:56 PM, Jeremy Stanley > > wrote: > > > > On 2019-08-19 12:49:41 -0500 (-0500), Sean McGinnis wrote: > > [...] > > > there seems to be a big difference between owning the task of > > > configuring the site for the next release (which totally makes > > > sense as a release team task) and owning the entire > > > docs.openstack.org site. > > > > That's why I also requested clarification in my earlier message on > > this thread. The vast majority of the content hosted under > > https://docs.openstack.org/ is maintained in a distributed fashion > > by the various teams writing documentation in their respective > > projects. The hosting (configuration apart from .htaccess files, > > storage, DNS, and so on) is handled by Infra/OpenDev folks. If it's > > *just* the stuff inside the "www" tree in the openstack-manuals > > repo > > then that's not a lot, but it's also possible what the release team > > actually needs to touch in there could be successfully scaled back > > even more (with the caveat that I haven't looked through it in > > detail). > > -- > > Jeremy Stanley > > > The suggestion is for the release team to take over the site > generator > for docs.openstack.org (the stuff under “www” in the current > openstack-manuals git repository) and for the SIG to own anything > that looks remotely like “content”. There isn’t much of that left > anyway, > now that most of it is in the project repositories. I like this. > > Most of what is under www is series-specific templates and data files > that tell the site generator how to insert links to parts of the > project documentation in the right places (the “install guides” page > links > to /foo/$series/install/ for example). They’re very simple, very > dumb, > templates, driven with a little custom script that wraps jinja2, > feeding the right data to the right templates based on series name. > There is pretty good documentation for how to use it in the > tools [1] and release [2] sections of the docs contributor guide. > > The current site-generator definitely could be simpler, especially > if it only linked to the master docs and *those* linked to the older > versions of themselves (so /nova/latest/ had a link that pointed to > /nova/ocata/ somewhere). That would take some work, though. > > The simplest thing we could do is just make the release team > committers > on openstack-manuals, leave everything else as it is, and exercise > trust between the two groups. If we absolutely want to separate the > builds, > then we could make a new repo with just the template-driven pages > under “www”, > but that’s going to involve changing/creating several publishing > jobs. I think this is a suitable option. I would like the docs team cores to review this, and approve. But I think this is the best/simpliest option for now. -- Alexandra Settle IRC: asettle From paye600 at gmail.com Thu Oct 3 11:04:10 2019 From: paye600 at gmail.com (Roman Gorshunov) Date: Thu, 3 Oct 2019 13:04:10 +0200 Subject: [Airship-discuss] Fwd: OOK,Airship In-Reply-To: References: <963B5DA1-1C3D-481B-A41B-D11369BC1848@openstack.org> Message-ID: Thanks Ashlee! Charles, A few companies who work on development of Airship do use it, including production uses: AT&T, SUSE, Mirantis, Ericsson, SK Telekom and others. Many of those companies (if not all) use Airship + OpenStack Helm as well. Airship, as you have mentioned, is a collection of components for undercloud control plane, which helps to deploy nodes with OS+Docker+Kubernetes on it, configure/manage it all in GitOps way, and then help to maintain the configuration. It also allows to manage deploys and maintenance of whatever runs on top of Kubernetes cluster, would that be OpenStack Helm or other software packaged in Helm format. OpenStack Helm does not really require to be running on Airship-managed cluster. It could run standalone. Yes, you can roll out an open source production grade Airship/Openstack Helm deployment today. Good example of production grade configuration could be found in airship/treasuremap repository [0] as 'seaworthy' site definition. You are welcome to try, of course. For the questions - reach out to us on IRC #airshipit at Freenode of via Airship-discuss mailing list. [0] https://opendev.org/airship/treasuremap Best regards, -- Roman Gorshunov On Wed, Oct 2, 2019 at 9:27 PM Ashlee Ferguson wrote: > > Hi Charles, > > Glad to hear you’re interested! Forwarding this to the Airship ML since there may be folks on this mailing list that will have pointers who didn't see the openstack-discuss post. > > Ashlee > > > > Begin forwarded message: > > From: Charles > Subject: OOK,Airship > Date: October 2, 2019 at 5:39:16 PM GMT+2 > To: openstack-discuss at lists.openstack.org > > Hi, > > > We are interested in OOK and Openstack Helm. > > Has anyone any experience with Airship (now that 1.0 is out)? > > Noticed that a few Enterprise distributions are looking at managing the Openstack control plane with Kubernetes and have been testing Airship with a view to rolling it out (Mirantis,SUSE) > > Is this a signal that there is momentum around Openstack Helm? > > Is it possible to roll out an open source production grade Airship/Openstack Helm deployment today, or is it too early? > > > Thoughts? > > > Charles > > > > > _______________________________________________ > Airship-discuss mailing list > Airship-discuss at lists.airshipit.org > http://lists.airshipit.org/cgi-bin/mailman/listinfo/airship-discuss From skaplons at redhat.com Thu Oct 3 11:29:03 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 3 Oct 2019 13:29:03 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> Message-ID: <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Hi Thierry, I think it’s interesting idea. Should we somehow sign up to this even (one or both, depends on which we plan to be) to let people know that PTL of specific project will be available there? Or it’s just enough to come there when will be time for that? Also, is it expected from project leaders to be available on both terms or only one is enough? > On 3 Oct 2019, at 12:24, Thierry Carrez wrote: > > Hi everyone, > > The summit is going to mainland China for the first time. It's a great > opportunity to meet the Chinese community, make ourselves available for > direct discussion, and on-board new team members. > > In order to facilitate that, the TC has been suggesting that the > Foundation organizes two opportunities to "meet the project leaders" > during the Summit in Shanghai: one around the Monday evening marketplace > mixer, and one around the Wednesday lunch: > > https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24417/ > https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24426/meet-the-project-leaders > > OpenStack PTLs, TC members, core reviewers, UC members interested in > meeting the local community are all welcome. We'll also have leaders > from the other OSF-supported projects around. > > See you there! > > -- > Thierry Carrez (ttx) > — Slawek Kaplonski Senior software engineer Red Hat From cems at ebi.ac.uk Thu Oct 3 12:13:28 2019 From: cems at ebi.ac.uk (Charles) Date: Thu, 3 Oct 2019 13:13:28 +0100 Subject: [Airship-discuss] Fwd: OOK,Airship In-Reply-To: References: <963B5DA1-1C3D-481B-A41B-D11369BC1848@openstack.org> Message-ID: <69277446-4470-3bd2-6cd4-b0f61c3e21e3@ebi.ac.uk> Hi Roman, Many thanks for the reply. I posted this on openstack-discuss because I was wondering if any users/Openstack operators out there (outside large corporations who are members of the Airship development framework) are actually running OOK in production. This could be Airship, or some other Kubernetes distribution running Openstack Helm. Our several years experience of managing Openstack so far (RHOSP/TripleO) has been bumpy due to issues with configuration maintenance /upgrades. The idea of using CI/CD and Kubernetes/Helm to manage Openstack is compelling and fits nicely into the DevOps framework here. If we were to explore this route we could 'roll our own' with a deployment say based on https://opendev.org/airship/treasuremap , or pay for and Enterprise solution that incorporates the OOK model (upcoming Mirantis and SUSE potentially). Regards Charles On 03/10/2019 12:04, Roman Gorshunov wrote: > Thanks Ashlee! > > Charles, > A few companies who work on development of Airship do use it, > including production uses: AT&T, SUSE, Mirantis, Ericsson, SK Telekom > and others. Many of those companies (if not all) use Airship + > OpenStack Helm as well. > > Airship, as you have mentioned, is a collection of components for > undercloud control plane, which helps to deploy nodes with > OS+Docker+Kubernetes on it, configure/manage it all in GitOps way, and > then help to maintain the configuration. It also allows to manage > deploys and maintenance of whatever runs on top of Kubernetes cluster, > would that be OpenStack Helm or other software packaged in Helm > format. > > OpenStack Helm does not really require to be running on > Airship-managed cluster. It could run standalone. > > Yes, you can roll out an open source production grade > Airship/Openstack Helm deployment today. Good example of production > grade configuration could be found in airship/treasuremap repository > [0] as 'seaworthy' site definition. You are welcome to try, of course. > For the questions - reach out to us on IRC #airshipit at Freenode of via > Airship-discuss mailing list. > > [0] https://opendev.org/airship/treasuremap > > Best regards, > -- > Roman Gorshunov > > On Wed, Oct 2, 2019 at 9:27 PM Ashlee Ferguson wrote: >> Hi Charles, >> >> Glad to hear you’re interested! Forwarding this to the Airship ML since there may be folks on this mailing list that will have pointers who didn't see the openstack-discuss post. >> >> Ashlee >> >> >> >> Begin forwarded message: >> >> From: Charles >> Subject: OOK,Airship >> Date: October 2, 2019 at 5:39:16 PM GMT+2 >> To: openstack-discuss at lists.openstack.org >> >> Hi, >> >> >> We are interested in OOK and Openstack Helm. >> >> Has anyone any experience with Airship (now that 1.0 is out)? >> >> Noticed that a few Enterprise distributions are looking at managing the Openstack control plane with Kubernetes and have been testing Airship with a view to rolling it out (Mirantis,SUSE) >> >> Is this a signal that there is momentum around Openstack Helm? >> >> Is it possible to roll out an open source production grade Airship/Openstack Helm deployment today, or is it too early? >> >> >> Thoughts? >> >> >> Charles >> >> >> >> >> _______________________________________________ >> Airship-discuss mailing list >> Airship-discuss at lists.airshipit.org >> http://lists.airshipit.org/cgi-bin/mailman/listinfo/airship-discuss -- Charles Short Senior Cloud Engineer EMBL-EBI Hinxton 01223494205 From fungi at yuggoth.org Thu Oct 3 12:17:56 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 3 Oct 2019 12:17:56 +0000 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <20191003101054.GB26595@paraplu> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <20191003101054.GB26595@paraplu> Message-ID: <20191003121756.4u6k2jh5p47rap5j@yuggoth.org> On 2019-10-03 12:10:54 +0200 (+0200), Kashyap Chamarthy wrote: > On Mon, Sep 30, 2019 at 06:09:16PM -0500, Eric Fried wrote: [...] > > (A) Constrain scope, drastically. We marked 25 blueprints > > complete in Train [3]. Since there has been no change to the > > core team, let's limit Ussuri to 25 blueprints [4]. If this > > turns out to be too few, what's the worst thing that happens? We > > finish everything, early, and wish we had done more. If that > > happens, drinks are on me, and we can bump the number for V. > > I welcome scope reduction, focusing on fewer features, stability, > and bug fixes than "more gadgetries and gongs". Which also means: > less frenzy, less split attention, fewer mistakes, more retained > concentration, and more serenity. And, yeah, any reasonable > person would read '25' as _an_ educated limit, rather than some > "optimal limit". [...] Viewing this from outside, 25 specs in a cycle already sounds like planning to get a *lot* done... that's completing an average of one Nova spec per week (even when averaged through the freeze weeks). Maybe as a goal it's undershooting a bit, but it's still a very impressive quantity to be able to consistently accomplish. Many thanks and congratulations to all the folks who work so hard to make this happen in Nova, cycle after cycle. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dangtrinhnt at gmail.com Thu Oct 3 14:28:15 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 3 Oct 2019 23:28:15 +0900 Subject: [i18n] Request to be added as Vietnamese translation group coordinators In-Reply-To: References: <49e1a362-aeea-b230-536c-8778e3f3d885@suse.com> Message-ID: Thanks, Ian :) On Thu, Oct 3, 2019 at 2:11 AM Ian Y. Choi wrote: > Hello, > > Sorry for replying here late (I was travelling by the end of last week > and have been following-up many things which I couldn't take care of). > > Yesterday, I approved all the open requests including requests mentioned > below :) > > > With many thanks, > > /Ian > > Andreas Jaeger wrote on 9/26/2019 10:14 PM: > > On 26/09/2019 13.59, Trinh Nguyen wrote: > >> Hi i18n team, > >> > >> Dai and I would like to volunteer as the coordinators of the > >> Vietnamese translation group. If you find us qualified, please let us > >> know. > >> > > > > Looking at translate.openstack.org: > > > > I saw that Dai asked to be a translator and approved his request as an > > admin, I do not see you in Vietnamese, please apply as translator for > > Vietnamese first. > > > > Ian, will you reach out to the current coordinator? > > > > Ian, a couple of language teams have open requests, could you check > > those and whether the coordinators are still alive, please? > > > > Andreas > > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Thu Oct 3 14:44:17 2019 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Thu, 3 Oct 2019 14:44:17 +0000 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: <1570113853.14734.1@smtp.office365.com> On Tue, Oct 1, 2019 at 11:40 PM, Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign > country from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > --- Thank you for your hard work and good luck with your next endeavour! Cheers, gibi From doug at doughellmann.com Thu Oct 3 15:11:01 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 3 Oct 2019 11:11:01 -0400 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <20191002160535.GA29937@sm-workstation> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> <20191002145723.GA27063@sm-workstation> <88053759ce094142b756c17a83e099a1@AUSX13MPS308.AMER.DELL.COM> <20191002160535.GA29937@sm-workstation> Message-ID: <932AE9B9-5EDB-44F9-84BA-5ADEAC384A74@doughellmann.com> > On Oct 2, 2019, at 12:05 PM, Sean McGinnis wrote: > > On Wed, Oct 02, 2019 at 04:01:22PM +0000, Arkady.Kanevsky at dell.com wrote: >> >> >> -----Original Message----- >> From: Sean McGinnis >> Sent: Wednesday, October 2, 2019 9:57 AM >> To: Ben Nemec >> Cc: Kanevsky, Arkady; gouthampravi at gmail.com; openstack-discuss at lists.openstack.org >> Subject: Re: [all] Planned Ussuri release schedule published >> >> >> [EXTERNAL EMAIL] >> >>> >>> On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: >>>> Sean, >>>> On https://releases.openstack.org/ussuri/schedule.html >>>> Feature freeze is R6 but >>>> Requirements freeze is R5. >>> >>> Is your browser dropping the background color for the table cells? >>> There are actually six bullet points in the R-5 one, but because it's >>> vertically centered some of them may appear to be under R-6. The only >>> thing that's in >>> R-6 though is the final non-client library release. >>> > > Looks like you fixed it? Any idea what you changed in case someone else has the > same issue? https://review.opendev.org/686420 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Thu Oct 3 16:02:49 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 3 Oct 2019 11:02:49 -0500 Subject: [oslo][nova] Revert of oslo.messaging JSON serialization change In-Reply-To: References: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> <1569857750.5848.0@smtp.office365.com> <1569917983.26355.2@smtp.office365.com> Message-ID: <3ea5faa5-4d32-cb7e-6bf5-89892afa55b6@nemebean.com> TLDR: I've abandoned the revert. After looking at Gibi's investigation further I agree that rabbit was actually using the jsonutils version of dumps, so making the fake driver use it is consistent. Apologies for the confusion. -Ben On 10/1/19 3:35 PM, Ken Giusti wrote: > Sorry I'm late to the party.... > > At the risk of stating the obvious I wouldn't put much faith in the fact > that the Kafka and Amqp1 drivers use jsonutils.   The use of jsonutils > in these drivers is simply a cut-n-paste from the way old qpidd > driver.    Why jsonutils was used there... I dunno. > > IMHO the RabbitMQ driver is the authoritative source for correct driver > implementation - the Fake driver (and the others) should use the same > serialization as the rabbitmq driver if possible. > > -K > > On Tue, Oct 1, 2019 at 4:30 AM Balázs Gibizer > wrote: > > > > On Mon, Sep 30, 2019 at 5:35 PM, Balázs Gibizer > wrote: > > > > > > On Mon, Sep 30, 2019 at 4:45 PM, Ben Nemec > > > > wrote: > >>  Hi, > >> > >>  I've just proposed https://review.opendev.org/#/c/685724/ which > >>  reverts a change that recently went in to make the fake driver in > >>  oslo.messaging use jsonutils for message serialization instead of > >>  json.dumps. > >> > >>  As explained in the commit message on the revert, this is > >> problematic > >>  because the rabbit driver uses kombu's default serialization > method, > >>  which is json.dumps. By changing the fake driver to use jsonutils > >>  we've made it more lenient than the most used real driver which > >> opens > >>  us up to merging broken changes in consumers of oslo.messaging. > >> > >>  We did have some discussion of whether we should try to > override the > >>  kombu default and tell it to use jsonutils too, as a number of > other > >>  drivers do. The concern with this was that the jsonutils > handler for > >>  things like datetime objects is not tz-aware, which means if you > >> send > >>  a datetime object over RPC and don't explicitly handle it you could > >>  lose important information. > >> > >>  I'm open to being persuaded otherwise, but at the moment I'm > leaning > >>  toward less magic happening at the RPC layer and requiring projects > >>  to explicitly handle types that aren't serializable by the standard > >>  library json module. If you have a different preference, please > >> share > >>  it here. > > > > Hi, > > > > I might me totally wrong here and please help me understand how the > > RabbitDriver works. What I did when I created the original patch > that > > I > > looked at each drivers how they handle sending messages. The > > oslo_messaging._drivers.base.BaseDriver defines the interface with a > > send() message. The oslo_messaging._drivers.amqpdriver.AMQPDriverBase > > implements the BaseDriver interface's send() method to call _send(). > > Then _send() calls rpc_commom.serialize_msg which then calls > > jsonutils.dumps. > > > > The oslo_messaging._drivers.impl_rabbit.RabbitDriver driver inherits > > from AMQPDriverBase and does not override send() or _send() so I > think > > the AMQPDriverBase ._send() is called that therefore jsonutils is > used > > during sending a message with RabbitDriver. > > I did some tracing in devstack to prove my point. See the result in > https://review.opendev.org/#/c/685724/1//COMMIT_MSG at 11 > > Cheers, > gibi > > > > > Cheers, > > gibi > > > > > > [1] > > > https://github.com/openstack/oslo.messaging/blob/7734ac1376a1a9285c8245a91cf43599358bfa9d/oslo_messaging/_drivers/amqpdriver.py#L599 > > > >> > >>  Thanks. > >> > >>  -Ben > >> > > > > > > > > > -- > Ken Giusti  (kgiusti at gmail.com ) From mriedemos at gmail.com Thu Oct 3 16:16:28 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Oct 2019 11:16:28 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> Message-ID: <28312232-6a30-17de-6141-a47c2f282af9@gmail.com> On 10/2/2019 4:18 PM, Eric Fried wrote: > I like the >2-core idea, though the real difference would be asking for > cores to consider "should we do this*in this cycle*" when they +2 a > spec. Which is good and valid, but (I think) difficult to > explain/track/quantify/validate. And it's asking each core to have some > sense of the "big picture" (understand the scope of all/most of the > candidates) which is very difficult. Note that having that "big picture" is I think the main reason why historically, until very recently, there was a subgroup of the nova core team that was the specs core team, because what was approved in specs could have wide impacts to nova and thus knowing the big picture was important. I know that not all specs are the same complexity and we changed how the core team works for specs for good reasons, but given the years of "why aren't they the same core team? it's not fair." I wanted to point out it can be, as you said, very difficult to be a specs core for different reasons from a nova core. -- Thanks, Matt From kendall at openstack.org Thu Oct 3 16:32:19 2019 From: kendall at openstack.org (Kendall Waters) Date: Thu, 3 Oct 2019 11:32:19 -0500 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Hey Alex, We still have tables available on Friday. Would half a day on Friday work for the docs team? Unless Ian is okay with it, we can combine Docs with i18n in their Wednesday afternoon/Thursday morning slot. Just let me know! Cheers, Kendall Kendall Waters OpenStack Marketing & Events kendall at openstack.org > On Oct 3, 2019, at 4:26 AM, Alexandra Settle wrote: > > Hey, > > Could you add something for docs? Or combine with i18n again if Ian > doesn't mind? > > We don't need a lot, just a room for people to ask questions about the > future of the docs team. > > Stephen will be there, as co-PTL. There's 0 chance of it not > conflicting with nova. > > Please :) > > Thank you! > > Alex > > On Wed, 2019-09-25 at 14:13 -0700, Kendall Nelson wrote: >> Hello Everyone! >> >> In the attached picture or link [0] you will find the proposed >> schedule for the various tracks at the Shanghai PTG in November. >> >> We did our best to avoid the key conflicts that the track leads >> (PTLs, SIG leads...) mentioned in their PTG survey responses, >> although there was no perfect solution that would avoid all conflicts >> especially when the event is three-ish days long and we have over 40 >> teams meeting. >> >> If there are critical conflicts we missed or other issues, please let >> us know, by October 6th at 7:00 UTC! >> >> -Kendall (diablo_rojo) >> >> [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedu >> le.png > -- > Alexandra Settle > > IRC: asettle -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Oct 3 16:35:05 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Oct 2019 11:35:05 -0500 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: On 10/3/2019 3:24 AM, Mark Goddard wrote: >> This recently merged, hopefully it helps clarify: >> >> https://review.opendev.org/#/c/671298/ > It does help a little for the schema migrations, but the point was > about data migrations. > That's an excellent point. Looking at devstack [1] and grenade [2] we don't necessarily do that properly. For devstack with a fresh install it doesn't really matter but it should matter for grenade since we should be migrating both cell0 and cell1. Grenade does not run in "superconductor" mode so some of the rules might be different there, i.e. grenade's nova.conf has the database pointed at cell1 while devstack has the database config pointed at cell0. Either way we're not properly running the online data migrations per cell DB as far as I can tell. Maybe we just haven't had an online data migration yet that makes that important, but it's definitely wrong. I also don't see anything in the docs for the online_data_migrations command [3] to use the --config-file option to run it against the cell DB config. I can open a bug for that. The upgrade guide should also be updated to mention that like for db sync in https://review.opendev.org/#/c/671298/. [1] https://github.com/openstack/devstack/blob/1a46c898db9c16173013d95e2bc954992121077c/lib/nova#L764 [2] https://github.com/openstack/grenade/blob/bb14e02a464db2b268930bbba0152862fe0f805e/projects/60_nova/upgrade.sh#L79 [3] https://docs.openstack.org/nova/latest/cli/nova-manage.html -- Thanks, Matt From sombrafam at gmail.com Thu Oct 3 17:17:03 2019 From: sombrafam at gmail.com (Erlon Cruz) Date: Thu, 3 Oct 2019 14:17:03 -0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: Glad to see that! +1 Em qui, 3 de out de 2019 às 03:22, Amit Oren escreveu: > +1 > > On Thu, Oct 3, 2019 at 3:31 AM Xing Yang wrote: > >> +1 >> >> On Wed, Oct 2, 2019 at 5:03 PM Goutham Pacha Ravi >> wrote: >> >>> Dear Zorillas and other Stackers, >>> >>> I would like to formalize the conversations we've been having amongst >>> ourselves over IRC and in-person. At the outset, we have a lot of >>> incoming changes to review, but we have limited core maintainer >>> attention. We haven't re-jigged our core maintainers team as often as >>> we'd like, and that's partly to blame. We have some relatively new and >>> enthusiastic contributors that we would love to encourage to become >>> maintainers! We've mentored contributors 1-1, n-1 before before adding >>> them to the maintainers team. We would like to do more of this!** >>> >>> In this spirit, I would like your inputs on adding Douglas Viroel >>> (dviroel) to the core maintainers team for manila and its associated >>> projects (manila-specs, manila-ui, python-manilaclient, >>> manila-tempest-plugin, manila-test-image, manila-image-elements). >>> Douglas has been an active contributor for the past two releases and >>> has valuable review inputs in the project. While he's been around here >>> less longer than some of us, he brings a lot of experience to the >>> table with his background in networking and shared file systems. He >>> has a good grasp of the codebase and is enthusiastic in adding new >>> features and fixing bugs in the Ussuri cycle and beyond. >>> >>> Please give me a +/-1 for this proposal. >>> >>> ** If you're interested in helping us maintain Manila by being part of >>> the manila core maintainer team, please reach out to me or any of the >>> current maintainers, we would love to work with you and help you grow >>> into that role! >>> >>> Thanks, >>> Goutham Pacha Ravi (gouthamr) >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Thu Oct 3 16:44:05 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 03 Oct 2019 11:44:05 -0500 Subject: Proposed Forum Schedule Message-ID: <5D962555.7090508@openstack.org> Hello! I'm attaching a PDF of the proposed Shanghai Forum Schedule. I'll publish the same on the actual website later this afternoon. However, there is still time for feedback/time changes, assuming there aren't conflicts for speakers/moderators. This is also available for download here: https://drive.google.com/file/d/1qp0I9xnyOK3mhBitQnk2a7VuS9XClvyF/view?usp=sharing Please respond to this thread with any concerns. Cheers, Jimmy -------------- next part -------------- A non-text attachment was scrubbed... Name: Forum Mock Schedule.pdf Type: application/pdf Size: 88937 bytes Desc: not available URL: From ben at swartzlander.org Thu Oct 3 18:04:51 2019 From: ben at swartzlander.org (Ben Swartzlander) Date: Thu, 3 Oct 2019 14:04:51 -0400 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: <8d939186-982d-429c-47fe-d95178ce0622@swartzlander.org> On 10/3/19 1:17 PM, Erlon Cruz wrote: > Glad to see that! +1 > > Em qui, 3 de out de 2019 às 03:22, Amit Oren > escreveu: > > +1 > > On Thu, Oct 3, 2019 at 3:31 AM Xing Yang > wrote: > > +1 > > On Wed, Oct 2, 2019 at 5:03 PM Goutham Pacha Ravi > > wrote: > > Dear Zorillas and other Stackers, > > I would like to formalize the conversations we've been > having amongst > ourselves over IRC and in-person. At the outset, we have a > lot of > incoming changes to review, but we have limited core maintainer > attention. We haven't re-jigged our core maintainers team as > often as > we'd like, and that's partly to blame. We have some > relatively new and > enthusiastic contributors that we would love to encourage to > become > maintainers! We've mentored contributors 1-1, n-1 before > before adding > them to the maintainers team. We would like to do more of > this!** > > In this spirit, I would like your inputs on adding Douglas > Viroel > (dviroel) to the core maintainers team for manila and its > associated > projects (manila-specs, manila-ui, python-manilaclient, > manila-tempest-plugin, manila-test-image, > manila-image-elements). > Douglas has been an active contributor for the past two > releases and > has valuable review inputs in the project. While he's been > around here > less longer than some of us, he brings a lot of experience > to the > table with his background in networking and shared file > systems. He > has a good grasp of the codebase and is enthusiastic in > adding new > features and fixing bugs in the Ussuri cycle and beyond. > > Please give me a +/-1 for this proposal. > > ** If you're interested in helping us maintain Manila by > being part of > the manila core maintainer team, please reach out to me or > any of the > current maintainers, we would love to work with you and help > you grow > into that role! > > Thanks, > Goutham Pacha Ravi (gouthamr) +1 -Ben Swartzlander From mthode at mthode.org Thu Oct 3 18:35:14 2019 From: mthode at mthode.org (Matthew Thode) Date: Thu, 3 Oct 2019 13:35:14 -0500 Subject: [FFE][requirements][mistral][amqp] Failing =?utf-8?B?4oCcZG9j?= =?utf-8?B?c+KAnQ==?= job due to the upper constraint conflict for amqp In-Reply-To: <3cc2f690-313a-4e40-abec-8d7df96846ec@Spark> References: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> <20191002163415.nu7okcn5de44txoz@mthode.org> <3cc2f690-313a-4e40-abec-8d7df96846ec@Spark> Message-ID: <20191003183514.iubdhip2bsjylcb3@mthode.org> On 19-10-03 14:45:14, Renat Akhmerov wrote: > Thanks Matthew, > > For now we did this: https://review.opendev.org/#/c/685932/. So we just added “kombu” explicitly into our dependencies that forces to load the right version of amqp before oslo.messaging. That works. If that looks OK for you we can skip the mentioned bump. > > > > Renat Akhmerov > @Nokia > On 2 Oct 2019, 23:35 +0700, Matthew Thode , wrote: > > On 19-10-02 14:57:24, Renat Akhmerov wrote: > > > Hi, > > > > > > We have a failing “docs” ([1]) CI job that fails because it implicitly brings amqp 2.5.2 but this lib is not allowed to be higher than 2.5.1 in the upper-constraings.txt in the requirements project ([2]). We see that there’s the patch [3] generated by the proposal bot that bumps the constraint to 2.5.2 for amqp (among others) but it was given -2. > > > > > > Please assist on how to address in the best way. Should we bump only amqp version in upper constraints for now? > > > > > > [1] https://zuul.opendev.org/t/openstack/build/6fe7c7d3e60b40458d2a98f3a293f412/log/job-output.txt#840 > > > [2] https://github.com/openstack/requirements/blob/master/upper-constraints.txt#L258 > > > [3] https://review.opendev.org/#/c/681382 > > > > > > > I'm going to be treating this as a FFE request to bump amqp from 2.5.1 > > to 2.5.2. > > It looks like a bugfix only release so I'm fine with it. As long as we > > don't need to mask 2.5.1 in global-requirements (which would cause a > > re-release for openstack/oslo.messaging). > > > > https://github.com/celery/py-amqp/compare/2.5.1...2.5.2 > > > > So, if you propose a constraints only bump of amqp-2.5.1 to 2.5.2 then I > > approve. > > Looks like a good workaround. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From colleen at gazlene.net Thu Oct 3 18:38:53 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Thu, 03 Oct 2019 11:38:53 -0700 Subject: [keystone] Ussuri roadmap Message-ID: <37661d40-2a1d-487b-8cd0-910219a34d01@www.fastmail.com> Hi team, In past cycles we used Trello for tracking our goals throughout a cycle, which gave us flexibility and visibility over the cycle plans as a whole in conjunction with specs and launchpad bugs. Trello is a proprietary platform and in the last few months changed its ToS to limit the number of public boards an organization can have, and the keystone team has reached that limit. Rather than try to backup and archive our old boards or create another team or a non-team board, for Ussuri I would like to try using a board on a different platform, Taiga: https://tree.taiga.io/project/keystone-ussuri-roadmap/kanban Taiga is AGPLv3 and has no restrictions on its hosted version that I've discovered yet. Many thanks to Morgan for discovering and researching it. I've copied over our incomplete stories from the Train roadmap[1] and arranged the kanban board more or less the same way as the old Trello board, but the platform seems to be very flexible and we could change the layout and workflows in any way that makes sense. For instance, while I only enabled the kanban feature, there is also a sprints/backlog mode if we wanted to take advantage of that. I can grant administrator privileges to anyone who is interested in investigating all the configuration options (or you can create your own sandbox projects to play with). The main deficiency seems to be the lack of support for "teams" or "organizations"[2], but users can be added to the board individually. Action required: * If you were a member of the old Trello keystone team and would to be a member of this board, send me an email address that I can send an invite to * Once you have an account and are added to the board, please have a look at the stories that are already there an assign yourself to the ones you are working on or plan to work on, and update their status or add relevant reviews as comments. Feel free to play with the platform's features and provide feedback in this thread or at next week's team meeting. Please also let me know if you have concerns about using this platform. Colleen [1] https://trello.com/b/ClKW9C8x/keystone-train-roadmap [2] https://tree.taiga.io/project/taiga/us/2129 From sean.mcginnis at gmx.com Thu Oct 3 19:32:45 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Oct 2019 14:32:45 -0500 Subject: [release] Release countdown for week R-1, October 7-11 Message-ID: <20191003193245.GA29220@sm-workstation> Development Focus ----------------- We are on the final mile of this Train ride! (You can thank Thierry for that one ^) Remember that the Train final release will include the latest release candidate (for cycle-with-rc deliverables) or the latest intermediary release (for cycle-with-intermediary deliverables) available. Thursday, October 10th is the deadline for final Train release candidates as well as any last cycle-with-intermediary deliverables. We will then enter a quiet period until we tag the final release on October 16th. Teams should be prioritizing fixing release-critical bugs, before that deadline. Otherwise it's time to start planning the Ussuri development cycle, including discussing Forum and PTG sessions content, in preparation of the Summit in Shanghai next month. Actions --------- Watch for any translation patches coming through on the stable/train branch and merge them quickly. If you discover a release-critical issue, please make sure to fix it on the master branch first, then backport the bugfix to the stable/train branch before triggering a new release. Please drop by #openstack-release with any questions or concerns about the upcoming release! Upcoming Deadlines & Dates -------------------------- Final Train release: October 16 Forum+PTG at Shanghai summit: November 4 From jean-philippe at evrard.me Thu Oct 3 19:58:16 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Thu, 03 Oct 2019 21:58:16 +0200 Subject: [tc] monthly meeting agenda Message-ID: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> Hello everyone, Here's the agenda for our monthly TC meeting. It will happen next Thursday (10 October) at the usual time (1400 UTC) in #openstack-tc . If you can't attend, please put your name in the "Apologies for Absence" section in the wiki [1] Our meeting chair will be Alexandra (asettle). * Follow up on past action items ** ricolin: Follow up with SIG chairs about guidelines https://etherpad.openstack.org/p/SIGs-guideline ** ttx: contact interested parties in a new 'large scale' sig (help with mnaser, jroll reaching out to verizon media) ** Release Naming - Results of the TC poll - Next action * New initiatives and/or report on previous initiatives ** Help gmann on the community goals following our new goal process ** mugsie: to sync with dhellmann or release-team to find the code for the proposal bot ** jroll - ttx: Feedback from the forum selection committee -- Follow up on https://etherpad.openstack.org/p/PVG-TC-brainstorming -- Final accepted list? ** mnaser: sync up with swift team on python3 migration Thank you everyone! Regards, JP [1]: https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting From openstack at fried.cc Thu Oct 3 21:56:50 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 3 Oct 2019 16:56:50 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> Message-ID: (B) After some very productive discussion in the nova meeting and IRC channel this morning, I have updated the nova-specs patch introducing the "Core Liaison" concept [1]. The main change is a drastic edit of the README to include a "Core Liaison FAQ". Other changes of note: * We're now going to make distinct use of the launchpad blueprint's "Definition" and "Direction" fields. As such, we can still decide to defer a blueprint whose spec is merged in the 'approved' directory. (Which really isn't different than what we were doing before; it's just that now we can do it for reasons other than "oops, this didn't get finished in time".) * The single-core-approval rule for previously approved specifications is removed. (A) Note that the idea of capping the number of specs is (mostly) unrelated, and we still haven't closed on it. I feel like we've agreed to have a targeted discussion around spec freeze time where we decide whether to defer features for resource reasons. That would be a new (and good, IMO) thing. But it's still TBD whether "30 approved for 25 completed" will apply, and/or what criteria would be used to decide what gets cut. Collected odds and ends from elsewhere in this thread: > If you do care reviewing a spec, that also means you do care reviewing > the implementation side. I agree that would be nice, and I'd like to make it happen, but separately from what's already being discussed. I added a TODO in the spec README [2]. > If we end up with bags of "spare time", there's loads of tech-debt > items, performance (it's a feature, let's recall) issues, and meaningful > clean-ups waiting to be tackled. Hear hear. > Viewing this from outside, 25 specs in a cycle already sounds like > planning to get a *lot* done... that's completing an average of one > Nova spec per week (even when averaged through the freeze weeks). > Maybe as a goal it's undershooting a bit, but it's still a very > impressive quantity to be able to consistently accomplish. Many > thanks and congratulations to all the folks who work so hard to make > this happen in Nova, cycle after cycle. That perspective literally hadn't occurred to me from here with my face mashed up against the trees [3]. Thanks fungi. > Note that having that "big picture" is I think the main reason why > historically, until very recently, there was a subgroup of the nova core > team that was the specs core team, because what was approved in specs > could have wide impacts to nova and thus knowing the big picture was > important. Good point, Matt. (Not that I think we should, or could, go back to that...) efried [1] https://review.opendev.org/#/c/685857 [2] https://review.opendev.org/#/c/685857/4/README.rst at 219 [3] For non-native speakers, this is a reference to the following idiom: https://www.dictionary.com/browse/can-t-see-the-forest-for-the-trees From mriedemos at gmail.com Thu Oct 3 23:22:33 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Oct 2019 18:22:33 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration Message-ID: Hello Cinderinos, I've now got a working patch that migrates legacy volume attachments to new style v3 attachments [1]. The fun stuff showing it working is in this paste [2]. We want to do this data migration in nova because we still have a lot of compatibility code since Queens for pre-v3 style attachments and we can't remove that compatibility code (ever) if we don't first make sure we provide a data migration routine for operators to roll through. So for example if this lands in Ussuri we can can enforce a nova-status upgrade check in V and rip out code in X. Without digging into the patch, this is the flow: 1. On nova-compute restart, query the nova DB for instances on the compute host with legacy volume attachments. 2. For each of those, create a new style attachment with the host connector and update the BlockDeviceMapping information in the nova DB (attachment_id and connection_info). 3. Delete the existing legacy attachment so when the server is deleted the volume status goes back to 'available' due to proper attachment reference counting in the Cinder DB. My main question is on #3. Right now I'm calling the v3 attachment delete API rather than the v2 os-terminate_connection API. Is that sufficient to cleanup the legacy attachment on the storage backend even though the connection was created via os-initialize_connection originally? Looking at the cinder code, attachment_delete hits the connection terminate code under the covers [3]. So that looks OK. The only thing I can really think of is if a host connector is not provided or tracked with the legacy attachment, is that going to cause problems? Note that I think volume drivers are already required to deal with that today anyway because of the "local delete" scenario in the compute API where the compute host that the server is on is down and thus we don't have a host connector to provide to Cinder to terminate the connection. So Cinder people, are you OK with this flow? Hello Novaheads, Do you have any issues with the above? Note the migration routine is threaded out on compute start so it doesn't block, similar to the ironic flavor data migration introduced in Pike. One question I have is if we should add a config option for this so operators can enable/disable it as needed. Note that this requires nova to be configured with a service user that has the admin role to do this stuff in cinder since we don't have a user token, similar to nova doing things with neutron ports without a user token. Testing this with devstack requires [4]. By default [cinder]/auth_type is None and not required so by default this migration routine is not going to run so maybe that is sufficient? Hello Operatorites, Do you have any issues with what's proposed above? [1] https://review.opendev.org/#/c/549130/ [2] http://paste.openstack.org/show/781063/ [3] https://github.com/openstack/cinder/blob/410791580ef60ddb03104bf20766859ed9d78932/cinder/volume/manager.py#L4650 [4] https://review.opendev.org/#/c/685488/ -- Thanks, Matt From thierry at openstack.org Fri Oct 4 07:48:36 2019 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 4 Oct 2019 09:48:36 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Message-ID: Slawek Kaplonski wrote: > I think it’s interesting idea. Should we somehow sign up to this even (one or both, depends on which we plan to be) to let people know that PTL of specific project will be available there? Or it’s just enough to come there when will be time for that? It would be good to have a rough idea of who will be available at each opportunity. To keep it simple, I created a sign-up sheet at: https://etherpad.openstack.org/p/meet-the-project-leaders > Also, is it expected from project leaders to be available on both terms or only one is enough? You can do one or both (or none) -- no commitment. -- Thierry From akekane at redhat.com Fri Oct 4 09:28:19 2019 From: akekane at redhat.com (Abhishek Kekane) Date: Fri, 4 Oct 2019 14:58:19 +0530 Subject: [Glance][PTG]Shanghai PTG planning Message-ID: Hello Everyone, I have prepared an etherpad [1] to plan the Shanghai PTG discussion topics for glance. The etherpad contains template to add the topic for discussion. It has also references of previous PTG planning etherpads. Even if anyone is not going to attend the PTG but wants there topic needs to be discussed can add as well. Kindly add your topics. [1] https://etherpad.openstack.org/p/Glance-Ussuri-PTG-planning Thanks, Abhishek Kekane -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at tipit.net Fri Oct 4 12:57:32 2019 From: jimmy at tipit.net (Jimmy Mcarthur) Date: Fri, 04 Oct 2019 07:57:32 -0500 Subject: Proposed Forum Schedule Message-ID: <5D9741BC.6080909@tipit.net> The forum schedule is now live: https://www.openstack.org/summit/shanghai-2019/summit-schedule/global-search?t=forum If you'd prefer to use the spreadsheet view: https://drive.google.com/file/d/1qp0I9xnyOK3mhBitQnk2a7VuS9XClvyF Please let Kendall Nelson or myself know as soon as possible if you see any conflicts. Cheers, Jimmy From zigo at debian.org Fri Oct 4 13:35:15 2019 From: zigo at debian.org (Thomas Goirand) Date: Fri, 4 Oct 2019 15:35:15 +0200 Subject: [oslo][nova] Revert of oslo.messaging JSON serialization change In-Reply-To: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> References: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> Message-ID: On 9/30/19 4:45 PM, Ben Nemec wrote: > The concern with this was that the jsonutils handler for > things like datetime objects is not tz-aware, which means if you send a > datetime object over RPC and don't explicitly handle it you could lose > important information. echo Etc/UTC >/etc/timezone Problem solved... :) Thomas From corey.bryant at canonical.com Fri Oct 4 13:41:12 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 4 Oct 2019 09:41:12 -0400 Subject: [charms] placement charm Message-ID: Hi All, I'd like to see if I can get some input on the current state of the Placement API split. For some background, the nova placement API was removed from nova in train, and it's been split into its own project. It's mostly just a basic API charm. The tricky part is the migration of tables from the nova_api database to the placement database. Code is located at: https://github.com/coreycb/charm-placement https://github.com/coreycb/charm-interface-placement https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) Test scenarios I've been testing with: 1) deploy nova-cc et al train, configure keystonev3, deploy instance 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy instance 3 There is currently an issue with the second test scenario where instance 2 creation errors because nova-scheduler can't find a valid placement candidate (not sure of the exact error atm). However if I delete instance 1 before creating instance 2 it is created successfully. It feels like a DB related issue but I'm really not sure so I'll keep digging. Thanks! Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From corey.bryant at canonical.com Fri Oct 4 13:48:13 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 4 Oct 2019 09:48:13 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: One other issue is "pxc-strict-mode: disabled" for percona-cluster is required to test this. /usr/share/placement/mysql-migrate-db.sh may need some updates but I haven't dug into that yet. Thanks, Corey On Fri, Oct 4, 2019 at 9:41 AM Corey Bryant wrote: > Hi All, > > I'd like to see if I can get some input on the current state of the > Placement API split. > > For some background, the nova placement API was removed from nova in > train, and it's been split into its own project. It's mostly just a basic > API charm. The tricky part is the migration of tables from the nova_api > database to the placement database. > > Code is located at: > https://github.com/coreycb/charm-placement > https://github.com/coreycb/charm-interface-placement > > https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) > > Test scenarios I've been testing with: > 1) deploy nova-cc et al train, configure keystonev3, deploy instance > 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, > deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy > instance 3 > > There is currently an issue with the second test scenario where instance 2 > creation errors because nova-scheduler can't find a valid placement > candidate (not sure of the exact error atm). However if I delete instance 1 > before creating instance 2 it is created successfully. It feels like a DB > related issue but I'm really not sure so I'll keep digging. > > Thanks! > Corey > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Fri Oct 4 14:22:43 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 4 Oct 2019 15:22:43 +0100 (BST) Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <20191003101054.GB26595@paraplu> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <20191003101054.GB26595@paraplu> Message-ID: On Thu, 3 Oct 2019, Kashyap Chamarthy wrote: > I welcome scope reduction, focusing on fewer features, stability, and > bug fixes than "more gadgetries and gongs". Which also means: less > frenzy, less split attention, fewer mistakes, more retained > concentration, and more serenity. And, yeah, any reasonable person > would read '25' as _an_ educated limit, rather than some "optimal > limit". > > If we end up with bags of "spare time", there's loads of tech-debt > items, performance (it's a feature, let's recall) issues, and meaningful > clean-ups waiting to be tackled. Since I quoted the above text and referred back to this entire thread in it, I thought I better: a) say "here here" (or is "hear hear"?) to the above 2. link to https://anticdent.org/fix-your-debt-placement-performance-summary.html which has more to say and an example of what you can get with "retained concentration" -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From corey.bryant at canonical.com Fri Oct 4 14:54:10 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 4 Oct 2019 10:54:10 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Fri, Oct 4, 2019 at 9:41 AM Corey Bryant wrote: > Hi All, > > I'd like to see if I can get some input on the current state of the > Placement API split. > > For some background, the nova placement API was removed from nova in > train, and it's been split into its own project. It's mostly just a basic > API charm. The tricky part is the migration of tables from the nova_api > database to the placement database. > > Code is located at: > https://github.com/coreycb/charm-placement > https://github.com/coreycb/charm-interface-placement > > https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) > > Test scenarios I've been testing with: > 1) deploy nova-cc et al train, configure keystonev3, deploy instance > 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, > deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy > instance 3 > > There is currently an issue with the second test scenario where instance 2 > creation errors because nova-scheduler can't find a valid placement > candidate (not sure of the exact error atm). However if I delete instance 1 > before creating instance 2 it is created successfully. It feels like a DB > related issue but I'm really not sure so I'll keep digging. > > Nothing to see here. Small compute node with limited resources. So this is not an issue. Thanks! > Corey > -------------- next part -------------- An HTML attachment was scrubbed... URL: From corey.bryant at canonical.com Fri Oct 4 15:53:35 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 4 Oct 2019 11:53:35 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Fri, Oct 4, 2019 at 9:41 AM Corey Bryant wrote: > Hi All, > > I'd like to see if I can get some input on the current state of the > Placement API split. > > For some background, the nova placement API was removed from nova in > train, and it's been split into its own project. It's mostly just a basic > API charm. The tricky part is the migration of tables from the nova_api > database to the placement database. > > Code is located at: > https://github.com/coreycb/charm-placement > https://github.com/coreycb/charm-interface-placement > > https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) > > Test scenarios I've been testing with: > 1) deploy nova-cc et al train, configure keystonev3, deploy instance > 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, > deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy > instance 3 > > There is currently an issue with the second test scenario where instance 2 > creation errors because nova-scheduler can't find a valid placement > candidate (not sure of the exact error atm). However if I delete instance 1 > before creating instance 2 it is created successfully. It feels like a DB > related issue but I'm really not sure so I'll keep digging. > > Thanks! > Corey > In case anyone needs these for testing prior to the code getting merged I've pushed placement and nova-cloud-controller charms to the charm store under my namespace. I've released them to the edge channel. https://jaas.ai/u/corey.bryant/placement/bionic/0 https://jaas.ai/u/corey.bryant/nova-cloud-controller/bionic/0 Thanks, Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From waboring at hemna.com Fri Oct 4 16:03:40 2019 From: waboring at hemna.com (Walter Boring) Date: Fri, 4 Oct 2019 12:03:40 -0400 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: References: Message-ID: So looking into the cinder code, calling attachment_delete should be what we want to call. But. I think if we don't have a host connector passed in and the attachment record doesn't have a connector saved, then that results in the volume manager not calling the cinder driver to terminate_connection and return. This also bypasses the driver's remove_export() which is the last chance for a driver to unexport a volume. Walt On Thu, Oct 3, 2019 at 7:27 PM Matt Riedemann wrote: > Hello Cinderinos, > > I've now got a working patch that migrates legacy volume attachments to > new style v3 attachments [1]. The fun stuff showing it working is in > this paste [2]. > > We want to do this data migration in nova because we still have a lot of > compatibility code since Queens for pre-v3 style attachments and we > can't remove that compatibility code (ever) if we don't first make sure > we provide a data migration routine for operators to roll through. So > for example if this lands in Ussuri we can can enforce a nova-status > upgrade check in V and rip out code in X. > > Without digging into the patch, this is the flow: > > 1. On nova-compute restart, query the nova DB for instances on the > compute host with legacy volume attachments. > > 2. For each of those, create a new style attachment with the host > connector and update the BlockDeviceMapping information in the nova DB > (attachment_id and connection_info). > > 3. Delete the existing legacy attachment so when the server is deleted > the volume status goes back to 'available' due to proper attachment > reference counting in the Cinder DB. > > My main question is on #3. Right now I'm calling the v3 attachment > delete API rather than the v2 os-terminate_connection API. Is that > sufficient to cleanup the legacy attachment on the storage backend even > though the connection was created via os-initialize_connection > originally? Looking at the cinder code, attachment_delete hits the > connection terminate code under the covers [3]. So that looks OK. The > only thing I can really think of is if a host connector is not provided > or tracked with the legacy attachment, is that going to cause problems? > Note that I think volume drivers are already required to deal with that > today anyway because of the "local delete" scenario in the compute API > where the compute host that the server is on is down and thus we don't > have a host connector to provide to Cinder to terminate the connection. > > So Cinder people, are you OK with this flow? > > Hello Novaheads, > > Do you have any issues with the above? Note the migration routine is > threaded out on compute start so it doesn't block, similar to the ironic > flavor data migration introduced in Pike. > > One question I have is if we should add a config option for this so > operators can enable/disable it as needed. Note that this requires nova > to be configured with a service user that has the admin role to do this > stuff in cinder since we don't have a user token, similar to nova doing > things with neutron ports without a user token. Testing this with > devstack requires [4]. By default [cinder]/auth_type is None and not > required so by default this migration routine is not going to run so > maybe that is sufficient? > > Hello Operatorites, > > Do you have any issues with what's proposed above? > > [1] https://review.opendev.org/#/c/549130/ > [2] http://paste.openstack.org/show/781063/ > [3] > > https://github.com/openstack/cinder/blob/410791580ef60ddb03104bf20766859ed9d78932/cinder/volume/manager.py#L4650 > [4] https://review.opendev.org/#/c/685488/ > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Fri Oct 4 17:52:34 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 4 Oct 2019 12:52:34 -0500 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Message-ID: > It would be good to have a rough idea of who will be available at each > opportunity. To keep it simple, I created a sign-up sheet at: > > https://etherpad.openstack.org/p/meet-the-project-leaders If a PTL will not be present, is it acceptable to send a delegate? efried . From fungi at yuggoth.org Fri Oct 4 18:07:12 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 4 Oct 2019 18:07:12 +0000 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Message-ID: <20191004180712.323nlymaxedoib54@yuggoth.org> On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: > > It would be good to have a rough idea of who will be available > > at each opportunity. To keep it simple, I created a sign-up > > sheet at: > > > > https://etherpad.openstack.org/p/meet-the-project-leaders > > If a PTL will not be present, is it acceptable to send a delegate? The goal, as I understand it, is to reinforce to attendees in China that OpenStack project leadership is accessible and achievable, by providing opportunities for them to be able to meet and speak in-person with a representative cross-section of our community leaders. Is that something which can be delegated? Seems to me it might convey the opposite of what's intended, but I don't know if my impression is shared by others. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From mriedemos at gmail.com Fri Oct 4 18:32:18 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Oct 2019 13:32:18 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: References: Message-ID: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> On 10/4/2019 11:03 AM, Walter Boring wrote: >   I think if we don't have a host connector passed in and the > attachment record doesn't have a connector saved, > then that results in the volume manager not calling the cinder driver to > terminate_connection and return. > This also bypasses the driver's remove_export() which is the last chance > for a driver to unexport a volume. Two things: 1. Yeah if the existing legacy attachment record doesn't have a connector I was worried about not properly cleaning on for that old connection, which is something I mentioned before, but also as mentioned we potentially have that case when a server is deleted and we can't get to the compute host to get the host connector, right? 2. If I were to use os-terminate_connection, I seem to have a tricky situation on the migration flow because right now I'm doing: a) create new attachment with host connector b) complete new attachment (put the volume back to in-use status) - if this fails I attempt to delete the new attachment c) delete the legacy attachment - I intentionally left this until the end to make sure (a) and (b) were successful. If I change (c) to be os-terminate_connection, will that screw up the accounting on the attachment created in (a)? If I did the terminate_connection first (before creating a new attachment), could that leave a window of time where the volume is shown as not attached/in-use? Maybe not since it's not the begin_detaching/os-detach API...I'm fuzzy on the cinder volume state machine here. Or maybe the flow would become: a) create new attachment with host connector b) terminate the connection for the legacy attachment - if this fails, delete the new attachment created in (a) c) complete the new attachment created in (a) - if this fails...? Without digging into the flow of a cold or live migration I want to say that's closer to what we do there, e.g. initialize_connection for the new host, terminate_connection for the old host, complete the new attachment. -- Thanks, Matt From gmann at ghanshyammann.com Fri Oct 4 19:30:19 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 04 Oct 2019 14:30:19 -0500 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <20191004180712.323nlymaxedoib54@yuggoth.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> <20191004180712.323nlymaxedoib54@yuggoth.org> Message-ID: <16d984064d2.bc633ba6242736.627005749645226424@ghanshyammann.com> ---- On Fri, 04 Oct 2019 13:07:12 -0500 Jeremy Stanley wrote ---- > On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: > > > It would be good to have a rough idea of who will be available > > > at each opportunity. To keep it simple, I created a sign-up > > > sheet at: > > > > > > https://etherpad.openstack.org/p/meet-the-project-leaders > > > > If a PTL will not be present, is it acceptable to send a delegate? > > The goal, as I understand it, is to reinforce to attendees in China > that OpenStack project leadership is accessible and achievable, by > providing opportunities for them to be able to meet and speak > in-person with a representative cross-section of our community > leaders. Is that something which can be delegated? Seems to me it > might convey the opposite of what's intended, but I don't know if my > impression is shared by others. IMO, it should be ok to delegate to other Core of that project. the main idea here is to interact with Chinese communities and help new contributors to onboard or just convey them 'if you are interested in this project, I am here to talk to you'. I think it will be more useful sessions if we have more local Core members also along with PTLs which will solve the cultural or language barrier if any. -gmann > -- > Jeremy Stanley > From colleen at gazlene.net Sat Oct 5 00:05:27 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 04 Oct 2019 17:05:27 -0700 Subject: [keystone] Keystone Team Update - Week of 30 September 2019 Message-ID: # Keystone Team Update - Week of 30 September 2019 ## News Quiet week as we wait for the final release and start preparing for Forum and next cycle. ## Action Items Team members: see action required regarding the new roadmap tracker[1]. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/009942.html ## Office Hours When there are topics to cover, the keystone team holds office hours on Tuesdays at 17:00 UTC. We won't plan to hold office hours next week. Add topics you would like to see covered during office hours to the etherpad: https://etherpad.openstack.org/p/keystone-office-hours-topics ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 7 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 33 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ## Bugs This week we opened 1 new bugs and closed 3. Bugs opened (1) Bug #1846817 (keystone:Medium) opened by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1846817 Bugs fixed (3) Bug #968696 (keystone:High) fixed by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/968696 Bug #1630434 (keystone:Medium) fixed by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1630434 Bug #1806762 (keystone:Medium) fixed by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1806762 Notably, we closed #968696 *for keystone*, as we have completed the migration of our policies to understand system scope and, when [oslo_policy]/enforce_scope is set to true and deprecated policies are overridden, system-wide requests won't respond to project-scoped tokens. This does not mean the "admin"-ness problem is solved across OpenStack, as it will have to be addressed on a service-by-service basis. ## Milestone Outlook https://releases.openstack.org/train/schedule.html Next week will be the last chance to release another RC if we need one. Please help triage and address any RC-critical bugs should they come up. Also, the release schedule for Ussuri has been published: https://releases.openstack.org/ussuri/schedule.html ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From rico.lin.guanyu at gmail.com Sat Oct 5 02:56:31 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Sat, 5 Oct 2019 10:56:31 +0800 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> Message-ID: On Thu, Oct 3, 2019 at 6:29 PM Thierry Carrez wrote: > > OpenStack PTLs, TC members, core reviewers, UC members interested in > meeting the local community are all welcome. We'll also have leaders > from the other OSF-supported projects around. > Is it possible to include SIG chairs as well? I think it is a good opportunity for people to meet SIGs and SIGs to find people and project teams too. > Thierry Carrez (ttx) > -- May The Force of OpenStack Be With You, Rico Lin irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyotishri403 at gmail.com Sat Oct 5 04:41:35 2019 From: jyotishri403 at gmail.com (Jyoti Dahiwele) Date: Sat, 5 Oct 2019 10:11:35 +0530 Subject: Neutron Dhcp-agent Message-ID: Dear Team, Please clarify on how can I use dhcp-agent of neutron as a relay and to use existing dhcp for allocation of IPs to instances? -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Sat Oct 5 17:32:30 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Sat, 5 Oct 2019 19:32:30 +0200 Subject: Neutron Dhcp-agent In-Reply-To: References: Message-ID: <7DC3F60F-41C2-4418-89A7-634D409AF40B@redhat.com> Hi, Neutron DHCP agent can’t configure DHCP relay for Your network. It don’t work like that. > On 5 Oct 2019, at 06:41, Jyoti Dahiwele wrote: > > Dear Team, > > Please clarify on how can I use dhcp-agent of neutron as a relay and to use existing dhcp for allocation of IPs to instances? — Slawek Kaplonski Senior software engineer Red Hat From akalambu at cisco.com Sat Oct 5 17:34:24 2019 From: akalambu at cisco.com (Ajay Kalambur (akalambu)) Date: Sat, 5 Oct 2019 17:34:24 +0000 Subject: [openstack][heat-cfn] CFN Signaling with heat Message-ID: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> Hi I was trying the Software Deployment/Structured deployment of heat. I somehow can never get the signaling to work I see that authentication is happening but I don’t see a POST from the VM as a result stack is stuck in CREATE_IN_PROGRESS I see this message in my heat api cfn log which seems to suggest authentication is successful but it does not seem to POST. Have included debug output from VM and also the sample heat template I used. Don’t know if the template is correct as I referred some online examples to build it 2019-10-05 10:30:00.908 7 INFO heat.api.aws.ec2token [-] Checking AWS credentials.. 2019-10-05 10:30:00.909 7 INFO heat.api.aws.ec2token [-] AWS credentials found, checking against keystone. 2019-10-05 10:30:00.910 7 INFO heat.api.aws.ec2token [-] Authenticating with http://10.10.173.9:5000/v3/ec2tokens 2019-10-05 10:30:01.315 7 INFO heat.api.aws.ec2token [-] AWS authentication successful. 2019-10-05 10:30:02.326 7 INFO eventlet.wsgi.server [req-506f22c6-4062-4a84-8e85-40317a4099ed - adccd09df89e4b71b0a42f462679e75a-b1c6eb69-3877-466b-b00d-03dc051 - 0ecadd4762a34de1ac08508db4d3caa9 0ecadd4762a34de1ac08508db4d3caa9] 10.11.59.36,10.10.173.9 - - [05/Oct/2019 10:30:02] "GET /v1/?SignatureVersion=2&AWSAccessKeyId=f7874ac9898248edaae53511230534a4&StackName=test_stack&SignatureMethod=HmacSHA256&Signature=c03Q7Hb35q9tPPuYOv6YByn5YekF96p2s5zx36sX7x4%3D&Action=DescribeStackResource&LogicalResourceId=sig-vm-1 HTTP/1.1" 200 4669 1.418045 Some debugging output from my VM: [root at sig-vm-1 fedora]# sudo os-collect-config --force --one-time --debug /var/lib/os-collect-config/local-data not found. Skipping [2019-10-05 17:32:47,058] (os-refresh-config) [INFO] Starting phase pre-configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Target: pre-configure.d dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:47,091] (os-refresh-config) [INFO] Completed phase pre-configure [2019-10-05 17:32:47,092] (os-refresh-config) [INFO] Starting phase configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/20-os-apply-config [2019/10/05 05:32:47 PM] [INFO] writing /var/run/heat-config/heat-config [2019/10/05 05:32:47 PM] [INFO] writing /etc/os-collect-config.conf [2019/10/05 05:32:47 PM] [INFO] success dib-run-parts Sat Oct 5 17:32:47 UTC 2019 20-os-apply-config completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/50-heat-config-docker-compose dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-docker-compose completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/50-heat-config-kubelet dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-kubelet completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/55-heat-config [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group Heat::Ungrouped with no hook script None [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group Heat::Ungrouped with no hook script None dib-run-parts Sat Oct 5 17:32:47 UTC 2019 55-heat-config completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Target: configure.d dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 20-os-apply-config 0.345 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-docker-compose 0.064 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-kubelet 0.134 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 55-heat-config 0.065 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Completed phase configure [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Starting phase post-configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/post-configure.d/99-refresh-completed ++ os-apply-config --key completion-handle --type raw --key-default '' + HANDLE= ++ os-apply-config --key completion-signal --type raw --key-default '' + SIGNAL= ++ os-apply-config --key instance-id --type raw --key-default '' + ID=i-0000000d + '[' -n i-0000000d ']' + '[' -n '' ']' + '[' -n '' ']' ++ os-apply-config --key deployments --type raw --key-default '' ++ jq -r 'map(select(.group == "os-apply-config") | select(.inputs[].name == "deploy_signal_id") | .id + (.inputs | map(select(.name == "deploy_signal_id")) | .[].value)) | .[]' + DEPLOYMENTS= + DEPLOYED_DIR=/var/lib/os-apply-config-deployments/deployed + '[' '!' -d /var/lib/os-apply-config-deployments/deployed ']' dib-run-parts Sat Oct 5 17:32:49 UTC 2019 99-refresh-completed completed dib-run-parts Sat Oct 5 17:32:49 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Target: post-configure.d dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 99-refresh-completed 1.206 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:49,041] (os-refresh-config) [INFO] Completed phase post-configure [2019-10-05 17:32:49,042] (os-refresh-config) [INFO] Starting phase migration dib-run-parts Sat Oct 5 17:32:49 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Target: migration.d dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:49,073] (os-refresh-config) [INFO] Completed phase migration onfig]# cat /var/run/heat-config/heat-config [{"inputs": [{"type": "String", "name": "foo", "value": "fu"}, {"type": "String", "name": "bar", "value": "barmy"}, {"type": "String", "name": "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of the server being deployed to"}, {"type": "String", "name": "deploy_action", "value": "CREATE", "description": "Name of the current action being deployed"}, {"type": "String", "name": "deploy_stack_id", "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of the stack this deployment belongs to"}, {"type": "String", "name": "deploy_resource_name", "value": "other_deployment", "description": "Name of this deployment resource in the stack"}, {"type": "String", "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": "How the server should signal to heat with the deployment output values."}, {"type": "String", "name": "deploy_signal_id", "value": "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/other_deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=28a09f5d996240b8b4a117ecb0e0142b&SignatureVersion=2&Signature=IqXbRf9MzJ%2FnzqM7CLNAsR3BiwmaaHyWQspegxYc3D8%3D", "description": "ID of signal to use for signaling output values"}, {"type": "String", "name": "deploy_signal_verb", "value": "POST", "description": "HTTP verb to use for signaling outputvalues"}], "group": "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": {"config_value_foo": "fu", "config_value_bar": "barmy"}, "id": "5c404619-ce79-48cd-b001-00ac6ff4f4e8"}, {"inputs": [{"type": "String", "name": "foo", "value": "fooooo"}, {"type": "String", "name": "bar", "value": "baaaaa"}, {"type": "String", "name": "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of the server being deployed to"}, {"type": "String", "name": "deploy_action", "value": "CREATE", "description": "Name of the current action being deployed"}, {"type": "String", "name": "deploy_stack_id", "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of the stack this deployment belongs to"}, {"type": "String", "name": "deploy_resource_name", "value": "deployment", "description": "Name of this deployment resource in the stack"}, {"type": "String", "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": "How the server should signal to heat with the deployment output values."}, {"type": "String", "name": "deploy_signal_id", "value": "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=4c3d718796e0452ea94f2ce8dc6973ef&SignatureVersion=2&Signature=rxtSBNUSF%2FEXn9wvVK4XMU%2F1RzXVDGILtZr1hmkl7gg%3D", "description": "ID of signal to use for signaling output values"}, {"type": "String", "name": "deploy_signal_verb", "value": "POST", "description": "HTTP verb to use for signaling outputvalues"}], "group": "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": {"config_value_foo": "fooooo", "config_value_bar": "baaaaa"}, "id": "f4dea0c1-73c9-4ce4-aa04-c76ef9b08859"}][root at sig-vm-1 heat-config]# [root at sig-vm-1 heat-config]# cat /etc/os-collect-config.conf [DEFAULT] command = os-refresh-config collectors = ec2 collectors = cfn collectors = local [cfn] metadata_url = http://172.29.85.87:8000/v1/ stack_name = test_stack secret_access_key = npa^GWsPtbRL7D*MYObOI*kV0i1yqKOG access_key_id = f7874ac9898248edaae53511230534a4 path = sig-vm-1.Metadata Here is my basic sample temple heat_template_version: 2013-05-23 description: > This template demonstrates how to use OS::Heat::StructuredDeployment to override substitute get_input placeholders defined in OS::Heat::StructuredConfig config. As there is no hook on the server to act on the configuration data, these deployment resource will perform no actual configuration. parameters: flavor: type: string default: 'a061cb6c-99e7-4bdb-93e4-f0037ee3e947' image: type: string default: 3be29d9f-2ce6-4b95-b80c-0dbca7acfdfe public_net_id: type: string default: 67ae0e17-6258-4fb6-8b9b-0f29f6adb9db private_net_id: type: string description: Private network id default: 995fc046-1c58-468a-b81c-e42c06fc8966 private_subnet_id: type: string description: Private subnet id default: 7598c805-3a9b-4c27-be5b-dca4d89f058c password: type: string description: SSH password default: lab123 resources: the_sg: type: OS::Neutron::SecurityGroup properties: name: the_sg description: Ping and SSH rules: - protocol: icmp - protocol: tcp port_range_min: 22 port_range_max: 22 config: type: OS::Heat::StructuredConfig properties: config: config_value_foo: {get_input: foo} config_value_bar: {get_input: bar} deployment: type: OS::Heat::StructuredDeployment properties: signal_transport: CFN_SIGNAL config: get_resource: config server: get_resource: sig-vm-1 input_values: foo: fooooo bar: baaaaa other_deployment: type: OS::Heat::StructuredDeployment properties: signal_transport: CFN_SIGNAL config: get_resource: config server: get_resource: sig-vm-1 input_values: foo: fu bar: barmy server1_port0: type: OS::Neutron::Port properties: network_id: { get_param: private_net_id } security_groups: - default fixed_ips: - subnet_id: { get_param: private_subnet_id } server1_public: type: OS::Neutron::FloatingIP properties: floating_network_id: { get_param: public_net_id } port_id: { get_resource: server1_port0 } sig-vm-1: type: OS::Nova::Server properties: name: sig-vm-1 image: { get_param: image } flavor: { get_param: flavor } networks: - port: { get_resource: server1_port0 } user_data_format: SOFTWARE_CONFIG user_data: get_resource: cloud_config cloud_config: type: OS::Heat::CloudConfig properties: cloud_config: password: { get_param: password } chpasswd: { expire: False } ssh_pwauth: True -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Sun Oct 6 09:51:11 2019 From: zigo at debian.org (Thomas Goirand) Date: Sun, 6 Oct 2019 11:51:11 +0200 Subject: Neutron Dhcp-agent In-Reply-To: References: Message-ID: <1fc4dc01-b50a-36d9-fb46-9ee412762930@debian.org> On 10/5/19 6:41 AM, Jyoti Dahiwele wrote: > Dear Team, > > Please clarify on how can I use dhcp-agent of neutron as a relay and to > use existing dhcp for allocation of IPs to instances? What Neutron does is setup a dnsmasq instance for each of your subnets, and setup L2 and L3 connectivity in the namespace of this subnet, where the dnsmasq runs. Subnets can be moved (manually) from one DHCP agent to another. Cheers, Thomas From zigo at debian.org Sun Oct 6 09:58:13 2019 From: zigo at debian.org (Thomas Goirand) Date: Sun, 6 Oct 2019 11:58:13 +0200 Subject: ANNOUNCE: Train packages repository for Debian Buster is now available and tested Message-ID: <9916226a-f844-8963-03ee-dd67bdac1dfd@debian.org> Hi, It's been a few days already, there's some fully working (and tested) Debian repositories backported to Buster for train. The URLs are using the usual scheme: deb http://buster-train.debian.net/debian/ buster-train-backports main deb-src http://buster-train.debian.net/debian/ buster-train-backports main deb http://buster-train.debian.net/debian/ buster-train-backports-nochange main deb-src http://buster-train.debian.net/debian/ buster-train-backports-nochange main Early last week, I was able to test this, doing my first deployment, and starting my first VM on it. I haven't run tempest on this yet, though my manual tests went well (ie: floating IP, ssh to instance, mounting a cinder volume over Ceph and LVM, etc.). Please do test this, and report any eventual issue. If everything goes as planned, I'll be at the Debian cloud sprint in Boston the week of the release, discussing the Debian official images for the cloud. So I will only be able to upload the final versions of projects for Train only then after (probably, during the week-end, so it's available on Monday). Cheers, Thomas Goirand (zigo) From marcin.juszkiewicz at linaro.org Mon Oct 7 06:10:50 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Mon, 7 Oct 2019 08:10:50 +0200 Subject: ANNOUNCE: Train packages repository for Debian Buster is now available and tested In-Reply-To: <9916226a-f844-8963-03ee-dd67bdac1dfd@debian.org> References: <9916226a-f844-8963-03ee-dd67bdac1dfd@debian.org> Message-ID: <2e57ae64-7aa2-c8ec-28fc-1869ffbbc386@linaro.org> W dniu 06.10.2019 o 11:58, Thomas Goirand pisze: > Hi, > > It's been a few days already, there's some fully working (and tested) > Debian repositories backported to Buster for train. The URLs are using > the usual scheme: > > deb http://buster-train.debian.net/debian/ buster-train-backports main > deb-src http://buster-train.debian.net/debian/ buster-train-backports main > deb http://buster-train.debian.net/debian/ > buster-train-backports-nochange main > deb-src http://buster-train.debian.net/debian/ > buster-train-backports-nochange main > > Early last week, I was able to test this, doing my first deployment, and > starting my first VM on it. I haven't run tempest on this yet, though my > manual tests went well (ie: floating IP, ssh to instance, mounting a > cinder volume over Ceph and LVM, etc.). > > Please do test this, and report any eventual issue. If everything goes > as planned, I'll be at the Debian cloud sprint in Boston the week of the > release, discussing the Debian official images for the cloud. So I will > only be able to upload the final versions of projects for Train only > then after (probably, during the week-end, so it's available on Monday). We use them in Kolla project. All images builds fine. Not tested deployment yet. From tbechtold at suse.com Mon Oct 7 07:30:30 2019 From: tbechtold at suse.com (Thomas Bechtold) Date: Mon, 7 Oct 2019 09:30:30 +0200 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> +1 from me, too. On 10/2/19 10:58 PM, Goutham Pacha Ravi wrote: > Dear Zorillas and other Stackers, > > I would like to formalize the conversations we've been having amongst > ourselves over IRC and in-person. At the outset, we have a lot of > incoming changes to review, but we have limited core maintainer > attention. We haven't re-jigged our core maintainers team as often as > we'd like, and that's partly to blame. We have some relatively new and > enthusiastic contributors that we would love to encourage to become > maintainers! We've mentored contributors 1-1, n-1 before before adding > them to the maintainers team. We would like to do more of this!** > > In this spirit, I would like your inputs on adding Douglas Viroel > (dviroel) to the core maintainers team for manila and its associated > projects (manila-specs, manila-ui, python-manilaclient, > manila-tempest-plugin, manila-test-image, manila-image-elements). > Douglas has been an active contributor for the past two releases and > has valuable review inputs in the project. While he's been around here > less longer than some of us, he brings a lot of experience to the > table with his background in networking and shared file systems. He > has a good grasp of the codebase and is enthusiastic in adding new > features and fixing bugs in the Ussuri cycle and beyond. > > Please give me a +/-1 for this proposal. > > ** If you're interested in helping us maintain Manila by being part of > the manila core maintainer team, please reach out to me or any of the > current maintainers, we would love to work with you and help you grow > into that role! > > Thanks, > Goutham Pacha Ravi (gouthamr) > > From bcafarel at redhat.com Mon Oct 7 08:21:35 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Mon, 7 Oct 2019 10:21:35 +0200 Subject: [neutron] Bug deputy report (week starting on 2019-09-30) Message-ID: Hello Neutrinos, train is almost ready to leave the station, and it is time for a new bug deputy rotation cycle! I was on duty last week, triaging bugs up to 1846703 included A quiet week, with most bugs having potential fixes or good discussions. First one listed could benefit from another pair of eyes Undecided: * neutron-openvswitch-agent and IPv6 - https://bugs.launchpad.net/neutron/+bug/1846494 Can not use an IPv6 address for OpenFlow connections listening address (of_listen_address) High: * Pyroute2 can return dictionary keys in bytes instead of strings - https://bugs.launchpad.net/neutron/+bug/1846360 Fix in progress: https://review.opendev.org/686206 * [mysql8] Unknown column 'public' in 'firewall_rules_v2' - https://bugs.launchpad.net/neutron/+bug/1846606 neutron-fwaas db creation failing with mysql 8 Fix in progress: https://review.opendev.org/686753 Medium: * Designate integration not fully multi region safe - https://bugs.launchpad.net/neutron/+bug/1845891 Fix released: https://review.opendev.org/684854 RFE: * routed network for hypervisor - https://bugs.launchpad.net/neutron/+bug/1846285 Proposition to have routed networks separation at hypervisor level directly, apparently already running in-house at bug reporter's Wishlist: * Avoid neutron to return error 500 when deleting port if designate is down - https://bugs.launchpad.net/neutron/+bug/1846703 Another bug for Designate support, port create and delete operations do not react the same when designate is down Some discussions also in https://review.opendev.org/685644 Opinion: * ovs VXLAN over IPv6 conflicts with linux native VXLAN over IPv4 using standard port - https://bugs.launchpad.net/neutron/+bug/1846507 Configuration issue in kolla-ansible CI, ovs-agent and CI configuration competing for IPv6 binding address. Neutron listed for possible insights on the issue Invalid: * ha router appear double vip - https://bugs.launchpad.net/neutron/+bug/1845900 Kolla issue with HA controllers when stopping L3 agent - keepalived processes are in same container and are killed at same time as agent, added Kolla to affected projects * packet loss during active L3 HA agent restart - https://bugs.launchpad.net/neutron/+bug/1846198 Similar issue for openstack-ansible, it kills all processes in control group (including keepalived processes) when restarting the systemd unit - added OSA as affected project Thanks! Passing the deputy role to slaweq -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Oct 7 08:38:05 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 7 Oct 2019 10:38:05 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Message-ID: Eric Fried wrote: >> It would be good to have a rough idea of who will be available at each >> opportunity. To keep it simple, I created a sign-up sheet at: >> >> https://etherpad.openstack.org/p/meet-the-project-leaders > > If a PTL will not be present, is it acceptable to send a delegate? Sure! The goal is to provide an opportunity for the Chinese community to meet project team members, not to make it an exclusive event. Anyone's welcome. + we should use those opportunities to promote the on-boarding sessions which will happen later in the week. -- Thierry From thierry at openstack.org Mon Oct 7 08:39:13 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 7 Oct 2019 10:39:13 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> Message-ID: Rico Lin wrote: > On Thu, Oct 3, 2019 at 6:29 PM Thierry Carrez > wrote: > > > > > OpenStack PTLs, TC members, core reviewers, UC members interested in > > meeting the local community are all welcome. We'll also have leaders > > from the other OSF-supported projects around. > > > Is it possible to include SIG chairs as well? > I think it is a good opportunity for people to meet SIGs and SIGs to > find people and project teams too. Yes, of course (see my other response for rationale). -- Thierry Carrez (ttx) From a.settle at outlook.com Mon Oct 7 08:40:01 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Mon, 7 Oct 2019 08:40:01 +0000 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <20191004180712.323nlymaxedoib54@yuggoth.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> <20191004180712.323nlymaxedoib54@yuggoth.org> Message-ID: On Fri, 2019-10-04 at 18:07 +0000, Jeremy Stanley wrote: > On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: > > > It would be good to have a rough idea of who will be available > > > at each opportunity. To keep it simple, I created a sign-up > > > sheet at: > > > > > > https://etherpad.openstack.org/p/meet-the-project-leaders > > > > If a PTL will not be present, is it acceptable to send a delegate? > > The goal, as I understand it, is to reinforce to attendees in China > that OpenStack project leadership is accessible and achievable, by > providing opportunities for them to be able to meet and speak > in-person with a representative cross-section of our community > leaders. Is that something which can be delegated? Seems to me it > might convey the opposite of what's intended, but I don't know if my > impression is shared by others. That was indeed my intention with the initial idea proposal. Conceptually, the meetup would be to break down any preconceived notions that individuals may have. Of course, that isn't to say that there aren't many leaders in the community that don't hold an _official_ position. It's mostly to put a face to a name, to create open communication channels. I'd say it's up the the team's discretion as to whether or not they'd like to delegate the presence at this meetup. This meetup is not compulsory for anyone, so if you can't go, and can't delegate, that is also fine. -- Alexandra Settle IRC: asettle From mark at stackhpc.com Mon Oct 7 08:55:18 2019 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 7 Oct 2019 09:55:18 +0100 Subject: [kolla] Feature freeze Message-ID: Hello Koalas, We are now in feature freeze for the Train release. Cores, please do not approve feature patches on the master branch until we have created the stable/train branch. We will allow some exceptions which must be approved by the core team. Currently, we have nova cells support and IPv6-only mode. Please apply for feature freeze exceptions either on openstack-discuss or during the weekly IRC meeting. The deadline for merging features with exceptions is Friday 18th October. Please now focus on bug fixing and testing. Thanks, Mark From bluejay.ahn at gmail.com Mon Oct 7 10:36:50 2019 From: bluejay.ahn at gmail.com (Jaesuk Ahn) Date: Mon, 7 Oct 2019 19:36:50 +0900 Subject: [Airship-discuss] Fwd: OOK,Airship In-Reply-To: <69277446-4470-3bd2-6cd4-b0f61c3e21e3@ebi.ac.uk> References: <963B5DA1-1C3D-481B-A41B-D11369BC1848@openstack.org> <69277446-4470-3bd2-6cd4-b0f61c3e21e3@ebi.ac.uk> Message-ID: Hi Charles, As briefly mentioned in the previous email, SKT is running OOK in several productions: SKT's LTE/5G NSA infrastructure for a certain VNF (Virtualized Network Function), Private Cloud, Cloud infrastructure for VDI. SKT started navigating OOK in late 2016 exactly because of "bumpy experience due to issues with configuration maintenance /upgrade". We got very lucky to work with AT&T from the beginning both on openstack-helm and airship-armada. SKT now has a slightly different technology set from Airship, we have ansible+ironic+kubeadm+airship-armada+openstack-helm. You can see all the code and information from the following link. We opened our codebase in July (we call it "taco: skt all container openstack). - https://github.com/openinfradev - https://github.com/openinfradev/tacoplay In addtion, we have a concrete plan to develop "2nd generation of ook" that will be very similar to what Airship 2.0 look like. We will work with Airship community on this route. I hope it help your research on ook option. You can always ask me any question on this topic. I will be happy to help you. FYI, here is a presentation about what we did. - https://www.openstack.org/videos/summits/berlin-2018/you-can-start-small-and-grow-sk-telecoms-use-case-on-armada Thanks! Thanks. 2019년 10월 3일 (목) 오후 9:14, Charles 님이 작성: > Hi Roman, > > > Many thanks for the reply. > > I posted this on openstack-discuss because I was wondering if any > users/Openstack operators out there (outside large corporations who are > members of the Airship development framework) are actually running OOK > in production. This could be Airship, or some other Kubernetes > distribution running Openstack Helm. > > Our several years experience of managing Openstack so far > (RHOSP/TripleO) has been bumpy due to issues with configuration > maintenance /upgrades. The idea of using CI/CD and Kubernetes/Helm to > manage Openstack is compelling and fits nicely into the DevOps framework > here. If we were to explore this route we could 'roll our own' with a > deployment say based on https://opendev.org/airship/treasuremap , or pay > for and Enterprise solution that incorporates the OOK model (upcoming > Mirantis and SUSE potentially). > > Regards > > Charles > > > > > > On 03/10/2019 12:04, Roman Gorshunov wrote: > > Thanks Ashlee! > > > > Charles, > > A few companies who work on development of Airship do use it, > > including production uses: AT&T, SUSE, Mirantis, Ericsson, SK Telekom > > and others. Many of those companies (if not all) use Airship + > > OpenStack Helm as well. > > > > Airship, as you have mentioned, is a collection of components for > > undercloud control plane, which helps to deploy nodes with > > OS+Docker+Kubernetes on it, configure/manage it all in GitOps way, and > > then help to maintain the configuration. It also allows to manage > > deploys and maintenance of whatever runs on top of Kubernetes cluster, > > would that be OpenStack Helm or other software packaged in Helm > > format. > > > > OpenStack Helm does not really require to be running on > > Airship-managed cluster. It could run standalone. > > > > Yes, you can roll out an open source production grade > > Airship/Openstack Helm deployment today. Good example of production > > grade configuration could be found in airship/treasuremap repository > > [0] as 'seaworthy' site definition. You are welcome to try, of course. > > For the questions - reach out to us on IRC #airshipit at Freenode of via > > Airship-discuss mailing list. > > > > [0] https://opendev.org/airship/treasuremap > > > > Best regards, > > -- > > Roman Gorshunov > > > > On Wed, Oct 2, 2019 at 9:27 PM Ashlee Ferguson > wrote: > >> Hi Charles, > >> > >> Glad to hear you’re interested! Forwarding this to the Airship ML since > there may be folks on this mailing list that will have pointers who didn't > see the openstack-discuss post. > >> > >> Ashlee > >> > >> > >> > >> Begin forwarded message: > >> > >> From: Charles > >> Subject: OOK,Airship > >> Date: October 2, 2019 at 5:39:16 PM GMT+2 > >> To: openstack-discuss at lists.openstack.org > >> > >> Hi, > >> > >> > >> We are interested in OOK and Openstack Helm. > >> > >> Has anyone any experience with Airship (now that 1.0 is out)? > >> > >> Noticed that a few Enterprise distributions are looking at managing the > Openstack control plane with Kubernetes and have been testing Airship with > a view to rolling it out (Mirantis,SUSE) > >> > >> Is this a signal that there is momentum around Openstack Helm? > >> > >> Is it possible to roll out an open source production grade > Airship/Openstack Helm deployment today, or is it too early? > >> > >> > >> Thoughts? > >> > >> > >> Charles > >> > >> > >> > >> > >> _______________________________________________ > >> Airship-discuss mailing list > >> Airship-discuss at lists.airshipit.org > >> http://lists.airshipit.org/cgi-bin/mailman/listinfo/airship-discuss > > -- > Charles Short > Senior Cloud Engineer > EMBL-EBI > Hinxton > 01223494205 > > > _______________________________________________ > Airship-discuss mailing list > Airship-discuss at lists.airshipit.org > http://lists.airshipit.org/cgi-bin/mailman/listinfo/airship-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From no-reply at openstack.org Mon Oct 7 12:01:48 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Mon, 07 Oct 2019 12:01:48 -0000 Subject: octavia 5.0.0.0rc2 (train) Message-ID: Hello everyone, A new release candidate for octavia for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/octavia/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/octavia/src/branch/stable/train Release notes for octavia can be found at: https://docs.openstack.org/releasenotes/octavia/ If you find an issue that could be considered release-critical, please file it at: https://storyboard.openstack.org/#!/project/908 and tag it *train-rc-potential* to bring it to the octavia release crew's attention. From no-reply at openstack.org Mon Oct 7 12:03:51 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Mon, 07 Oct 2019 12:03:51 -0000 Subject: storlets 4.0.0.0rc2 (train) Message-ID: Hello everyone, A new release candidate for storlets for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/storlets/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/storlets/src/branch/stable/train Release notes for storlets can be found at: https://docs.openstack.org/releasenotes/storlets/ If you find an issue that could be considered release-critical, please file it at: https://bugs.launchpad.net/storlets/+bugs and tag it *train-rc-potential* to bring it to the storlets release crew's attention. From fungi at yuggoth.org Mon Oct 7 13:36:42 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 7 Oct 2019 13:36:42 +0000 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> <20191004180712.323nlymaxedoib54@yuggoth.org> Message-ID: <20191007133641.f4q2ylxckr362pop@yuggoth.org> On 2019-10-07 08:40:01 +0000 (+0000), Alexandra Settle wrote: > On Fri, 2019-10-04 at 18:07 +0000, Jeremy Stanley wrote: > > On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: [...] > > > If a PTL will not be present, is it acceptable to send a > > > delegate? > > > > The goal, as I understand it, is to reinforce to attendees in > > China that OpenStack project leadership is accessible and > > achievable, by providing opportunities for them to be able to > > meet and speak in-person with a representative cross-section of > > our community leaders. Is that something which can be delegated? > > Seems to me it might convey the opposite of what's intended, but > > I don't know if my impression is shared by others. > > That was indeed my intention with the initial idea proposal. > Conceptually, the meetup would be to break down any preconceived > notions that individuals may have. > > Of course, that isn't to say that there aren't many leaders in the > community that don't hold an _official_ position. It's mostly to > put a face to a name, to create open communication channels. I'd > say it's up the the team's discretion as to whether or not they'd > like to delegate the presence at this meetup. Of course, I should have clarified. I think providing folks the opportunity to meet and speak with a Nova core reviewer is great. It's definitely a type of leadership we prize highly in our community and want to encourage more of. Being "the person who showed up on behalf of the Nova PTL because they're not present" doesn't really make the Nova PTL position any more approachable on the other hand. If anything, it seems to me that it might reinforce the impression it's a distant and unachievable position. > This meetup is not compulsory for anyone, so if you can't go, and > can't delegate, that is also fine. Yep, I think having a variety of different sorts of community leaders present is what's needed, it doesn't have to (and realistically, probably can't anyway?) involve every one of the ~hundred teams, SIGs, and other organized groups within the community. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From corey.bryant at canonical.com Mon Oct 7 13:58:04 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Mon, 7 Oct 2019 09:58:04 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Fri, Oct 4, 2019 at 9:48 AM Corey Bryant wrote: > One other issue is "pxc-strict-mode: disabled" for percona-cluster is > required to test this. /usr/share/placement/mysql-migrate-db.sh may need > some updates but I haven't dug into that yet. > > I have a review up for this issue now at: https://review.opendev.org/#/c/687056/ Thanks, Corey > On Fri, Oct 4, 2019 at 9:41 AM Corey Bryant > wrote: > >> Hi All, >> >> I'd like to see if I can get some input on the current state of the >> Placement API split. >> >> For some background, the nova placement API was removed from nova in >> train, and it's been split into its own project. It's mostly just a basic >> API charm. The tricky part is the migration of tables from the nova_api >> database to the placement database. >> >> Code is located at: >> https://github.com/coreycb/charm-placement >> https://github.com/coreycb/charm-interface-placement >> >> https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) >> >> Test scenarios I've been testing with: >> 1) deploy nova-cc et al train, configure keystonev3, deploy instance >> 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, >> deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy >> instance 3 >> >> There is currently an issue with the second test scenario where instance >> 2 creation errors because nova-scheduler can't find a valid placement >> candidate (not sure of the exact error atm). However if I delete instance 1 >> before creating instance 2 it is created successfully. It feels like a DB >> related issue but I'm really not sure so I'll keep digging. >> >> Thanks! >> Corey >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon Oct 7 14:24:59 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 7 Oct 2019 16:24:59 +0200 Subject: [oslo] FFE: Support "qemu-img info" virtual size in QEMU 4.1 and late Message-ID: Hi, I request a late feature freeze exception (FFE) for https://review.opendev.org/#/c/686598/ and https://github.com/openstack/oslo.utils/commit/89bccdee95f81ddb54b427d6af172bb987fd7545 -- "Support "qemu-img info" virtual size in QEMU 4.1 and later". It will fix an issue that can be blocking for users so it's can be really valuable for operators, if we release it ASAP. They would be delighted if it were included in Train. Please let me know if you have any concerns or questions. Thank you for your consideration. Hervé -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From no-reply at openstack.org Mon Oct 7 14:40:21 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Mon, 07 Oct 2019 14:40:21 -0000 Subject: cinder 15.0.0.0rc2 (train) Message-ID: Hello everyone, A new release candidate for cinder for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/cinder/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/cinder/src/branch/stable/train Release notes for cinder can be found at: https://docs.openstack.org/releasenotes/cinder/ If you find an issue that could be considered release-critical, please file it at: https://bugs.launchpad.net/cinder/+bugs and tag it *train-rc-potential* to bring it to the cinder release crew's attention. From luka.peschke at objectif-libre.com Mon Oct 7 14:53:28 2019 From: luka.peschke at objectif-libre.com (Luka Peschke) Date: Mon, 07 Oct 2019 16:53:28 +0200 Subject: [cloudkitty] 07/10 IRC meeting recap Message-ID: <1b49a519ea12fb979e4cc688506a5e7c@objectif-libre.com> Hello everybody, This is the recap for today's IRC meeting of the cloudkitty team. The agenda can be found at [1] and the logs can be found at [2]. cloudkitty 11.0.0 and python-cloudkittyclient 3.1.0 =================================================== Cloudkitty 11.0.0 has been released on september 25th. If no critical bug is reported, it will be final release for the train cycle. The release notes for the train cycle can be found at [3]. New meeting schedule ==================== As discussed, the cloudkitty IRC meeting will now happen on the 1st and 3rd monday of each month at 14h00 UTC. This time has been chosen because cloudkitty's main contributors are split between montreal and france. Of course, if anyone from an incompatible timezone would like to take part in the meetings, we can re-adjust the schedule. From now on, we'll provide a recap to the ML after each meeting. New features / specs / projects =============================== First, I'd like to welcome our two new contributors, Quentin Anglade (qanglade) and Julien Pinchelimouroux (julien-pinchelim). Quentin has been working on porting some v1 API endpoints to v2, more specifically the ones used for rating module configuration (/v1/rating/modules). These endpoints will be included in the Ussuri version. Julien is working on a standalone dashboard for cloudkitty. It will be compatible with the standalone mode, but will also support keystone integration. It should provide a more modern and easier to use interface than the cloudkitty-dashboard horizon plugin. It will require the v2 API to work. I've been busy with some improvements to v2 API performance, in particular regarding driver loading. The spec can be found at [4]. Tempest plugin ============== Justin (jferrieu) has been working on the tempest plugin. Now that the Elasticsearch v2 storage driver is supported in devstack, we plan to add a lot more tests and some complete scenarios. Some of Justin's work on differenciating v1 and v2 API tempest tests can be found at [5]. The next meeting will happen on October 21st at 14h00 UTC. Cheers, -- Luka Peschke (peschk_l) [1] https://etherpad.openstack.org/p/cloudkitty-meeting-topics [2] http://eavesdrop.openstack.org/meetings/cloudkitty/2019/cloudkitty.2019-10-07-14.00.log.html [3] https://docs.openstack.org/releasenotes/cloudkitty/train.html [4] https://review.opendev.org/#/c/686391/ [5] https://review.opendev.org/#/c/686210/ From openstack at nemebean.com Mon Oct 7 15:32:35 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Oct 2019 10:32:35 -0500 Subject: [oslo][release][requirements] FFE: Support "qemu-img info" virtual size in QEMU 4.1 and late In-Reply-To: References: Message-ID: <6d93472a-a191-7c01-42ad-960442b2f491@nemebean.com> Tagging with release and requirements as they need to sign off on this. On 10/7/19 9:24 AM, Herve Beraud wrote: > Hi, > > I request a late feature freeze exception (FFE) for > https://review.opendev.org/#/c/686598/ and > https://github.com/openstack/oslo.utils/commit/89bccdee95f81ddb54b427d6af172bb987fd7545 > -- "Support "qemu-img info" virtual size in QEMU 4.1 and later". It will > fix an issue that can be blocking for users so it's can be really > valuable for operators, if we release it ASAP. They would be delighted > if it were included in Train. I guess I'll reiterate my question from the review: Does this need to be in the initial Train release or can we backport it immediately after? Since qemu 4.1.0 released during the Train cycle I would argue that it's fair to backport patches to support it (I'm less sure about the stein patch, but that's a separate topic). If there are consumers of OpenStack who will take the initial Train release and not any subsequent bugfix releases then that would suggest we need to do this now, but I can't imagine anyone locks themselves into the .0 release of a piece of software and refuses to take any bug fixes after that. I'm open to being persuaded otherwise though. > > Please let me know if you have any concerns or questions. Thank you for > your consideration. > > Hervé > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From openstack at nemebean.com Mon Oct 7 15:44:04 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Oct 2019 10:44:04 -0500 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older Message-ID: Hi, This is related to the FFE for train, but I wanted to discuss it separately because I think the circumstances are a bit different. Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear to me that backporting bug fixes for it is valid. The original author of the patch actually wants it for Rocky, which is basically in the same situation as Stein. I should note he's willing to carry the patch downstream if necessary. On the one hand, it sounds like this is something at least one operator wants, but on the other I'm not sure the stable policy supports backporting patches to support a version of a dependency that didn't exist when the release was initially cut. I'm soliciting opinions on how to proceed here. Reference: https://review.opendev.org/#/c/686532 Thanks. -Ben From mthode at mthode.org Mon Oct 7 15:49:56 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Oct 2019 10:49:56 -0500 Subject: [oslo][release][requirements] FFE: Support "qemu-img info" virtual size in QEMU 4.1 and late In-Reply-To: <6d93472a-a191-7c01-42ad-960442b2f491@nemebean.com> References: <6d93472a-a191-7c01-42ad-960442b2f491@nemebean.com> Message-ID: <20191007154956.lukimg63dti4kdt5@mthode.org> On 19-10-07 10:32:35, Ben Nemec wrote: > Tagging with release and requirements as they need to sign off on this. > > On 10/7/19 9:24 AM, Herve Beraud wrote: > > Hi, > > > > I request a late feature freeze exception (FFE) for > > https://review.opendev.org/#/c/686598/ and https://github.com/openstack/oslo.utils/commit/89bccdee95f81ddb54b427d6af172bb987fd7545 > > -- "Support "qemu-img info" virtual size in QEMU 4.1 and later". It will > > fix an issue that can be blocking for users so it's can be really > > valuable for operators, if we release it ASAP. They would be delighted > > if it were included in Train. > > I guess I'll reiterate my question from the review: Does this need to be in > the initial Train release or can we backport it immediately after? Since > qemu 4.1.0 released during the Train cycle I would argue that it's fair to > backport patches to support it (I'm less sure about the stein patch, but > that's a separate topic). If there are consumers of OpenStack who will take > the initial Train release and not any subsequent bugfix releases then that > would suggest we need to do this now, but I can't imagine anyone locks > themselves into the .0 release of a piece of software and refuses to take > any bug fixes after that. I'm open to being persuaded otherwise though. > > > > > Please let me know if you have any concerns or questions. Thank you for > > your consideration. > > > > Hervé > > > > -- > > Hervé Beraud > > Senior Software Engineer > > Red Hat - Openstack Oslo > > irc: hberaud > > -----BEGIN PGP SIGNATURE----- > > > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > > v6rDpkeNksZ9fFSyoY2o > > =ECSj > > -----END PGP SIGNATURE----- > > > Given that this is a backwards compatible change I think it's fine. https://github.com/openstack/oslo.utils/compare/3.41.1...89bccdee95f81ddb54b427d6af172bb987fd7545 the above link shows that this is the only commit (that's code related) as well so no issues here. The only thing we'll need to make sure of is to cherry-pick the requirements update into master (like was just done with the tempest release). -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From thierry at openstack.org Mon Oct 7 16:02:38 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 7 Oct 2019 18:02:38 +0200 Subject: [Release-job-failures] Tag of openstack/cinder for ref refs/tags/15.0.0.0rc2 failed In-Reply-To: References: Message-ID: <861d9067-070a-4f75-1f71-d15baf221760@openstack.org> zuul at openstack.org wrote: > Build failed. > > - publish-openstack-releasenotes-python3 https://zuul.opendev.org/t/openstack/build/965908bbf69141c393d4728f7de07f7d : POST_FAILURE in 29m 45s Looks like a transient failure Collect sphinx build html: ssh: connect to host 162.242.237.111 port 22: Connection timed out rsync: connection unexpectedly closed (0 bytes received so far) [Receiver] rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.1] Collect artifacts: ssh: connect to host 162.242.237.111 port 22: Connection timed out rsync: connection unexpectedly closed (0 bytes received so far) [Receiver] rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.1] Release notes should be picked up at the next RC or the final, so no need to retry/reenqueue? -- Thierry Carrez (ttx) From mthode at mthode.org Mon Oct 7 16:07:03 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Oct 2019 11:07:03 -0500 Subject: [all][requirements] requirements branched train - cycle-trailing are on notice that master is now ussuri Message-ID: <20191007160703.lzyan2777owo4cbw@mthode.org> Just a friendly ping that the train keeps rolling. Master is now ussuri. https://releases.openstack.org/constraints/upper/ussuri should work soon for those that want to switch their install_command in tox.ini within master earlier in the cycle (rather than having things pile up). I'll email again once it's working. cycle-trailing projects are on notice that if they track master, the dependencies may change from what they are currently working on (train). If you have any questions please let me know (in the #openstack-requirements channel preferably). P.S. FFE season is over -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From fungi at yuggoth.org Mon Oct 7 16:31:19 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 7 Oct 2019 16:31:19 +0000 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: References: Message-ID: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: [...] > Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear > to me that backporting bug fixes for it is valid. The original > author of the patch actually wants it for Rocky [...] Neither the changes nor the bug report indicate what the motivation is for supporting newer Qemu with (much) older OpenStack. Is there some platform which has this Qemu behavior on which folks are trying to run Rocky? Or is it a homegrown build combining these dependency versions from disparate time periods? Or maybe some other reason I'm not imagining? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gouthampravi at gmail.com Mon Oct 7 17:00:41 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Mon, 7 Oct 2019 10:00:41 -0700 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <20191007133641.f4q2ylxckr362pop@yuggoth.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> <20191004180712.323nlymaxedoib54@yuggoth.org> <20191007133641.f4q2ylxckr362pop@yuggoth.org> Message-ID: On Mon, Oct 7, 2019 at 6:40 AM Jeremy Stanley wrote: > On 2019-10-07 08:40:01 +0000 (+0000), Alexandra Settle wrote: > > On Fri, 2019-10-04 at 18:07 +0000, Jeremy Stanley wrote: > > > On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: > [...] > > > > If a PTL will not be present, is it acceptable to send a > > > > delegate? > > > > > > The goal, as I understand it, is to reinforce to attendees in > > > China that OpenStack project leadership is accessible and > > > achievable, by providing opportunities for them to be able to > > > meet and speak in-person with a representative cross-section of > > > our community leaders. Is that something which can be delegated? > > > Seems to me it might convey the opposite of what's intended, but > > > I don't know if my impression is shared by others. > > > > That was indeed my intention with the initial idea proposal. > > Conceptually, the meetup would be to break down any preconceived > > notions that individuals may have. > > > > Of course, that isn't to say that there aren't many leaders in the > > community that don't hold an _official_ position. It's mostly to > > put a face to a name, to create open communication channels. I'd > > say it's up the the team's discretion as to whether or not they'd > > like to delegate the presence at this meetup. > > Of course, I should have clarified. I think providing folks the > opportunity to meet and speak with a Nova core reviewer is great. > It's definitely a type of leadership we prize highly in our > community and want to encourage more of. Being "the person who > showed up on behalf of the Nova PTL because they're not present" > doesn't really make the Nova PTL position any more approachable on > the other hand. If anything, it seems to me that it might reinforce > the impression it's a distant and unachievable position. > Sure hope it doesn't. I support the concept behind this and would love to be there, but cannot, because I'm unable to travel to Shanghai. Many of the PTL-driven tasks at the event for Manila have been delegated to project maintainers that are attending. I would like to find a suitable lead for this too, along with encouraging all core reviewers that are in attendance to be part of these events: the mixer and the lunch. > > > This meetup is not compulsory for anyone, so if you can't go, and > > can't delegate, that is also fine. > > Yep, I think having a variety of different sorts of community > leaders present is what's needed, it doesn't have to (and > realistically, probably can't anyway?) involve every one of the > ~hundred teams, SIGs, and other organized groups within the > community. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Mon Oct 7 17:09:27 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 7 Oct 2019 10:09:27 -0700 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Hey Alex, So since the TC stuff is Friday we managed to shuffle things around and now docs has the afternoon on Thursday. We will get the final schedule up on the website soon. -Kendall (diablo_rojo) On Thu, Oct 3, 2019 at 9:32 AM Kendall Waters wrote: > Hey Alex, > > We still have tables available on Friday. Would half a day on Friday work > for the docs team? Unless Ian is okay with it, we can combine Docs with > i18n in their Wednesday afternoon/Thursday morning slot. Just let me know! > > Cheers, > Kendall > > > > Kendall Waters > OpenStack Marketing & Events > kendall at openstack.org > > > > On Oct 3, 2019, at 4:26 AM, Alexandra Settle wrote: > > Hey, > > Could you add something for docs? Or combine with i18n again if Ian > doesn't mind? > > We don't need a lot, just a room for people to ask questions about the > future of the docs team. > > Stephen will be there, as co-PTL. There's 0 chance of it not > conflicting with nova. > > Please :) > > Thank you! > > Alex > > On Wed, 2019-09-25 at 14:13 -0700, Kendall Nelson wrote: > > Hello Everyone! > > In the attached picture or link [0] you will find the proposed > schedule for the various tracks at the Shanghai PTG in November. > > We did our best to avoid the key conflicts that the track leads > (PTLs, SIG leads...) mentioned in their PTG survey responses, > although there was no perfect solution that would avoid all conflicts > especially when the event is three-ish days long and we have over 40 > teams meeting. > > If there are critical conflicts we missed or other issues, please let > us know, by October 6th at 7:00 UTC! > > -Kendall (diablo_rojo) > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedu > le.png > > -- > Alexandra Settle > IRC: asettle > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Mon Oct 7 18:02:32 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Mon, 7 Oct 2019 13:02:32 -0500 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals Message-ID: Hi Stackers, I want to request the inclusion of the support for Neutron Routed Networks in the Nova goals for the Ussuri cycle. As many of you might know, Routed networks is a feature in Neutron that enables the creation of very large virtual networks that avoids the performance penalties of large L2 broadcast domains ( https://www.openstack.org/videos/summits/barcelona-2016/scaling-up-openstack-networking-with-routed-networks). This functionality can be very helpful for large deployers who have the need to have one or a few large virtual networks shared by all their users and has been available in Neutron since very soon after the Barcelona Summit in 2016. But it is really useless until there is code in Nova that can schedule VMs to compute hosts based on the segments topology of the routed networks. Without it, VMs can land in compute hosts where their traffic cannot be routed by the underlying network infrastructure. I would like the Nova team to consider the following when making a decision about this request: 1. Work for Routed Networks was approved as a priority for the Ocata cycle, although it wasn't concluded: https://specs.openstack.org/openstack/nova-specs/priorities/ocata-priorities.html#network-aware-scheduling and https://specs.openstack.org/openstack/nova-specs/specs/pike/index.html 2. The are several large deployers that need this feature. Verizon Media, my employer, is one of them. Others that come to mind include GoDaddy and Cern. And I am sure there are others. 3. There is a WIP patch to implement the functionality some of the functionality: https://review.opendev.org/#/c/656885. We, at Verizon Media, are proposing to take over this work and finish its implementation by the end of U. What we are requesting is Nova core reviewers bandwidth to help us merge the code I will be attending the PTG in Shanghai and will make myself available to discuss this further in person any day and any time. Hopefully, we can get this feature lined up very soon Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Oct 7 18:16:39 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Oct 2019 19:16:39 +0100 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> Message-ID: On Mon, 2019-10-07 at 16:31 +0000, Jeremy Stanley wrote: > On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: > [...] > > Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear > > to me that backporting bug fixes for it is valid. The original > > author of the patch actually wants it for Rocky > > [...] > > Neither the changes nor the bug report indicate what the motivation > is for supporting newer Qemu with (much) older OpenStack. Is there > some platform which has this Qemu behavior on which folks are trying > to run Rocky? Or is it a homegrown build combining these dependency > versions from disparate time periods? Or maybe some other reason I'm > not imagining? i suspect the motivation is the fact that distos like RHEL often bump qemu and libvirt versions in minor releases. so if you deploy Queens on say rhel 7.5 orignally but you upgraged it to rhel 7.7 over time you would end up running with a qemu/libvirt that may not have existed when queens was released. when qemu has broken its public api in the past and that change in behavior has been addressed in later openstack release disto have often had to backport that fix to an openstack that was release before that depency existed. this depends on the distro. canonical for example package qemu and ovs in the ubuntu cloud archive for each given release i belive so you can go form 18.04.0 to 18.04.1 and know it wont break your openstack install but on rhel QEMU and kvm are owned by a sperate team and layered prodcut like openstack consume the output of that team which follow the RHEL release cycle not the openstack one. so i expect this to vary per distro. when a change is backportable upstream that is obviosly perferable. i dont actully think this need to be fixed in Train GA if a oslo release is done promptly that can be consumed instead. i expect this to get backported downs stream anyway so if we can avoid multiple distros doing that and backport it upstream give it backward compatibale it think that would be preferable. just my 2 cents From smooney at redhat.com Mon Oct 7 18:45:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Oct 2019 19:45:54 +0100 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: References: Message-ID: <51911abf0b59bf482719d25a9b7c370931db981d.camel@redhat.com> On Mon, 2019-10-07 at 13:02 -0500, Miguel Lavalle wrote: > Hi Stackers, > > I want to request the inclusion of the support for Neutron Routed Networks > in the Nova goals for the Ussuri cycle. +1 > As many of you might know, Routed > networks is a feature in Neutron that enables the creation of very large > virtual networks that avoids the performance penalties of large L2 > broadcast domains ( > https://www.openstack.org/videos/summits/barcelona-2016/scaling-up-openstack-networking-with-routed-networks). > This functionality can be very helpful for large deployers who have the > need to have one or a few large virtual networks shared by all their users > and has been available in Neutron since very soon after the Barcelona > Summit in 2016. But it is really useless until there is code in Nova that > can schedule VMs to compute hosts based on the segments topology of the > routed networks. Without it, VMs can land in compute hosts where their > traffic cannot be routed by the underlying network infrastructure. I would > like the Nova team to consider the following when making a decision about > this request: > > 1. Work for Routed Networks was approved as a priority for the Ocata > cycle, although it wasn't concluded: > https://specs.openstack.org/openstack/nova-specs/priorities/ocata-priorities.html#network-aware-scheduling > and > https://specs.openstack.org/openstack/nova-specs/specs/pike/index.html > 2. The are several large deployers that need this feature. Verizon > Media, my employer, is one of them. Others that come to mind include > GoDaddy and Cern. And I am sure there are others. > 3. There is a WIP patch to implement the functionality some of the > functionality: https://review.opendev.org/#/c/656885. We, at Verizon > Media, are proposing to take over this work and finish its implementation > by the end of U. What we are requesting is Nova core reviewers bandwidth to > help us merge the code > for context of other and to qualify for myself, the main work to "support this in nova" is related to the schduler and placement. if i remember correctly we prviosly discussed the idea of modeling the subnet/segment affinity between compute hosts and routed ip subents as placement aggreates that woudl be create by neutron. the available number of ips in each routed subnet would be modelled as an inventory of ips in a shareing resouce provider. during spawn when nova retrieves the port info from the precreated neturon port, neutron would pass a resources request for an ip and a aggreage using the existing resouce requests mechanism that was introduced for bandwith aware schduleing. nova then jsut need to merge the aggreage and ip requst with the other request form the port,flavor,image when it queries placment to ensure that the returned hosts are connected to the correct routed segment. > I will be attending the PTG in Shanghai and will make myself available to > discuss this further in person any day and any time. Hopefully, we can get > this feature lined up very soon i wont be at the PTG but i do think this i quite valuable as today the only way to use routed networks today is to ip_allocation=defer which means you cannot chose the ports ip ahead of time. also because we dont schduler based on compute host to segment affinity today it is not safe to live migrate, cold migrate, resize or shelve instance with routed networks today as it coudl fail due to the segment being unreachable on the selected host. if we finish move operation for ports with resouce requests which we need for port with minium bandwith then it will fix all of the above for routed networks too. > > Best regards > > Miguel From rosmaita.fossdev at gmail.com Mon Oct 7 19:42:13 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Mon, 7 Oct 2019 15:42:13 -0400 Subject: [Release-job-failures] Tag of openstack/cinder for ref refs/tags/15.0.0.0rc2 failed In-Reply-To: <861d9067-070a-4f75-1f71-d15baf221760@openstack.org> References: <861d9067-070a-4f75-1f71-d15baf221760@openstack.org> Message-ID: On 10/7/19 12:02 PM, Thierry Carrez wrote: > zuul at openstack.org wrote: >> Build failed. >> >> - publish-openstack-releasenotes-python3 >> https://zuul.opendev.org/t/openstack/build/965908bbf69141c393d4728f7de07f7d >> : POST_FAILURE in 29m 45s > > Looks like a transient failure > > Collect sphinx build html: > ssh: connect to host 162.242.237.111 port 22: Connection timed out > rsync: connection unexpectedly closed (0 bytes received so far) [Receiver] > rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.1] > > Collect artifacts: > ssh: connect to host 162.242.237.111 port 22: Connection timed out > rsync: connection unexpectedly closed (0 bytes received so far) [Receiver] > rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.1] > > Release notes should be picked up at the next RC or the final, so no > need to retry/reenqueue? > That sounds OK to me. From openstack at nemebean.com Mon Oct 7 19:43:04 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Oct 2019 14:43:04 -0500 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> Message-ID: On 10/7/19 11:31 AM, Jeremy Stanley wrote: > On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: > [...] >> Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear >> to me that backporting bug fixes for it is valid. The original >> author of the patch actually wants it for Rocky > [...] > > Neither the changes nor the bug report indicate what the motivation > is for supporting newer Qemu with (much) older OpenStack. Is there > some platform which has this Qemu behavior on which folks are trying > to run Rocky? Or is it a homegrown build combining these dependency > versions from disparate time periods? Or maybe some other reason I'm > not imagining? > In addition to the downstream reasons Sean mentioned, Mark (the original author of the patch) responded to my question on the train backport with this: """ Today, I need it in Rocky. But, I'm find to do local patching. Anybody who needs Qemu 4.1.0 likely needs it. A key feature in Qemu 4.1.0 is that this is the first release of Qemu to include proper support for migration of L1 guests that have L2 guests (nVMX / nested KVM). So, I expect it is pretty important to whoever realizes this, and whoever needs this. """ So basically a desire to use a feature of the newer qemu with older openstack, which is why I'm questioning whether this fits our stable policy. My inclination is to say it's a fairly simple, backward-compatible patch that will make users' lives easier, but I also feel like doing a backport to enable a feature, even if the actual patch is a "bugfix", is violating the spirit of the stable policy. From kennelson11 at gmail.com Mon Oct 7 19:53:04 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 7 Oct 2019 12:53:04 -0700 Subject: [all] Final PTG Schedule Message-ID: Hello Everyone! After a few weeks of shuffling and changes, we have a final schedule! It can be seen here on the 'Schedule' tab[1]. -Kendall (diablo_rojo) [1] https://www.openstack.org/PTG -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Mon Oct 7 20:00:27 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Oct 2019 15:00:27 -0500 Subject: [all][requirements] requirements branched train - cycle-trailing are on notice that master is now ussuri In-Reply-To: <20191007160703.lzyan2777owo4cbw@mthode.org> References: <20191007160703.lzyan2777owo4cbw@mthode.org> Message-ID: <20191007200027.cvdt754ftnncdz4u@mthode.org> On 19-10-07 11:07:03, Matthew Thode wrote: > Just a friendly ping that the train keeps rolling. > > Master is now ussuri. > https://releases.openstack.org/constraints/upper/ussuri should work soon > for those that want to switch their install_command in tox.ini within > master earlier in the cycle (rather than having things pile up). I'll > email again once it's working. > > cycle-trailing projects are on notice that if they track master, the > dependencies may change from what they are currently working on (train). > > If you have any questions please let me know (in the > #openstack-requirements channel preferably). > > P.S. FFE season is over > https://releases.openstack.org/constraints/upper/ussuri now works -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From smooney at redhat.com Mon Oct 7 20:08:19 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Oct 2019 21:08:19 +0100 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> Message-ID: <1c17ad14272bddd29f46ea9790d128f4ff005099.camel@redhat.com> On Mon, 2019-10-07 at 14:43 -0500, Ben Nemec wrote: > > On 10/7/19 11:31 AM, Jeremy Stanley wrote: > > On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: > > [...] > > > Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear > > > to me that backporting bug fixes for it is valid. The original > > > author of the patch actually wants it for Rocky > > > > [...] > > > > Neither the changes nor the bug report indicate what the motivation > > is for supporting newer Qemu with (much) older OpenStack. Is there > > some platform which has this Qemu behavior on which folks are trying > > to run Rocky? Or is it a homegrown build combining these dependency > > versions from disparate time periods? Or maybe some other reason I'm > > not imagining? > > > > In addition to the downstream reasons Sean mentioned, Mark (the original > author of the patch) responded to my question on the train backport with > this: > > """ > Today, I need it in Rocky. But, I'm find to do local patching. > > Anybody who needs Qemu 4.1.0 likely needs it. A key feature in Qemu > 4.1.0 is that this is the first release of Qemu to include proper > support for migration of L1 guests that have L2 guests (nVMX / nested > KVM). So, I expect it is pretty important to whoever realizes this, and > whoever needs this. > """ > > So basically a desire to use a feature of the newer qemu with older > openstack, which is why I'm questioning whether this fits our stable > policy. My inclination is to say it's a fairly simple, > backward-compatible patch that will make users' lives easier, but I also > feel like doing a backport to enable a feature, even if the actual patch > is a "bugfix", is violating the spirit of the stable policy. in many distros the older qemus allow migration of the l1 guest eventhouhg it is unsafe to do so and either work by luck or the vm will curput its memroy and likely crash. the context of the qemu issue is for years people though that live migration with nested virt worked, then it was disabeld upstream and many distos reverted that as it would break there users where they got lucky and it worked, and in 4.1 it was fixed. this does not add or remvoe any functionality in openstack nova will try to live migarte if you tell it too regardless of the qemu it has it just will fail if the live migration check was complied in. similarly if all your images did not have fractional sizes you could use 4.1.0 with older oslo releases and it would be fine. i.e. you could get lucky and for your specific usecase this might not be needed but it would be nice not do depend on luck. anyway i woudl expect any disto the chooses to support qemu 4.1.0 to backport this as required. im not sure this problematic to require a late oslo version bump before train ga but i would hope it can be fixed on stable/train > From mthode at mthode.org Mon Oct 7 20:15:53 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Oct 2019 15:15:53 -0500 Subject: [requirements][heat] remove salt from requirements (used by heat-agents tests only) Message-ID: <20191007201553.xvaeejp2meoyw3ea@mthode.org> Salt has been harsh to deal with. Upstream adding and maintaining caps has caused it to be held back. This time it's pyyaml, I'm not going to hold back the version of pyyaml for one import of salt. In any case, heat-agents uses salt in one location and may not even be using the one we define via constraints in any case. File: heat-config-salt/install.d/50-heat-config-hook-salt Installs salt from package then runs heat-config-salt/install.d/hook-salt.py In heat-config-salt/install.d/hook-salt.py is defined the only import of salt I can find and likely uses the package version as it's installed after tox sets things up. Is the heat team ok with this? -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From sean.mcginnis at gmx.com Mon Oct 7 20:19:22 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Mon, 7 Oct 2019 15:19:22 -0500 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: References: Message-ID: <20191007201922.GA7126@sm-workstation> On Mon, Oct 07, 2019 at 01:02:32PM -0500, Miguel Lavalle wrote: > Hi Stackers, > > I want to request the inclusion of the support for Neutron Routed Networks > in the Nova goals for the Ussuri cycle. As many of you might know, Routed > networks is a feature in Neutron that enables the creation of very large > virtual networks that avoids the performance penalties of large L2 > broadcast domains ( > https://www.openstack.org/videos/summits/barcelona-2016/scaling-up-openstack-networking-with-routed-networks). > This functionality can be very helpful for large deployers who have the > need to have one or a few large virtual networks shared by all their users > and has been available in Neutron since very soon after the Barcelona > Summit in 2016. But it is really useless until there is code in Nova that > can schedule VMs to compute hosts based on the segments topology of the > routed networks. Without it, VMs can land in compute hosts where their > traffic cannot be routed by the underlying network infrastructure. I would > like the Nova team to consider the following when making a decision about > this request: > Is there a community-wide effort with this, or is this really just asking that Nova prioritize this work? The cycle goals (typically) have been used for things that we need the majority of the community to focus on in order to complete. If this is just something between Neutron and Nova, I don't think it really fits as a cycle goal. I do think it would be a good thing to try to complete in Ussuri though. Just maybe not as a community goal. Sean From fsbiz at yahoo.com Mon Oct 7 20:33:20 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Mon, 7 Oct 2019 20:33:20 +0000 (UTC) Subject: Port creation times out for some VMs in large group In-Reply-To: <1127664659.2766839.1570042860356@mail.yahoo.com> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> <1226029673.2675287.1570034502180@mail.yahoo.com> <1127664659.2766839.1570042860356@mail.yahoo.com> Message-ID: <1645654897.4940251.1570480400983@mail.yahoo.com> Thanks. Yes, it helps breathe some CPU cycles. This was traced to afixed bug. https://bugs.launchpad.net/neutron/+bug/1760047 which was applied to Queens in April 2019. https://review.opendev.org/#/c/649580/ Unfortunately, the patch simply makes the code more elegant by removing the semaphores.But it does not really fix the real issue that is dhcp-client serializes all the port update messages and eachmessage is processed too slowly resulting in PXE boot timeouts. The issue still remains open. thanks,Fred. On Wednesday, October 2, 2019, 11:34:39 AM PDT, Chris Apsey wrote: Is that still spitting out a vif plug failure or are your instances spawning but not getting addresses? I've found that adding in the no-ping option to dnsmasq lowers load significantly, but can be dangerous if you've got potentially conflicting sources of address allocation. While it doesn't address the below bug report specifically, it may breathe some more CPU cycles into dnsmasq so it can handle other items better. R CA -------- Original Message -------- On Oct 2, 2019, 12:41, fsbiz at yahoo.com < fsbiz at yahoo.com> wrote: Thanks. This definitely helps. I am running a stable release of Queens.Even after this change I still see 10-15 failures when I create 100 VMs in our cluster. I have tracked this down (to a reasonable degree of certainty) to the SIGHUPs caused by DNSMASQ reloadsevery time a new MAC entry is added, deleted or updated.  It seems to be related tohttps://bugs.launchpad.net/neutron/+bug/1598078 The fix for the above bug was abandoned.  Gerrit Code Review | | | | Gerrit Code Review | | | Any further fine tuning that can be done?  Thanks,Fred. On Friday, September 27, 2019, 09:37:51 AM PDT, Chris Apsey wrote: Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away.  I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden wrote: When I create 100 VMs in our prod cluster:   openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest   Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.”   If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure.   What config variables should I be looking at?   Here are the relevant log entries from the HV:   2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds   More logs and data:   http://paste.openstack.org/show/779524/   -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Mon Oct 7 20:36:40 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Oct 2019 15:36:40 -0500 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: <1c17ad14272bddd29f46ea9790d128f4ff005099.camel@redhat.com> References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> <1c17ad14272bddd29f46ea9790d128f4ff005099.camel@redhat.com> Message-ID: On 10/7/19 3:08 PM, Sean Mooney wrote: > On Mon, 2019-10-07 at 14:43 -0500, Ben Nemec wrote: >> >> On 10/7/19 11:31 AM, Jeremy Stanley wrote: >>> On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: >>> [...] >>>> Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear >>>> to me that backporting bug fixes for it is valid. The original >>>> author of the patch actually wants it for Rocky >>> >>> [...] >>> >>> Neither the changes nor the bug report indicate what the motivation >>> is for supporting newer Qemu with (much) older OpenStack. Is there >>> some platform which has this Qemu behavior on which folks are trying >>> to run Rocky? Or is it a homegrown build combining these dependency >>> versions from disparate time periods? Or maybe some other reason I'm >>> not imagining? >>> >> >> In addition to the downstream reasons Sean mentioned, Mark (the original >> author of the patch) responded to my question on the train backport with >> this: >> >> """ >> Today, I need it in Rocky. But, I'm find to do local patching. >> >> Anybody who needs Qemu 4.1.0 likely needs it. A key feature in Qemu >> 4.1.0 is that this is the first release of Qemu to include proper >> support for migration of L1 guests that have L2 guests (nVMX / nested >> KVM). So, I expect it is pretty important to whoever realizes this, and >> whoever needs this. >> """ >> >> So basically a desire to use a feature of the newer qemu with older >> openstack, which is why I'm questioning whether this fits our stable >> policy. My inclination is to say it's a fairly simple, >> backward-compatible patch that will make users' lives easier, but I also >> feel like doing a backport to enable a feature, even if the actual patch >> is a "bugfix", is violating the spirit of the stable policy. > in many distros the older qemus allow migration of the l1 guest eventhouhg it is > unsafe to do so and either work by luck or the vm will curput its memroy and likely > crash. the context of the qemu issue is for years people though that live migration with > nested virt worked, then it was disabeld upstream and many distos reverted that as it would > break there users where they got lucky and it worked, and in 4.1 it was fixed. > > this does not add or remvoe any functionality in openstack nova will try to live migarte if you > tell it too regardless of the qemu it has it just will fail if the live migration check was complied in. > > > similarly if all your images did not have fractional sizes you could use 4.1.0 with older > oslo releases and it would be fine. i.e. you could get lucky and for your specific usecase this > might not be needed but it would be nice not do depend on luck. > > anyway i woudl expect any disto the chooses to support qemu 4.1.0 to backport this as required. > im not sure this problematic to require a late oslo version bump before train ga but i would hope > it can be fixed on stable/train Note that this discussion is separate from the train patch. I agree we should do that backport, and actually we already have. That discussion was just about timing of the release. This thread is because the fix was also proposed to stable/stein. It merged before I had a chance to start this discussion, and I'm wondering if we need to revert it. From mriedemos at gmail.com Mon Oct 7 22:18:23 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Mon, 7 Oct 2019 17:18:23 -0500 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: <20191007201922.GA7126@sm-workstation> References: <20191007201922.GA7126@sm-workstation> Message-ID: On 10/7/2019 3:19 PM, Sean McGinnis wrote: > Is there a community-wide effort with this, or is this really just asking that > Nova prioritize this work? > > The cycle goals (typically) have been used for things that we need the majority > of the community to focus on in order to complete. If this is just something > between Neutron and Nova, I don't think it really fits as a cycle goal. > > I do think it would be a good thing to try to complete in Ussuri though. Just > maybe not as a community goal. Miguel isn't talking about cycle wide goals. There are some proposed process changes for nova in Ussuri [1] along with constraining the amount of feature work approved for the release. I think Miguel is just asking that routed networks support is included in that bucket and I'm sure the answer is, like for anything, "it depends". From a wider governance perspective, if people interested in developing this feature were looking for an officially blessed thing, this would be a pop-up team. [1] https://review.opendev.org/#/c/685857/ -- Thanks, Matt From openstack at fried.cc Mon Oct 7 22:28:42 2019 From: openstack at fried.cc (Eric Fried) Date: Mon, 7 Oct 2019 17:28:42 -0500 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: References: <20191007201922.GA7126@sm-workstation> Message-ID: <85498548-b657-7b96-35e5-ed493bec0056@fried.cc> > Miguel isn't talking about cycle wide goals. There are some proposed > process changes for nova in Ussuri [1] along with constraining the > amount of feature work approved for the release. I think Miguel is just > asking that routed networks support is included in that bucket and I'm > sure the answer is, like for anything, "it depends". Agreed. What hasn't changed is that to get to the table it will need a blueprint [1] (which I don't see yet [2]) and spec [3] (likewise [4]). efried [1] https://blueprints.launchpad.net/nova/ussuri/+addspec [2] https://blueprints.launchpad.net/nova/ussuri [3] http://specs.openstack.org/openstack/nova-specs/readme.html [4] https://review.opendev.org/#/q/project:openstack/nova-specs+status:open From fsbiz at yahoo.com Mon Oct 7 22:45:15 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Mon, 7 Oct 2019 22:45:15 +0000 (UTC) Subject: [neutron]: Latest Queens release: dhcp-client takes too long processing messages and falls behind. In-Reply-To: <556991713.4938348.1570478933095@mail.yahoo.com> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> <1226029673.2675287.1570034502180@mail.yahoo.com> <1127664659.2766839.1570042860356@mail.yahoo.com> <556991713.4938348.1570478933095@mail.yahoo.com> Message-ID: <1671513214.4995921.1570488315508@mail.yahoo.com> Hi neutron team, We've been troubleshooting an issue with neutron's dhcp-client for sometime now. We were previously on neutron 12.0.5 and observed that upon reloading 5-8 baremetals simultaneously almost always led to a few baremetals failing PXE boot during provisioning and/or cleaning. This was traced to afixed bug. https://bugs.launchpad.net/neutron/+bug/1760047 which was applied to Queens in April 2019. https://review.opendev.org/#/c/649580/ We patched the above fix but found out the problem was not resolved. The fix gets rid of the semaphores by serializing the multiple messages into aPriority Queue. The Priority Queue then drains the messages serially one by one making sure notto yield during the processing of each message.  All in all this just seems like a more elegant way ofgetting rid of the semaphores but does not really  fix the issue at hand. Below are the logs from dhcp-agent in neutron release 12.0.5.  As can be seen the semaphore locks all threads for almost 6 seconds.While the below has been fixed using https://review.opendev.org/#/c/649580/ the underlying problem has not been fixed.  The semaphore has been removed but instead the message is being serialized and does not yieldresulting in PXE boot failures on the baremetal nodes. Any pointers would be appreciated. thanks,Fred 2019-10-03 18:07:37.454 318956DEBUG oslo_concurrency.lockutils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -] Acquired semaphore"dhcp-agent-network-lock-077aa2d1-605c-48ec-842d-7dd6767bfd01" lock/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212 2019-10-03 18:07:37.455 318956DEBUG neutron.agent.dhcp.agent [req-eac79995-3846-46e6-b946-c5b5ccdb7aa5 8941137e383548bda725e74a93b2f86519f6fb7446dc47dd88c63cf03c1cce94 - - -] Calling driver for network:077aa2d1-605c-48ec-842d-7dd6767bfd01 action: reload_allocations call_driver/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py:135 2019-10-03 18:07:37.456 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'dhcp_release', 'ns-8387b854-d1','10.33.27.77', '9c:71:3a:cb:7c:43', '01:9c:71:3a:cb:7c:43']execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:38.101 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'dhcp_release', 'ns-8387b854-d1','10.33.27.75', '9c:71:3a:cb:7b:fb'] execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:38.717 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'dhcp_release', 'ns-8387b854-d1','10.33.27.75', '9c:71:3a:cb:7b:fb', 'ff:3a:cb:7b:fb:00:04:8a:ef:2f:58:b4:20:45:03:80:27:0f:15:84:a4:70:7b']execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:39.631 318956DEBUG neutron.agent.linux.dhcp [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Building host file: /var/lib/neutron/dhcp/077aa2d1-605c-48ec-842d-7dd6767bfd01/host_output_hosts_file/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py:695 2019-10-03 18:07:39.632 318956DEBUG neutron.agent.linux.dhcp [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -] Donebuilding host file/var/lib/neutron/dhcp/077aa2d1-605c-48ec-842d-7dd6767bfd01/host_output_hosts_file/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py:734 2019-10-03 18:07:39.633 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'ip', 'addr', 'show','ns-8387b854-d1'] execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:40.263 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['kill', '-HUP', '319109']execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:40.843 318956DEBUG neutron.agent.linux.dhcp [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Reloading allocations for network: 077aa2d1-605c-48ec-842d-7dd6767bfd01reload_allocations/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py:524 2019-10-03 18:07:40.843 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'ip', '-4', 'route', 'list','dev', 'ns-8387b854-d1'] execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:41.462 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa5 8941137e383548bda725e74a93b2f86519f6fb7446dc47dd88c63cf03c1cce94 - - -] Running command (rootwrap daemon):['ip', 'netns', 'exec', 'qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'ip','-6', 'route', 'list', 'dev', 'ns-8387b854-d1'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:42.101 318956DEBUG oslo_concurrency.lockutils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -] Releasing semaphore"dhcp-agent-network-lock-077aa2d1-605c-48ec-842d-7dd6767bfd01" lock/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:228   -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnsomor at gmail.com Tue Oct 8 00:15:56 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Mon, 7 Oct 2019 17:15:56 -0700 Subject: [dev][taskflow] Accepting any decision In-Reply-To: References: Message-ID: Hi Raja, You can have a decider that goes down one of two paths and then continues with common tasks. See the Octavia code starting at Line 228 to Line 289 here: https://github.com/openstack/octavia/blob/master/octavia/controller/worker/v2/flows/amphora_flows.py#L228 We "decide" if we can use a pre-booted VM and if not, we boot one. Then once we have a VM by either path, we finish configuring it. Michael On Fri, Sep 27, 2019 at 6:49 AM Jiří Rája wrote: > > Hi, > I wrote the code in the attachment and I would like to ask if it's possible to execute next task (step3) even if one decider returns True (link from step 1 to step 3) and one returns False (link from step 2 to step 3). If it is possible could someone alter the code? Or is there any other way to do it? And if the task wouldn't have to wait for all of the links, it would be great. Thank you! > > All the best, > Rája From soulxu at gmail.com Tue Oct 8 04:46:04 2019 From: soulxu at gmail.com (Alex Xu) Date: Tue, 8 Oct 2019 12:46:04 +0800 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: Kenichi, thanks for your contribution, I also learned a lot from you. All the best for your future endeavors! Kenichi Omichi 于2019年10月2日周三 上午5:47写道: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, make > friends in the world and bring me and my family to foreign country from our > home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still keep > contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > --- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Tue Oct 8 08:05:20 2019 From: zigo at debian.org (Thomas Goirand) Date: Tue, 8 Oct 2019 10:05:20 +0200 Subject: cinder 15.0.0.0rc2 (train) In-Reply-To: References: Message-ID: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> On 10/7/19 4:40 PM, no-reply at openstack.org wrote: > Hello everyone, > > A new release candidate for cinder for the end of the Train > cycle is available! You can find the source code tarball at: > > https://tarballs.openstack.org/cinder/ Hi, For the 2nd time, could we *please* re-add the tag: [release-announce] when announcing for a release? I don't mind if it's sent to -discuss instead of the announce this, but this breaks mail filters... Cheers, Thomas Goirand (zigo) From thierry at openstack.org Tue Oct 8 08:47:22 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 8 Oct 2019 10:47:22 +0200 Subject: cinder 15.0.0.0rc2 (train) In-Reply-To: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> References: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> Message-ID: Thomas Goirand wrote: > For the 2nd time, could we *please* re-add the tag: [release-announce] > when announcing for a release? I don't mind if it's sent to -discuss > instead of the announce this, but this breaks mail filters... There was no clear decision last time we brought this up (and nobody proposed patches to fix it). I think we'll just move RC announcements to release-announce. Let me see if I can push a patch for this today (there may be a few more RCs sent like this one in the mean time). -- Thierry Carrez (ttx) From smooney at redhat.com Tue Oct 8 09:25:01 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 08 Oct 2019 10:25:01 +0100 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: <85498548-b657-7b96-35e5-ed493bec0056@fried.cc> References: <20191007201922.GA7126@sm-workstation> <85498548-b657-7b96-35e5-ed493bec0056@fried.cc> Message-ID: On Mon, 2019-10-07 at 17:28 -0500, Eric Fried wrote: > > Miguel isn't talking about cycle wide goals. There are some proposed > > process changes for nova in Ussuri [1] along with constraining the > > amount of feature work approved for the release. I think Miguel is just > > asking that routed networks support is included in that bucket and I'm > > sure the answer is, like for anything, "it depends". > > Agreed. What hasn't changed is that to get to the table it will need a > blueprint [1] (which I don't see yet [2]) and spec [3] (likewise [4]). for this specific effort while it would not be a community wide goal this effort might benefit form a pop-up team of nova, placement and neutron developers to Shepard it along. i have to admit while we discussed this at some length at the PTG i did not follow the neutron development to see if they had got to the point of modelling subnets/segments as placement aggregates and sharing resource providers of ips. we have definitely made progress on the nova side thanks to gibi on move operations for ports with resource requests. having a fourm to bring the 3 project together may help finally get this over the line. that said i am not sure what remains to be done on the neutron side and what nova needs to do. I speculated about the gaps in my previous responce based on the desgin we discussed in the past. The current WIP patch was uploaded by matt https://review.opendev.org/#/c/656885 so i think he understands nova process better then most, that said if migule and matt are tied up with things i can try and help with the paperwork. Matt you have not been active on that patch since may is this something you have time/intend to work on for Ussuri? im not necessarily signing up to work on this at this point but it is a feature i think we should add and given i have not finalise what work i intent to do in U i might be able to help. @matt one point on your last comment to that patch that does perplex me somewhat was the assertion/implication configuration of nova host aggreates woudl be required. part of the goal as i understood it was to require no configuration on the nova side at all. i.e. instead of haveing a config option for a prefilter to update the request spec by transforming the subnets into placement aggreates we would build on the port requests feature we used for bandwith based schduling so that neutron can provide a resouce request for an ip and aggreate per port. we could discuss this in a spec but the reason i bring it up is the current patch looks like it would be problematic if you have a cloud with multiple network backeds say sriov and calico as its a global config rather then a backend specific behavior that builds on the generic perport resource requests. anyway that is an implemantion detail/design choice that we can discuss else where i just wanted to point it out. > > efried > > [1] https://blueprints.launchpad.net/nova/ussuri/+addspec > [2] https://blueprints.launchpad.net/nova/ussuri > [3] http://specs.openstack.org/openstack/nova-specs/readme.html > [4] https://review.opendev.org/#/q/project:openstack/nova-specs+status:open > From zigo at debian.org Tue Oct 8 09:58:03 2019 From: zigo at debian.org (Thomas Goirand) Date: Tue, 8 Oct 2019 11:58:03 +0200 Subject: cinder 15.0.0.0rc2 (train) In-Reply-To: References: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> Message-ID: On 10/8/19 10:47 AM, Thierry Carrez wrote: > Thomas Goirand wrote: >> For the 2nd time, could we *please* re-add the tag: [release-announce] >> when announcing for a release? I don't mind if it's sent to -discuss >> instead of the announce this, but this breaks mail filters... > > There was no clear decision last time we brought this up (and nobody > proposed patches to fix it). > > I think we'll just move RC announcements to release-announce. Let me see > if I can push a patch for this today (there may be a few more RCs sent > like this one in the mean time). Thierry, Hopefully, I'm not too moronic here... :) I'm not trying to push any decision of changing any habits. Just trying to not forget one artifact. There's no need for any decision to add the [release announce] tag! :) I'm not sure where to propose the patch (I've searched for it), otherwise I would have done it. Thomas From thierry at openstack.org Tue Oct 8 12:00:26 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 8 Oct 2019 14:00:26 +0200 Subject: cinder 15.0.0.0rc2 (train) In-Reply-To: References: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> Message-ID: Thomas Goirand wrote: > On 10/8/19 10:47 AM, Thierry Carrez wrote: >> Thomas Goirand wrote: >>> For the 2nd time, could we *please* re-add the tag: [release-announce] >>> when announcing for a release? I don't mind if it's sent to -discuss >>> instead of the announce this, but this breaks mail filters... >> >> There was no clear decision last time we brought this up (and nobody >> proposed patches to fix it). >> >> I think we'll just move RC announcements to release-announce. Let me see >> if I can push a patch for this today (there may be a few more RCs sent >> like this one in the mean time). > > Thierry, > > Hopefully, I'm not too moronic here... :) > I'm not trying to push any decision of changing any habits. Just trying > to not forget one artifact. There's no need for any decision to add the > [release announce] tag! :) Actually the [release-announce] prefix is added by the mailing-list itself, so if we add it to the subject line we'd also have to change ML settings so that it's not added twice... > I'm not sure where to propose the patch (I've searched for it), > otherwise I would have done it. That should do it: https://review.opendev.org/687275 -- Thierry Carrez (ttx) From jim at jimrollenhagen.com Tue Oct 8 12:12:57 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Tue, 8 Oct 2019 08:12:57 -0400 Subject: [tc] monthly meeting agenda In-Reply-To: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> References: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> Message-ID: On Thu, Oct 3, 2019 at 3:59 PM Jean-Philippe Evrard wrote: > Hello everyone, > > Here's the agenda for our monthly TC meeting. It will happen next > Thursday (10 October) at the usual time (1400 UTC) in #openstack-tc . > > If you can't attend, please put your name in the "Apologies for > Absence" section in the wiki [1] > > Our meeting chair will be Alexandra (asettle). > > * Follow up on past action items > ** ricolin: Follow up with SIG chairs about guidelines > https://etherpad.openstack.org/p/SIGs-guideline > ** ttx: contact interested parties in a new 'large scale' sig (help > with mnaser, jroll reaching out to verizon media) > ** Release Naming - Results of the TC poll - Next action > > * New initiatives and/or report on previous initiatives > ** Help gmann on the community goals following our new goal process > ** mugsie: to sync with dhellmann or release-team to find the code for > the proposal bot > ** jroll - ttx: Feedback from the forum selection committee -- Follow > up on https://etherpad.openstack.org/p/PVG-TC-brainstorming -- Final > accepted list? > To follow up on this asynchronously: Final schedule is here: https://www.openstack.org/summit/shanghai-2019/summit-schedule/global-search?t=forum I made notes on the etherpad about which were accepted or not. Of course, we can still discuss in the meeting. :) // jim > ** mnaser: sync up with swift team on python3 migration > > Thank you everyone! > > Regards, > JP > > [1]: > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Tue Oct 8 13:03:34 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 8 Oct 2019 09:03:34 -0400 Subject: [cinder] Train release status Message-ID: <81b0820f-8694-c827-d82e-e2e1562f9a83@gmail.com> You may have noticed that RC-2 was released yesterday. We aren't planning to do an RC-3 unless a critical bugfix is approved for backport. Here's the timeline: - Now through Friday 11 October: RC-3, etc. are cut as necessary - The "final RC" is whatever RC-n exists on 11 October - The coordinated release date is 16 October * Any bugfixes caught after 11 Oct can be merged into stable/train, but it is up to the release team whether they can be included in the release. Please do some exploratory testing on the 15.0.0.0rc2 tag (which right now is the HEAD of stable/train). If you find a critical bug, please file it in Launchpad and tag it 'train-rc-potential'. Also add it to the etherpad: https://etherpad.openstack.org/p/cinder-train-backport-potential and make some noise in #openstack-cinder so we are all aware of it. cheers, brian From jean-philippe at evrard.me Tue Oct 8 14:04:27 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 08 Oct 2019 16:04:27 +0200 Subject: [tc] monthly meeting agenda In-Reply-To: References: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> Message-ID: <1e6f227d2b341b7d7d528d30f4b3c9821e66ffe9.camel@evrard.me> On Tue, 2019-10-08 at 08:12 -0400, Jim Rollenhagen wrote: > I made notes on the etherpad about which were accepted or not. > Of course, we can still discuss in the meeting. :) Thanks! Maybe we could only discuss about what to do for our rejected sessions (in https://etherpad.openstack.org/p/PVG-TC-brainstorming )? Regards, JP From mihalis68 at gmail.com Tue Oct 8 15:18:42 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 8 Oct 2019 11:18:42 -0400 Subject: [ops] ops meetups team meeting 2019-10-8 Message-ID: Minutes from todays meeting are here: 10:56 AM Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-08-14.06.html 10:56 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-08-14.06.txt 10:56 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-08-14.06.log.html The Ops Community attending the upcoming Summit in Shanghai will have one Forum session (ops war stories). On day 4 we will also have a 3 hours session for further ops related topic discussion. Details still to be arranged. Chris - on behalf of the openstack ops meetups team -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From fsbiz at yahoo.com Tue Oct 8 16:14:22 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Tue, 8 Oct 2019 16:14:22 +0000 (UTC) Subject: Neutron dhcp-agent scalability techniques References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> Message-ID: <459655647.5382428.1570551262388@mail.yahoo.com> Hi folks, We have a rather large flat network consisting of over 300 ironic baremetal nodesand are constantly having the baremetals timing out during their PXE boot due tothe dhcp agent not able to respond in time. Looking for inputs on successful DHCP scaling techniques that would help mitigate this. thanks,Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ruslanas at lpic.lt Tue Oct 8 16:24:20 2019 From: ruslanas at lpic.lt (=?UTF-8?Q?Ruslanas_G=C5=BEibovskis?=) Date: Tue, 8 Oct 2019 18:24:20 +0200 Subject: Neutron dhcp-agent scalability techniques In-Reply-To: <459655647.5382428.1570551262388@mail.yahoo.com> References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> Message-ID: Hi, I am just curious, how much dhcp agents do you have on a network? Is controller monolithic? How much controllers do you have? On Tue, 8 Oct 2019, 18:17 fsbiz at yahoo.com, wrote: > Hi folks, > > We have a rather large flat network consisting of over 300 ironic > baremetal nodes > and are constantly having the baremetals timing out during their PXE boot > due to > the dhcp agent not able to respond in time. > > Looking for inputs on successful DHCP scaling techniques that would help > mitigate this. > > thanks, > Fred. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Oct 8 16:28:45 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 8 Oct 2019 09:28:45 -0700 Subject: Neutron dhcp-agent scalability techniques In-Reply-To: <459655647.5382428.1570551262388@mail.yahoo.com> References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> Message-ID: While not necessarily direct scaling of that subnet, you may want to look at ironic.conf's [neutron]port_setup_delay option. The default value is zero seconds, but increasing that value will cause the process to pause a little longer to give time for the neutron agent configuration to update, as the agent may not even know about the configuration as there are multiple steps with-in neutron, by the time the baremetal machine tries to PXE boot. We're hoping that in the U cycle, we'll finally have things in place where neutron tells ironic that the port setup is done and that the machine can be powered-on, but not all the code made it during Train. -Julia On Tue, Oct 8, 2019 at 9:15 AM fsbiz at yahoo.com wrote: > > Hi folks, > > We have a rather large flat network consisting of over 300 ironic baremetal nodes > and are constantly having the baremetals timing out during their PXE boot due to > the dhcp agent not able to respond in time. > > Looking for inputs on successful DHCP scaling techniques that would help mitigate this. > > thanks, > Fred. From fsbiz at yahoo.com Tue Oct 8 16:52:09 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Tue, 8 Oct 2019 16:52:09 +0000 (UTC) Subject: Neutron dhcp-agent scalability techniques In-Reply-To: References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> Message-ID: <1923689261.5373083.1570553529145@mail.yahoo.com> We have 3 controller nodes each running a DHCP agent (so 3 DHCP agents in all). Fred. On Tuesday, October 8, 2019, 09:24:34 AM PDT, Ruslanas Gžibovskis wrote: Hi, I am just curious, how much dhcp agents do you have on a network?Is controller monolithic?How much controllers do you have? On Tue, 8 Oct 2019, 18:17 fsbiz at yahoo.com, wrote: Hi folks, We have a rather large flat network consisting of over 300 ironic baremetal nodesand are constantly having the baremetals timing out during their PXE boot due tothe dhcp agent not able to respond in time. Looking for inputs on successful DHCP scaling techniques that would help mitigate this. thanks,Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Tue Oct 8 16:54:40 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Tue, 8 Oct 2019 10:54:40 -0600 Subject: [tripleo] owls at ptg Message-ID: Greetings, A number of folks from TripleO will be at the OpenDev PTG. If you would like to discuss anything and collaborate please list your topic on this etherpad [1] Thank you! [1] https://etherpad.openstack.org/p/tripleo-ussuri-topics -------------- next part -------------- An HTML attachment was scrubbed... URL: From fsbiz at yahoo.com Tue Oct 8 16:55:06 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Tue, 8 Oct 2019 16:55:06 +0000 (UTC) Subject: Neutron dhcp-agent scalability techniques In-Reply-To: References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> Message-ID: <1716201708.5392618.1570553706947@mail.yahoo.com> Thanks Julia.   We have set the port_setup_delay to 30. # Delay value to wait for Neutron agents to setup sufficient# DHCP configuration for port. (integer value)# Minimum value: 0port_setup_delay = 30 >We're hoping that in the U >cycle, we'll finally have things in place where neutron tells ironic >that the port setup is done and that the machine can be powered-on, >but not all the code made it during Train. This would be perfect. Fred. On Tuesday, October 8, 2019, 09:32:44 AM PDT, Julia Kreger wrote: While not necessarily direct scaling of that subnet, you may want to look at ironic.conf's [neutron]port_setup_delay option. The default value is zero seconds, but increasing that value will cause the process to pause a little longer to give time for the neutron agent configuration to update, as the agent may not even know about the configuration as there are multiple steps with-in neutron, by the time the baremetal machine tries to PXE boot. We're hoping that in the U cycle, we'll finally have things in place where neutron tells ironic that the port setup is done and that the machine can be powered-on, but not all the code made it during Train. -Julia On Tue, Oct 8, 2019 at 9:15 AM fsbiz at yahoo.com wrote: > > Hi folks, > > We have a rather large flat network consisting of over 300 ironic baremetal nodes > and are constantly having the baremetals timing out during their PXE boot due to > the dhcp agent not able to respond in time. > > Looking for inputs on successful DHCP scaling techniques that would help mitigate this. > > thanks, > Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Tue Oct 8 16:57:19 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 9 Oct 2019 00:57:19 +0800 Subject: [tc] monthly meeting agenda In-Reply-To: <1e6f227d2b341b7d7d528d30f4b3c9821e66ffe9.camel@evrard.me> References: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> <1e6f227d2b341b7d7d528d30f4b3c9821e66ffe9.camel@evrard.me> Message-ID: I added two more topics in agenda suggestion today which might worth discuss about. * define goal select process schedule * Maintain issue with Telemetery On Tue, Oct 8, 2019 at 10:10 PM Jean-Philippe Evrard < jean-philippe at evrard.me> wrote: > > Thanks! Maybe we could only discuss about what to do for our rejected > sessions (in https://etherpad.openstack.org/p/PVG-TC-brainstorming )? That sounds like a good idea. -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Oct 8 17:05:24 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 8 Oct 2019 10:05:24 -0700 Subject: Neutron dhcp-agent scalability techniques In-Reply-To: <1716201708.5392618.1570553706947@mail.yahoo.com> References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> <1716201708.5392618.1570553706947@mail.yahoo.com> Message-ID: One other thing that comes to mind at 30 seconds is spanning-tree port forwarding delay. PXE boot often thinks once carrier is up, that it can try and send/receive packets, however switches may still block traffic waiting for spanning-tree packets. Just from a limiting possible issues, it might be a good thing to double check network side to make sure "portfast" is the operating mode for the physical ports attached to that flat network. What this would look like is the machine appears to DHCP, but the packets would never actually reach the DHCP server. -Julia On Tue, Oct 8, 2019 at 9:55 AM fsbiz at yahoo.com wrote: > > Thanks Julia. We have set the port_setup_delay to 30. > > > # Delay value to wait for Neutron agents to setup sufficient > # DHCP configuration for port. (integer value) > # Minimum value: 0 > port_setup_delay = 30 > > >We're hoping that in the U > >cycle, we'll finally have things in place where neutron tells ironic > >that the port setup is done and that the machine can be powered-on, > >but not all the code made it during Train. > > This would be perfect. > > Fred. > > > > > On Tuesday, October 8, 2019, 09:32:44 AM PDT, Julia Kreger wrote: > > > While not necessarily direct scaling of that subnet, you may want to > look at ironic.conf's [neutron]port_setup_delay option. The default > value is zero seconds, but increasing that value will cause the > process to pause a little longer to give time for the neutron agent > configuration to update, as the agent may not even know about the > configuration as there are multiple steps with-in neutron, by the time > the baremetal machine tries to PXE boot. We're hoping that in the U > cycle, we'll finally have things in place where neutron tells ironic > that the port setup is done and that the machine can be powered-on, > but not all the code made it during Train. > > -Julia > > On Tue, Oct 8, 2019 at 9:15 AM fsbiz at yahoo.com wrote: > > > > Hi folks, > > > > We have a rather large flat network consisting of over 300 ironic baremetal nodes > > and are constantly having the baremetals timing out during their PXE boot due to > > the dhcp agent not able to respond in time. > > > > Looking for inputs on successful DHCP scaling techniques that would help mitigate this. > > > > thanks, > > Fred. > From fsbiz at yahoo.com Tue Oct 8 18:34:54 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Tue, 8 Oct 2019 18:34:54 +0000 (UTC) Subject: Neutron dhcp-agent scalability techniques In-Reply-To: References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> <1716201708.5392618.1570553706947@mail.yahoo.com> Message-ID: <667768633.5458229.1570559694224@mail.yahoo.com> Thanks Julia.  Yes, portfast is enabled on the ports of the TOR switch. Regards,Fred. On Tuesday, October 8, 2019, 10:09:35 AM PDT, Julia Kreger wrote: One other thing that comes to mind at 30 seconds is spanning-tree port forwarding delay. PXE boot often thinks once carrier is up, that it can try and send/receive packets, however switches may still block traffic waiting for spanning-tree packets.  Just from a limiting possible issues, it might be a good thing to double check network side to make sure "portfast" is the operating mode for the physical ports attached to that flat network. What this would look like is the machine appears to DHCP, but the packets would never actually reach the DHCP server. -Julia On Tue, Oct 8, 2019 at 9:55 AM fsbiz at yahoo.com wrote: > > Thanks Julia.  We have set the port_setup_delay to 30. > > > # Delay value to wait for Neutron agents to setup sufficient > # DHCP configuration for port. (integer value) > # Minimum value: 0 > port_setup_delay = 30 > > >We're hoping that in the U > >cycle, we'll finally have things in place where neutron tells ironic > >that the port setup is done and that the machine can be powered-on, > >but not all the code made it during Train. > > This would be perfect. > > Fred. > > > > > On Tuesday, October 8, 2019, 09:32:44 AM PDT, Julia Kreger wrote: > > > While not necessarily direct scaling of that subnet, you may want to > look at ironic.conf's [neutron]port_setup_delay option. The default > value is zero seconds, but increasing that value will cause the > process to pause a little longer to give time for the neutron agent > configuration to update, as the agent may not even know about the > configuration as there are multiple steps with-in neutron, by the time > the baremetal machine tries to PXE boot. We're hoping that in the U > cycle, we'll finally have things in place where neutron tells ironic > that the port setup is done and that the machine can be powered-on, > but not all the code made it during Train. > > -Julia > > On Tue, Oct 8, 2019 at 9:15 AM fsbiz at yahoo.com wrote: > > > > Hi folks, > > > > We have a rather large flat network consisting of over 300 ironic baremetal nodes > > and are constantly having the baremetals timing out during their PXE boot due to > > the dhcp agent not able to respond in time. > > > > Looking for inputs on successful DHCP scaling techniques that would help mitigate this. > > > > thanks, > > Fred. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gouthampravi at gmail.com Tue Oct 8 18:38:23 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Tue, 8 Oct 2019 11:38:23 -0700 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> References: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> Message-ID: Thank you all for responding; I've added Douglas to https://review.opendev.org/#/admin/groups/213,members. Thank you Douglas for your hard work - welcome, and glad to have you on board! On Mon, Oct 7, 2019 at 12:30 AM Thomas Bechtold wrote: > +1 from me, too. > > On 10/2/19 10:58 PM, Goutham Pacha Ravi wrote: > > Dear Zorillas and other Stackers, > > > > I would like to formalize the conversations we've been having amongst > > ourselves over IRC and in-person. At the outset, we have a lot of > > incoming changes to review, but we have limited core maintainer > > attention. We haven't re-jigged our core maintainers team as often as > > we'd like, and that's partly to blame. We have some relatively new and > > enthusiastic contributors that we would love to encourage to become > > maintainers! We've mentored contributors 1-1, n-1 before before adding > > them to the maintainers team. We would like to do more of this!** > > > > In this spirit, I would like your inputs on adding Douglas Viroel > > (dviroel) to the core maintainers team for manila and its associated > > projects (manila-specs, manila-ui, python-manilaclient, > > manila-tempest-plugin, manila-test-image, manila-image-elements). > > Douglas has been an active contributor for the past two releases and > > has valuable review inputs in the project. While he's been around here > > less longer than some of us, he brings a lot of experience to the > > table with his background in networking and shared file systems. He > > has a good grasp of the codebase and is enthusiastic in adding new > > features and fixing bugs in the Ussuri cycle and beyond. > > > > Please give me a +/-1 for this proposal. > > > > ** If you're interested in helping us maintain Manila by being part of > > the manila core maintainer team, please reach out to me or any of the > > current maintainers, we would love to work with you and help you grow > > into that role! > > > > Thanks, > > Goutham Pacha Ravi (gouthamr) > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucioseki at gmail.com Tue Oct 8 18:47:54 2019 From: lucioseki at gmail.com (Lucio Seki) Date: Tue, 8 Oct 2019 15:47:54 -0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> Message-ID: Congratulations, Douglas! On Tue, Oct 8, 2019 at 3:41 PM Goutham Pacha Ravi wrote: > Thank you all for responding; I've added Douglas to > https://review.opendev.org/#/admin/groups/213,members. > Thank you Douglas for your hard work - welcome, and glad to have you on > board! > > On Mon, Oct 7, 2019 at 12:30 AM Thomas Bechtold > wrote: > >> +1 from me, too. >> >> On 10/2/19 10:58 PM, Goutham Pacha Ravi wrote: >> > Dear Zorillas and other Stackers, >> > >> > I would like to formalize the conversations we've been having amongst >> > ourselves over IRC and in-person. At the outset, we have a lot of >> > incoming changes to review, but we have limited core maintainer >> > attention. We haven't re-jigged our core maintainers team as often as >> > we'd like, and that's partly to blame. We have some relatively new and >> > enthusiastic contributors that we would love to encourage to become >> > maintainers! We've mentored contributors 1-1, n-1 before before adding >> > them to the maintainers team. We would like to do more of this!** >> > >> > In this spirit, I would like your inputs on adding Douglas Viroel >> > (dviroel) to the core maintainers team for manila and its associated >> > projects (manila-specs, manila-ui, python-manilaclient, >> > manila-tempest-plugin, manila-test-image, manila-image-elements). >> > Douglas has been an active contributor for the past two releases and >> > has valuable review inputs in the project. While he's been around here >> > less longer than some of us, he brings a lot of experience to the >> > table with his background in networking and shared file systems. He >> > has a good grasp of the codebase and is enthusiastic in adding new >> > features and fixing bugs in the Ussuri cycle and beyond. >> > >> > Please give me a +/-1 for this proposal. >> > >> > ** If you're interested in helping us maintain Manila by being part of >> > the manila core maintainer team, please reach out to me or any of the >> > current maintainers, we would love to work with you and help you grow >> > into that role! >> > >> > Thanks, >> > Goutham Pacha Ravi (gouthamr) >> > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rodrigo.barbieri2010 at gmail.com Tue Oct 8 18:57:21 2019 From: rodrigo.barbieri2010 at gmail.com (Rodrigo Barbieri) Date: Tue, 8 Oct 2019 15:57:21 -0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> Message-ID: Congratulations! On Tue, Oct 8, 2019 at 3:55 PM Lucio Seki wrote: > Congratulations, Douglas! > > On Tue, Oct 8, 2019 at 3:41 PM Goutham Pacha Ravi > wrote: > >> Thank you all for responding; I've added Douglas to >> https://review.opendev.org/#/admin/groups/213,members. >> Thank you Douglas for your hard work - welcome, and glad to have you on >> board! >> >> On Mon, Oct 7, 2019 at 12:30 AM Thomas Bechtold >> wrote: >> >>> +1 from me, too. >>> >>> On 10/2/19 10:58 PM, Goutham Pacha Ravi wrote: >>> > Dear Zorillas and other Stackers, >>> > >>> > I would like to formalize the conversations we've been having amongst >>> > ourselves over IRC and in-person. At the outset, we have a lot of >>> > incoming changes to review, but we have limited core maintainer >>> > attention. We haven't re-jigged our core maintainers team as often as >>> > we'd like, and that's partly to blame. We have some relatively new and >>> > enthusiastic contributors that we would love to encourage to become >>> > maintainers! We've mentored contributors 1-1, n-1 before before adding >>> > them to the maintainers team. We would like to do more of this!** >>> > >>> > In this spirit, I would like your inputs on adding Douglas Viroel >>> > (dviroel) to the core maintainers team for manila and its associated >>> > projects (manila-specs, manila-ui, python-manilaclient, >>> > manila-tempest-plugin, manila-test-image, manila-image-elements). >>> > Douglas has been an active contributor for the past two releases and >>> > has valuable review inputs in the project. While he's been around here >>> > less longer than some of us, he brings a lot of experience to the >>> > table with his background in networking and shared file systems. He >>> > has a good grasp of the codebase and is enthusiastic in adding new >>> > features and fixing bugs in the Ussuri cycle and beyond. >>> > >>> > Please give me a +/-1 for this proposal. >>> > >>> > ** If you're interested in helping us maintain Manila by being part of >>> > the manila core maintainer team, please reach out to me or any of the >>> > current maintainers, we would love to work with you and help you grow >>> > into that role! >>> > >>> > Thanks, >>> > Goutham Pacha Ravi (gouthamr) >>> > >>> > >>> >> -- Rodrigo Barbieri MSc Computer Scientist OpenStack Manila Core Contributor Federal University of São Carlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfidente at redhat.com Tue Oct 8 20:28:29 2019 From: gfidente at redhat.com (Giulio Fidente) Date: Tue, 8 Oct 2019 22:28:29 +0200 Subject: [tripleo] owls at ptg In-Reply-To: References: Message-ID: On 10/8/19 6:54 PM, Wesley Hayutin wrote: > Greetings, > > A number of folks from TripleO will be at the OpenDev PTG.  If you would > like to discuss anything and collaborate please list your topic on this > etherpad [1] hi Wes, thanks for starting this thread. I think the Edge topic is quite big and interesting. For example, I have seen in the etherpad a proposal to discuss the nodes lifecycle management; I'd like to lead myself a session about Edge as well to review the status and the plans to support storage at the Edge ... there is quite a lot to be said regarding all most common storage components cinder, glance, manila, swift and regarding ceph support (at the Edge) Another topic about which I'd like to learn about is mistral workflows deprecation; I don't think I'd be the best person to drive this conversation though, would be nice if somebody else could pick it up I think it would also be interesting to try generalize the ffu process and review the existing process/structures created to support upgrade and ffu but not sure if we have upgrade/ffu experts in shanghai? Hopefully we can find at least half day to get people together > [1] https://etherpad.openstack.org/p/tripleo-ussuri-topics -- Giulio Fidente GPG KEY: 08D733BA From zbitter at redhat.com Tue Oct 8 21:22:34 2019 From: zbitter at redhat.com (Zane Bitter) Date: Tue, 8 Oct 2019 17:22:34 -0400 Subject: [requirements][heat] remove salt from requirements (used by heat-agents tests only) In-Reply-To: <20191007201553.xvaeejp2meoyw3ea@mthode.org> References: <20191007201553.xvaeejp2meoyw3ea@mthode.org> Message-ID: <167b0004-862d-d689-511a-504585ebf2f9@redhat.com> On 7/10/19 4:15 PM, Matthew Thode wrote: > Salt has been harsh to deal with. :( > Upstream adding and maintaining caps has caused it to be held back. > This time it's pyyaml, I'm not going to hold back the version of pyyaml > for one import of salt. > > In any case, heat-agents uses salt in one location and may not even be > using the one we define via constraints in any case. > > File: heat-config-salt/install.d/50-heat-config-hook-salt > > Installs salt from package then runs > heat-config-salt/install.d/hook-salt.py This is true, in that the repo itself appears in the form of a set of disk-image-builder elements. (Though it may not necessarily be used this way - for example in RDO we package the actual agents as RPMs, ignoring the d-i-b elements.) > In heat-config-salt/install.d/hook-salt.py is defined the only import of > salt I can find and likely uses the package version as it's installed > after tox sets things up. However, that module is tested in the unit tests (which is why salt appears in test-requirements.txt). > Is the heat team ok with this? We discussed this a little at the time that I added it to global constraints: https://review.opendev.org/604386 The issue for us is that we'd like to be able to use a lower-constraints job. There's a certain library (*cough*paunch) that keeps releasing new major versions, so it's very helpful to have tests to verify when we rewrite for the new API whether or not we have to bump the minimum version. The rest of the requirements tooling seems useful as well, and given that the team obviously maintains other repos in OpenStack we know how to use it, and it gives some confidence that we're providing the right guidance to distros wanting to package this. That said, nothing in heat-agents necessarily needs to be co-installable with OpenStack - the agents run on guest machines. So if it's not tied to the global-requirements any more then that may not be the worst thing. But IIRC when we last discussed this there was no recommended way for a project to run in that kind of configuration. If somebody with more knowledge of the requirements tooling were able to help out with suggestions then I'd be more than happy to implement them. cheers, Zane. From daniel at preussker.net Tue Oct 8 09:20:08 2019 From: daniel at preussker.net (Daniel 'f0o' Preussker) Date: Tue, 8 Oct 2019 11:20:08 +0200 Subject: [OSSA-2019-005] Octavia Amphora-Agent not requiring Client-Certificate (CVE-2019-17134) Message-ID: ===================================================================== OSSA-2019-005: Octavia Amphora-Agent not requiring Client-Certificate ===================================================================== :Date: October 07, 2019 :CVE: CVE-2019-17134 Affects ~~~~~~~ - Octavia: >=0.10.0 <2.1.2, >=3.0.0 <3.2.0, >=4.0.0 <4.1.0 Description ~~~~~~~~~~~ Daniel Preussker reported a vulnerability in amphora-agent, running within Octavia Amphora Instances which allows unauthenticated access from the management network. This leads to information disclosure and also allows changes to the configuration of the Amphora via simple HTTP requests because cmd/agent.py gunicorn cert_reqs option is incorrectly set to True instead of ssl.CERT_REQUIRED. Patches ~~~~~~~ - https://review.opendev.org/686547 (Ocata) - https://review.opendev.org/686546 (Pike) - https://review.opendev.org/686545 (Queens) - https://review.opendev.org/686544 (Rocky) - https://review.opendev.org/686543 (Stein) - https://review.opendev.org/686541 (Train) Credits ~~~~~~~ - Daniel Preussker (CVE-2019-17134) References ~~~~~~~~~~ - https://storyboard.openstack.org/#!/story/2006660 - http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17134 Notes ~~~~~ - The stable/ocata and stable/pike branches are under extended maintenance and will receive no new point releases, but patches for them are provided as a courtesy. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From akalambu at cisco.com Tue Oct 8 19:55:59 2019 From: akalambu at cisco.com (Ajay Kalambur (akalambu)) Date: Tue, 8 Oct 2019 19:55:59 +0000 Subject: [openstack][heat-cfn] CFN Signaling with heat In-Reply-To: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> References: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> Message-ID: Would be great if someone has an example template where CFN SIGNAL works so we can see whats going on From: "Ajay Kalambur (akalambu)" Date: Saturday, October 5, 2019 at 10:34 AM To: "openstack-discuss at lists.openstack.org" Subject: [openstack][heat-cfn] CFN Signaling with heat Hi I was trying the Software Deployment/Structured deployment of heat. I somehow can never get the signaling to work I see that authentication is happening but I don’t see a POST from the VM as a result stack is stuck in CREATE_IN_PROGRESS I see this message in my heat api cfn log which seems to suggest authentication is successful but it does not seem to POST. Have included debug output from VM and also the sample heat template I used. Don’t know if the template is correct as I referred some online examples to build it 2019-10-05 10:30:00.908 7 INFO heat.api.aws.ec2token [-] Checking AWS credentials.. 2019-10-05 10:30:00.909 7 INFO heat.api.aws.ec2token [-] AWS credentials found, checking against keystone. 2019-10-05 10:30:00.910 7 INFO heat.api.aws.ec2token [-] Authenticating with http://10.10.173.9:5000/v3/ec2tokens 2019-10-05 10:30:01.315 7 INFO heat.api.aws.ec2token [-] AWS authentication successful. 2019-10-05 10:30:02.326 7 INFO eventlet.wsgi.server [req-506f22c6-4062-4a84-8e85-40317a4099ed - adccd09df89e4b71b0a42f462679e75a-b1c6eb69-3877-466b-b00d-03dc051 - 0ecadd4762a34de1ac08508db4d3caa9 0ecadd4762a34de1ac08508db4d3caa9] 10.11.59.36,10.10.173.9 - - [05/Oct/2019 10:30:02] "GET /v1/?SignatureVersion=2&AWSAccessKeyId=f7874ac9898248edaae53511230534a4&StackName=test_stack&SignatureMethod=HmacSHA256&Signature=c03Q7Hb35q9tPPuYOv6YByn5YekF96p2s5zx36sX7x4%3D&Action=DescribeStackResource&LogicalResourceId=sig-vm-1 HTTP/1.1" 200 4669 1.418045 Some debugging output from my VM: [root at sig-vm-1 fedora]# sudo os-collect-config --force --one-time --debug /var/lib/os-collect-config/local-data not found. Skipping [2019-10-05 17:32:47,058] (os-refresh-config) [INFO] Starting phase pre-configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Target: pre-configure.d dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:47,091] (os-refresh-config) [INFO] Completed phase pre-configure [2019-10-05 17:32:47,092] (os-refresh-config) [INFO] Starting phase configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/20-os-apply-config [2019/10/05 05:32:47 PM] [INFO] writing /var/run/heat-config/heat-config [2019/10/05 05:32:47 PM] [INFO] writing /etc/os-collect-config.conf [2019/10/05 05:32:47 PM] [INFO] success dib-run-parts Sat Oct 5 17:32:47 UTC 2019 20-os-apply-config completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/50-heat-config-docker-compose dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-docker-compose completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/50-heat-config-kubelet dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-kubelet completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/55-heat-config [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group Heat::Ungrouped with no hook script None [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group Heat::Ungrouped with no hook script None dib-run-parts Sat Oct 5 17:32:47 UTC 2019 55-heat-config completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Target: configure.d dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 20-os-apply-config 0.345 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-docker-compose 0.064 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-kubelet 0.134 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 55-heat-config 0.065 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Completed phase configure [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Starting phase post-configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/post-configure.d/99-refresh-completed ++ os-apply-config --key completion-handle --type raw --key-default '' + HANDLE= ++ os-apply-config --key completion-signal --type raw --key-default '' + SIGNAL= ++ os-apply-config --key instance-id --type raw --key-default '' + ID=i-0000000d + '[' -n i-0000000d ']' + '[' -n '' ']' + '[' -n '' ']' ++ os-apply-config --key deployments --type raw --key-default '' ++ jq -r 'map(select(.group == "os-apply-config") | select(.inputs[].name == "deploy_signal_id") | .id + (.inputs | map(select(.name == "deploy_signal_id")) | .[].value)) | .[]' + DEPLOYMENTS= + DEPLOYED_DIR=/var/lib/os-apply-config-deployments/deployed + '[' '!' -d /var/lib/os-apply-config-deployments/deployed ']' dib-run-parts Sat Oct 5 17:32:49 UTC 2019 99-refresh-completed completed dib-run-parts Sat Oct 5 17:32:49 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Target: post-configure.d dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 99-refresh-completed 1.206 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:49,041] (os-refresh-config) [INFO] Completed phase post-configure [2019-10-05 17:32:49,042] (os-refresh-config) [INFO] Starting phase migration dib-run-parts Sat Oct 5 17:32:49 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Target: migration.d dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:49,073] (os-refresh-config) [INFO] Completed phase migration onfig]# cat /var/run/heat-config/heat-config [{"inputs": [{"type": "String", "name": "foo", "value": "fu"}, {"type": "String", "name": "bar", "value": "barmy"}, {"type": "String", "name": "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of the server being deployed to"}, {"type": "String", "name": "deploy_action", "value": "CREATE", "description": "Name of the current action being deployed"}, {"type": "String", "name": "deploy_stack_id", "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of the stack this deployment belongs to"}, {"type": "String", "name": "deploy_resource_name", "value": "other_deployment", "description": "Name of this deployment resource in the stack"}, {"type": "String", "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": "How the server should signal to heat with the deployment output values."}, {"type": "String", "name": "deploy_signal_id", "value": "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/other_deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=28a09f5d996240b8b4a117ecb0e0142b&SignatureVersion=2&Signature=IqXbRf9MzJ%2FnzqM7CLNAsR3BiwmaaHyWQspegxYc3D8%3D", "description": "ID of signal to use for signaling output values"}, {"type": "String", "name": "deploy_signal_verb", "value": "POST", "description": "HTTP verb to use for signaling outputvalues"}], "group": "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": {"config_value_foo": "fu", "config_value_bar": "barmy"}, "id": "5c404619-ce79-48cd-b001-00ac6ff4f4e8"}, {"inputs": [{"type": "String", "name": "foo", "value": "fooooo"}, {"type": "String", "name": "bar", "value": "baaaaa"}, {"type": "String", "name": "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of the server being deployed to"}, {"type": "String", "name": "deploy_action", "value": "CREATE", "description": "Name of the current action being deployed"}, {"type": "String", "name": "deploy_stack_id", "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of the stack this deployment belongs to"}, {"type": "String", "name": "deploy_resource_name", "value": "deployment", "description": "Name of this deployment resource in the stack"}, {"type": "String", "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": "How the server should signal to heat with the deployment output values."}, {"type": "String", "name": "deploy_signal_id", "value": "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=4c3d718796e0452ea94f2ce8dc6973ef&SignatureVersion=2&Signature=rxtSBNUSF%2FEXn9wvVK4XMU%2F1RzXVDGILtZr1hmkl7gg%3D", "description": "ID of signal to use for signaling output values"}, {"type": "String", "name": "deploy_signal_verb", "value": "POST", "description": "HTTP verb to use for signaling outputvalues"}], "group": "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": {"config_value_foo": "fooooo", "config_value_bar": "baaaaa"}, "id": "f4dea0c1-73c9-4ce4-aa04-c76ef9b08859"}][root at sig-vm-1 heat-config]# [root at sig-vm-1 heat-config]# cat /etc/os-collect-config.conf [DEFAULT] command = os-refresh-config collectors = ec2 collectors = cfn collectors = local [cfn] metadata_url = http://172.29.85.87:8000/v1/ stack_name = test_stack secret_access_key = npa^GWsPtbRL7D*MYObOI*kV0i1yqKOG access_key_id = f7874ac9898248edaae53511230534a4 path = sig-vm-1.Metadata Here is my basic sample temple heat_template_version: 2013-05-23 description: > This template demonstrates how to use OS::Heat::StructuredDeployment to override substitute get_input placeholders defined in OS::Heat::StructuredConfig config. As there is no hook on the server to act on the configuration data, these deployment resource will perform no actual configuration. parameters: flavor: type: string default: 'a061cb6c-99e7-4bdb-93e4-f0037ee3e947' image: type: string default: 3be29d9f-2ce6-4b95-b80c-0dbca7acfdfe public_net_id: type: string default: 67ae0e17-6258-4fb6-8b9b-0f29f6adb9db private_net_id: type: string description: Private network id default: 995fc046-1c58-468a-b81c-e42c06fc8966 private_subnet_id: type: string description: Private subnet id default: 7598c805-3a9b-4c27-be5b-dca4d89f058c password: type: string description: SSH password default: lab123 resources: the_sg: type: OS::Neutron::SecurityGroup properties: name: the_sg description: Ping and SSH rules: - protocol: icmp - protocol: tcp port_range_min: 22 port_range_max: 22 config: type: OS::Heat::StructuredConfig properties: config: config_value_foo: {get_input: foo} config_value_bar: {get_input: bar} deployment: type: OS::Heat::StructuredDeployment properties: signal_transport: CFN_SIGNAL config: get_resource: config server: get_resource: sig-vm-1 input_values: foo: fooooo bar: baaaaa other_deployment: type: OS::Heat::StructuredDeployment properties: signal_transport: CFN_SIGNAL config: get_resource: config server: get_resource: sig-vm-1 input_values: foo: fu bar: barmy server1_port0: type: OS::Neutron::Port properties: network_id: { get_param: private_net_id } security_groups: - default fixed_ips: - subnet_id: { get_param: private_subnet_id } server1_public: type: OS::Neutron::FloatingIP properties: floating_network_id: { get_param: public_net_id } port_id: { get_resource: server1_port0 } sig-vm-1: type: OS::Nova::Server properties: name: sig-vm-1 image: { get_param: image } flavor: { get_param: flavor } networks: - port: { get_resource: server1_port0 } user_data_format: SOFTWARE_CONFIG user_data: get_resource: cloud_config cloud_config: type: OS::Heat::CloudConfig properties: cloud_config: password: { get_param: password } chpasswd: { expire: False } ssh_pwauth: True -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Wed Oct 9 00:02:40 2019 From: mthode at mthode.org (Matthew Thode) Date: Tue, 8 Oct 2019 19:02:40 -0500 Subject: [requirements][heat] remove salt from requirements (used by heat-agents tests only) In-Reply-To: <167b0004-862d-d689-511a-504585ebf2f9@redhat.com> References: <20191007201553.xvaeejp2meoyw3ea@mthode.org> <167b0004-862d-d689-511a-504585ebf2f9@redhat.com> Message-ID: <20191009000240.a4joopyehlx7pdk6@mthode.org> On 19-10-08 17:22:34, Zane Bitter wrote: > On 7/10/19 4:15 PM, Matthew Thode wrote: > > Salt has been harsh to deal with. > > :( > > > Upstream adding and maintaining caps has caused it to be held back. > > This time it's pyyaml, I'm not going to hold back the version of pyyaml > > for one import of salt. > > > > In any case, heat-agents uses salt in one location and may not even be > > using the one we define via constraints in any case. > > > > File: heat-config-salt/install.d/50-heat-config-hook-salt > > > > Installs salt from package then runs > > heat-config-salt/install.d/hook-salt.py > > This is true, in that the repo itself appears in the form of a set of > disk-image-builder elements. (Though it may not necessarily be used this way > - for example in RDO we package the actual agents as RPMs, ignoring the > d-i-b elements.) > > > In heat-config-salt/install.d/hook-salt.py is defined the only import of > > salt I can find and likely uses the package version as it's installed > > after tox sets things up. > > However, that module is tested in the unit tests (which is why salt appears > in test-requirements.txt). > > > Is the heat team ok with this? > > We discussed this a little at the time that I added it to global > constraints: https://review.opendev.org/604386 > > The issue for us is that we'd like to be able to use a lower-constraints > job. There's a certain library (*cough*paunch) that keeps releasing new > major versions, so it's very helpful to have tests to verify when we rewrite > for the new API whether or not we have to bump the minimum version. The rest > of the requirements tooling seems useful as well, and given that the team > obviously maintains other repos in OpenStack we know how to use it, and it > gives some confidence that we're providing the right guidance to distros > wanting to package this. > > That said, nothing in heat-agents necessarily needs to be co-installable > with OpenStack - the agents run on guest machines. So if it's not tied to > the global-requirements any more then that may not be the worst thing. But > IIRC when we last discussed this there was no recommended way for a project > to run in that kind of configuration. If somebody with more knowledge of the > requirements tooling were able to help out with suggestions then I'd be more > than happy to implement them. Ya, I remember that conversation :D If it helps I think we can sidestep this as the tests are currenly not using salt managed by requirements. Also, salt will not be updated while they are capping pyyaml. IIRC there was a way to tell the package installer a version you wanted. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From zbitter at redhat.com Wed Oct 9 01:29:27 2019 From: zbitter at redhat.com (Zane Bitter) Date: Tue, 8 Oct 2019 21:29:27 -0400 Subject: [openstack][heat-cfn] CFN Signaling with heat In-Reply-To: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> References: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> Message-ID: <053c6d35-6834-8e09-2cd9-d90f030b2833@redhat.com> I'm not an expert on stuff that happens on the guest, but it looks like this is the problem: > [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group > Heat::Ungrouped with no hook script None You're using the default group that has no handler for it configured. It looks like 55_heat_config bails out before attempting to signal a response in this case. (That seems crazy to me, but here we are.) Try configuring a group (like 'script') that actually does something. Also why not use HEAT_SIGNAL as the transport? It's 2019 ;) cheers, Zane. On 5/10/19 1:34 PM, Ajay Kalambur (akalambu) wrote: > Hi > > I was trying the Software Deployment/Structured deployment of heat. > > I somehow can never get the signaling to work I see that authentication > is happening but I don’t see a POST from the VM as a result stack is > stuck in CREATE_IN_PROGRESS > > I see this message in my heat api cfn log which seems to suggest > authentication is successful but it does not seem to POST. Have included > debug output from VM and also the sample heat template I used. Don’t > know if the template is correct as I referred some online examples to > build it > > 2019-10-05 10:30:00.908 7 INFO heat.api.aws.ec2token [-] Checking AWS > credentials.. > > 2019-10-05 10:30:00.909 7 INFO heat.api.aws.ec2token [-] AWS credentials > found, checking against keystone. > > 2019-10-05 10:30:00.910 7 INFO heat.api.aws.ec2token [-] Authenticating > with http://10.10.173.9:5000/v3/ec2tokens > > 2019-10-05 10:30:01.315 7 INFO heat.api.aws.ec2token [-] AWS > authentication successful. > > 2019-10-05 10:30:02.326 7 INFO eventlet.wsgi.server > [req-506f22c6-4062-4a84-8e85-40317a4099ed - > adccd09df89e4b71b0a42f462679e75a-b1c6eb69-3877-466b-b00d-03dc051 - > 0ecadd4762a34de1ac08508db4d3caa9 0ecadd4762a34de1ac08508db4d3caa9] > 10.11.59.36,10.10.173.9 - - [05/Oct/2019 10:30:02] "GET > /v1/?SignatureVersion=2&AWSAccessKeyId=f7874ac9898248edaae53511230534a4&StackName=test_stack&SignatureMethod=HmacSHA256&Signature=c03Q7Hb35q9tPPuYOv6YByn5YekF96p2s5zx36sX7x4%3D&Action=DescribeStackResource&LogicalResourceId=sig-vm-1 > HTTP/1.1" 200 4669 1.418045 > > Some debugging output from my VM: > > [root at sig-vm-1 fedora]# sudo os-collect-config --force --one-time --debug > > /var/lib/os-collect-config/local-data not found. Skipping > > [2019-10-05 17:32:47,058] (os-refresh-config) [INFO] Starting phase > pre-configure > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 ----------------------- > PROFILING ----------------------- > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Target: pre-configure.d > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > Script                                     Seconds > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > ---------------------------------------  ---------- > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 --------------------- END > PROFILING --------------------- > > [2019-10-05 17:32:47,091] (os-refresh-config) [INFO] Completed phase > pre-configure > > [2019-10-05 17:32:47,092] (os-refresh-config) [INFO] Starting phase > configure > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/configure.d/20-os-apply-config > > [2019/10/05 05:32:47 PM] [INFO] writing /var/run/heat-config/heat-config > > [2019/10/05 05:32:47 PM] [INFO] writing /etc/os-collect-config.conf > > [2019/10/05 05:32:47 PM] [INFO] success > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 20-os-apply-config completed > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/configure.d/50-heat-config-docker-compose > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 50-heat-config-docker-compose > completed > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/configure.d/50-heat-config-kubelet > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 50-heat-config-kubelet completed > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/configure.d/55-heat-config > > [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group > Heat::Ungrouped with no hook script None > > [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group > Heat::Ungrouped with no hook script None > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 55-heat-config completed > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 ----------------------- > PROFILING ----------------------- > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Target: configure.d > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Script >                        Seconds > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > ---------------------------------------  ---------- > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > 20-os-apply-config                            0.345 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > 50-heat-config-docker-compose                 0.064 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > 50-heat-config-kubelet                        0.134 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > 55-heat-config                                0.065 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 --------------------- END > PROFILING --------------------- > > [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Completed phase > configure > > [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Starting phase > post-configure > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/post-configure.d/99-refresh-completed > > ++ os-apply-config --key completion-handle --type raw --key-default '' > > + HANDLE= > > ++ os-apply-config --key completion-signal --type raw --key-default '' > > + SIGNAL= > > ++ os-apply-config --key instance-id --type raw --key-default '' > > + ID=i-0000000d > > + '[' -n i-0000000d ']' > > + '[' -n '' ']' > > + '[' -n '' ']' > > ++ os-apply-config --key deployments --type raw --key-default '' > > ++ jq -r 'map(select(.group == "os-apply-config") | > >               select(.inputs[].name == "deploy_signal_id") | > >               .id + (.inputs | map(select(.name == "deploy_signal_id")) > | .[].value)) | > >               .[]' > > + DEPLOYMENTS= > > + DEPLOYED_DIR=/var/lib/os-apply-config-deployments/deployed > > + '[' '!' -d /var/lib/os-apply-config-deployments/deployed ']' > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 99-refresh-completed completed > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 ----------------------- > PROFILING ----------------------- > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 Target: post-configure.d > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > Script                                     Seconds > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > ---------------------------------------  ---------- > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > 99-refresh-completed                          1.206 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 --------------------- END > PROFILING --------------------- > > [2019-10-05 17:32:49,041] (os-refresh-config) [INFO] Completed phase > post-configure > > [2019-10-05 17:32:49,042] (os-refresh-config) [INFO] Starting phase > migration > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 ----------------------- > PROFILING ----------------------- > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 Target: migration.d > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > Script                                     Seconds > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > ---------------------------------------  ---------- > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 --------------------- END > PROFILING --------------------- > > [2019-10-05 17:32:49,073] (os-refresh-config) [INFO] Completed phase > migration > > onfig]# cat /var/run/heat-config/heat-config > > [{"inputs": [{"type": "String", "name": "foo", "value": "fu"}, {"type": > "String", "name": "bar", "value": "barmy"}, {"type": "String", "name": > "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", > "description": "ID of the server being deployed to"}, {"type": "String", > "name": "deploy_action", "value": "CREATE", "description": "Name of the > current action being deployed"}, {"type": "String", "name": > "deploy_stack_id", "value": > "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of > the stack this deployment belongs to"}, {"type": "String", "name": > "deploy_resource_name", "value": "other_deployment", "description": > "Name of this deployment resource in the stack"}, {"type": "String", > "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": > "How the server should signal to heat with the deployment output > values."}, {"type": "String", "name": "deploy_signal_id", "value": > "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/other_deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=28a09f5d996240b8b4a117ecb0e0142b&SignatureVersion=2&Signature=IqXbRf9MzJ%2FnzqM7CLNAsR3BiwmaaHyWQspegxYc3D8%3D", > "description": "ID of signal to use for signaling output values"}, > {"type": "String", "name": "deploy_signal_verb", "value": "POST", > "description": "HTTP verb to use for signaling outputvalues"}], "group": > "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": > [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": > {"config_value_foo": "fu", "config_value_bar": "barmy"}, "id": > "5c404619-ce79-48cd-b001-00ac6ff4f4e8"}, {"inputs": [{"type": "String", > "name": "foo", "value": "fooooo"}, {"type": "String", "name": "bar", > "value": "baaaaa"}, {"type": "String", "name": "deploy_server_id", > "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of > the server being deployed to"}, {"type": "String", "name": > "deploy_action", "value": "CREATE", "description": "Name of the current > action being deployed"}, {"type": "String", "name": "deploy_stack_id", > "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", > "description": "ID of the stack this deployment belongs to"}, {"type": > "String", "name": "deploy_resource_name", "value": "deployment", > "description": "Name of this deployment resource in the stack"}, > {"type": "String", "name": "deploy_signal_transport", "value": > "CFN_SIGNAL", "description": "How the server should signal to heat with > the deployment output values."}, {"type": "String", "name": > "deploy_signal_id", "value": > "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=4c3d718796e0452ea94f2ce8dc6973ef&SignatureVersion=2&Signature=rxtSBNUSF%2FEXn9wvVK4XMU%2F1RzXVDGILtZr1hmkl7gg%3D", > "description": "ID of signal to use for signaling output values"}, > {"type": "String", "name": "deploy_signal_verb", "value": "POST", > "description": "HTTP verb to use for signaling outputvalues"}], "group": > "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": > [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": > {"config_value_foo": "fooooo", "config_value_bar": "baaaaa"}, "id": > "f4dea0c1-73c9-4ce4-aa04-c76ef9b08859"}][root at sig-vm-1 heat-config]# > > [root at sig-vm-1 heat-config]# cat /etc/os-collect-config.conf > > [DEFAULT] > > command = os-refresh-config > > collectors = ec2 > > collectors = cfn > > collectors = local > > [cfn] > > metadata_url = http://172.29.85.87:8000/v1/ > > stack_name = test_stack > > secret_access_key = npa^GWsPtbRL7D*MYObOI*kV0i1yqKOG > > access_key_id = f7874ac9898248edaae53511230534a4 > > path = sig-vm-1.Metadata > > *Here is my basic sample temple* > > heat_template_version: 2013-05-23 > > description: > > >   This template demonstrates how to use OS::Heat::StructuredDeployment > >   to override substitute get_input placeholders defined in > >   OS::Heat::StructuredConfig config. > >   As there is no hook on the server to act on the configuration data, > >   these deployment resource will perform no actual configuration. > > parameters: > >   flavor: > >     type: string > >     default: 'a061cb6c-99e7-4bdb-93e4-f0037ee3e947' > >   image: > >     type: string > >     default: 3be29d9f-2ce6-4b95-b80c-0dbca7acfdfe > >   public_net_id: > >     type: string > >     default: 67ae0e17-6258-4fb6-8b9b-0f29f6adb9db > >   private_net_id: > >     type: string > >     description: Private network id > >     default: 995fc046-1c58-468a-b81c-e42c06fc8966 > >   private_subnet_id: > >     type: string > >     description: Private subnet id > >     default: 7598c805-3a9b-4c27-be5b-dca4d89f058c > >   password: > >     type: string > >     description: SSH password > >     default: lab123 > > resources: > >   the_sg: > >     type: OS::Neutron::SecurityGroup > >     properties: > >       name: the_sg > >       description: Ping and SSH > >       rules: > >       - protocol: icmp > >       - protocol: tcp > >         port_range_min: 22 > >         port_range_max: 22 > >   config: > >     type: OS::Heat::StructuredConfig > >     properties: > >       config: > >        config_value_foo: {get_input: foo} > >        config_value_bar: {get_input: bar} > >   deployment: > >     type: OS::Heat::StructuredDeployment > >     properties: > >       signal_transport: CFN_SIGNAL > >       config: > >         get_resource: config > >       server: > >         get_resource: sig-vm-1 > >       input_values: > >         foo: fooooo > >         bar: baaaaa > >   other_deployment: > >     type: OS::Heat::StructuredDeployment > >     properties: > >       signal_transport: CFN_SIGNAL > >       config: > >         get_resource: config > >       server: > >         get_resource: sig-vm-1 > >       input_values: > >         foo: fu > >         bar: barmy > >   server1_port0: > >     type: OS::Neutron::Port > >     properties: > >       network_id: { get_param: private_net_id } > >       security_groups: > >         - default > >       fixed_ips: > >         - subnet_id: { get_param: private_subnet_id } > >   server1_public: > >     type: OS::Neutron::FloatingIP > >     properties: > >       floating_network_id: { get_param: public_net_id } > >       port_id: { get_resource: server1_port0 } > >   sig-vm-1: > >     type: OS::Nova::Server > >     properties: > >       name: sig-vm-1 > >       image: { get_param: image } > >       flavor: { get_param: flavor } > >       networks: > >         - port: { get_resource: server1_port0 } > >       user_data_format: SOFTWARE_CONFIG > >       user_data: > >         get_resource: cloud_config > >   cloud_config: > >     type: OS::Heat::CloudConfig > >     properties: > >       cloud_config: > >         password: { get_param: password } > >         chpasswd: { expire: False } > >         ssh_pwauth: True > From li.canwei2 at zte.com.cn Wed Oct 9 03:55:28 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 9 Oct 2019 11:55:28 +0800 (CST) Subject: =?UTF-8?B?W1dhdGNoZXJdIHRlYW0gbWVldGluZyBhdCAwODowMCBVVEMgdG9kYXk=?= Message-ID: <201910091155285054778@zte.com.cn> Hi, Watcher team will have a meeting at 08:00 UTC today in the #openstack-meeting-alt channel. The agenda is available on https://wiki.openstack.org/wiki/Watcher_Meeting_Agenda feel free to add any additional items. Thanks! Canwei Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From frode.nordahl at canonical.com Wed Oct 9 06:05:56 2019 From: frode.nordahl at canonical.com (Frode Nordahl) Date: Wed, 9 Oct 2019 08:05:56 +0200 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Fri, Oct 4, 2019 at 3:46 PM Corey Bryant wrote: > Hi All, > Hey Corey, Great to see the charm coming along! Code is located at: > https://github.com/coreycb/charm-placement > https://github.com/coreycb/charm-interface-placement > > https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) > 1) Since the interface is new I would love to see it based on the ``Endpoint`` class instead of the aging ``RelationBase`` class. Also the interface code needs unit tests. We have multiple examples of interface implementations with both in place you can get inspiration from [0]. Also consider having both a ``connected`` and ``available`` state, the available state could be set on the first relation-changed event. This increases the probability of your charm detecting a live charm in the other end of the relation, both states are also required to use the ``charms.openstack`` required relation gating code. 2) In the reactive handler you do a bespoke import of the charm class module just to activate the code, this is no longer necessary as there has been implemented a module that does automatic search and import of the class for you. Please use that instead. [1] import charms_openstack.bus import charms_openstack.charm as charm charms_openstack.bus.discover() 0: https://github.com/search?q=org%3Aopenstack+%22from+charms.reactive+import+Endpoint%22&type=Code 1: https://github.com/search?q=org%3Aopenstack+charms_openstack.bus&type=Code -- Frode Nordahl -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Wed Oct 9 08:02:54 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Wed, 09 Oct 2019 10:02:54 +0200 Subject: [tc] Weekly update Message-ID: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> Hello friends, Here's what need attention for the OpenStack TC this week. 1. You should probably prepare our next meeting, happening on Thursday. Alexandra is preparing the topics and warming up the gifs already. 2. We still need someone to step up for the OpenStack User survey on the ML [2] 3. We have plenty of patches which haven't received a vote. 4. We only have two goals for Ussuri [3]. Having more goals makes it easier to select the goals amongst the suggested ones :) If you can socialize about those, that would be awesome. Thank you everyone! JP & Rico [1]: https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [2]: http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html [3]: https://etherpad.openstack.org/p/PVG-u-series-goals From dtantsur at redhat.com Wed Oct 9 10:04:11 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 9 Oct 2019 12:04:11 +0200 Subject: Release Cycle Observations In-Reply-To: References: <40ab2bd3-e23a-6877-e515-63bbc1663f66@gmail.com> <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> Message-ID: On Wed, Oct 2, 2019 at 10:31 AM Thomas Goirand wrote: > On 10/1/19 12:05 PM, Dmitry Tantsur wrote: > > > > > > On Fri, Sep 27, 2019 at 10:47 PM Thomas Goirand > > wrote: > > > > On 9/26/19 9:51 PM, Sean McGinnis wrote: > > >> I know we'd like to have everyone CD'ing master > > > > > > Watch who you're lumping in with the "we" statement. ;) > > > > You've pinpointed what the problem is. > > > > Everyone but OpenStack upstream would like to stop having to upgrade > > every 6 months. > > > > > > Yep, but the same "everyone" want to have features now or better > > yesterday, not in 2-3 years ;) > > This probably was the case a few years ago, when OpenStack was young. > Now that it has matured, and has all the needed features, things have > changed a lot. > This is still the case often enough in my world. IPv6 comes to mind as an example. > > Thomas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.settle at outlook.com Wed Oct 9 10:13:47 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Wed, 9 Oct 2019 10:13:47 +0000 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Thanks so much! On Mon, 2019-10-07 at 10:09 -0700, Kendall Nelson wrote: > Hey Alex, > > So since the TC stuff is Friday we managed to shuffle things around > and now docs has the afternoon on Thursday. > > We will get the final schedule up on the website soon. > > -Kendall (diablo_rojo) > > On Thu, Oct 3, 2019 at 9:32 AM Kendall Waters > wrote: > > Hey Alex, > > > > We still have tables available on Friday. Would half a day on > > Friday work for the docs team? Unless Ian is okay with it, we can > > combine Docs with i18n in their Wednesday afternoon/Thursday > > morning slot. Just let me know! > > > > Cheers, > > Kendall > > > > > > > > Kendall Waters > > OpenStack Marketing & Events > > kendall at openstack.org > > > > > > > > > On Oct 3, 2019, at 4:26 AM, Alexandra Settle > > m> wrote: > > > > > > Hey, > > > > > > Could you add something for docs? Or combine with i18n again if > > > Ian > > > doesn't mind? > > > > > > We don't need a lot, just a room for people to ask questions > > > about the > > > future of the docs team. > > > > > > Stephen will be there, as co-PTL. There's 0 chance of it not > > > conflicting with nova. > > > > > > Please :) > > > > > > Thank you! > > > > > > Alex > > > > > > On Wed, 2019-09-25 at 14:13 -0700, Kendall Nelson wrote: > > > > Hello Everyone! > > > > > > > > In the attached picture or link [0] you will find the proposed > > > > schedule for the various tracks at the Shanghai PTG in > > > > November. > > > > > > > > We did our best to avoid the key conflicts that the track leads > > > > (PTLs, SIG leads...) mentioned in their PTG survey responses, > > > > although there was no perfect solution that would avoid all > > > > conflicts > > > > especially when the event is three-ish days long and we have > > > > over 40 > > > > teams meeting. > > > > > > > > If there are critical conflicts we missed or other issues, > > > > please let > > > > us know, by October 6th at 7:00 UTC! > > > > > > > > -Kendall (diablo_rojo) > > > > > > > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_ > > > > schedu > > > > le.png > > > -- > > > Alexandra Settle > > > IRC: asettle -- Alexandra Settle IRC: asettle From smooney at redhat.com Wed Oct 9 10:40:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 09 Oct 2019 11:40:54 +0100 Subject: Release Cycle Observations In-Reply-To: References: <40ab2bd3-e23a-6877-e515-63bbc1663f66@gmail.com> <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> Message-ID: On Wed, 2019-10-09 at 12:04 +0200, Dmitry Tantsur wrote: > On Wed, Oct 2, 2019 at 10:31 AM Thomas Goirand wrote: > > > On 10/1/19 12:05 PM, Dmitry Tantsur wrote: > > > > > > > > > On Fri, Sep 27, 2019 at 10:47 PM Thomas Goirand > > > wrote: > > > > > > On 9/26/19 9:51 PM, Sean McGinnis wrote: > > > >> I know we'd like to have everyone CD'ing master > > > > > > > > Watch who you're lumping in with the "we" statement. ;) > > > > > > You've pinpointed what the problem is. > > > > > > Everyone but OpenStack upstream would like to stop having to upgrade > > > every 6 months. im not sure that is true. i think if upgrades where as easy as a yum update or apt upgrade people would not mind 6 month or shorter upgrade cycle but even though tooling has imporoved we are a long way from upgrades being trivial. > > > > > > > > > Yep, but the same "everyone" want to have features now or better > > > yesterday, not in 2-3 years ;) yes and this is a double edge sword in more ways then one. we have a large proportion of our customer base that are only now upgrading to queens from Newton. so they are already running a 2-3 year out of date openstack and when they upgrade they would also like all the features that were only added in train backported to Queens which is our current LTS donwstream. Our internal data on deployments more or less shows that most non lts releases downstream are ignored by larger customers createing a pressure to backport features that we cant resonably do given our current tooling and desire to not create a large fork. > > > > This probably was the case a few years ago, when OpenStack was young. > > Now that it has matured, and has all the needed features, things have > > changed a lot. > > i dont think it has. i think many of the need feature are now avaiable in master although looking at our downstream back log there are also a lot of feature that are not avilable. the issue is that because upgrading has been so painful for many for so long they are not willing in many case to go to the latest release. maybe in another 2 years time this statement will be more correct as the majority of clouds will be running stien+(i hope). > > This is still the case often enough in my world. IPv6 comes to mind as an > example. > > > > > > Thomas > > > > From jungleboyj at gmail.com Wed Oct 9 13:28:16 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 9 Oct 2019 08:28:16 -0500 Subject: [tc] Weekly update In-Reply-To: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> References: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> Message-ID: <5af7b363-4333-6fce-38c2-cf0dc8541d4c@gmail.com> JP, I thought I had responded to the ML about helping with the OpenStack user survey.  Maybe I only thought about responding and didn't actually do it.  :-) Anyway, I am willing to take a look at it and put together a summary.  What is the time frame the TC is looking for on this? Thanks! Jay On 10/9/2019 3:02 AM, Jean-Philippe Evrard wrote: > Hello friends, > > Here's what need attention for the OpenStack TC this week. > > 1. You should probably prepare our next meeting, happening on Thursday. > Alexandra is preparing the topics and warming up the gifs already. > 2. We still need someone to step up for the OpenStack User survey on > the ML [2] > 3. We have plenty of patches which haven't received a vote. > 4. We only have two goals for Ussuri [3]. Having more goals makes it > easier to select the goals amongst the suggested ones :) If you can > socialize about those, that would be awesome. > > Thank you everyone! > JP & Rico > > [1]: > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > [2]: > http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html > > [3]: https://etherpad.openstack.org/p/PVG-u-series-goals > > From jungleboyj at gmail.com Wed Oct 9 13:31:10 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 9 Oct 2019 08:31:10 -0500 Subject: [PTLs] [TC] OpenStack User Survey - PTL & TC Feedback Message-ID: <032d7c6a-6b73-4ff9-b061-468aed7b546e@gmail.com> Jimmy, I will take a User Survey feedback for the TC and let you know if we need additional information on anything. Thanks! Jay From sean.mcginnis at gmx.com Wed Oct 9 13:44:11 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 9 Oct 2019 08:44:11 -0500 Subject: [ptl][release] Last call for RC updates Message-ID: <20191009134411.GA9816@sm-workstation> Hey everyone, This is just a reminder about tomorrow's deadline for a final RC for Train. There are several projects that have changes merged since cutting the stable/train branch. Not all of these changes need to be included in the initial Train coordinated release, but it would be good if there are translations and bug fixes merged to get them into a final RC while there's still time. After tomorrow's (Oct 10) deadline, we will only want to release something if it's absolutely critical. We will enter a quiet period from tomorrow until the coordinated release date next week to give time for packagers to complete their work and to make sure things are stable. Next week the last RC releases will then be retagged as the final release. Again, not all changes need to be included if they are not critical bugfixes or translations at this point. Stable releases can be done at any point after the official release date. Thanks for your help as we reach the end of the Train. Sean From jean-philippe at evrard.me Wed Oct 9 15:39:34 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Wed, 09 Oct 2019 17:39:34 +0200 Subject: [tc] Weekly update In-Reply-To: <5af7b363-4333-6fce-38c2-cf0dc8541d4c@gmail.com> References: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> <5af7b363-4333-6fce-38c2-cf0dc8541d4c@gmail.com> Message-ID: On Wed, 2019-10-09 at 08:28 -0500, Jay Bryant wrote: > Anyway, I am willing to take a look at it and put together a > summary. > What is the time frame the TC is looking for on this? I think it's an important exercise. However, I don't think there is a strict timeline. As long as we learn from it, I would say that we are good. Maybe we could discuss the teachings of it during the summit? The next meeting (tomorrow) seems a little bit ambitious to me... Regards, JP From sean.mcginnis at gmx.com Wed Oct 9 15:46:29 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 9 Oct 2019 10:46:29 -0500 Subject: [ptl][release] Last call for RC updates In-Reply-To: <20191009134411.GA9816@sm-workstation> References: <20191009134411.GA9816@sm-workstation> Message-ID: <20191009154629.GA26100@sm-workstation> On Wed, Oct 09, 2019 at 08:44:11AM -0500, Sean McGinnis wrote: > Hey everyone, > > This is just a reminder about tomorrow's deadline for a final RC for Train. > > There are several projects that have changes merged since cutting the > stable/train branch. Not all of these changes need to be included in the > initial Train coordinated release, but it would be good if there are > translations and bug fixes merged to get them into a final RC while there's > still time. > To try to help some teams, I have proposed RC2 releases for those deliverables that looked like they had relevant things that would be good to pick up for the final Train release. They can be found under the train-rc2 topic: https://review.opendev.org/#/q/topic:train-rc2+(status:open+OR+status:merged) Again, not all changes are necessary to be included, so we will only process these if we get an explicit ack from the PTL or release liaison that the team actually wants these extra RC releases. Feel free to +1 if you would like us to proceed, or -1 if you do not want the RC or just need a little more time to get anything else merged before tomorrow's deadline. If the latter, please take over the patch and update with the new commit hash that should be used for the release. Thanks! Sean From jungleboyj at gmail.com Wed Oct 9 15:54:59 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 9 Oct 2019 10:54:59 -0500 Subject: [tc] Weekly update In-Reply-To: References: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> <5af7b363-4333-6fce-38c2-cf0dc8541d4c@gmail.com> Message-ID: On 10/9/2019 10:39 AM, Jean-Philippe Evrard wrote: > On Wed, 2019-10-09 at 08:28 -0500, Jay Bryant wrote: >> Anyway, I am willing to take a look at it and put together a >> summary. >> What is the time frame the TC is looking for on this? > I think it's an important exercise. > However, I don't think there is a strict timeline. As long as we learn > from it, I would say that we are good. > > Maybe we could discuss the teachings of it during the summit? The next > meeting (tomorrow) seems a little bit ambitious to me... > > Regards, > JP > JP, Awesome.  I will try to skim through what is out there before the meeting and at least have a proposal on how to proceed from tomorrow's meeting to the summit. Thanks! Jay From kendall at openstack.org Wed Oct 9 17:06:46 2019 From: kendall at openstack.org (Kendall Waters) Date: Wed, 9 Oct 2019 12:06:46 -0500 Subject: Important Shanghai PTG Information Message-ID: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> Hello Everyone! As I’m sure you already know, the Shanghai PTG is going to be a very different event from PTGs in the past so we wanted to spell out the differences so you can be better prepared. Registration & Badges Registration for the PTG is included in the cost of the Summit. It is a single registration for both events. Since there is a single registration for the event, there is also one badge for both events. You will pick it up when you check in for the Summit and keep it until the end of the PTG. The Space Rooms The space we are contracted to have for the PTG will be laid out differently. We only have a couple dedicated rooms which are allocated to those groups with the largest numbers of people. The rest of the teams will be in a single larger room together. To help people gather teams in an organized fashion, we will be naming the arrangements of tables after OpenStack releases (Austin, Bexar, Cactus, etc). Food & Beverage Rules Unfortunately, the venue does not allow ANY food or drink in any of the rooms. This includes coffee and tea. Lunch will be from 12:30 to 1:30 in the beautiful pre-function space outside of the Blue Hall. Moving Furniture You are allowed to! Yay! If the table arrangements your project/team/group lead requested don’t work for you, feel free to move the furniture around. That being said, try to keep the tables marked with their names so that others can find them during their time slots. There will also be extra chairs stacked in the corner if your team needs them. Hours This venue is particularly strict about the hours we are allowed to be there. The PTG is scheduled to run from 9:00 in the morning to 4:30 in the evening. Its reasonably likely that if you try to come early or stay late, security will talk to you. So please be kind and respectfully leave if they ask you to. Resources Power While we have been working with the venue to accomodate our power needs, we won’t have as many power strips as we have had in the past. For this reason, we want to remind everyone to charge all their devices every night and share the power strips we do have during the day. Sharing is caring! Flipcharts While we won’t have projection available, we will have some flipcharts around. Each dedicated room will have one flipchart and the big main room will have a few to share. Please feel free to grab one when you need it, but put it back when you are finished so that others can use it if they need. Again, sharing is caring! :) Onboarding A lot of the usual PTG attendees won’t be able to attend this event, but we will also have a lot of new faces. With this in mind, we have decided to add project onboarding to the PTG so that the new contributors can get up to speed with the projects meeting that week. The teams gathering that will be doing onboarding will have that denoted on the print and digital schedule on site. They have also been encouraged to promote when they will be doing their onboarding via the PTGBot and on the mailing lists. If you have any questions, please let us know! Cheers, The Kendalls (wendallkaters & diablo_rojo) -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucioseki at gmail.com Wed Oct 9 17:49:24 2019 From: lucioseki at gmail.com (Lucio Seki) Date: Wed, 9 Oct 2019 14:49:24 -0300 Subject: Important Shanghai PTG Information In-Reply-To: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> References: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> Message-ID: Hi Kendall, thanks for the info. > While we won’t have projection available Will be there projection for summit speakers? Lucio On Wed, Oct 9, 2019 at 2:11 PM Kendall Waters wrote: > Hello Everyone! > > As I’m sure you already know, the Shanghai PTG is going to be a very > different event from PTGs in the past so we wanted to spell out the > differences so you can be better prepared. > > Registration & Badges > > Registration for the PTG is included in the cost of the Summit. It is a > single registration for both events. Since there is a single registration > for the event, there is also one badge for both events. You will pick it up > when you check in for the Summit and keep it until the end of the PTG. > > The Space > > Rooms > > The space we are contracted to have for the PTG will be laid out > differently. We only have a couple dedicated rooms which are allocated to > those groups with the largest numbers of people. The rest of the teams will > be in a single larger room together. To help people gather teams in an > organized fashion, we will be naming the arrangements of tables after > OpenStack releases (Austin, Bexar, Cactus, etc). > > Food & Beverage Rules > > Unfortunately, the venue does not allow ANY food or drink in any of the > rooms. This includes coffee and tea. Lunch will be from 12:30 to 1:30 in > the beautiful pre-function space outside of the Blue Hall. > > Moving Furniture > > You are allowed to! Yay! If the table arrangements your project/team/group > lead requested don’t work for you, feel free to move the furniture around. > That being said, try to keep the tables marked with their names so that > others can find them during their time slots. There will also be extra > chairs stacked in the corner if your team needs them. > > Hours > > This venue is particularly strict about the hours we are allowed to be > there. The PTG is scheduled to run from 9:00 in the morning to 4:30 in the > evening. Its reasonably likely that if you try to come early or stay late, > security will talk to you. So please be kind and respectfully leave if they > ask you to. > > Resources > > Power > > While we have been working with the venue to accomodate our power needs, > we won’t have as many power strips as we have had in the past. For this > reason, we want to remind everyone to charge all their devices every night > and share the power strips we do have during the day. Sharing is caring! > > Flipcharts > > While we won’t have projection available, we will have some flipcharts > around. Each dedicated room will have one flipchart and the big main room > will have a few to share. Please feel free to grab one when you need it, > but put it back when you are finished so that others can use it if they > need. Again, sharing is caring! :) > > Onboarding > > A lot of the usual PTG attendees won’t be able to attend this event, but > we will also have a lot of new faces. With this in mind, we have decided to > add project onboarding to the PTG so that the new contributors can get up > to speed with the projects meeting that week. The teams gathering that will > be doing onboarding will have that denoted on the print and digital > schedule on site. They have also been encouraged to promote when they will > be doing their onboarding via the PTGBot and on the mailing lists. > > If you have any questions, please let us know! > > Cheers, > The Kendalls > (wendallkaters & diablo_rojo) > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Wed Oct 9 18:06:31 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 9 Oct 2019 11:06:31 -0700 Subject: [PTL] PTG Team Photos Message-ID: Hello Everyone! We are excited to see you in a few weeks at the PTG and wanted to share that we will be taking team photos again! Here is an ethercalc signup for the available time slots [1]. We will be providing time on Thursday Morning/Afternoon and Friday morning to come as a team to get your photo taken. Slots are only ten minutes so its *important that everyone be on time*! The location is TBD at this point, but it will likely be in the prefunction space near registration. Thanks, -Kendall Nelson (diablo_rojo) [1] https://ethercalc.openstack.org/lnupu1sx6ljl -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Wed Oct 9 19:05:09 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 9 Oct 2019 21:05:09 +0200 Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR packages issues) Message-ID: Hello Tackers! Some time ago I reported a bug in Kolla-Ansible Tacker deployment [1] Eduardo (thanks!) did some debugging to discover that you started requiring internal Glance configuration for Tacker to make it use the local filesystem via the filestore backend (internally in Tacker, not via the deployed Glance) [2] This makes us, Koalas, wonder how to approach a proper production deployment of Tacker. Tacker docs have not been updated regarding this new feature and following them may result in broken Tacker deployment (as we have now). We are especially interested in how to deal with multinode Tacker deployment. Do these new paths require any synchronization? [1] https://bugs.launchpad.net/kolla-ansible/+bug/1845142 [2] https://review.opendev.org/#/c/684275/2/ansible/roles/tacker/templates/tacker.conf.j2 Kind regards, Radek -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Wed Oct 9 22:58:58 2019 From: emilien at redhat.com (Emilien Macchi) Date: Wed, 9 Oct 2019 18:58:58 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: This thread deserves an update: - tripleo-ansible has now a paunch module, calling openstack/paunch as a library. https://opendev.org/openstack/tripleo-ansible/src/branch/master/tripleo_ansible/ansible_plugins/modules/paunch.py And is called here for paunch apply: https://opendev.org/openstack/tripleo-heat-templates/src/branch/master/common/deploy-steps-tasks.yaml#L232-L254 In theory, we could deprecate "paunch apply" now as we don't need it anymore. I was working on porting "paunch cleanup" but it's still WIP. - I've been working on a new Ansible role which could totally replace Paunch, called "tripleo-container-manage", which has been enough for me to deploy an Undercloud: https://review.opendev.org/#/c/686196. It's being tested here: https://review.opendev.org/#/c/687651/ and as you can see the undercloud was successfully deployed without Paunch. Note that some container parameters haven't been ported and upgrade untested (this is a prototype). The second approach is a serious prototype I would like to continue further but before I would like some feedback. As for the feedback received in the previous answers, people would like to keep a "print-cmd" like, which makes total sense. I was thinking we could write a proper check mode for the podman_container module, which could output the podman commands that are run by the module. We could also extract the container management tasks to its own playbook so an operator who would usually run: $ paunch debug (...) --action print-cmd replaced by: $ ansible-playbook --check -i inventory.yaml containers.yaml A few benefits of this new role: - leverage ansible modules (we plan to upstream podman_container module) - could be easier to maintain and contribute (python vs ansible) - could potentially be faster. I want to investigate usage of async actions/polls in the role. Challenges: - no unit tests like in paunch, will need good testing with Molecule - we need to invest a lot in testing it, Paunch has a lot of edge cases that we carried over the cycles to manage containers. More feedback is very welcome and anyone interested to contribute please let me know. On Tue, Sep 17, 2019 at 5:03 AM Bogdan Dobrelya wrote: > On 16.09.2019 18:07, Emilien Macchi wrote: > > On Mon, Sep 16, 2019 at 11:47 AM Rabi Mishra > > wrote: > > > > I'm not sure if podman as container tool would move in that > > direction, as it's meant to be a command line tool. If we really > > want to reduce the overhead of so many layers in TripleO and podman > > is the container tool for us (I'll ignore the k8s related > > discussions for the time being), I would think the logic of > > translating the JSON configs to podman calls should be be in ansible > > (we can even write a TripleO specific podman module). > > > > > > I think we're both in strong agreement and say "let's convert paunch > > into ansible module". > > I support the idea of calling paunch code as is from an ansible module. > Although I'm strongly opposed against re-implementing the paunch code > itself as ansible modules. That only brings maintenance burden (harder > will be much to backport fixes into Queens and Train) and more place for > potential regressions, without any functional improvements. > > > And make the module robust enough for our needs. Then we could replace > > paunch by calling the podman module directly. > > -- > > Emilien Macchi > > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Thu Oct 10 00:53:01 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 00:53:01 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> Message-ID: We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? Message with logs got moderated so logs are here: http://paste.openstack.org/show/782622/ From: Chris Apsey Sent: Friday, September 27, 2019 9:34 AM To: Albert Braden Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away. I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden > wrote: When I create 100 VMs in our prod cluster: openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.” If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure. What config variables should I be looking at? Here are the relevant log entries from the HV: 2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds More logs and data: http://paste.openstack.org/show/779524/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali74.ebrahimpour at gmail.com Wed Oct 9 07:49:42 2019 From: ali74.ebrahimpour at gmail.com (Ali Ebrahimpour) Date: Wed, 9 Oct 2019 11:19:42 +0330 Subject: monitoring openstack Message-ID: hi guys i want to install monitoring in my horizon Ui and i'm confused in setting up ceilometer or gnocchi or aodh or monasca in my project because all of them where deprecated. i setup openstack with ansible and i want to monitor the usage of cpu and ram and etc in my dashboard and i also want to know how much resources each customer used for one hour and day. Thanks in advance for your precise guidance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Thu Oct 10 00:20:24 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 00:20:24 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> Message-ID: We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? [1] Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 2. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 3. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 4. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Start request repeated too quickly. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Failed to start Openstack Neutron Linux Bridge Agent. [2] 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] Logging enabled! 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] /usr/bin/neutron-linuxbridge-agent version 13.0.4 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Interface mappings: {'physnet1': 'eno1'} 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge mappings: {} 2019-10-09 17:05:24.522 5803 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/plugins/ml2/linuxbridge_agent.ini', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpmdyxcD/privsep.sock'] 2019-10-09 17:05:25.071 5803 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap 2019-10-09 17:05:25.022 5828 INFO oslo.privsep.daemon [-] privsep daemon starting 2019-10-09 17:05:25.025 5828 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/none 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep daemon running as pid 5828 2019-10-09 17:05:25.125 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Agent initialized successfully, now running... 2019-10-09 17:05:25.193 5803 ERROR neutron.agent.linux.utils [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Error starting thread.: Exception: Failed to spawn rootwrap process. stderr: sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Traceback (most recent call last): 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 794, in run_service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service service.start() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 86, in start 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.setup_rpc() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 153, in setup_rpc 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.context, self.sg_plugin_rpc, defer_refresh_firewall=True) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 58, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.init_firewall(defer_refresh_firewall, integration_bridge) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 83, in init_firewall 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.firewall = firewall_class() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_firewall.py", line 88, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service zone_per_port=self.CONNTRACK_ZONE_PER_PORT) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 58, in get_conntrack 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute, namespace, zone_per_port) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 75, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._populate_initial_zone_map() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 182, in _populate_initial_zone_map 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service rules = self.get_rules_for_table_func('raw') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_manager.py", line 477, in get_rules_for_table 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return self.execute(args, run_as_root=True).split('\n') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 122, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute_rootwrap_daemon(cmd, process_input, addl_env)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 109, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service LOG.error("Rootwrap error running command: %s", cmd) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.force_reraise() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service six.reraise(self.type_, self.value, self.tb) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 106, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return client.execute(cmd, process_input) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 148, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._ensure_initialized() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 115, in _ensure_initialized 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._initialize() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 85, in _initialize 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service (stderr,)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service stderr: 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.197 5803 INFO neutron.plugins.ml2.drivers.agent._common_agent [-] Stopping Linux bridge agent agent. 2019-10-09 17:05:25.198 5803 CRITICAL neutron [-] Unhandled error: AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron Traceback (most recent call last): 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/bin/neutron-linuxbridge-agent", line 10, in 2019-10-09 17:05:25.198 5803 ERROR neutron sys.exit(main()) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/cmd/eventlet/plugins/linuxbridge_neutron_agent.py", line 21, in main 2019-10-09 17:05:25.198 5803 ERROR neutron agent_main.main() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1051, in main 2019-10-09 17:05:25.198 5803 ERROR neutron launcher.wait() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 392, in wait 2019-10-09 17:05:25.198 5803 ERROR neutron status, signo = self._wait_for_exit_or_signal() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 377, in _wait_for_exit_or_signal 2019-10-09 17:05:25.198 5803 ERROR neutron self.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 292, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.services.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 760, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron service.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 117, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.set_rpc_timeout(self.quitting_rpc_timeout) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 476, in set_rpc_timeout 2019-10-09 17:05:25.198 5803 ERROR neutron self.state_rpc): 2019-10-09 17:05:25.198 5803 ERROR neutron AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron From: Chris Apsey Sent: Friday, September 27, 2019 9:34 AM To: Albert Braden Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away. I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden > wrote: When I create 100 VMs in our prod cluster: openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.” If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure. What config variables should I be looking at? Here are the relevant log entries from the HV: 2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds More logs and data: http://paste.openstack.org/show/779524/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eandersson at blizzard.com Thu Oct 10 01:40:17 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Thu, 10 Oct 2019 01:40:17 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? [1] Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 2. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 3. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 4. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Start request repeated too quickly. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Failed to start Openstack Neutron Linux Bridge Agent. [2] 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] Logging enabled! 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] /usr/bin/neutron-linuxbridge-agent version 13.0.4 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Interface mappings: {'physnet1': 'eno1'} 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge mappings: {} 2019-10-09 17:05:24.522 5803 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/plugins/ml2/linuxbridge_agent.ini', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpmdyxcD/privsep.sock'] 2019-10-09 17:05:25.071 5803 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap 2019-10-09 17:05:25.022 5828 INFO oslo.privsep.daemon [-] privsep daemon starting 2019-10-09 17:05:25.025 5828 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/none 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep daemon running as pid 5828 2019-10-09 17:05:25.125 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Agent initialized successfully, now running... 2019-10-09 17:05:25.193 5803 ERROR neutron.agent.linux.utils [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Error starting thread.: Exception: Failed to spawn rootwrap process. stderr: sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Traceback (most recent call last): 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 794, in run_service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service service.start() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 86, in start 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.setup_rpc() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 153, in setup_rpc 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.context, self.sg_plugin_rpc, defer_refresh_firewall=True) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 58, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.init_firewall(defer_refresh_firewall, integration_bridge) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 83, in init_firewall 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.firewall = firewall_class() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_firewall.py", line 88, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service zone_per_port=self.CONNTRACK_ZONE_PER_PORT) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 58, in get_conntrack 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute, namespace, zone_per_port) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 75, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._populate_initial_zone_map() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 182, in _populate_initial_zone_map 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service rules = self.get_rules_for_table_func('raw') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_manager.py", line 477, in get_rules_for_table 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return self.execute(args, run_as_root=True).split('\n') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 122, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute_rootwrap_daemon(cmd, process_input, addl_env)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 109, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service LOG.error("Rootwrap error running command: %s", cmd) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.force_reraise() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service six.reraise(self.type_, self.value, self.tb) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 106, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return client.execute(cmd, process_input) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 148, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._ensure_initialized() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 115, in _ensure_initialized 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._initialize() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 85, in _initialize 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service (stderr,)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service stderr: 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.197 5803 INFO neutron.plugins.ml2.drivers.agent._common_agent [-] Stopping Linux bridge agent agent. 2019-10-09 17:05:25.198 5803 CRITICAL neutron [-] Unhandled error: AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron Traceback (most recent call last): 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/bin/neutron-linuxbridge-agent", line 10, in 2019-10-09 17:05:25.198 5803 ERROR neutron sys.exit(main()) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/cmd/eventlet/plugins/linuxbridge_neutron_agent.py", line 21, in main 2019-10-09 17:05:25.198 5803 ERROR neutron agent_main.main() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1051, in main 2019-10-09 17:05:25.198 5803 ERROR neutron launcher.wait() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 392, in wait 2019-10-09 17:05:25.198 5803 ERROR neutron status, signo = self._wait_for_exit_or_signal() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 377, in _wait_for_exit_or_signal 2019-10-09 17:05:25.198 5803 ERROR neutron self.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 292, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.services.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 760, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron service.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 117, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.set_rpc_timeout(self.quitting_rpc_timeout) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 476, in set_rpc_timeout 2019-10-09 17:05:25.198 5803 ERROR neutron self.state_rpc): 2019-10-09 17:05:25.198 5803 ERROR neutron AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron From: Chris Apsey Sent: Friday, September 27, 2019 9:34 AM To: Albert Braden Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away. I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden > wrote: When I create 100 VMs in our prod cluster: openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.” If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure. What config variables should I be looking at? Here are the relevant log entries from the HV: 2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds More logs and data: http://paste.openstack.org/show/779524/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eandersson at blizzard.com Thu Oct 10 01:43:09 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Thu, 10 Oct 2019 01:43:09 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? [1] Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 2. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 3. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 4. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Start request repeated too quickly. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Failed to start Openstack Neutron Linux Bridge Agent. [2] 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] Logging enabled! 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] /usr/bin/neutron-linuxbridge-agent version 13.0.4 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Interface mappings: {'physnet1': 'eno1'} 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge mappings: {} 2019-10-09 17:05:24.522 5803 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/plugins/ml2/linuxbridge_agent.ini', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpmdyxcD/privsep.sock'] 2019-10-09 17:05:25.071 5803 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap 2019-10-09 17:05:25.022 5828 INFO oslo.privsep.daemon [-] privsep daemon starting 2019-10-09 17:05:25.025 5828 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/none 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep daemon running as pid 5828 2019-10-09 17:05:25.125 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Agent initialized successfully, now running... 2019-10-09 17:05:25.193 5803 ERROR neutron.agent.linux.utils [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Error starting thread.: Exception: Failed to spawn rootwrap process. stderr: sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Traceback (most recent call last): 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 794, in run_service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service service.start() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 86, in start 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.setup_rpc() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 153, in setup_rpc 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.context, self.sg_plugin_rpc, defer_refresh_firewall=True) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 58, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.init_firewall(defer_refresh_firewall, integration_bridge) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 83, in init_firewall 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.firewall = firewall_class() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_firewall.py", line 88, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service zone_per_port=self.CONNTRACK_ZONE_PER_PORT) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 58, in get_conntrack 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute, namespace, zone_per_port) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 75, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._populate_initial_zone_map() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 182, in _populate_initial_zone_map 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service rules = self.get_rules_for_table_func('raw') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_manager.py", line 477, in get_rules_for_table 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return self.execute(args, run_as_root=True).split('\n') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 122, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute_rootwrap_daemon(cmd, process_input, addl_env)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 109, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service LOG.error("Rootwrap error running command: %s", cmd) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.force_reraise() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service six.reraise(self.type_, self.value, self.tb) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 106, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return client.execute(cmd, process_input) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 148, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._ensure_initialized() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 115, in _ensure_initialized 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._initialize() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 85, in _initialize 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service (stderr,)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service stderr: 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.197 5803 INFO neutron.plugins.ml2.drivers.agent._common_agent [-] Stopping Linux bridge agent agent. 2019-10-09 17:05:25.198 5803 CRITICAL neutron [-] Unhandled error: AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron Traceback (most recent call last): 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/bin/neutron-linuxbridge-agent", line 10, in 2019-10-09 17:05:25.198 5803 ERROR neutron sys.exit(main()) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/cmd/eventlet/plugins/linuxbridge_neutron_agent.py", line 21, in main 2019-10-09 17:05:25.198 5803 ERROR neutron agent_main.main() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1051, in main 2019-10-09 17:05:25.198 5803 ERROR neutron launcher.wait() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 392, in wait 2019-10-09 17:05:25.198 5803 ERROR neutron status, signo = self._wait_for_exit_or_signal() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 377, in _wait_for_exit_or_signal 2019-10-09 17:05:25.198 5803 ERROR neutron self.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 292, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.services.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 760, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron service.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 117, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.set_rpc_timeout(self.quitting_rpc_timeout) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 476, in set_rpc_timeout 2019-10-09 17:05:25.198 5803 ERROR neutron self.state_rpc): 2019-10-09 17:05:25.198 5803 ERROR neutron AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron -------------- next part -------------- An HTML attachment was scrubbed... URL: From anlin.kong at gmail.com Thu Oct 10 06:09:28 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Thu, 10 Oct 2019 19:09:28 +1300 Subject: [trove] Core team change Message-ID: Hi, As the Ussuri dev cycle begins, it's time to make some changes to Trove core team. Unfortunately, except myself, there is no one actively contributing(including coding and reviewing) to Trove according to https://www.stackalytics.com/report/contribution/trove/90. Some of the reasons are probably because there was a significant change in Trove community in the history and the security concerns. However, as a member of a public cloud provider who deployed Trove, I've been trying my best to bring Trove back on track during the recent several dev cycles. Although it's very hard to make this decision, I am going to remove all the current members from the core team according to https://docs.openstack.org/project-team-guide/open-development.html, but everyone is encouraged to help review changes and join the core team again with enough valuable reviews and contributions. Thanks for all your contributions in the past. - Best regards, Lingxian Kong Catalyst Cloud -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Oct 10 07:22:33 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 10 Oct 2019 09:22:33 +0200 Subject: [ptl][release] Last call for RC updates In-Reply-To: <20191009154629.GA26100@sm-workstation> References: <20191009134411.GA9816@sm-workstation> <20191009154629.GA26100@sm-workstation> Message-ID: Hi, > On 9 Oct 2019, at 17:46, Sean McGinnis wrote: > > On Wed, Oct 09, 2019 at 08:44:11AM -0500, Sean McGinnis wrote: >> Hey everyone, >> >> This is just a reminder about tomorrow's deadline for a final RC for Train. >> >> There are several projects that have changes merged since cutting the >> stable/train branch. Not all of these changes need to be included in the >> initial Train coordinated release, but it would be good if there are >> translations and bug fixes merged to get them into a final RC while there's >> still time. >> > > To try to help some teams, I have proposed RC2 releases for those deliverables > that looked like they had relevant things that would be good to pick up for the > final Train release. They can be found under the train-rc2 topic: > > https://review.opendev.org/#/q/topic:train-rc2+(status:open+OR+status:merged) Thx a lot for doing this Sean :) > > Again, not all changes are necessary to be included, so we will only process > these if we get an explicit ack from the PTL or release liaison that the team > actually wants these extra RC releases. > > Feel free to +1 if you would like us to proceed, or -1 if you do not want the > RC or just need a little more time to get anything else merged before > tomorrow's deadline. If the latter, please take over the patch and update with > the new commit hash that should be used for the release. > > Thanks! > Sean > — Slawek Kaplonski Senior software engineer Red Hat From geguileo at redhat.com Thu Oct 10 10:00:50 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Thu, 10 Oct 2019 12:00:50 +0200 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> Message-ID: <20191010100050.hn546tikeihaho7e@localhost> On 04/10, Matt Riedemann wrote: > On 10/4/2019 11:03 AM, Walter Boring wrote: > >   I think if we don't have a host connector passed in and the > > attachment record doesn't have a connector saved, > > then that results in the volume manager not calling the cinder driver to > > terminate_connection and return. > > This also bypasses the driver's remove_export() which is the last chance > > for a driver to unexport a volume. > > Two things: > > 1. Yeah if the existing legacy attachment record doesn't have a connector I > was worried about not properly cleaning on for that old connection, which is > something I mentioned before, but also as mentioned we potentially have that > case when a server is deleted and we can't get to the compute host to get > the host connector, right? > Hi, Not really... In that case we still have the BDM info in the DB, so we can just make the 3 Cinder REST API calls ourselves (begin_detaching, terminate_connection and detach) to have the volume unmapped, the export removed, and the volume return to available as usual, without needing to go to the storage array manually. > 2. If I were to use os-terminate_connection, I seem to have a tricky > situation on the migration flow because right now I'm doing: > > a) create new attachment with host connector > b) complete new attachment (put the volume back to in-use status) > - if this fails I attempt to delete the new attachment > c) delete the legacy attachment - I intentionally left this until the end to > make sure (a) and (b) were successful. > > If I change (c) to be os-terminate_connection, will that screw up the > accounting on the attachment created in (a)? > > If I did the terminate_connection first (before creating a new attachment), > could that leave a window of time where the volume is shown as not > attached/in-use? Maybe not since it's not the begin_detaching/os-detach > API...I'm fuzzy on the cinder volume state machine here. > > Or maybe the flow would become: > > a) create new attachment with host connector This is a good idea in itself, but it's not taking into account weird behaviors that some Cinder drivers may have when you call them twice to initialize the connection on the same host. Some drivers end up creating a different mapping for the volume instead of returning the existing one; we've had bugs like this before, and that's why Nova made a change in its live instance migration code to not call intialize_connection on the source host to get the connection_info for detaching. > b) terminate the connection for the legacy attachment > - if this fails, delete the new attachment created in (a) > c) complete the new attachment created in (a) > - if this fails...? > > Without digging into the flow of a cold or live migration I want to say > that's closer to what we do there, e.g. initialize_connection for the new > host, terminate_connection for the old host, complete the new attachment. > I think any workaround we try to find has a good chance of resulting in a good number of bugs. In my opinion our options are: 1- Completely detach and re-attach the volume 2- Write new code in Cinder The new code can be either a new action or we can just add a microversion to attachment create to also accept "connection_info", and when we provide connection_info on the call the method confirms that it's a "migration" (the volume is 'in-use' and doesn't have any attachments) and it doesn't bother to call the cinder-volume to export and map the volume again and simply saves this information in the DB. I know the solution it's not "clean/nice/elegant", and I'd rather go with option 1, but that would be terrible user experience, so I'll settle for a solution that doesn't add much code to Cinder, is simple for Nova, and is less likely to result in bugs. What do you think? Regards, Gorka. PS: In this week's meeting we briefly discussed this topic and agreed to continue the conversation here and retake it on next week's meeting. > -- > > Thanks, > > Matt > From rico.lin.guanyu at gmail.com Thu Oct 10 10:23:45 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Thu, 10 Oct 2019 18:23:45 +0800 Subject: [tc] monthly meeting agenda In-Reply-To: References: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> <1e6f227d2b341b7d7d528d30f4b3c9821e66ffe9.camel@evrard.me> Message-ID: Hi all I just add topic `overall review for TC summit and PTG plans` to agenda since this is the last meeting we have before Summit and we should take some time to confirm it. On Wed, Oct 9, 2019 at 12:57 AM Rico Lin wrote: > I added two more topics in agenda suggestion today which might worth > discuss about. > * define goal select process schedule > * Maintain issue with Telemetery > > On Tue, Oct 8, 2019 at 10:10 PM Jean-Philippe Evrard < > jean-philippe at evrard.me> wrote: > > > > > Thanks! Maybe we could only discuss about what to do for our rejected > > sessions (in https://etherpad.openstack.org/p/PVG-TC-brainstorming )? > > That sounds like a good idea. > > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Thu Oct 10 12:19:04 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Thu, 10 Oct 2019 14:19:04 +0200 Subject: [tc] Time off for JP! Message-ID: Hello, I will have limited access to internet and emails until the 23rd of October. For all urgent matters, you can contact Rico Lin, our vice-chair. Talk to you all later! Regards, JP From thierry at openstack.org Thu Oct 10 15:40:17 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 10 Oct 2019 17:40:17 +0200 Subject: [ptg] Auto-generated etherpad links ! Message-ID: <86f9ea36-5c38-ef64-aa7c-dd5849143c5d@openstack.org> Hi everyone, The PTGbot grew a new feature over the summer. It now dynamically generates the list of PTG track etherpads. You can find that list at: http://ptg.openstack.org/etherpads.html If you haven't created your etherpad already, just follow the link there to create your etherpad. If you have created your track etherpad already under a different name, you can overload the automatically-generated name using the PTGbot. Just join the #openstack-ptg channel and (as a Freenode authenticated user) send the following command: #TRACKNAME etherpad Example: #keystone etherpad https://etherpad.openstack.org/p/awesome-keystone-pad That will update the link on that page automatically. Hoping to see you in Shanghai! -- Thierry Carrez (ttx) From Albert.Braden at synopsys.com Thu Oct 10 18:04:41 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 18:04:41 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn't start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: "Exception: Failed to spawn rootwrap process." If I comment out 'root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"' and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Thu Oct 10 18:07:37 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 10 Oct 2019 13:07:37 -0500 Subject: [nova] No meeting today Message-ID: Hi all. I'm going to be on PTO, it sounds like several other key members will be absent, the US-time meetings have been very sparsely attended lately, and there are a few things on the agenda for which we should really have a quorum, so I'm canceling today's meeting. https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting Thanks, efried . From eandersson at blizzard.com Thu Oct 10 18:08:02 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Thu, 10 Oct 2019 18:08:02 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: Yea - if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn't start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: "Exception: Failed to spawn rootwrap process." If I comment out 'root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"' and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Oct 10 18:21:31 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 10 Oct 2019 13:21:31 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: <20191010100050.hn546tikeihaho7e@localhost> References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> <20191010100050.hn546tikeihaho7e@localhost> Message-ID: On 10/10/2019 5:00 AM, Gorka Eguileor wrote: >> 1. Yeah if the existing legacy attachment record doesn't have a connector I >> was worried about not properly cleaning on for that old connection, which is >> something I mentioned before, but also as mentioned we potentially have that >> case when a server is deleted and we can't get to the compute host to get >> the host connector, right? >> > Hi, > > Not really... In that case we still have the BDM info in the DB, so we > can just make the 3 Cinder REST API calls ourselves (begin_detaching, > terminate_connection and detach) to have the volume unmapped, the export > removed, and the volume return to available as usual, without needing to > go to the storage array manually. I'm not sure what you mean. Yes we have the BDM in nova but if it's really old it won't have the host connector stashed away in the connection_info dict and we won't be able to pass that to the terminate_connection API: https://github.com/openstack/nova/blob/19.0.0/nova/compute/api.py#L2186 Are you talking about something else? I realize ^ is very edge case since we've been storing the connector in the BDM.connection_info since I think at least Liberty or Mitaka. > > >> 2. If I were to use os-terminate_connection, I seem to have a tricky >> situation on the migration flow because right now I'm doing: >> >> a) create new attachment with host connector >> b) complete new attachment (put the volume back to in-use status) >> - if this fails I attempt to delete the new attachment >> c) delete the legacy attachment - I intentionally left this until the end to >> make sure (a) and (b) were successful. >> >> If I change (c) to be os-terminate_connection, will that screw up the >> accounting on the attachment created in (a)? >> >> If I did the terminate_connection first (before creating a new attachment), >> could that leave a window of time where the volume is shown as not >> attached/in-use? Maybe not since it's not the begin_detaching/os-detach >> API...I'm fuzzy on the cinder volume state machine here. >> >> Or maybe the flow would become: >> >> a) create new attachment with host connector > This is a good idea in itself, but it's not taking into account weird > behaviors that some Cinder drivers may have when you call them twice to > initialize the connection on the same host. Some drivers end up > creating a different mapping for the volume instead of returning the > existing one; we've had bugs like this before, and that's why Nova made > a change in its live instance migration code to not call > intialize_connection on the source host to get the connection_info for > detaching. Huh...I thought attachments in cinder were a dime a dozen and you could create/delete them as needed, or that was the idea behind the new v3 attachments stuff. It seems to at least be what I remember John Griffith always saying we should be able to do. Also if you can't refresh the connection info on the same host then a change like this: https://review.opendev.org/#/c/579004/ Which does just that - refreshes the connection info doing reboot and start instance operations - would break on those volume drivers if I'm following you. > > >> b) terminate the connection for the legacy attachment >> - if this fails, delete the new attachment created in (a) >> c) complete the new attachment created in (a) >> - if this fails...? >> >> Without digging into the flow of a cold or live migration I want to say >> that's closer to what we do there, e.g. initialize_connection for the new >> host, terminate_connection for the old host, complete the new attachment. >> > I think any workaround we try to find has a good chance of resulting in > a good number of bugs. > > In my opinion our options are: > > 1- Completely detach and re-attach the volume I'd really like to avoid this if possible because it could screw up running applications and the migration operation itself is threaded out to not hold up the restart of the compute service. But maybe that's also true of what I've got written up today though it's closer to what we do during resize/cold migrate (though those of course involve downtime for the guest). > 2- Write new code in Cinder > > The new code can be either a new action or we can just add a > microversion to attachment create to also accept "connection_info", and > when we provide connection_info on the call the method confirms that > it's a "migration" (the volume is 'in-use' and doesn't have any > attachments) and it doesn't bother to call the cinder-volume to export > and map the volume again and simply saves this information in the DB. If the volume is in-use it would have attachments, so I'm not following you there. Even if the volume is attached the "legacy" way from a nova perspective, using os-initialize_connection, there is a volume attachment record in the cinder DB (I confirmed this in my devstack testing and the notes are in my patch). It's also precisely the problem I'm trying to solve which is without deleting the old legacy attachment, when you delete the server the volume is detached but still shows up in cinder as in-use because of the orphaned attachment. > > I know the solution it's not "clean/nice/elegant", and I'd rather go > with option 1, but that would be terrible user experience, so I'll > settle for a solution that doesn't add much code to Cinder, is simple > for Nova, and is less likely to result in bugs. > > What do you think? > > Regards, > Gorka. > > PS: In this week's meeting we briefly discussed this topic and agreed to > continue the conversation here and retake it on next week's meeting. > Thanks for discussing it and getting back to me. -- Thanks, Matt From ildiko.vancsa at gmail.com Thu Oct 10 18:26:38 2019 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Thu, 10 Oct 2019 20:26:38 +0200 Subject: [keystone][edge][k8s] Keystone - StarlingX integration feedback Message-ID: Hi, I wanted to point you to a thread that’s just started on the edge-computing mailing list: http://lists.openstack.org/pipermail/edge-computing/2019-October/000642.html The mail contains information about a use case that StarlingX has to use Keystone integrated with Kubernetes which I believe is valuable information to the Keystone team to see if there are any items to discuss further/fix/implement. Thanks, Ildikó From Albert.Braden at synopsys.com Thu Oct 10 19:13:02 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 19:13:02 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone... for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren't erroring at this time. I changed neutron's shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? From: Erik Olof Gunnar Andersson Sent: Thursday, October 10, 2019 11:08 AM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group Yea - if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden > Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn't start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: "Exception: Failed to spawn rootwrap process." If I comment out 'root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"' and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Thu Oct 10 19:20:37 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Oct 2019 14:20:37 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> <20191010100050.hn546tikeihaho7e@localhost> Message-ID: <20191010192037.GA1037@sm-workstation> > > > > > > a) create new attachment with host connector > > This is a good idea in itself, but it's not taking into account weird > > behaviors that some Cinder drivers may have when you call them twice to > > initialize the connection on the same host. Some drivers end up > > creating a different mapping for the volume instead of returning the > > existing one; we've had bugs like this before, and that's why Nova made > > a change in its live instance migration code to not call > > intialize_connection on the source host to get the connection_info for > > detaching. > > Huh...I thought attachments in cinder were a dime a dozen and you could > create/delete them as needed, or that was the idea behind the new v3 > attachments stuff. It seems to at least be what I remember John Griffith > always saying we should be able to do. > > Also if you can't refresh the connection info on the same host then a change > like this: > > https://review.opendev.org/#/c/579004/ > > Which does just that - refreshes the connection info doing reboot and start > instance operations - would break on those volume drivers if I'm following > you. > Creating attachements, using the new attachments API, is a pretty low overhead thing. The issue is/was with the way Nova was calling initialize_connection expecting it to be an idempotent operation. I think we've identified most drivers that had an issue with this. It wasn't a documented assumption on the Cinder side, so I remember when we first realized that was what Nova was doing, we found a lot of Cinder backends that had issues with this. With initialize_connection, at least how it was intended, it is telling the backend to create a new connection between the storage and the host. So every time initialize_connection was called, most backends would make configuration changes on the storage backend to expose the volume to the requested host. Depending on how that backend worked, this could result in multiple separate (and different) connection settings for how the host can access the volume. Most drivers are now aware of this (mis?)use of the call and will first check if there is an existing configuration and just return those settings if it's found. There's no real way to test and enforce that easily, so making sure all drivers including newly added ones behave that way has been up to cores remembering to look for it during code reviews. But you can create as many attachment objects in the database as you want. Sean From corey.bryant at canonical.com Thu Oct 10 20:03:58 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Thu, 10 Oct 2019 16:03:58 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Wed, Oct 9, 2019 at 2:06 AM Frode Nordahl wrote: > On Fri, Oct 4, 2019 at 3:46 PM Corey Bryant > wrote: > >> Hi All, >> > > Hey Corey, > > Great to see the charm coming along! > > Code is located at: >> https://github.com/coreycb/charm-placement >> https://github.com/coreycb/charm-interface-placement >> >> https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) >> > > 1) Since the interface is new I would love to see it based on the > ``Endpoint`` class instead of the aging ``RelationBase`` class. Also the > interface code needs unit tests. We have multiple examples of interface > implementations with both in place you can get inspiration from [0]. > > Also consider having both a ``connected`` and ``available`` state, the > available state could be set on the first relation-changed event. This > increases the probability of your charm detecting a live charm in the other > end of the relation, both states are also required to use the > ``charms.openstack`` required relation gating code. > > 2) In the reactive handler you do a bespoke import of the charm class > module just to activate the code, this is no longer necessary as there has > been implemented a module that does automatic search and import of the > class for you. Please use that instead. [1] > > > import charms_openstack.bus > import charms_openstack.charm as charm > > charms_openstack.bus.discover() > > > 0: > https://github.com/search?q=org%3Aopenstack+%22from+charms.reactive+import+Endpoint%22&type=Code > 1: > https://github.com/search?q=org%3Aopenstack+charms_openstack.bus&type=Code > > -- > Frode Nordahl > Hey Frode, Thanks very much for the input. I have these up in gerrit now with the changes you mentioned so I think we can move further reviews to gerrit: https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) Thanks, Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Oct 10 20:16:46 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 10 Oct 2019 15:16:46 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: <20191010192037.GA1037@sm-workstation> References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> <20191010100050.hn546tikeihaho7e@localhost> <20191010192037.GA1037@sm-workstation> Message-ID: <0d024d78-3f54-e633-bda8-fee57e1c9999@gmail.com> On 10/10/2019 2:20 PM, Sean McGinnis wrote: > Most drivers are now aware of this (mis?)use of the call and will first check > if there is an existing configuration and just return those settings if it's > found. There's no real way to test and enforce that easily, so making sure all > drivers including newly added ones behave that way has been up to cores > remembering to look for it during code reviews. It's unrelated to what I'm trying to solve, but could a cinder tempest plugin test be added which hits the initialize_connection API multiple times without changing host connector and assert the driver is doing the correct thing, whatever that is? Maybe it's just asserting that the connection_info returned from the first call is identical to subsequent calls if the host connector dict input doesn't change? > > But you can create as many attachment objects in the database as you want. But you have to remember to delete them otherwise the volume doesn't leave in-use status even if the volume is detached from the server, so there is attachment counting that needs to happen somewhere (I know cinder does it, but part of that is also on the client side - nova in this case). With the legacy attach flow nova would being_detaching, terminate_connection and then call os-detach and I suppose os-detach could puke if the client hadn't done the attachment cleanup properly, i.e. hadn't called terminate_connection. With the v3 attachments flow we don't have that, we just create attachment, update it with host connector and then complete it. On detach we just delete the attachment and if it's the last one the volume is no longer in-use. I'm not advocating adding another os-detach-like API for the v3 attachment flow, just saying it's an issue if the client isn't aware of all that. -- Thanks, Matt From Albert.Braden at synopsys.com Thu Oct 10 20:45:38 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 20:45:38 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We’re not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone… for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren’t erroring at this time. I changed neutron’s shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? From: Erik Olof Gunnar Andersson > Sent: Thursday, October 10, 2019 11:08 AM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group Yea – if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden > Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn’t start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From najoy at cisco.com Thu Oct 10 23:26:25 2019 From: najoy at cisco.com (Naveen Joy (najoy)) Date: Thu, 10 Oct 2019 23:26:25 +0000 Subject: Networking-vpp 19.08.1 for VPP 19.08.1 is now available Message-ID: Hello All, We'd like to invite you to try out Networking-vpp 19.08.1. As many of you may already know, VPP is a fast user space forwarder based on the DPDK toolkit. VPP uses vector packet processing algorithms to minimize the CPU time spent on each packet to maximize throughput. Networking-vpp is a ML2 mechanism driver that controls VPP on your control and compute hosts to provide fast L2 forwarding under Neutron. This latest version of Networking-vpp is updated to work with VPP 19.08.1 In this release, we've worked on the below updates: - We've added ERSPAN support for Tap-as-a-Service (TaaS). Since ERSPAN provides remote port mirroring, you can now mirror your OpenStack traffic to a destination outside of OpenStack or to a remote OpenStack VM. This is a customized version of OpenStack TaaS. We will be working with the community to push our custom TaaS extensions upstream. In the meantime, you can access our modified TaaS code here[2]. For further info on installation and usage, you can read the TaaS-README[3]. - We've updated the code to be compatible with VPP 19.08 & 19.08.1 API changes. - We've updated the unit test framework to support python3.5 onwards. - We've added security-group support for Trunk subports. We've added support for neutron trunk_details extension. - We've fixed some bugs in our Trunk and L3 plugins that caused a race condition during port binding. - We've migrated our repo from OpenStack to OpenDev. - A recent change in nova caused live migration to fail for instances with NUMA characteristics. We've found that this is a limitation in nova and not VPP/Networking-vpp. It is still possible to use live migration with VPP/Networking-vpp. Please refer to the README[1] for further details. - We've been doing the usual round of bug fixes and updates. The code will work with VPP 19.08.1 and has been updated to keep up with Neutron Rocky and Stein. The README [1] explains how you can try out VPP with Networking-vpp using Devstack: the Devstack plugin will deploy the mechanism driver and VPP and should give you a working system with a minimum of hassle. We will be continuing our development for VPP's 20.01 release. We welcome anyone who would like to come help us. -- Naveen, Ian & Jerome [1] https://opendev.org/x/networking-vpp/src/branch/master/README.rst [2] https://github.com/jbeuque/tap-as-a-service [3] https://opendev.org/x/networking-vpp/src/branch/master/README_taas.txt -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Thu Oct 10 23:34:42 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Oct 2019 18:34:42 -0500 Subject: [release] Release countdown for week R-0, October 14-18 Message-ID: <20191010233442.GA24173@sm-workstation> Development Focus ----------------- We will be releasing the coordinated OpenStack Train release next week, on Wednesday October 16th. Thanks to everyone involved in the Train cycle! We are now in pre-release freeze, so no new deliverable will be created until final release, unless a release-critical regression is spotted. Otherwise, teams attending the PTG in Shanghai should start to plan what they will be discussing there, by creating and filling team etherpads. You can find the list of etherpads at: http://ptg.openstack.org/etherpads.html General Information ------------------- On release day, the release team will produce final versions of deliverables following the cycle-with-rc release model, by re-tagging the commit used for the last RC. A patch doing just that will be proposed. PTLs and release liaisons should watch for that final release patch from the release team. While not required, we would appreciate having an ack from each team before we approve it on the 16th. Upcoming Deadlines & Dates -------------------------- Final Train release: October 16 Forum+PTG at Shanghai summit: November 4 From sean.mcginnis at gmx.com Thu Oct 10 23:38:58 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Oct 2019 18:38:58 -0500 Subject: [release] Release countdown for week R-0, October 14-18 In-Reply-To: <20191010233442.GA24173@sm-workstation> References: <20191010233442.GA24173@sm-workstation> Message-ID: <20191010233858.GB24173@sm-workstation> > > General Information > ------------------- > > On release day, the release team will produce final versions of deliverables > following the cycle-with-rc release model, by re-tagging the commit used for > the last RC. > > A patch doing just that will be proposed. PTLs and release liaisons should > watch for that final release patch from the release team. While not required, > we would appreciate having an ack from each team before we approve it on the > 16th. > And that patch has been proposed. PTL's, please ack this patch when you have a moment: https://review.opendev.org/#/c/687991/ From kendall at openstack.org Thu Oct 10 23:40:02 2019 From: kendall at openstack.org (Kendall Waters) Date: Thu, 10 Oct 2019 18:40:02 -0500 Subject: Important Shanghai PTG Information In-Reply-To: References: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> Message-ID: Hi Lucio, Great question! Yes, there will be projection in all Summit breakout rooms. Cheers, Kendall Kendall Waters OpenStack Marketing & Events kendall at openstack.org > On Oct 9, 2019, at 12:49 PM, Lucio Seki wrote: > > Hi Kendall, thanks for the info. > > > While we won’t have projection available > Will be there projection for summit speakers? > > Lucio > > On Wed, Oct 9, 2019 at 2:11 PM Kendall Waters > wrote: > Hello Everyone! > > As I’m sure you already know, the Shanghai PTG is going to be a very different event from PTGs in the past so we wanted to spell out the differences so you can be better prepared. > > Registration & Badges > > Registration for the PTG is included in the cost of the Summit. It is a single registration for both events. Since there is a single registration for the event, there is also one badge for both events. You will pick it up when you check in for the Summit and keep it until the end of the PTG. > > The Space > > Rooms > > The space we are contracted to have for the PTG will be laid out differently. We only have a couple dedicated rooms which are allocated to those groups with the largest numbers of people. The rest of the teams will be in a single larger room together. To help people gather teams in an organized fashion, we will be naming the arrangements of tables after OpenStack releases (Austin, Bexar, Cactus, etc). > > Food & Beverage Rules > > Unfortunately, the venue does not allow ANY food or drink in any of the rooms. This includes coffee and tea. Lunch will be from 12:30 to 1:30 in the beautiful pre-function space outside of the Blue Hall. > > Moving Furniture > > You are allowed to! Yay! If the table arrangements your project/team/group lead requested don’t work for you, feel free to move the furniture around. That being said, try to keep the tables marked with their names so that others can find them during their time slots. There will also be extra chairs stacked in the corner if your team needs them. > > Hours > > This venue is particularly strict about the hours we are allowed to be there. The PTG is scheduled to run from 9:00 in the morning to 4:30 in the evening. Its reasonably likely that if you try to come early or stay late, security will talk to you. So please be kind and respectfully leave if they ask you to. > > Resources > > Power > > While we have been working with the venue to accomodate our power needs, we won’t have as many power strips as we have had in the past. For this reason, we want to remind everyone to charge all their devices every night and share the power strips we do have during the day. Sharing is caring! > > Flipcharts > > While we won’t have projection available, we will have some flipcharts around. Each dedicated room will have one flipchart and the big main room will have a few to share. Please feel free to grab one when you need it, but put it back when you are finished so that others can use it if they need. Again, sharing is caring! :) > > Onboarding > > A lot of the usual PTG attendees won’t be able to attend this event, but we will also have a lot of new faces. With this in mind, we have decided to add project onboarding to the PTG so that the new contributors can get up to speed with the projects meeting that week. The teams gathering that will be doing onboarding will have that denoted on the print and digital schedule on site. They have also been encouraged to promote when they will be doing their onboarding via the PTGBot and on the mailing lists. > > If you have any questions, please let us know! > > Cheers, > The Kendalls > (wendallkaters & diablo_rojo) > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eandersson at blizzard.com Fri Oct 11 01:18:30 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Fri, 11 Oct 2019 01:18:30 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, , Message-ID: Maybe double check that your rootwrap config is up to date? /etc/neutron/rootwrap .conf and /etc/neutron/rootwrap.d (Make sure to pick the appropriate branch in github) https://github.com/openstack/neutron/blob/master/etc/rootwrap.conf https://github.com/openstack/neutron/tree/master/etc/neutron/rootwrap.d ________________________________ From: Albert Braden Sent: Thursday, October 10, 2019 1:45 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We’re not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone… for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren’t erroring at this time. I changed neutron’s shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? From: Erik Olof Gunnar Andersson > Sent: Thursday, October 10, 2019 11:08 AM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group Yea – if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden > Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn’t start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eandersson at blizzard.com Fri Oct 11 01:21:26 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Fri, 11 Oct 2019 01:21:26 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, , , Message-ID: Btw I still think your suders is slightly incorrect. I feel like that is significant, but not a hundred. Drop the star at the end of the last line. root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Erik Olof Gunnar Andersson Sent: Thursday, October 10, 2019 6:18 PM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Maybe double check that your rootwrap config is up to date? /etc/neutron/rootwrap .conf and /etc/neutron/rootwrap.d (Make sure to pick the appropriate branch in github) https://github.com/openstack/neutron/blob/master/etc/rootwrap.conf https://github.com/openstack/neutron/tree/master/etc/neutron/rootwrap.d ________________________________ From: Albert Braden Sent: Thursday, October 10, 2019 1:45 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We’re not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone… for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren’t erroring at this time. I changed neutron’s shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dharmendra.kushwaha at india.nec.com Fri Oct 11 07:31:12 2019 From: dharmendra.kushwaha at india.nec.com (Dharmendra Kushwaha) Date: Fri, 11 Oct 2019 07:31:12 +0000 Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR packages issues) In-Reply-To: References: Message-ID: Hi Radosław, Sorry for inconvenience. We added support for vnf package with limited scope [1] in train cycle, and have ongoing activity for U cycle, so we didn't published proper doc for this feature. But yes, we will add doc for current dependent changes. I have just pushed a manual installation doc changes in [2]. We needs vnf_package_csar_path(i.e. /var/lib/tacker/vnfpackages/) path to keep extracted data locally for further actions, and filesystem_store_datadir(i.e. /var/lib/tacker/csar_files) for glance store. In case of multi node deployment, we recommend to configure filesystem_store_datadir option on shared storage to make sure the availability from other nodes. [1]: https://github.com/openstack/tacker/blob/master/releasenotes/notes/bp-tosca-csar-mgmt-driver-6dbf9e847c8fe77a.yaml [2]: https://review.opendev.org/#/c/688045/ Thanks & Regards Dharmendra Kushwaha ________________________________________ From: Radosław Piliszek Sent: Thursday, October 10, 2019 12:35 AM To: openstack-discuss Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR packages issues) Hello Tackers! Some time ago I reported a bug in Kolla-Ansible Tacker deployment [1] Eduardo (thanks!) did some debugging to discover that you started requiring internal Glance configuration for Tacker to make it use the local filesystem via the filestore backend (internally in Tacker, not via the deployed Glance) [2] This makes us, Koalas, wonder how to approach a proper production deployment of Tacker. Tacker docs have not been updated regarding this new feature and following them may result in broken Tacker deployment (as we have now). We are especially interested in how to deal with multinode Tacker deployment. Do these new paths require any synchronization? [1] https://bugs.launchpad.net/kolla-ansible/+bug/1845142 [2] https://review.opendev.org/#/c/684275/2/ansible/roles/tacker/templates/tacker.conf.j2 Kind regards, Radek ________________________________ The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or NECTI or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of NECTI or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. From ionut at fleio.com Fri Oct 11 11:06:27 2019 From: ionut at fleio.com (Ionut Biru) Date: Fri, 11 Oct 2019 14:06:27 +0300 Subject: [nova] rescue instances with volumes Message-ID: Hello guys, How do you guys rescue instances that are booted from volumes or have volumes attached to them? If i use nova rescue on instances booted from volume, api returns that it's not compatible and if i rescue an instance that has a volume attached, the drive is not available in the OS. -- Ionut Biru - https://fleio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Fri Oct 11 11:53:17 2019 From: eblock at nde.ag (Eugen Block) Date: Fri, 11 Oct 2019 11:53:17 +0000 Subject: [nova] rescue instances with volumes In-Reply-To: Message-ID: <20191011115317.Horde.t_YCohCT_o3lHR8iJ0RKQ3y@webmail.nde.ag> Hi, with nova rescue you only get access to the root disk if it's an ephemeral disk. If the instance is booted from volume you can shutdown the instance, reset the volume state to "available" and attach-status to "detached" with CLI (because you can't actually detach the root volume), then you should be able to attach that volume to a different instance. Other volumes of the affected instance can be detached and re-attached with Horizon or CLI to another instance if you need them all for the rescue mode. But with this workaround you can't use the "nova rescue" command, so you have to revert all those attachments to the original state manually. Regards, Eugen Zitat von Ionut Biru : > Hello guys, > > How do you guys rescue instances that are booted from volumes or have > volumes attached to them? > > If i use nova rescue on instances booted from volume, api returns that it's > not compatible and if i rescue an instance that has a volume attached, the > drive is not available in the OS. > > -- > Ionut Biru - https://fleio.com From satish.txt at gmail.com Fri Oct 11 12:13:25 2019 From: satish.txt at gmail.com (Satish Patel) Date: Fri, 11 Oct 2019 08:13:25 -0400 Subject: monitoring openstack In-Reply-To: References: Message-ID: <9DD5FA05-11A8-4DE7-8C09-F46BB0E3CC32@gmail.com> You only need ceilometer and gnocchi (aodh for alerting) Also look into grafana + gnocchi for beautiful graphing. Sent from my iPhone > On Oct 9, 2019, at 3:49 AM, Ali Ebrahimpour < ali74.ebrahimpour at gmail.com> wrote: > > hi guys > i want to install monitoring in my horizon Ui and i'm confused in setting up ceilometer or gnocchi or aodh or monasca in my project because all of them where deprecated. i setup openstack with ansible and i want to monitor the usage of cpu and ram and etc in my dashboard and i also want to know how much resources each customer used for one hour and day. > Thanks in advance for your precise guidance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From angeiv.zhang at gmail.com Fri Oct 11 12:22:55 2019 From: angeiv.zhang at gmail.com (Xing Zhang) Date: Fri, 11 Oct 2019 20:22:55 +0800 Subject: [neutron][security group][IPv6] IPv6 ICMPv6 port security in security group Message-ID: Hi all, When using neutron on CentOS 7 with OVSHybridIptablesFirewallDriver, create a vm with IPv4/IPv6 dual stack port, then remove all security group, we can get response with ping dhcp or router using IPv6 address in vm, while IPv4 can't. IPv6 works different with IPv4 in some cases and some useful function must work with ICMPv6 like NDP, NS, NA. Checking these two links below, neutron only drop IPv6 RA from vm, and allow all ICMPv6 ICMPv6 Type 128 Echo Request and Type 129 Echo Reply are allowed by default. Should we try to restrict ICMPv6 some types or there are some considerations and just follow ITEF 4890? IETF 4890 [section 4.3.2. Traffic That Normally Should Not Be Dropped] mentioned that: As discussed in Section 3.2 , the risks from port scanning in an IPv6 network are much less severe, and it is not necessary to filter IPv6 Echo Request messages. [section 3.2. Probing] However, the very large address space of IPv6 makes probing a less effective weapon as compared with IPv4 provided that addresses are not allocated in an easily guessable fashion. https://github.com/openstack/neutron/commit/a8a9d225d8496c044db7057552394afd6c950a8e https://www.ietf.org/rfc/rfc4890.txt Commands are: neutron port-update --no-security-groups 0307f016-0cc8-468b-bf3e-36ebe50e13ac ping6 from vm to dhcp ip6tables rules in compute node: PS: seems rules for type 131/135/143 are included in the rule # ip6tables-save | grep 08a0812a -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 131 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 135 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 143 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 -m comment --comment "Drop IPv6 Router Advts from VM Instance." -j DROP -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback full rules are at Ref #3 REF #1 ml2_config.ini [securitygroup] firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver Ref #2 Chain neutron-openvswi-o08a0812a-9 (2 references) pkts bytes target prot opt in out source destination 0 0 RETURN icmpv6 * * :: ff02::/16 ipv6-icmptype 131 /* Allow IPv6 ICMP traffic. */ 1 72 RETURN icmpv6 * * :: ff02::/16 ipv6-icmptype 135 /* Allow IPv6 ICMP traffic. */ 2 152 RETURN icmpv6 * * :: ff02::/16 ipv6-icmptype 143 /* Allow IPv6 ICMP traffic. */ 5 344 neutron-openvswi-s08a0812a-9 all * * ::/0 ::/0 0 0 DROP icmpv6 * * ::/0 ::/0 ipv6-icmptype 134 /* Drop IPv6 Router Advts from VM Instance. */ 5 344 RETURN icmpv6 * * ::/0 ::/0 /* Allow IPv6 ICMP traffic. */ 0 0 RETURN udp * * ::/0 ::/0 udp spt:546 dpt:547 /* Allow DHCP client traffic. */ 0 0 DROP udp * * ::/0 ::/0 udp spt:547 dpt:546 /* Prevent DHCP Spoofing by VM. */ 0 0 RETURN all * * ::/0 ::/0 state RELATED,ESTABLISHED /* Direct packets associated with a known session to the RETURN chain. */ 0 0 DROP all * * ::/0 ::/0 state INVALID /* Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack. */ 0 0 neutron-openvswi-sg-fallback all * * ::/0 ::/0 /* Send unmatched traffic to the fallback chain. */ Ref #3 # ip6tables-save | grep 08a0812a -A neutron-openvswi-PREROUTING -m physdev --physdev-in qvb08a0812a-9e -m comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT --zone 4104 -A neutron-openvswi-PREROUTING -i qvb08a0812a-9e -m comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT --zone 4104 -A neutron-openvswi-PREROUTING -m physdev --physdev-in tap08a0812a-9e -m comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT --zone 4104 :neutron-openvswi-i08a0812a-9 - [0:0] :neutron-openvswi-o08a0812a-9 - [0:0] :neutron-openvswi-s08a0812a-9 - [0:0] -A neutron-openvswi-FORWARD -m physdev --physdev-out tap08a0812a-9e --physdev-is-bridged -m comment --comment "Direct traffic from the VM interface to the security group chain." -j neutron-openvswi-sg-chain -A neutron-openvswi-FORWARD -m physdev --physdev-in tap08a0812a-9e --physdev-is-bridged -m comment --comment "Direct traffic from the VM interface to the security group chain." -j neutron-openvswi-sg-chain -A neutron-openvswi-INPUT -m physdev --physdev-in tap08a0812a-9e --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM to the security group chain." -j neutron-openvswi-o08a0812a-9 -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 130 -j RETURN -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 135 -j RETURN -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 136 -j RETURN -A neutron-openvswi-i08a0812a-9 -m state --state RELATED,ESTABLISHED -m comment --comment "Direct packets associated with a known session to the RETURN chain." -j RETURN -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 -j RETURN -A neutron-openvswi-i08a0812a-9 -d 20ff::c/128 -p udp -m udp --sport 547 --dport 546 -j RETURN -A neutron-openvswi-i08a0812a-9 -d fe80::/64 -p udp -m udp --sport 547 --dport 546 -j RETURN -A neutron-openvswi-i08a0812a-9 -m state --state INVALID -m comment --comment "Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP -A neutron-openvswi-i08a0812a-9 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 131 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 135 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 143 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -j neutron-openvswi-s08a0812a-9 -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 -m comment --comment "Drop IPv6 Router Advts from VM Instance." -j DROP -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -p udp -m udp --sport 546 --dport 547 -m comment --comment "Allow DHCP client traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -p udp -m udp --sport 547 --dport 546 -m comment --comment "Prevent DHCP Spoofing by VM." -j DROP -A neutron-openvswi-o08a0812a-9 -m state --state RELATED,ESTABLISHED -m comment --comment "Direct packets associated with a known session to the RETURN chain." -j RETURN -A neutron-openvswi-o08a0812a-9 -m state --state INVALID -m comment --comment "Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP -A neutron-openvswi-o08a0812a-9 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback -A neutron-openvswi-s08a0812a-9 -s 20ff::c/128 -m mac --mac-source FA:16:3E:7C:D8:C0 -m comment --comment "Allow traffic from defined IP/MAC pairs." -j RETURN -A neutron-openvswi-s08a0812a-9 -s fe80::f816:3eff:fe7c:d8c0/128 -m mac --mac-source FA:16:3E:7C:D8:C0 -m comment --comment "Allow traffic from defined IP/MAC pairs." -j RETURN -A neutron-openvswi-s08a0812a-9 -m comment --comment "Drop traffic without an IP/MAC allow rule." -j DROP -A neutron-openvswi-sg-chain -m physdev --physdev-out tap08a0812a-9e --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-i08a0812a-9 -A neutron-openvswi-sg-chain -m physdev --physdev-in tap08a0812a-9e --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-o08a0812a-9 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sshnaidm at redhat.com Fri Oct 11 13:29:01 2019 From: sshnaidm at redhat.com (Sagi Shnaidman) Date: Fri, 11 Oct 2019 16:29:01 +0300 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: > As for the feedback received in the previous answers, people would like to > keep a "print-cmd" like, which makes total sense. > I was thinking we could write a proper check mode for the podman_container > module, which could output the podman commands that are run by the module. > We could also extract the container management tasks to its own playbook > so an operator who would usually run: > $ paunch debug (...) --action print-cmd > > replaced by: > $ ansible-playbook --check -i inventory.yaml containers.yaml > > It's totally doable. Currently the module prints podman commands to syslog exactly as they are executed. We can definitely support check mode with it. > Challenges: > - no unit tests like in paunch, will need good testing with Molecule > The podman ansible modules are written in python, so i think we can still use some unittests in addition to molecule tests. > -- > Emilien Macchi > -- Best regards Sagi Shnaidman -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.slagle at gmail.com Fri Oct 11 14:55:26 2019 From: james.slagle at gmail.com (James Slagle) Date: Fri, 11 Oct 2019 10:55:26 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: On Wed, Oct 9, 2019 at 7:05 PM Emilien Macchi wrote: > > This thread deserves an update: > > - tripleo-ansible has now a paunch module, calling openstack/paunch as a library. > https://opendev.org/openstack/tripleo-ansible/src/branch/master/tripleo_ansible/ansible_plugins/modules/paunch.py > > And is called here for paunch apply: > https://opendev.org/openstack/tripleo-heat-templates/src/branch/master/common/deploy-steps-tasks.yaml#L232-L254 > > In theory, we could deprecate "paunch apply" now as we don't need it anymore. I was working on porting "paunch cleanup" but it's still WIP. > > - I've been working on a new Ansible role which could totally replace Paunch, called "tripleo-container-manage", which has been enough for me to deploy an Undercloud: https://review.opendev.org/#/c/686196. It's being tested here: https://review.opendev.org/#/c/687651/ and as you can see the undercloud was successfully deployed without Paunch. Note that some container parameters haven't been ported and upgrade untested (this is a prototype). > > The second approach is a serious prototype I would like to continue further but before I would like some feedback. > As for the feedback received in the previous answers, people would like to keep a "print-cmd" like, which makes total sense. > I was thinking we could write a proper check mode for the podman_container module, which could output the podman commands that are run by the module. > We could also extract the container management tasks to its own playbook so an operator who would usually run: > $ paunch debug (...) --action print-cmd > > replaced by: > $ ansible-playbook --check -i inventory.yaml containers.yaml > > A few benefits of this new role: > - leverage ansible modules (we plan to upstream podman_container module) > - could be easier to maintain and contribute (python vs ansible) > - could potentially be faster. I want to investigate usage of async actions/polls in the role. > > Challenges: > - no unit tests like in paunch, will need good testing with Molecule > - we need to invest a lot in testing it, Paunch has a lot of edge cases that we carried over the cycles to manage containers. > > More feedback is very welcome and anyone interested to contribute please let me know. Nice work! I like the approach with the new ansible role. I do think there will be a balance between what makes sense to keep in a python module vs an ansible task. If/then branching logic and conditional tasks based on previous results is of course all possible with ansible tasks, but it starts to become complex and difficult to manage. A higher level language (python) is much better at that. Personally, I prefer to view ansible as just an execution engine and would look to keep the actual application and business logic in proper reusable/testable code modules (python). Finding that right balance is likely something we can figure out in review feedback, ad-hoc discussions, etc. An idea for a future improvement I would like to see as we move in this direction is to switch from reading the container startup configs from a single file per step (/var/lib/tripleo-config/container-startup-config-step_{{ step }}.json), to using a directory per step instead. It would look something like: /var/lib/tripleo-config/container-startup-config/step1 /var/lib/tripleo-config/container-startup-config/step1/keystone-init-tasks.json /var/lib/tripleo-config/container-startup-config/step1/pacemaker-init-tasks.json etc. That way each service template can be converted to a proper ansible role in tripleo-ansible that just drops its config into the right directory on the managed node. When the tripleo-container-manage role is then executed, it will operate on those files. This would also make it much more clear what container caused a failure, since we could log the results individually instead of just getting back the union of all logs per step. I think you're patches already address this to some degree since you are looping over the contents of the single file. The other feedback I would offer is perhaps continue to think about keeping the container implementation pluggable in some fashion. Right now you have a tasks/podman.yaml. What might it look like if we wanted to have a tasks/kubernetes.yaml in the future, and how would that be enabled? Thanks -- -- James Slagle -- From emilien at redhat.com Fri Oct 11 15:08:18 2019 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 11 Oct 2019 11:08:18 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: On Fri, Oct 11, 2019 at 10:55 AM James Slagle wrote: [snip] > Nice work! I like the approach with the new ansible role. > > I do think there will be a balance between what makes sense to keep in > a python module vs an ansible task. If/then branching logic and > conditional tasks based on previous results is of course all possible > with ansible tasks, but it starts to become complex and difficult to > manage. A higher level language (python) is much better at that. > Personally, I prefer to view ansible as just an execution engine and > would look to keep the actual application and business logic in proper > reusable/testable code modules (python). Finding that right balance is > likely something we can figure out in review feedback, ad-hoc > discussions, etc. > Ack & agreed on my side. An idea for a future improvement I would like to see as we move in > this direction is to switch from reading the container startup configs > from a single file per step > (/var/lib/tripleo-config/container-startup-config-step_{{ step > }}.json), to using a directory per step instead. It would look > something like: > > /var/lib/tripleo-config/container-startup-config/step1 > > /var/lib/tripleo-config/container-startup-config/step1/keystone-init-tasks.json > > /var/lib/tripleo-config/container-startup-config/step1/pacemaker-init-tasks.json > etc. > > That way each service template can be converted to a proper ansible > role in tripleo-ansible that just drops its config into the right > directory on the managed node. When the tripleo-container-manage role > is then executed, it will operate on those files. This would also make > it much more clear what container caused a failure, since we could log > the results individually instead of just getting back the union of all > logs per step. I think you're patches already address this to some > degree since you are looping over the contents of the single file. > This is an excellent idea. One of the feedback I've got from the Upgrade folks is the need to be able to easily upgrade one service, and the current structure doesn't easily allow it. Your proposal is I think exactly addressing it; and indeed it'll help when migrating container config into their individual roles in tripleo-ansible. I'll add that to the backlog. The other feedback I would offer is perhaps continue to think about > keeping the container implementation pluggable in some fashion. Right > now you have a tasks/podman.yaml. What might it look like if we wanted > to have a tasks/kubernetes.yaml in the future, and how would that be > enabled? > Yes, that's what I had in mind when starting the role. The podman.yaml is for Podman logic. We will probably have docker.yaml if we want to support Docker for FFU from Queens to Train. And we can easily add a playbook "kubernetes.yaml" which will read the container config data, generate k8s YAML and then consume it via https://docs.ansible.com/ansible/latest/modules/k8s_module.html . Really there is no limit if we can make it really pluggable. Thanks for the input and the great feedback, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From lshort at redhat.com Fri Oct 11 16:32:35 2019 From: lshort at redhat.com (Luke Short) Date: Fri, 11 Oct 2019 12:32:35 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: Hey folks, +1 to the direction we're going with this. Like Emilien said, the skies the limit when using a flexible automation framework like Ansible. We're definitely going to need Molecule tests for the role and unit/integration tests for the podman_container module itself. I left a comment in the podman_container feature request in Ansible to let the broader community know that we're working towards stabilizing that module. Hopefully that will get more contributors to help us fast track upstreaming it. It doesn't seem like Paunch is really used outside of TripleO so having it in Ansible, which has wider adoption, seems really ideal. As for backports, I think it's fair to say that Paunch for the most part "just works." When it does break it's a pain to fix. Which is even more reason to move away from it. Sincerely, Luke Short, RHCE Software Engineer, OpenStack Deployment Framework Red Hat, Inc. On Fri, Oct 11, 2019 at 11:13 AM Emilien Macchi wrote: > On Fri, Oct 11, 2019 at 10:55 AM James Slagle > wrote: > [snip] > >> Nice work! I like the approach with the new ansible role. >> >> I do think there will be a balance between what makes sense to keep in >> a python module vs an ansible task. If/then branching logic and >> conditional tasks based on previous results is of course all possible >> with ansible tasks, but it starts to become complex and difficult to >> manage. A higher level language (python) is much better at that. >> Personally, I prefer to view ansible as just an execution engine and >> would look to keep the actual application and business logic in proper >> reusable/testable code modules (python). Finding that right balance is >> likely something we can figure out in review feedback, ad-hoc >> discussions, etc. >> > > Ack & agreed on my side. > > An idea for a future improvement I would like to see as we move in >> this direction is to switch from reading the container startup configs >> from a single file per step >> (/var/lib/tripleo-config/container-startup-config-step_{{ step >> }}.json), to using a directory per step instead. It would look >> something like: >> >> /var/lib/tripleo-config/container-startup-config/step1 >> >> /var/lib/tripleo-config/container-startup-config/step1/keystone-init-tasks.json >> >> /var/lib/tripleo-config/container-startup-config/step1/pacemaker-init-tasks.json >> etc. >> >> That way each service template can be converted to a proper ansible >> role in tripleo-ansible that just drops its config into the right >> directory on the managed node. When the tripleo-container-manage role >> is then executed, it will operate on those files. This would also make >> it much more clear what container caused a failure, since we could log >> the results individually instead of just getting back the union of all >> logs per step. I think you're patches already address this to some >> degree since you are looping over the contents of the single file. >> > > This is an excellent idea. One of the feedback I've got from the Upgrade > folks is the need to be able to easily upgrade one service, and the current > structure doesn't easily allow it. Your proposal is I think exactly > addressing it; and indeed it'll help when migrating container config into > their individual roles in tripleo-ansible. > I'll add that to the backlog. > > The other feedback I would offer is perhaps continue to think about >> keeping the container implementation pluggable in some fashion. Right >> now you have a tasks/podman.yaml. What might it look like if we wanted >> to have a tasks/kubernetes.yaml in the future, and how would that be >> enabled? >> > > Yes, that's what I had in mind when starting the role. The podman.yaml is > for Podman logic. > We will probably have docker.yaml if we want to support Docker for FFU > from Queens to Train. > And we can easily add a playbook "kubernetes.yaml" which will read the > container config data, generate k8s YAML and then consume it via > https://docs.ansible.com/ansible/latest/modules/k8s_module.html . Really > there is no limit if we can make it really pluggable. > > > Thanks for the input and the great feedback, > -- > Emilien Macchi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From info at dantalion.nl Fri Oct 11 06:52:39 2019 From: info at dantalion.nl (info at dantalion.nl) Date: Fri, 11 Oct 2019 08:52:39 +0200 Subject: [watcher] Thesis on improving Watcher and collaborating with OpenStack community Message-ID: Hello everyone, I am a Dutch student at the Amsterdam University of Applied Sciences (AUAS) and have recently finished my thesis. My thesis was written on improvements that were made to OpenStack Watcher between February and Juli of 2019. Specifically, many of these improvements were written to aid CERN in deploying Watcher. In addition, the thesis describes methods of collaboration and engaging in communities as well as evaluating strengths and weaknesses of communties. Since the thesis primarily resolves around OpenStack I would like to share it with the community as well. Please find the thesis attached to this email. Any feedback, remarks, future advice or other responses are appreciated. Kind regards, Corne Lukken (Dantali0n) -------------- next part -------------- A non-text attachment was scrubbed... Name: EffectsOpenStackWatcherDeploymentR.pdf Type: application/pdf Size: 1309768 bytes Desc: not available URL: From pkliczew at redhat.com Fri Oct 11 08:22:21 2019 From: pkliczew at redhat.com (Piotr Kliczewski) Date: Fri, 11 Oct 2019 10:22:21 +0200 Subject: [Openstack] FOSDEM 2020 Virtualization & IaaS Devroom CfP Message-ID: We are excited to announce that the call for proposals is now open for the Virtualization & IaaS devroom at the upcoming FOSDEM 2020, to be hosted on February 1st 2020. This year will mark FOSDEM’s 20th anniversary as one of the longest-running free and open source software developer events, attracting thousands of developers and users from all over the world. FOSDEM will be held once again in Brussels, Belgium, on February 1st & 2nd, 2020. This devroom is a collaborative effort, and is organized by dedicated folks from projects such as OpenStack, Xen Project, oVirt, QEMU, KVM, and Foreman. We would like to invite all those who are involved in these fields to submit your proposals by December 1st, 2019. About the Devroom The Virtualization & IaaS devroom will feature session topics such as open source hypervisors and virtual machine managers such as Xen Project, KVM, bhyve, and VirtualBox, and Infrastructure-as-a-Service projects such as KubeVirt, Apache CloudStack, OpenStack, oVirt, QEMU and OpenNebula. This devroom will host presentations that focus on topics of shared interest, such as KVM; libvirt; shared storage; virtualized networking; cloud security; clustering and high availability; interfacing with multiple hypervisors; hyperconverged deployments; and scaling across hundreds or thousands of servers. Presentations in this devroom will be aimed at developers working on these platforms who are looking to collaborate and improve shared infrastructure or solve common problems. We seek topics that encourage dialog between projects and continued work post-FOSDEM. Important Dates Submission deadline: 1 December 2019 Acceptance notifications: 10 December 2019 Final schedule announcement: 15th December 2019 Devroom: 1st February 2020 Submit Your Proposal All submissions must be made via the Pentabarf event planning site[1]. If you have not used Pentabarf before, you will need to create an account. If you submitted proposals for FOSDEM in previous years, you can use your existing account. After creating the account, select Create Event to start the submission process. Make sure to select Virtualization and IaaS devroom from the Track list. Please fill out all the required fields, and provide a meaningful abstract and description of your proposed session. Submission Guidelines We expect more proposals than we can possibly accept, so it is vitally important that you submit your proposal on or before the deadline. Late submissions are unlikely to be considered. All presentation slots are 30 minutes, with 20 minutes planned for presentations, and 10 minutes for Q&A. All presentations will be recorded and made available under Creative Commons licenses. In the Submission notes field, please indicate that you agree that your presentation will be licensed under the CC-By-SA-4.0 or CC-By-4.0 license and that you agree to have your presentation recorded. For example: "If my presentation is accepted for FOSDEM, I hereby agree to license all recordings, slides, and other associated materials under the Creative Commons Attribution Share-Alike 4.0 International License. Sincerely, ." In the Submission notes field, please also confirm that if your talk is accepted, you will be able to attend FOSDEM and deliver your presentation. We will not consider proposals from prospective speakers who are unsure whether they will be able to secure funds for travel and lodging to attend FOSDEM. (Sadly, we are not able to offer travel funding for prospective speakers.) Submission Guidelines Mentored presentations will have 25-minute slots, where 20 minutes will include the presentation and 5 minutes will be reserved for questions. The number of newcomer session slots is limited, so we will probably not be able to accept all applications. You must submit your talk and abstract to apply for the mentoring program, our mentors are volunteering their time and will happily provide feedback but won't write your presentation for you! If you are experiencing problems with Pentabarf, the proposal submission interface, or have other questions, you can email our devroom mailing list[2] and we will try to help you. How to Apply In addition to agreeing to video recording and confirming that you can attend FOSDEM in case your session is accepted, please write "speaker mentoring program application" in the "Submission notes" field, and list any prior speaking experience or other relevant information for your application. Code of Conduct Following the release of the updated code of conduct for FOSDEM, we'd like to remind all speakers and attendees that all of the presentations and discussions in our devroom are held under the guidelines set in the CoC and we expect attendees, speakers, and volunteers to follow the CoC at all times. If you submit a proposal and it is accepted, you will be required to confirm that you accept the FOSDEM CoC. If you have any questions about the CoC or wish to have one of the devroom organizers review your presentation slides or any other content for CoC compliance, please email us and we will do our best to assist you. Call for Volunteers We are also looking for volunteers to help run the devroom. We need assistance watching time for the speakers, and helping with video for the devroom. Please contact devroom mailing list [2] for more information. Questions? If you have any questions about this devroom, please send your questions to our devroom mailing list. You can also subscribe to the list to receive updates about important dates, session announcements, and to connect with other attendees. See you all at FOSDEM! [1] https://penta.fosdem.org/submission/FOSDEM20 [2] iaas-virt-devroom at lists.fosdem.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Fri Oct 11 01:16:17 2019 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Thu, 10 Oct 2019 21:16:17 -0400 Subject: [kolla] Support for removing previously enabled services.. Message-ID: Hey everyone, I'm pretty sure I know the answer but are there any support within Kolla itself to disable Services that we're previously enabled. For example, I was testing the Skydive Agent/Analyzer combo till I realized that it was using about 90-100% of the CPUs or computes and controllers. [image: image.png] Re-running Kolla with reconfigure but with Service set to "No" didn't remove the containers. I had to remove the containers after the reconfigure finished. This is Kolla 8.0.1 with a Stein install. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 82637 bytes Desc: not available URL: From eandersson at blizzard.com Fri Oct 11 01:19:56 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Fri, 11 Oct 2019 01:19:56 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, , , Message-ID: Btw I still think your suders is slightly incorrect. I feel like that is significant, but not a hundred. Drop the star at the end of the last line. root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Erik Olof Gunnar Andersson Sent: Thursday, October 10, 2019 6:18 PM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Maybe double check that your rootwrap config is up to date? /etc/neutron/rootwrap .conf and /etc/neutron/rootwrap.d (Make sure to pick the appropriate branch in github) https://github.com/openstack/neutron/blob/master/etc/rootwrap.conf https://github.com/openstack/neutron/tree/master/etc/neutron/rootwrap.d ________________________________ From: Albert Braden Sent: Thursday, October 10, 2019 1:45 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We’re not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone… for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren’t erroring at this time. I changed neutron’s shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? From: Erik Olof Gunnar Andersson > Sent: Thursday, October 10, 2019 11:08 AM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group Yea – if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden > Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn’t start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Fri Oct 11 20:44:03 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 11 Oct 2019 13:44:03 -0700 Subject: [keystone][edge][k8s] Keystone - StarlingX integration feedback In-Reply-To: References: Message-ID: On Thu, Oct 10, 2019, at 11:26, Ildiko Vancsa wrote: > Hi, > > I wanted to point you to a thread that’s just started on the > edge-computing mailing list: > http://lists.openstack.org/pipermail/edge-computing/2019-October/000642.html > > The mail contains information about a use case that StarlingX has to > use Keystone integrated with Kubernetes which I believe is valuable > information to the Keystone team to see if there are any items to > discuss further/fix/implement. > > Thanks, > Ildikó > > Thanks for highlighting this, I've responded on the other mailing list. Colleen From Albert.Braden at synopsys.com Fri Oct 11 21:03:40 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Fri, 11 Oct 2019 21:03:40 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, , , Message-ID: It appears that the extra * was the issue. After removing it I can run the rootwrap daemon without errors. I'm not 100% sure because the issue took 2 weeks to show up after the initial config change, but this seems to have fixed the problem. From: Erik Olof Gunnar Andersson Sent: Thursday, October 10, 2019 6:21 PM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Btw I still think your suders is slightly incorrect. I feel like that is significant, but not a hundred. Drop the star at the end of the last line. root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Erik Olof Gunnar Andersson > Sent: Thursday, October 10, 2019 6:18 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: Re: Port creation times out for some VMs in large group Maybe double check that your rootwrap config is up to date? /etc/neutron/rootwrap .conf and /etc/neutron/rootwrap.d (Make sure to pick the appropriate branch in github) https://github.com/openstack/neutron/blob/master/etc/rootwrap.conf https://github.com/openstack/neutron/tree/master/etc/neutron/rootwrap.d ________________________________ From: Albert Braden > Sent: Thursday, October 10, 2019 1:45 PM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We're not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden > Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone... for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren't erroring at this time. I changed neutron's shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Fri Oct 11 21:21:09 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Fri, 11 Oct 2019 17:21:09 -0400 Subject: [neutron][security group][IPv6] IPv6 ICMPv6 port security in security group In-Reply-To: References: Message-ID: <4c668285-d214-3099-a9f0-3842bc659639@gmail.com> On 10/11/19 8:22 AM, Xing Zhang wrote: > Hi all, > > When using neutron on CentOS 7 with OVSHybridIptablesFirewallDriver, > create a vm with IPv4/IPv6 dual stack port, > then remove all security group, we can get response with ping dhcp or > router using IPv6 address in vm, while IPv4 can't. > IPv6 works different with IPv4 in some cases and some useful function > must work with ICMPv6 like NDP, NS, NA. > > Checking these two links below, neutron only drop IPv6 RA from vm, and > allow all ICMPv6 > ICMPv6 Type 128 Echo Request and Type 129 Echo Reply are allowed by default. > Should we try to restrict ICMPv6 some types or there are some > considerations and just follow ITEF 4890? The iptables rules you listed below are for egress traffic, and by default the firewall driver only drops things that could allow one instance to interfere with operation of another, for example, sending DHCP replies or IPv6 router advertisements. Only privileged neutron ports (router and dhcp) can do that. I believe the reason we were so permissive on allowing all ICMPv6 out is to not interfere with NS/NA/RS packets by accident, looking back we probably could have written more specific rules here. The OVS firewall driver actually does add more specific rules for outbound NS/NA/RS, and has been the current default for neutron for a couple of cycles. Regarding dropping other outbound IPv6 traffic, I don't think we should filter anything else by default, it would be a not-backwards-compatible change that would cause a lot of confusion. -Brian > IETF 4890 [section 4.3.2. Traffic That Normally Should Not Be Dropped] > mentioned that: > > As discussed in > Section 3.2 , the risks from port scanning in an IPv6 network are much > less severe, and it is not necessary to filter IPv6 Echo Request > messages. > > [section 3.2. Probing] > > However, the very large address space of IPv6 makes probing a less > effective weapon as compared with IPv4 provided that addresses are > not allocated in an easily guessable fashion. > > > https://github.com/openstack/neutron/commit/a8a9d225d8496c044db7057552394afd6c950a8e > > > https://www.ietf.org/rfc/rfc4890.txt > > > > Commands are: > neutron port-update --no-security-groups > 0307f016-0cc8-468b-bf3e-36ebe50e13ac > > ping6 from vm to dhcp > > ip6tables rules in compute node: > PS: seems rules for type 131/135/143 are included in the rule > > # ip6tables-save | grep 08a0812a > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 131 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 135 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 143 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 > -m comment --comment "Drop IPv6 Router Advts from VM Instance." -j DROP > -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m comment --comment "Allow > IPv6 ICMP traffic." -j RETURN > -A neutron-openvswi-o08a0812a-9 -m comment --comment "Send unmatched > traffic to the fallback chain." -j neutron-openvswi-sg-fallback > > full rules are at Ref #3 > > > > > REF #1 > ml2_config.ini > [securitygroup] > firewall_driver = > neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver > > Ref #2 > Chain neutron-openvswi-o08a0812a-9 (2 references) >  pkts bytes target     prot opt in     out     source > destination >     0     0 RETURN     icmpv6    *      *       :: > ff02::/16            ipv6-icmptype 131 /* Allow IPv6 ICMP traffic. */ >     1    72 RETURN     icmpv6    *      *       :: > ff02::/16            ipv6-icmptype 135 /* Allow IPv6 ICMP traffic. */ >     2   152 RETURN     icmpv6    *      *       :: > ff02::/16            ipv6-icmptype 143 /* Allow IPv6 ICMP traffic. */ >     5   344 neutron-openvswi-s08a0812a-9  all      *      *       ::/0 >                 ::/0 >     0     0 DROP       icmpv6    *      *       ::/0 > ::/0                 ipv6-icmptype 134 /* Drop IPv6 Router Advts from VM > Instance. */ >     5   344 RETURN     icmpv6    *      *       ::/0 > ::/0                 /* Allow IPv6 ICMP traffic. */ >     0     0 RETURN     udp      *      *       ::/0 > ::/0                 udp spt:546 dpt:547 /* Allow DHCP client traffic. */ >     0     0 DROP       udp      *      *       ::/0 > ::/0                 udp spt:547 dpt:546 /* Prevent DHCP Spoofing by VM. */ >     0     0 RETURN     all      *      *       ::/0 > ::/0                 state RELATED,ESTABLISHED /* Direct packets > associated with a known session to the RETURN chain. */ >     0     0 DROP       all      *      *       ::/0 > ::/0                 state INVALID /* Drop packets that appear related > to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in > conntrack. */ >     0     0 neutron-openvswi-sg-fallback  all      *      *       ::/0 >                 ::/0                 /* Send unmatched traffic to the > fallback chain. */ > > Ref #3 > # ip6tables-save | grep 08a0812a > > -A neutron-openvswi-PREROUTING -m physdev --physdev-in qvb08a0812a-9e -m > comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT > --zone 4104 > -A neutron-openvswi-PREROUTING -i qvb08a0812a-9e -m comment --comment > "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT --zone 4104 > -A neutron-openvswi-PREROUTING -m physdev --physdev-in tap08a0812a-9e -m > comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT > --zone 4104 > :neutron-openvswi-i08a0812a-9 - [0:0] > :neutron-openvswi-o08a0812a-9 - [0:0] > :neutron-openvswi-s08a0812a-9 - [0:0] > -A neutron-openvswi-FORWARD -m physdev --physdev-out tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Direct traffic from the VM > interface to the security group chain." -j neutron-openvswi-sg-chain > -A neutron-openvswi-FORWARD -m physdev --physdev-in tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Direct traffic from the VM > interface to the security group chain." -j neutron-openvswi-sg-chain > -A neutron-openvswi-INPUT -m physdev --physdev-in tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Direct incoming traffic from > VM to the security group chain." -j neutron-openvswi-o08a0812a-9 > -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 130 > -j RETURN > -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 135 > -j RETURN > -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 136 > -j RETURN > -A neutron-openvswi-i08a0812a-9 -m state --state RELATED,ESTABLISHED -m > comment --comment "Direct packets associated with a known session to the > RETURN chain." -j RETURN > -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 > -j RETURN > -A neutron-openvswi-i08a0812a-9 -d 20ff::c/128 -p udp -m udp --sport 547 > --dport 546 -j RETURN > -A neutron-openvswi-i08a0812a-9 -d fe80::/64 -p udp -m udp --sport 547 > --dport 546 -j RETURN > -A neutron-openvswi-i08a0812a-9 -m state --state INVALID -m comment > --comment "Drop packets that appear related to an existing connection > (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP > -A neutron-openvswi-i08a0812a-9 -m comment --comment "Send unmatched > traffic to the fallback chain." -j neutron-openvswi-sg-fallback > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 131 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 135 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 143 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -j neutron-openvswi-s08a0812a-9 > -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 > -m comment --comment "Drop IPv6 Router Advts from VM Instance." -j DROP > -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m comment --comment "Allow > IPv6 ICMP traffic." -j RETURN > -A neutron-openvswi-o08a0812a-9 -p udp -m udp --sport 546 --dport 547 -m > comment --comment "Allow DHCP client traffic." -j RETURN > -A neutron-openvswi-o08a0812a-9 -p udp -m udp --sport 547 --dport 546 -m > comment --comment "Prevent DHCP Spoofing by VM." -j DROP > -A neutron-openvswi-o08a0812a-9 -m state --state RELATED,ESTABLISHED -m > comment --comment "Direct packets associated with a known session to the > RETURN chain." -j RETURN > -A neutron-openvswi-o08a0812a-9 -m state --state INVALID -m comment > --comment "Drop packets that appear related to an existing connection > (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP > -A neutron-openvswi-o08a0812a-9 -m comment --comment "Send unmatched > traffic to the fallback chain." -j neutron-openvswi-sg-fallback > -A neutron-openvswi-s08a0812a-9 -s 20ff::c/128 -m mac --mac-source > FA:16:3E:7C:D8:C0 -m comment --comment "Allow traffic from defined > IP/MAC pairs." -j RETURN > -A neutron-openvswi-s08a0812a-9 -s fe80::f816:3eff:fe7c:d8c0/128 -m mac > --mac-source FA:16:3E:7C:D8:C0 -m comment --comment "Allow traffic from > defined IP/MAC pairs." -j RETURN > -A neutron-openvswi-s08a0812a-9 -m comment --comment "Drop traffic > without an IP/MAC allow rule." -j DROP > -A neutron-openvswi-sg-chain -m physdev --physdev-out tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Jump to the VM specific > chain." -j neutron-openvswi-i08a0812a-9 > -A neutron-openvswi-sg-chain -m physdev --physdev-in tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Jump to the VM specific > chain." -j neutron-openvswi-o08a0812a-9 From colleen at gazlene.net Sat Oct 12 00:12:39 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 11 Oct 2019 17:12:39 -0700 Subject: [keystone] Keystone Team Update - Week of 7 October 2019 Message-ID: <546861eb-62a6-4c53-9e84-d6b2e285a4e6@www.fastmail.com> # Keystone Team Update - Week of 7 October 2019 ## News ### RC2 We ended up cutting a second RC in order to remove the policy.v3cloudsample.json file[1] and to include placeholder schema migrations[2]. That RC should become the Train release next week[3]. [1] https://review.opendev.org/687639 [2] https://review.opendev.org/687775 [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010074.html ## Office Hours When there are topics to cover, the keystone team holds office hours on Tuesdays at 17:00 UTC. There won't be office hours next week. Add topics you would like to see covered during office hours to the etherpad: https://etherpad.openstack.org/p/keystone-office-hours-topics ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 17 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 32 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ## Milestone Outlook https://releases.openstack.org/train/schedule.html Next week is the Train release! ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From radoslaw.piliszek at gmail.com Sat Oct 12 07:40:00 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Sat, 12 Oct 2019 09:40:00 +0200 Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR packages issues) In-Reply-To: References: Message-ID: Hi Dharmendra, thanks for the insights. We will see what we can do. In the worst case we will leave it to the operator to provide the shared filesystem (by documenting the need). Are you planning to move to using glance-api? It would solve the locality problem. Kind regards, Radek pt., 11 paź 2019 o 09:31 Dharmendra Kushwaha < dharmendra.kushwaha at india.nec.com> napisał(a): > Hi Radosław, > > Sorry for inconvenience. > We added support for vnf package with limited scope [1] in train cycle, > and have ongoing activity for U cycle, so we didn't published proper doc > for this feature. But yes, we will add doc for current dependent changes. I > have just pushed a manual installation doc changes in [2]. > We needs vnf_package_csar_path(i.e. /var/lib/tacker/vnfpackages/) path to > keep extracted data locally for further actions, and > filesystem_store_datadir(i.e. /var/lib/tacker/csar_files) for glance store. > In case of multi node deployment, we recommend to configure > filesystem_store_datadir option on shared storage to make sure the > availability from other nodes. > > [1]: > https://github.com/openstack/tacker/blob/master/releasenotes/notes/bp-tosca-csar-mgmt-driver-6dbf9e847c8fe77a.yaml > [2]: https://review.opendev.org/#/c/688045/ > > Thanks & Regards > Dharmendra Kushwaha > ________________________________________ > From: Radosław Piliszek > Sent: Thursday, October 10, 2019 12:35 AM > To: openstack-discuss > Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR > packages issues) > > Hello Tackers! > > Some time ago I reported a bug in Kolla-Ansible Tacker deployment [1] > Eduardo (thanks!) did some debugging to discover that you started > requiring internal Glance configuration for Tacker to make it use the local > filesystem via the filestore backend (internally in Tacker, not via the > deployed Glance) [2] > This makes us, Koalas, wonder how to approach a proper production > deployment of Tacker. > Tacker docs have not been updated regarding this new feature and following > them may result in broken Tacker deployment (as we have now). > We are especially interested in how to deal with multinode Tacker > deployment. Do these new paths require any synchronization? > > [1] https://bugs.launchpad.net/kolla-ansible/+bug/1845142 > [2] > https://review.opendev.org/#/c/684275/2/ansible/roles/tacker/templates/tacker.conf.j2 > > Kind regards, > Radek > > ________________________________ > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. It shall not attach any liability > on the originator or NECTI or its affiliates. Any views or opinions > presented in this email are solely those of the author and may not > necessarily reflect the opinions of NECTI or its affiliates. Any form of > reproduction, dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior written > consent of the author of this e-mail is strictly prohibited. If you have > received this email in error please delete it and notify the sender > immediately. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Sat Oct 12 13:27:11 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Sat, 12 Oct 2019 15:27:11 +0200 Subject: [kolla] Support for removing previously enabled services.. In-Reply-To: References: Message-ID: Hi Laurent, Unfortunately Kolla Ansible does not provide this functionality at the moment. On the other hand, we would welcome such functionality gladly. It needs some discussion regarding how it would work to suit operators' needs. The interesting part is the real clean-up - e.g. removing leftovers, databases, rabbitmq objects... PS: bacon rlz Kind regards, Radek On Fri, Oct 11, 2019, 19:12 Laurent Dumont wrote: > Hey everyone, > > I'm pretty sure I know the answer but are there any support within Kolla > itself to disable Services that we're previously enabled. > > For example, I was testing the Skydive Agent/Analyzer combo till I > realized that it was using about 90-100% of the CPUs or computes and > controllers. > > Re-running Kolla with reconfigure but with Service set to "No" didn't > remove the containers. I had to remove the containers after the reconfigure > finished. > > This is Kolla 8.0.1 with a Stein install. > > Thanks! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Sun Oct 13 01:01:57 2019 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Sat, 12 Oct 2019 21:01:57 -0400 Subject: [kolla] Support for removing previously enabled services.. In-Reply-To: References: Message-ID: Hey! Not a problem - it's a big rabbit hole and I get why it's really not easy to implement. It's easy to clean up containers but as you mentioned, all the rest of the bits and pieces is a tough fit. Laurent On Sat, Oct 12, 2019 at 9:27 AM Radosław Piliszek < radoslaw.piliszek at gmail.com> wrote: > Hi Laurent, > > Unfortunately Kolla Ansible does not provide this functionality at the > moment. > On the other hand, we would welcome such functionality gladly. > It needs some discussion regarding how it would work to suit operators' > needs. > The interesting part is the real clean-up - e.g. removing leftovers, > databases, rabbitmq objects... > > PS: bacon rlz > > Kind regards, > Radek > > > On Fri, Oct 11, 2019, 19:12 Laurent Dumont > wrote: > >> Hey everyone, >> >> I'm pretty sure I know the answer but are there any support within Kolla >> itself to disable Services that we're previously enabled. >> >> For example, I was testing the Skydive Agent/Analyzer combo till I >> realized that it was using about 90-100% of the CPUs or computes and >> controllers. >> >> Re-running Kolla with reconfigure but with Service set to "No" didn't >> remove the containers. I had to remove the containers after the reconfigure >> finished. >> >> This is Kolla 8.0.1 with a Stein install. >> >> Thanks! >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pramchan at yahoo.com Sun Oct 13 01:45:48 2019 From: pramchan at yahoo.com (prakash RAMCHANDRAN) Date: Sun, 13 Oct 2019 01:45:48 +0000 (UTC) Subject: [Indian OpenStack User Group} Need Volunteer Mentors for Bangalore Friday Oct 18 10AM-4 PM References: <2038858173.780724.1570931148265.ref@mail.yahoo.com> Message-ID: <2038858173.780724.1570931148265@mail.yahoo.com> Hi all, We have 150+ students and  3 classrooms & need technical mentors who can deliver following contentInformation about training OpenStack Docs: OpenStack Upstream Institute Training Content https://docs.openstack.org/upstream-training/upstream-training-content.html | | | | OpenStack Docs: OpenStack Upstream Institute | | | If you are in Bangalore   (India) and your company allows you to fulfill your Corporate Social Responsibility (CSR) or you are a motivated Professional ready to help Engineering Students please respond to highlighted link and RSVP so that Event Co-coordinators can reach you to seek your help. Alternate is you can contact madhuri.rai07  or ganesh.hiregaoudar or digambarpat  all AT gamailDOTcom. Once again appreciate all the help from tech folks from Bangalore India, who have been instrumental in supporting OpenStack for last decade. ThanksPrakash RamchandranEvent Coordinator -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Sun Oct 13 10:22:34 2019 From: smooney at redhat.com (Sean Mooney) Date: Sun, 13 Oct 2019 11:22:34 +0100 Subject: [kolla] Support for removing previously enabled services.. In-Reply-To: References: Message-ID: On Sat, 2019-10-12 at 21:01 -0400, Laurent Dumont wrote: > Hey! > > Not a problem - it's a big rabbit hole and I get why it's really not easy > to implement. It's easy to clean up containers but as you mentioned, all > the rest of the bits and pieces is a tough fit. i did add a tiny step in that direction years ago for mainly dev use https://github.com/openstack/kolla-ansible/commit/2ffb35ee5308ece3717263d38163e5fd9b29a3ae basically the tools/cleanup-containers script takes a regex of the continers to clean up as its first argument. e.g. tools/cleanup-containers "neutron|openvswitch" that is totally a hack but it was so useful for dev. i belive there is already a request to limit kolla-ansible --destroy by tags destroy should in theory be cleanup dbs in addtion to remvoing the contianers but give that currently it just remvoed everything of nothing it not that useful for operator wanting to remvoe a deployed service. my hack when used to be an tiny ansible script that copied that tool to all host then invoked it the relevent regex. anywya if you jsut wasnt to do this for dev or on a small number of hosts it might help but keep in mind that it will not clean up the configs or dbs and it only works if no vms are runnign on the host > > Laurent > > On Sat, Oct 12, 2019 at 9:27 AM Radosław Piliszek < > radoslaw.piliszek at gmail.com> wrote: > > > Hi Laurent, > > > > Unfortunately Kolla Ansible does not provide this functionality at the > > moment. > > On the other hand, we would welcome such functionality gladly. > > It needs some discussion regarding how it would work to suit operators' > > needs. > > The interesting part is the real clean-up - e.g. removing leftovers, > > databases, rabbitmq objects... > > > > PS: bacon rlz > > > > Kind regards, > > Radek > > > > > > On Fri, Oct 11, 2019, 19:12 Laurent Dumont > > wrote: > > > > > Hey everyone, > > > > > > I'm pretty sure I know the answer but are there any support within Kolla > > > itself to disable Services that we're previously enabled. > > > > > > For example, I was testing the Skydive Agent/Analyzer combo till I > > > realized that it was using about 90-100% of the CPUs or computes and > > > controllers. > > > > > > Re-running Kolla with reconfigure but with Service set to "No" didn't > > > remove the containers. I had to remove the containers after the reconfigure > > > finished. > > > > > > This is Kolla 8.0.1 with a Stein install. > > > > > > Thanks! > > > From gmann at ghanshyammann.com Sun Oct 13 15:10:34 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 13 Oct 2019 10:10:34 -0500 Subject: [goals][IPv6-Only Deployments and Testing] Update Message-ID: <16dc5abcf69.dfc6953743477.8454873137806209108@ghanshyammann.com> Hello Everyone, Below is the updated on IPv6 goal. All the projects have the ipv6 job patch proposed now. Next step is to review then as per mentioned guidelines below or help in debugging the failure. As stable/train is already cut for all the projects, we will keep merging the remaining projects listed below in Ussuri release. If your project is listed below, check the project patch and help in review/debug failure. Summary: The projects waiting for IPv6 job patch to merge: If patch is failing, help me to debug that otherwise review and merge. * Barbican * Tricircle * Vitrage * Zaqar * Glance * Monasca * Neutron stadium projects (added a more generic job for all. need debugging as few tests failing- https://review.opendev.org/#/c/686043/) * Qinling * Sahara * Searchlight * Senlin * Tacker * Ec2-Api * Freezer * Heat * Ironic * Karbor * kuryr-kubernetes (not yet ready for IPv6. as per IRC chat with dulek, IPv6 support is planned for ussuri cycle - https://review.opendev.org/#/c/682531/) * Magnum * Masakari * Mistral * Octavia (johnsom is working on this) Storyboard: ========= - https://storyboard.openstack.org/#!/story/2005477 IPv6 missing support found: ===================== 1. https://review.opendev.org/#/c/673397/ 2. https://review.opendev.org/#/c/673449/ 3. https://review.opendev.org/#/c/677524/ There are few more but need to be tracked. How you can help: ============== - Each project needs to look for and review the ipv6 job patch. - Verify it works fine on ipv6 and no ipv4 used in conf etc - Any other specific scenario needs to be added as part of project IPv6 verification. - Help on debugging and fix the bug in IPv6 job is failing. Everything related to this goal can be found under this topic: Topic: https://review.opendev.org/#/q/topic:ipv6-only-deployment-and-testing+(status:open+OR+status:merged) How to define and run new IPv6 Job on project side: ======================================= - I prepared a wiki page to describe this section - https://wiki.openstack.org/wiki/Goal-IPv6-only-deployments-and-testing Review suggestion: ============== - Main goal of these jobs will be whether your service is able to listen on IPv6 and can communicate to any other services either OpenStack or DB or rabbitmq etc on IPv6 or not. So check your proposed job with that point of view. If anything missing, comment on patch. - One example was - I missed to configure novnc address to IPv6- https://review.opendev.org/#/c/672493/ - base script as part of 'devstack-tempest-ipv6' will do basic checks for endpoints on IPv6 and some devstack var setting. But if your project needs more specific verification then it can be added in project side job as post-run playbooks as described in wiki page[1]. [1] https://wiki.openstack.org/wiki/Goal-IPv6-only-deployments-and-testing From anlin.kong at gmail.com Mon Oct 14 04:24:00 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Mon, 14 Oct 2019 17:24:00 +1300 Subject: [Trove] [Qinling] PTL on vacation Message-ID: Hi all, I will be away from 15 Oct to 15 Nov. - Best regards, Lingxian Kong Catalyst Cloud -------------- next part -------------- An HTML attachment was scrubbed... URL: From josephine.seifert at secustack.com Mon Oct 14 06:21:01 2019 From: josephine.seifert at secustack.com (Josephine Seifert) Date: Mon, 14 Oct 2019 08:21:01 +0200 Subject: [image-encryption] No meeting today Message-ID: <7d1b35dc-7a7e-76cd-c45a-1419cfa74920@secustack.com> Hi, unfortunately neither Markus (mhen) nor me can hold the meeting today. We will have our next meeting next monday. greetings Josephine (Luzi) From skaplons at redhat.com Mon Oct 14 08:11:46 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 14 Oct 2019 10:11:46 +0200 Subject: [goals][IPv6-Only Deployments and Testing] Update In-Reply-To: <16dc5abcf69.dfc6953743477.8454873137806209108@ghanshyammann.com> References: <16dc5abcf69.dfc6953743477.8454873137806209108@ghanshyammann.com> Message-ID: Hi, > On 13 Oct 2019, at 17:10, Ghanshyam Mann wrote: > > Hello Everyone, > > Below is the updated on IPv6 goal. All the projects have the ipv6 job patch proposed now. Next step is to review then as per mentioned guidelines below or help in debugging the failure. > > As stable/train is already cut for all the projects, we will keep merging the remaining projects listed below in Ussuri release. If your project is listed below, check the project patch and help in review/debug failure. > > Summary: > > The projects waiting for IPv6 job patch to merge: > If patch is failing, help me to debug that otherwise review and merge. > > * Barbican > * Tricircle > * Vitrage > * Zaqar > * Glance > * Monasca > * Neutron stadium projects (added a more generic job for all. need debugging as few tests failing- https://review.opendev.org/#/c/686043/) I will investigate and update this patch this week. > * Qinling > * Sahara > * Searchlight > * Senlin > * Tacker > * Ec2-Api > * Freezer > * Heat > * Ironic > * Karbor > * kuryr-kubernetes (not yet ready for IPv6. as per IRC chat with dulek, IPv6 support is planned for ussuri cycle - https://review.opendev.org/#/c/682531/) > * Magnum > * Masakari > * Mistral > * Octavia (johnsom is working on this) > > Storyboard: > ========= > - https://storyboard.openstack.org/#!/story/2005477 > > IPv6 missing support found: > ===================== > 1. https://review.opendev.org/#/c/673397/ > 2. https://review.opendev.org/#/c/673449/ > 3. https://review.opendev.org/#/c/677524/ > There are few more but need to be tracked. > > How you can help: > ============== > - Each project needs to look for and review the ipv6 job patch. > - Verify it works fine on ipv6 and no ipv4 used in conf etc > - Any other specific scenario needs to be added as part of project IPv6 verification. > - Help on debugging and fix the bug in IPv6 job is failing. > > Everything related to this goal can be found under this topic: > Topic: https://review.opendev.org/#/q/topic:ipv6-only-deployment-and-testing+(status:open+OR+status:merged) > > How to define and run new IPv6 Job on project side: > ======================================= > - I prepared a wiki page to describe this section - https://wiki.openstack.org/wiki/Goal-IPv6-only-deployments-and-testing > > Review suggestion: > ============== > - Main goal of these jobs will be whether your service is able to listen on IPv6 and can communicate to any > other services either OpenStack or DB or rabbitmq etc on IPv6 or not. So check your proposed job with > that point of view. If anything missing, comment on patch. > - One example was - I missed to configure novnc address to IPv6- https://review.opendev.org/#/c/672493/ > - base script as part of 'devstack-tempest-ipv6' will do basic checks for endpoints on IPv6 and some devstack var > setting. But if your project needs more specific verification then it can be added in project side job as post-run > playbooks as described in wiki page[1]. > > [1] https://wiki.openstack.org/wiki/Goal-IPv6-only-deployments-and-testing > > — Slawek Kaplonski Senior software engineer Red Hat From mark at stackhpc.com Mon Oct 14 08:51:32 2019 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 14 Oct 2019 09:51:32 +0100 Subject: [kolla] Support for removing previously enabled services.. In-Reply-To: References: Message-ID: On Sun, 13 Oct 2019 at 11:24, Sean Mooney wrote: > > On Sat, 2019-10-12 at 21:01 -0400, Laurent Dumont wrote: > > Hey! > > > > Not a problem - it's a big rabbit hole and I get why it's really not easy > > to implement. It's easy to clean up containers but as you mentioned, all > > the rest of the bits and pieces is a tough fit. True, although simply removing the containers and load balancer configuration would be a good start. > > i did add a tiny step in that direction years ago for mainly dev use > https://github.com/openstack/kolla-ansible/commit/2ffb35ee5308ece3717263d38163e5fd9b29a3ae > basically the tools/cleanup-containers script takes a regex of the continers to clean up as its > first argument. > e.g. tools/cleanup-containers "neutron|openvswitch" > > that is totally a hack but it was so useful for dev. > > > i belive there is already a request to limit kolla-ansible --destroy by tags Yes - https://review.opendev.org/504592. It will need some work to get it merged. > > destroy should in theory be cleanup dbs in addtion to remvoing the contianers but give > that currently it just remvoed everything of nothing it not that useful for operator > wanting to remvoe a deployed service. > > my hack when used to be an tiny ansible script that copied that tool to all host then invoked > it the relevent regex. > > anywya if you jsut wasnt to do this for dev or on a small number of hosts it might help but > keep in mind that it will not clean up the configs or dbs and it only works if no vms are runnign on the host > > > > > > Laurent > > > > On Sat, Oct 12, 2019 at 9:27 AM Radosław Piliszek < > > radoslaw.piliszek at gmail.com> wrote: > > > > > Hi Laurent, > > > > > > Unfortunately Kolla Ansible does not provide this functionality at the > > > moment. > > > On the other hand, we would welcome such functionality gladly. > > > It needs some discussion regarding how it would work to suit operators' > > > needs. > > > The interesting part is the real clean-up - e.g. removing leftovers, > > > databases, rabbitmq objects... > > > > > > PS: bacon rlz > > > > > > Kind regards, > > > Radek > > > > > > > > > On Fri, Oct 11, 2019, 19:12 Laurent Dumont > > > wrote: > > > > > > > Hey everyone, > > > > > > > > I'm pretty sure I know the answer but are there any support within Kolla > > > > itself to disable Services that we're previously enabled. > > > > > > > > For example, I was testing the Skydive Agent/Analyzer combo till I > > > > realized that it was using about 90-100% of the CPUs or computes and > > > > controllers. > > > > > > > > Re-running Kolla with reconfigure but with Service set to "No" didn't > > > > remove the containers. I had to remove the containers after the reconfigure > > > > finished. > > > > > > > > This is Kolla 8.0.1 with a Stein install. > > > > > > > > Thanks! > > > > > > From thierry at openstack.org Mon Oct 14 10:14:54 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 14 Oct 2019 12:14:54 +0200 Subject: [tc] Status update on naming our next releases Message-ID: <4cf0c4be-5b22-2565-6449-367670ba577d@openstack.org> Hi, Following last week TC meeting, I took the action to summarize the situation and way forward of naming releases. Naming our U release was a bit of a painful process, mostly due to the subjectivity of the process, combined with a difficult combination of letter vs. naming criteria. It triggered proposals to change the naming process and/or criteria, as we expect similar difficulties with the rest of the alphabet. We did a first round of proposals that covered the V-Z part of the alphabet. A Condorcet poll was run to select the best options. The two best ones were then run in a new Condorcet poll against "keep things the same", and the result was a draw. It was a good indication that the TC membership found none of the proposals on the table was significantly better than keeping things the same, and therefore no change was effected. That said, it does not mean we should stop proposing new models, as the current system is flawed: its subjectivity combined with a popularity contest creates problems, and its criteria (strongly tied to event locations) will not work well in the future as we work to reduce the number of global events we run while increasing the number of local events. The way forward is as follows: proposals can still be made, but they should address "any foreseeable future". That means they need to explain how they will name the V-Z releases, but also how they will roll over past Z. TC members should rollcall-vote +1 on those proposals if they think they are better than keeping things the same. They can rollcall-vote -1 on the proposals for which they think keeping things the same would be better. If one proposal gets a majority of votes (seven +1s), then after the usual grace period of 3 calendar days, it should be approved (unless a competing proposals gathers *more* positive votes in the mean time). There is no deadline for proposing, even after one such proposal is approved. Those things can always be changed in the future. However I personally don't think we should change naming systems too often, because they are only fun if they become some sort of tradition. We currently have three proposals up: Cities with 100,000+ inhabitants, TC-only poll: https://review.opendev.org/#/c/677745/ Vancouver, then words present in movie quotes about "release": https://review.opendev.org/#/c/684688/ Vancouver, then cities with 100,000+ inhabitants, community poll: https://review.opendev.org/#/c/687764/ Cheers, -- Thierry Carrez (ttx) From skaplons at redhat.com Mon Oct 14 10:47:28 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 14 Oct 2019 12:47:28 +0200 Subject: [neutron] Bug deputy report - week of October 7th Message-ID: Hi, I was on bug deputy last week. It was really quiet week. Below is summary of new bugs reported: Critical test_update_firewall_calls_get_dvr_hosts_for_router failure on rocky - https://bugs.launchpad.net/neutron/+bug/1847019 - patch is already proposed https://review.opendev.org/#/c/687085/ Medium Loading neutron-lib internationalized file - https://bugs.launchpad.net/neutron/+bug/1847586 - fix proposed already https://review.opendev.org/687861 Low '--sql' option of neutron-db-manage does not work - https://bugs.launchpad.net/neutron/+bug/1847210 Undecided l3-agent stops processing router updates - https://bugs.launchpad.net/neutron/+bug/1847203 - I asked some additional questions there but would be also good if some L3 experts could take a look into that, Incomplete cannot reuse floating IP as port's fixed-ip - https://bugs.launchpad.net/neutron/+bug/1847763 - I think it’s for Calico project but lets wait until reporter clarify that RFEs [RFE] create option in neutron.conf to disable designate+neutron consistency - https://bugs.launchpad.net/neutron/+bug/1847068 Others [RPC] digging RPC timeout for client and server - https://bugs.launchpad.net/oslo.messaging/+bug/1847747 - More oslo_messaging issue IMO, neutron added only as impacted project. — Slawek Kaplonski Senior software engineer Red Hat From rfolco at redhat.com Mon Oct 14 13:23:02 2019 From: rfolco at redhat.com (Rafael Folco) Date: Mon, 14 Oct 2019 10:23:02 -0300 Subject: [tripleo] TripleO CI Summary: Sprint 37 Message-ID: Greetings, The TripleO CI team has just completed Sprint 37 / Unified Sprint 16 (Sep 19 thru Oct 09). The following is a summary of completed work during this sprint cycle: - Started Train release branching prep work and bootstrapped a centos8 nodepool node. - Designed and implemented tests for verifying changes in the promotion server. - Added multi-arch support w/ manifests to container push in the promoter code. - Designed a test strategy for building and running jobs in zuul on ceph-ansible and podman repositories against pull requests on Github. The planned work for the next sprint [1] are: - Complete the manifest implementation with a test strategy for not breaking promotion workflow. - Improve tests for verifying a full promotion workflow running on the staging environment. - Implement CI jobs in zuul to build and run tests against ceph-ansible and podman pull requests in github. - Close-out Train release branching preparation work. - Address required changes for building a CentOS8 node for upcoming distro release support across TripleO CI jobs. The Ruck and Rover for this sprint are Rafael Folco (rfolco) and Marios Andreou (marios). Please direct questions or queries to them regarding CI status or issues in #tripleo, ideally to whomever has the ‘|ruck’ suffix on their nick. Ruck/rover notes are being tracked in etherpad [2]. Thanks, rfolco [1] https://tree.taiga.io/project/tripleo-ci-board/taskboard/unified-sprint-17 [2] https://etherpad.openstack.org/p/ruckroversprint17 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Mon Oct 14 14:00:56 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 14 Oct 2019 15:00:56 +0100 Subject: [nova][all] Roadmap for dropping Python 2 support Message-ID: <4f66a849e1d26f56ef9272e69f43460a6a6a9614.camel@redhat.com> The time has come. Train is almost out the door and we're already well into Ussuri planning and work. We agreed some time ago that this cycle was the right one to drop support for Python 2.7 [1] and to that effect I've proposed a patch to do just this in nova [2]. However, I have noticed a large number of the third party CIs are failing on this patch. This is because they are still testing with Python 2 and the patch marks nova as only supporting Python 3. As you can see in that patch, the effort to switch things over is not that significant, and I'd ask that any owners of third party CIs prioritise work to switch these things over in the next few weeks leading up to the PTG so we can merge this change as soon as possible. Please reach out on IRC if you have any concerns or questions. Cheers, Stephen [1] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html#python2-deprecation-timeline [2] https://review.opendev.org/#/c/687954/ From mriedemos at gmail.com Mon Oct 14 15:14:55 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Mon, 14 Oct 2019 10:14:55 -0500 Subject: [watcher][nova] Thesis on improving Watcher and collaborating with OpenStack community In-Reply-To: References: Message-ID: <400783eb-8ca7-e10f-5481-0940caa53dca@gmail.com> On 10/11/2019 1:52 AM, info at dantalion.nl wrote: > Hello everyone, > > I am a Dutch student at the Amsterdam University of Applied Sciences > (AUAS) and have recently finished my thesis. My thesis was written on > improvements that were made to OpenStack Watcher between February and > Juli of 2019. Specifically, many of these improvements were written to > aid CERN in deploying Watcher. In addition, the thesis describes methods > of collaboration and engaging in communities as well as evaluating > strengths and weaknesses of communties. > > Since the thesis primarily resolves around OpenStack I would like to > share it with the community as well. Please find the thesis attached to > this email. > > Any feedback, remarks, future advice or other responses are appreciated. > > Kind regards, > Corne Lukken (Dantali0n) > Thanks for sharing these results Corne, looks great. I've added the [nova] tag to the subject line to sub-thread this just for awareness to nova developers. Section 8 should be interesting for nova developers to see how simple client-side optimizations can be done to improve performance when working with the compute API. The particularly interesting one to me is the regression fix to not use limit=-1 in novaclient when listing servers but specify a hard limit to avoid extra API calls to list servers. Personally I enjoyed this little short-term side project working with the Watcher team on improving the performance of Watcher's nova data model builder code. Thanks to the Watcher team for welcoming my contributions. -- Thanks, Matt From openstack at nemebean.com Mon Oct 14 15:30:50 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 14 Oct 2019 10:30:50 -0500 Subject: [oslo] New courtesy ping list for Ussuri In-Reply-To: <473f0fcb-c8c2-e8ae-812e-15575e898d66@nemebean.com> References: <473f0fcb-c8c2-e8ae-812e-15575e898d66@nemebean.com> Message-ID: Final notice! As I mentioned in the meeting today, next week I'll move to using the new ping list. I think most people have already re-upped for next cycle, but if you haven't yet and want to, now is the time. Of course, you can always add yourself to the ping list any time. It is a wiki, after all. :-) On 9/23/19 2:59 PM, Ben Nemec wrote: > As we discussed at the beginning of the cycle, I'll be clearing the > current ping list in the next few weeks. This is to prevent courtesy > pinging people who are no longer active on the project. If you wish to > continue receiving courtesy pings at the start of the Oslo meeting > please add yourself to the new list on the agenda template [0]. Note > that the new list is above the template, called "Courtesy ping list for > Ussuri". If you add yourself again to the end of the existing list I'll > assume you want to be left on though. :-) > > Thanks. > > -Ben > > 0: https://wiki.openstack.org/wiki/Meetings/Oslo#Agenda_Template From openstack at nemebean.com Mon Oct 14 15:52:56 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 14 Oct 2019 10:52:56 -0500 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> <1c17ad14272bddd29f46ea9790d128f4ff005099.camel@redhat.com> Message-ID: <6ad1f914-c43e-5ae8-57fc-51d3e000b953@nemebean.com> Okay, circling back to wrap this topic up. It sounds like this is a pretty big win in terms of avoiding random failures either from trying to migrate a VM with nested guests on older qemu or using newer qemu with older OpenStack. Since it's a pretty simple patch and it allows our stable branches to behave more sanely, I'm inclined to go with the backport. If anyone strongly objects, please let me know ASAP before we release it. On 10/7/19 3:36 PM, Ben Nemec wrote: > > > On 10/7/19 3:08 PM, Sean Mooney wrote: >> On Mon, 2019-10-07 at 14:43 -0500, Ben Nemec wrote: >>> >>> On 10/7/19 11:31 AM, Jeremy Stanley wrote: >>>> On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: >>>> [...] >>>>> Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear >>>>> to me that backporting bug fixes for it is valid. The original >>>>> author of the patch actually wants it for Rocky >>>> >>>> [...] >>>> >>>> Neither the changes nor the bug report indicate what the motivation >>>> is for supporting newer Qemu with (much) older OpenStack. Is there >>>> some platform which has this Qemu behavior on which folks are trying >>>> to run Rocky? Or is it a homegrown build combining these dependency >>>> versions from disparate time periods? Or maybe some other reason I'm >>>> not imagining? >>>> >>> >>> In addition to the downstream reasons Sean mentioned, Mark (the original >>> author of the patch) responded to my question on the train backport with >>> this: >>> >>> """ >>> Today, I need it in Rocky. But, I'm find to do local patching. >>> >>> Anybody who needs Qemu 4.1.0 likely needs it. A key feature in Qemu >>> 4.1.0 is that this is the first release of Qemu to include proper >>> support for migration of L1 guests that have L2 guests (nVMX / nested >>> KVM). So, I expect it is pretty important to whoever realizes this, and >>> whoever needs this. >>> """ >>> >>> So basically a desire to use a feature of the newer qemu with older >>> openstack, which is why I'm questioning whether this fits our stable >>> policy. My inclination is to say it's a fairly simple, >>> backward-compatible patch that will make users' lives easier, but I also >>> feel like doing a backport to enable a feature, even if the actual patch >>> is a "bugfix", is violating the spirit of the stable policy. >> in many distros the older qemus allow migration of the l1 guest >> eventhouhg it is >> unsafe to do so and either work by luck or the vm will curput its >> memroy and likely >> crash.  the context of the qemu issue is for years people though that >> live migration with >> nested virt worked, then it was disabeld upstream and many distos >> reverted that as it would >> break there users where they got lucky and it worked, and in 4.1 it >> was fixed. >> >> this does not add or remvoe any functionality in openstack nova will >> try to live migarte if you >> tell it too regardless of the qemu it has it just will fail if the >> live migration check was complied in. >> >> >> similarly if all your images did not have fractional sizes you could >> use 4.1.0 with older >> oslo releases and it would be fine. i.e. you could get lucky and for >> your specific usecase this >> might not be needed but it would be nice not do depend on luck. >> >> anyway i woudl expect any disto the chooses to support qemu 4.1.0 to >> backport this as required. >> im not sure this problematic to require a late oslo version bump >> before train ga but i would hope >> it can be fixed on stable/train > > Note that this discussion is separate from the train patch. I agree we > should do that backport, and actually we already have. That discussion > was just about timing of the release. > > This thread is because the fix was also proposed to stable/stein. It > merged before I had a chance to start this discussion, and I'm wondering > if we need to revert it. > From rico.lin.guanyu at gmail.com Mon Oct 14 16:44:29 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Tue, 15 Oct 2019 00:44:29 +0800 Subject: [all][tc] What happened in OpenStack Governance recently Message-ID: Hello everyone, Here are a few things that happened recently: - *We've got the last volley of new PTLs!* Lucian Petrut for winstackers and Nicholas Bock for Designate. Congratulations! *- Rico Lin is now the Vice-chair of the TC.* - As mentioned in [1], *we will have two `Meet the project leaders` events* during the Shanghai summit. It will be nice if OpenStack PTLs, SIG Chairs, TC members, core reviewers, UC members, interested in join. You can signup in [2] to let others know you're coming. And if you think you might be part of this or you would like to meet any of them. Yes! you should come! -* We're open for goal idea*, so add it in [3] if you have any. We also looking for V cycle goal as well. And here's some backlog [4] if you're interested to be a champion of it. - At this point you should already know, *the newest OpenStack User Survey results are already out *[8]. So analysis it and make decisions accordingly (assume those questions are essential for team dicisions). - *The Shanghai PTG schedule is finalized*. Please check [7] for more detail on each forum. We hope we can have users, operators, and developers all together to collaborate and make more successful and valuable outcome from each forum. So please join! - Big thanks to our release team, *Train final releases for cycle-with-rc projects is right around the corner* [9]. - Our official meeting happened on October 10, as announced on the ML. We decided the following: - As one of meeting actions, a ML `[tc] Status update on naming our next releases` [6] is out (Thanks to Thierry). If you would like to give your review/feedback before proposals (mentioned in that mail) gets a majority of TC votes, now is the time. As always your review and feedback matter. - Summit and PTG are near, currently, Summit presentations and Forum sessions [7] are released. At this point, teams (which plan to join PTG) should start planning for their PTG topics and formats. So please help with teams to be prepared. For TC PTG, please propose topics before October 17, so we will have two weeks to discuss, finalize, and prepare. - V cycle goal discussion will be more asynchronously, more ML for U and V cycle goal process will be out soon. - Thanks for the effort of swift team, current python 3 support for swift is good. This makes OpenStack more ready for python 3 first. - There will be a forum [5] for large scale SIG, so if you're interested, please join (we hope to have more large scale users join to provide hands and feedbacks) so we can make OpenStack a better place for large scale. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/009947.html [2] https://etherpad.openstack.org/p/meet-the-project-leaders [3] https://etherpad.openstack.org/p/PVG-u-series-goals [4] https://etherpad.openstack.org/p/community-goals [5] https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24405/facilitating-running-openstack-at-scale-join-the-large-scale-sig [6] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010106.html [7] https://www.openstack.org/summit/shanghai-2019/summit-schedule#track_groups=90 [8] http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html [9] https://review.opendev.org/#/c/687991 Regards, JP & Rico -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Oct 14 17:24:13 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 14 Oct 2019 12:24:13 -0500 Subject: [all][qa][forum] Etherpad for Users / Operators adoption of QA tools / plugins sessions at Shanghai Summit Message-ID: <16dcb4c89b5.d4afc3b579599.8480309289121917289@ghanshyammann.com> Hello Everyone, I've created the etherpad for the QA feedback sessions[1] which is scheduled on Monday, November 4, 1:20pm-2:00pm. I have added a few basic feedback questions there and If you have any additional items to add or modify, please feel free to do that. In case, you are not able to attend this session, you can still write your feedback with your irc/name contact. Etherpad: https://etherpad.openstack.org/p/PVG-forum-qa-ops-user-feedback [1] https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24401/users-operators-adoption-of-qa-tools-plugins -gmann From gmann at ghanshyammann.com Mon Oct 14 17:24:24 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 14 Oct 2019 12:24:24 -0500 Subject: [qa][ptg] Ussuri PTG Planning for QA Message-ID: <16dcb4cb379.cb1085b779605.6576175916005057549@ghanshyammann.com> Hello Everyone, This is the etherpad[1] to collect the Ussuri cycle PTG topic ideas for QA. Please start adding your item/topic you want to discuss in PTG. Even you are not making to PTG physically, still, add your topic which you want us to discuss or give a thought. Anyone is welcome to add the cross-project testing topics they want to discuss related to QA. [1] https://etherpad.openstack.org/p/shanghai-ptg-qa -gmann From rico.lin.guanyu at gmail.com Mon Oct 14 17:27:23 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Tue, 15 Oct 2019 01:27:23 +0800 Subject: [tc] Weekly update Message-ID: Hello friends, Here's what needs attention for the OpenStack TC this week. 1. We have our meeting last Thursday [1], so please work on actions [2]. And notice that some actions might require TCs to votes/helps on (like in [4]). 2. We have three potential goals for Ussuri [7] now. Please provide any if you found any suitable goal idea for Ussuri (or V) cycle. 3. For TC PTG, please propose topics before October 17, so we will have two weeks to discuss, finalize, and prepare. 4. Rico will help with chair's responsibility during JP's time off [5]. 5. Some recently started mailing list with [tc] or [all] tags: - [all][tc] What happened in OpenStack Governance recently [3] - [tc] Status update on naming our next releases [4] - [tc] Time off for JP! [5] - [nova][all] Roadmap for dropping Python 2 support [6] Thank you everyone! [1] http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-10-10-14.00.log.html [2] http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-10-10-14.00.txt [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010113.html [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010106.html [5] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010061.html [6] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010109.html [7] https://etherpad.openstack.org/p/PVG-u-series-goals Regards, JP & Rico -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali74.ebrahimpour at gmail.com Mon Oct 14 05:49:01 2019 From: ali74.ebrahimpour at gmail.com (Ali Ebrahimpour) Date: Mon, 14 Oct 2019 09:19:01 +0330 Subject: monitoring Message-ID: hi guys i want to install monitoring in my horizon Ui and i'm confused in setting up ceilometer or gnocchi or aodh or monasca in my project because all of them where deprecated. i setup openstack with ansible and i want to monitor the usage of cpu and ram and etc in my dashboard and i also want to know how much resources each customer used for one hour and day. Thanks in advance for your precise guidance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Oct 14 21:26:17 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 14 Oct 2019 23:26:17 +0200 Subject: [neutron] Team dinner in Shanghai Message-ID: <76CDA132-87D8-4FAD-A993-76D65E879F5E@redhat.com> Hi neutrinos, We are planning to organise some team dinner during PTG in Shanghai. If You are interested to go for such dinner, please write it in etherpad [1] together with days which works the best for You. [1] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning — Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Mon Oct 14 21:46:44 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 14 Oct 2019 23:46:44 +0200 Subject: [neutron][ptg] Ussuri planning etherpad for Neutron Message-ID: <74600817-8AD4-4593-8862-4BCB024C9B57@redhat.com> Hi, At [1] there is Ussuri PTG planning etherpad. Please add Your topics to it, even if You are not planning to be there and You want team to discuss about it. I plan to start preparing agenda during the week of October 28th so would be great if You could add Your topics to it before this date. But of course any “last minute” topics are always welcome :) [1] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning — Slawek Kaplonski Senior software engineer Red Hat From allison at openstack.org Mon Oct 14 21:59:29 2019 From: allison at openstack.org (Allison Price) Date: Mon, 14 Oct 2019 16:59:29 -0500 Subject: Writing a Train blog post? Message-ID: Hi everyone, Are you planning on writing a blog post about OpenStack Train ahead of (or after) the release this Wednesday, October 16? If so, please respond and let me know so we can include it in Train promotional activities, including the Open Infrastructure Community Newsletter. Thanks! Allison Allison Price OpenStack Foundation allison at openstack.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Oct 14 22:52:31 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 14 Oct 2019 17:52:31 -0500 Subject: [tc][all] Community-wide goal Ussuri and V cycle forum collaboration idea Message-ID: <16dcc79196d.b7dfa21684317.2121277505699030183@ghanshyammann.com> Hello Everyone, During the TC meeting on 10th Oct, we discussed the community-wide goal planning for Ussuri as well as the V cycle[1] but there was no consensus on V cycle planning so I am bringing this to ML for further thoughts. Ussuri cycle goal planning is all good here, That has been already started[2] and we have a dedicated forum session [3] for the same to discuss the goals in more detail. Question is for V cycle goal planning, whether we should discuss the V cycle goal in Ussuri goal fourm sessoin[3] or it is too early to kick off V cycle goal at least until we finalize U cycle goal first. I would like to list the below two options to proceed further (at least to decide if we need to change the existing U cycle goal forum sessions title). 1. Merge the Forum session for both cycle goal discussion (divide both in two half). This need forum session title and description change. 2. Keep forum session for U cycle goal only and start the V cycle over ML asynchronously. This will help to avoid any confusion or mixing the both cycle goal discussions. Thoughts? [1] http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-10-10-14.00.log.html#l-211 [2] https://etherpad.openstack.org/p/PVG-u-series-goals [3] https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24398/ussuri-cycle-community-wide-goals-discussion -gmann From pojadhav at redhat.com Tue Oct 15 06:15:20 2019 From: pojadhav at redhat.com (Pooja Jadhav) Date: Tue, 15 Oct 2019 11:45:20 +0530 Subject: Request to update email of Launchpad Account Message-ID: Hi Team, I am Pooja Jadhav. I have a Launchpad account with pooja.jadhav at nttdata.com but now I have left the company so I am not able to access this email any more, Hence I am not entitled to do forget password as well. So I am requesting community to suggest me alternative way to solve this problem through which I can put my Personal Email Id/ Current Corporate Id. Looking forward for your reply. Thanks & Regards, Pooja Jadhav -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinzs2048 at gmail.com Tue Oct 15 08:16:32 2019 From: kevinzs2048 at gmail.com (Shuai Zhao) Date: Tue, 15 Oct 2019 16:16:32 +0800 Subject: [neutron][Kolla] Failed to get DHCP offer packet at qvo/qvb in compute node Message-ID: Hi Neutron, I've deployed Rocky-rc2 version on Debian Buster(compute node), kernel Linux 4.19 Now the issue: The VM running on the Host(Debian Buster) could not get IP when Booting. I use tcpdump to get the packet on tap, qbr, qvb and qvo. *The DHCP broadcast packet could be dumped at tap and qbr, but not at qvo/qvb.* So the DHCP failed. All the firewall policy is neutron automatic generated. The firewall policy is never changed. (neutron-openvswitch-agent)[root@** /]# iptables -S | grep tapba5cd56c-46 -A neutron-openvswi-FORWARD -m physdev --physdev-out tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Direct traffic from the VM interface to the security group chain." -j neutron-openvswi-sg-chain -A neutron-openvswi-FORWARD -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Direct traffic from the VM interface to the security group chain." -j neutron-openvswi-sg-chain -A neutron-openvswi-INPUT -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM to the security group chain." -j neutron-openvswi-oba5cd56c-4 -A neutron-openvswi-sg-chain -m physdev --physdev-out tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-iba5cd56c-4 -A neutron-openvswi-sg-chain -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-oba5cd56c-4 (neutron-openvswitch-agent)[root@*** /]#* iptables -S | grep neutron-openvswi-oba5cd56c-4* -N neutron-openvswi-oba5cd56c-4 -A neutron-openvswi-INPUT -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM to the security group chain." -j neutron-openvswi-oba5cd56c-4 -A neutron-openvswi-oba5cd56c-4 -s 0.0.0.0/32 -d 255.255.255.255/32 -p udp -m udp --sport 68 --dport 67 -m comment --comment "Allow DHCP client traffic." -j RETURN -A neutron-openvswi-oba5cd56c-4 -j neutron-openvswi-sba5cd56c-4 -A neutron-openvswi-oba5cd56c-4 -p udp -m udp --sport 68 --dport 67 -m comment --comment "Allow DHCP client traffic." -j RETURN -A neutron-openvswi-oba5cd56c-4 -p udp -m udp --sport 67 --dport 68 -m comment --comment "Prevent DHCP Spoofing by VM." -j DROP -A neutron-openvswi-oba5cd56c-4 -m state --state RELATED,ESTABLISHED -m comment --comment "Direct packets associated with a known session to the RETURN chain." -j RETURN -A neutron-openvswi-oba5cd56c-4 -j RETURN -A neutron-openvswi-oba5cd56c-4 -m state --state INVALID -m comment --comment "Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP -A neutron-openvswi-oba5cd56c-4 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback -A neutron-openvswi-sg-chain -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-oba5cd56c-4 Pls help to give some advices about that. Thanks a lot! -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinzs2048 at gmail.com Tue Oct 15 08:38:10 2019 From: kevinzs2048 at gmail.com (Shuai Zhao) Date: Tue, 15 Oct 2019 16:38:10 +0800 Subject: [neutron][Kolla] Failed to get DHCP offer packet at qvo/qvb in compute node In-Reply-To: References: Message-ID: Sorry missed ingress rules: (neutron-openvswitch-agent)[root at uk-dc-tx2-01 /]# *iptables -S | grep neutron-openvswi-iba5cd56c-4* -N neutron-openvswi-iba5cd56c-4 -A neutron-openvswi-iba5cd56c-4 -m state --state RELATED,ESTABLISHED -m comment --comment "Direct packets associated with a known session to the RETURN chain." -j RETURN -A neutron-openvswi-iba5cd56c-4 -d 192.168.200.6/32 -p udp -m udp --sport 67 --dport 68 -j RETURN -A neutron-openvswi-iba5cd56c-4 -d 255.255.255.255/32 -p udp -m udp --sport 67 --dport 68 -j RETURN -A neutron-openvswi-iba5cd56c-4 -p tcp -m tcp -m multiport --dports 1:65535 -j RETURN -A neutron-openvswi-iba5cd56c-4 -p icmp -j RETURN -A neutron-openvswi-iba5cd56c-4 -p tcp -m tcp --dport 22 -j RETURN -A neutron-openvswi-iba5cd56c-4 -m set --match-set NIPv40cd3823f-af20-4015-b9f4- src -j RETURN -A neutron-openvswi-iba5cd56c-4 -m state --state INVALID -m comment --comment "Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP -A neutron-openvswi-iba5cd56c-4 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback -A neutron-openvswi-sg-chain -m physdev --physdev-out tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-iba5cd56c-4 And *ml2_conf.ini*: [ml2] type_drivers = flat,vlan,vxlan tenant_network_types = vxlan mechanism_drivers = openvswitch,l2population extension_drivers = port_security [ml2_type_vlan] network_vlan_ranges = [ml2_type_flat] flat_networks = physnet1 [ml2_type_vxlan] vni_ranges = 1:1000 vxlan_group = 239.1.1.1 [securitygroup] firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver [agent] tunnel_types = vxlan l2_population = true arp_responder = true [ovs] datapath_type = system ovsdb_connection = tcp:127.0.0.1:6640 local_ip = 10.22.20.4 On Tue, Oct 15, 2019 at 4:16 PM Shuai Zhao wrote: > Hi Neutron, > I've deployed Rocky-rc2 version on Debian Buster(compute node), kernel > Linux 4.19 > > Now the issue: > The VM running on the Host(Debian Buster) could not get IP when Booting. I > use tcpdump to get the packet on tap, qbr, qvb and qvo. > *The DHCP broadcast packet could be dumped at tap and qbr, but not at > qvo/qvb.* So the DHCP failed. All the firewall policy is neutron > automatic generated. > > The firewall policy is never changed. > (neutron-openvswitch-agent)[root@** /]# iptables -S | grep tapba5cd56c-46 > -A neutron-openvswi-FORWARD -m physdev --physdev-out tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Direct traffic from the VM > interface to the security group chain." -j neutron-openvswi-sg-chain > -A neutron-openvswi-FORWARD -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Direct traffic from the VM > interface to the security group chain." -j neutron-openvswi-sg-chain > -A neutron-openvswi-INPUT -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM > to the security group chain." -j neutron-openvswi-oba5cd56c-4 > -A neutron-openvswi-sg-chain -m physdev --physdev-out tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." > -j neutron-openvswi-iba5cd56c-4 > -A neutron-openvswi-sg-chain -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." > -j neutron-openvswi-oba5cd56c-4 > > (neutron-openvswitch-agent)[root@*** /]#* iptables -S | grep > neutron-openvswi-oba5cd56c-4* > -N neutron-openvswi-oba5cd56c-4 > -A neutron-openvswi-INPUT -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM > to the security group chain." -j neutron-openvswi-oba5cd56c-4 > -A neutron-openvswi-oba5cd56c-4 -s 0.0.0.0/32 -d 255.255.255.255/32 -p > udp -m udp --sport 68 --dport 67 -m comment --comment "Allow DHCP client > traffic." -j RETURN > -A neutron-openvswi-oba5cd56c-4 -j neutron-openvswi-sba5cd56c-4 > -A neutron-openvswi-oba5cd56c-4 -p udp -m udp --sport 68 --dport 67 -m > comment --comment "Allow DHCP client traffic." -j RETURN > -A neutron-openvswi-oba5cd56c-4 -p udp -m udp --sport 67 --dport 68 -m > comment --comment "Prevent DHCP Spoofing by VM." -j DROP > -A neutron-openvswi-oba5cd56c-4 -m state --state RELATED,ESTABLISHED -m > comment --comment "Direct packets associated with a known session to the > RETURN chain." -j RETURN > -A neutron-openvswi-oba5cd56c-4 -j RETURN > -A neutron-openvswi-oba5cd56c-4 -m state --state INVALID -m comment > --comment "Drop packets that appear related to an existing connection (e.g. > TCP ACK/FIN) but do not have an entry in conntrack." -j DROP > -A neutron-openvswi-oba5cd56c-4 -m comment --comment "Send unmatched > traffic to the fallback chain." -j neutron-openvswi-sg-fallback > -A neutron-openvswi-sg-chain -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." > -j neutron-openvswi-oba5cd56c-4 > > Pls help to give some advices about that. > Thanks a lot! > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Tue Oct 15 09:13:39 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Tue, 15 Oct 2019 11:13:39 +0200 Subject: [ironic] [tripleo] IPA images without RPM and YUM/DNF? Message-ID: (adding TripleO because of potential effect) Hi all, I'm working on making ironic-python-agent images smaller than they currently are. The proposed patches already reduce the default image (as built by IPA-builder) size from around 420 MiB to around 380 MiB. My next idea is to get rid of RPM and YUM databases (in case of a CentOS/RHEL image). They amount for nearly 100 MiB of the uncompressed image: $ du -sh var/lib/rpm 91M var/lib/rpm $ du -sh var/lib/yum 6.6M var/lib/yum How important for anyone is the ability to install/inspect packages inside a ramdisk? Dmitry -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Tue Oct 15 10:18:04 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 15 Oct 2019 11:18:04 +0100 Subject: [nova][ptg] Ussuri planning etherpad for nova Message-ID: <8258f512417de0b2cc70740f7aff1b1309bd3ec6.camel@redhat.com> With the PTG only a few weeks away, it's about time we started figuring out what we need to discuss there. The Ussuri PTG planning etherpad can be found at [1]. If you have something you want to discuss at the PTG then please include it there, even if you're not going to be there in person. I'm going to be away from this Friday until the summit but if we could get the bones of this in place this week, it should leave us (well, others in nova) enough time to organize the usual jumble of topics into something approaching an agenda. Stephen [1] https://etherpad.openstack.org/p/nova-shanghai-ptg From sfinucan at redhat.com Tue Oct 15 10:26:06 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 15 Oct 2019 11:26:06 +0100 Subject: [nova][ptg] Team dinner in Shanghai Message-ID: We haven't done one of these in a while (formally, anyway), so I think it would be a good idea to take advantage of the lack of fussy eaters [1] present and organise a team dinner. For anyone that hasn't joined one of these before, it's a good opportunity for people that regularly work on nova to spend some time with their fellow nova contributors IRL (in real life) and for newer contributors to m̶e̶e̶t̶ ̶t̶h̶e̶i̶r̶ ̶h̶e̶r̶o̶e̶s̶ get to know the people reviewing their code. If you are interested, please state so in the Etherpad [2] along with days that work for you and I'll try organize a suitable venue. Stephen [1] Sorry Dan, Jay :P [2] https://etherpad.openstack.org/p/nova-shanghai-ptg From thierry at openstack.org Tue Oct 15 12:41:46 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 15 Oct 2019 14:41:46 +0200 Subject: Request to update email of Launchpad Account In-Reply-To: References: Message-ID: <4ffbb61c-cd8c-4bfb-3ac4-33ab9213d591@openstack.org> Pooja Jadhav wrote: > Hi Team, > > I am Pooja Jadhav. I have a Launchpad account with > pooja.jadhav at nttdata.com but now I > have left the company so I am not able to access this email any more, > Hence I am not entitled to do forget password as well. > > So I am requesting community to suggest me alternative way to solve this > problem through which  I can put my Personal Email Id/ Current Corporate > Id. Hi Pooja, Launchpad is run by Canonical, so unfortunately the OpenStack community can't really help you. You should ask your question on #launchpad on Freenode IRC, the launchpad-users mailing-list[1] or on Launchpad itself[2]. To access those last two you'll likely have to create a new Launchpad account, but they might be able to merge them afterwards. [1] https://launchpad.net/~launchpad-users [2] https://answers.launchpad.net/launchpad/+addquestion -- Thierry Carrez (ttx) From hberaud at redhat.com Tue Oct 15 12:48:13 2019 From: hberaud at redhat.com (Herve Beraud) Date: Tue, 15 Oct 2019 14:48:13 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: I proposed some patches through heat templates and puppet-cinder to remove lock files older than 1 week and avoid file system growing. This is a solution based on a cron job, to fix that on stable branches, in a second time I'll help the fasteners project to fix the root cause by reviewing and testing the proposed patch (lock based on file offset). In next versions I hope we will use a patched fasteners and so we could drop the cron based solution. Please can you give /reviews/feedbacks: - https://review.opendev.org/688413 - https://review.opendev.org/688414 - https://review.opendev.org/688415 Thanks Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo a écrit : > On 2019/09/28 1:44, Ben Nemec wrote: > > > > > > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: > >> Hi Eric, > >> > >> On 2019/09/20 23:10, Eric Harney wrote: > >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: > >>>> Hi, > >>>> > >>>> I'm using Queens cinder with the following setting. > >>>> > >>>> --------------------------------- > >>>> [coordination] > >>>> backend_url = file://$state_path > >>>> --------------------------------- > >>>> > >>>> As a result, the files like the following were remained under the > state path after some operations.[1] > >>>> > >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume > >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot > >>>> > >>>> In my understanding, these are lock-files created for synchronization > by tooz. > >>>> But, these lock-files were not deleted after finishing operations. > >>>> Is this behaviour correct? > >>>> > >>>> [1] > >>>> e.g. Delete volume, Delete snapshot > >>> > >>> This is a known bug that's described here: > >>> > >>> https://github.com/harlowja/fasteners/issues/26 > >>> > >>> (The fasteners library is used by tooz, which is used by Cinder for > managing these lock files.) > >>> > >>> There's an old Cinder bug for it here: > >>> https://bugs.launchpad.net/cinder/+bug/1432387 > >>> > >>> but that's marked as "Won't Fix" because Cinder needs it to be fixed > in the underlying libraries. > >> Thank you for your explanation. > >> I understood the state. > >> > >> But, I have one more question. > >> Can I think this bug doesn't affect synchronization? > > > > It does not. In fact, it's important to not remove lock files while a > service is running or you can end up with synchronization issues. > > > > To clean up the leftover lock files, we generally recommend clearing the > lock_path for each service on reboot before the services have started. > > Thank you for your information. > I think that I understood this issue completely. > > Best Regards, > > > >> > >> Best regards, > >> > >>> Thanks, > >>> Eric > >>> > >> > > > > -- > _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > Rikimaru Honjo > E-mail:honjo.rikimaru at ntt-tx.co.jp > > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgolovat at redhat.com Tue Oct 15 12:55:41 2019 From: sgolovat at redhat.com (Sergii Golovatiuk) Date: Tue, 15 Oct 2019 14:55:41 +0200 Subject: [ironic] [tripleo] IPA images without RPM and YUM/DNF? In-Reply-To: References: Message-ID: Hi, Operator may run "rpm --rebuilddb" in case he needs some packages installed. Alternatively, he may build a new image with rpm/yum databases. вт, 15 окт. 2019 г. в 11:15, Dmitry Tantsur : > (adding TripleO because of potential effect) > > Hi all, > > I'm working on making ironic-python-agent images smaller than they > currently are. The proposed patches already reduce the default image (as > built by IPA-builder) size from around 420 MiB to around 380 MiB. > > My next idea is to get rid of RPM and YUM databases (in case of a > CentOS/RHEL image). They amount for nearly 100 MiB of the uncompressed > image: > $ du -sh var/lib/rpm > 91M var/lib/rpm > $ du -sh var/lib/yum > 6.6M var/lib/yum > > How important for anyone is the ability to install/inspect packages inside > a ramdisk? > > Dmitry > -- Sergii Golovatiuk Senior Software Developer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From emccormick at cirrusseven.com Tue Oct 15 13:21:39 2019 From: emccormick at cirrusseven.com (Erik McCormick) Date: Tue, 15 Oct 2019 09:21:39 -0400 Subject: [ops] Shanghai Meetup Message-ID: Greetings Operators! We have been allocated a half-day session on Thursday afternoon as part of the PTG. We have room for 50 people and would like to use it as a mini Ops Meetup. For those who haven't attended one before, these are working sessions like the forum or PTG rather than formal presentations. Sessions don't need to be of fixed length for this, especially since it's a fairly short period of time. How much time we spend on a topic will be dictated by the cadence of the discussion and interest of the attendees. Please take a few minutes to add topic suggestions here, and +1 others that you would like to talk about. https://etherpad.openstack.org/p/PVG-OPS-Forum-Brainstorming Thanks, and see you in Shanghai! -Erik -------------- next part -------------- An HTML attachment was scrubbed... URL: From denise at openstack.org Tue Oct 15 12:13:20 2019 From: denise at openstack.org (denise at openstack.org) Date: Tue, 15 Oct 2019 06:13:20 -0600 (MDT) Subject: OSF booth at KubeCon+CloudNativeCon in San Diego Message-ID: <1571141600.054118968@apps.rackspace.com> Hello Everyone, We wanted to let you know that the OpenStack Foundation will have a booth at the upcoming KubeCon+CloudNativeCon event in San Diego, CA on November 18-21, 2019. We are in booth #S23 and we will be featuring the OpenStack Foundation in addition to all the projects - Airship, Kata Containers, StarlingX, OpenStack and Zuul. At the booth we will have stickers and educational collateral about each project to distribute. We would like to invite you to help us in the following areas: Represent your project by staffing the OSF booth If you can spare 1 hour/day (or any time at all!) to represent your specific project in the OSF booth Here is the [ link ]( https://docs.google.com/spreadsheets/d/1mZzK0GHm9OQ0IL9njTWLxPLuAeR6jqmyeM60T7sQQOc/edit#gid=0 ) to the google doc to sign up Project Demos in the OSF booth If you are interested in delivering a project-specific demo in the OSF booth, please contact [ Denise ]( http://denise at openstack.org ) Looking forward to seeing all of you in San Diego! Best regards, OSF Marketing team Denise, Claire, Allison and Ashlee -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue Oct 15 14:47:55 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 09:47:55 -0500 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: On 10/15/19 7:48 AM, Herve Beraud wrote: > I proposed some patches through heat templates and puppet-cinder to > remove lock files older than 1 week and avoid file system growing. > > This is a solution based on a cron job, to fix that on stable branches, > in a second time I'll help the fasteners project to fix the root cause > by reviewing and testing the proposed patch (lock based on file offset). > In next versions I hope we will use a patched fasteners and so we could > drop the cron based solution. > > Please can you give /reviews/feedbacks: > - https://review.opendev.org/688413 > - https://review.opendev.org/688414 > - https://review.opendev.org/688415 I'm rather hesitant to recommend this. It looks like the change is only removing the -delete lock files, which are a fraction of the total lock files created by Cinder, and I don't particularly want to encourage people to start monkeying with the lock files while a service is running. Even with this limited set of deletions, I think we need a Cinder person to look and verify that we aren't making bad assumptions about how the locks are used. In essence, I don't think this is going to meaningfully reduce the amount of leftover lock files and it sets a bad precedent for how to handle them. Personally, I'd rather see a boot-time service added for each OpenStack service that goes out and wipes the lock file directory before starting the service. On a more general note, I'm going to challenge the assertion that "Customer file system growing slowly and so customer risk to facing some issues to file system usage after a long period." I have yet to hear an actual bug report from the leftover lock files. Every time this comes up it's because someone noticed a lot of lock files and thought we were leaking them. I've never heard anyone report an actual functional or performance problem as a result of the lock files. I don't think we should "fix" this until someone reports that it's actually broken. Especially because previous attempts have all resulted in very real bugs that did break people. Maybe we should have oslo.concurrency drop a file named _README (or something else likely to sort first in the file listing) into the configured lock_path that explains why the files are there and the proper way to deal with them. > > Thanks > > > Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo > > a écrit : > > On 2019/09/28 1:44, Ben Nemec wrote: > > > > > > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: > >> Hi Eric, > >> > >> On 2019/09/20 23:10, Eric Harney wrote: > >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: > >>>> Hi, > >>>> > >>>> I'm using Queens cinder with the following setting. > >>>> > >>>> --------------------------------- > >>>> [coordination] > >>>> backend_url = file://$state_path > >>>> --------------------------------- > >>>> > >>>> As a result, the files like the following were remained under > the state path after some operations.[1] > >>>> > >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume > >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot > >>>> > >>>> In my understanding, these are lock-files created for > synchronization by tooz. > >>>> But, these lock-files were not deleted after finishing operations. > >>>> Is this behaviour correct? > >>>> > >>>> [1] > >>>> e.g. Delete volume, Delete snapshot > >>> > >>> This is a known bug that's described here: > >>> > >>> https://github.com/harlowja/fasteners/issues/26 > >>> > >>> (The fasteners library is used by tooz, which is used by Cinder > for managing these lock files.) > >>> > >>> There's an old Cinder bug for it here: > >>> https://bugs.launchpad.net/cinder/+bug/1432387 > >>> > >>> but that's marked as "Won't Fix" because Cinder needs it to be > fixed in the underlying libraries. > >> Thank you for your explanation. > >> I understood the state. > >> > >> But, I have one more question. > >> Can I think this bug doesn't affect synchronization? > > > > It does not. In fact, it's important to not remove lock files > while a service is running or you can end up with synchronization > issues. > > > > To clean up the leftover lock files, we generally recommend > clearing the lock_path for each service on reboot before the > services have started. > > Thank you for your information. > I think that I understood this issue completely. > > Best Regards, > > > >> > >> Best regards, > >> > >>> Thanks, > >>> Eric > >>> > >> > > > > -- > _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > Rikimaru Honjo > E-mail:honjo.rikimaru at ntt-tx.co.jp > > > > > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From mihalis68 at gmail.com Tue Oct 15 14:51:48 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 15 Oct 2019 10:51:48 -0400 Subject: [Ops] ops meetups team meeting 2019-10-15 - minutes Message-ID: We had a brief meeting for the OpenStack Ops Meetups today on IRC, minutes linked below. The preparations for ops events at the Shanghai summit continue. There will be an "ops war stories" session during the forum, and then a mini ops meetup on day 4 (thursday) agenda still tbd, please make suggestions here : https://etherpad.openstack.org/p/PVG-OPS-Forum-Brainstorming Another bit of news is that Bloomberg intends to offer to host the next OpenStack Ops Meetup. If accepted this would be an event in our London headquarters on January 7th and 8th. More news will be shared here as and when available and via the ops notifications twitter account ( https://twitter.com/osopsmeetup) Today's meeting minutes: 10:40 AM Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-15-14.04.html 10:40 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-15-14.04.txt 10:40 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-15-14.04.log.html Cheers, Chris - on behalf of the OpenStack Ops Meetups team -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue Oct 15 14:54:50 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 09:54:50 -0500 Subject: [ironic] [tripleo] IPA images without RPM and YUM/DNF? In-Reply-To: References: Message-ID: <8b106a3f-e999-31d8-8497-8d84bd992832@nemebean.com> On 10/15/19 4:13 AM, Dmitry Tantsur wrote: > (adding TripleO because of potential effect) > > Hi all, > > I'm working on making ironic-python-agent images smaller than they > currently are. The proposed patches already reduce the default image (as > built by IPA-builder) size from around 420 MiB to around 380 MiB. > > My next idea is to get rid of RPM and YUM databases (in case of a > CentOS/RHEL image). They amount for nearly 100 MiB of the uncompressed > image: > $ du -sh var/lib/rpm > 91M var/lib/rpm > $ du -sh var/lib/yum > 6.6M var/lib/yum > > How important for anyone is the ability to install/inspect packages > inside a ramdisk? Back when we were building the ramdisks with DIB I'm pretty sure we were wiping the RPM db too, so unless something has changed since then I would expect this to be fine. > > Dmitry From mriedemos at gmail.com Tue Oct 15 15:24:58 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 15 Oct 2019 10:24:58 -0500 Subject: [tc][all] Community-wide goal Ussuri and V cycle forum collaboration idea In-Reply-To: <16dcc79196d.b7dfa21684317.2121277505699030183@ghanshyammann.com> References: <16dcc79196d.b7dfa21684317.2121277505699030183@ghanshyammann.com> Message-ID: <05fa700e-dba6-36ce-cf42-c7023f2515c9@gmail.com> On 10/14/2019 5:52 PM, Ghanshyam Mann wrote: > Question is for V cycle goal planning, whether we should discuss the V cycle goal in Ussuri goal fourm sessoin[3] or > it is too early to kick off V cycle goal at least until we finalize U cycle goal first. I would like to list the below two > options to proceed further (at least to decide if we need to change the existing U cycle goal forum sessions title). > > 1. Merge the Forum session for both cycle goal discussion (divide both in two half). This need forum session title and description change. > 2. Keep forum session for U cycle goal only and start the V cycle over ML asynchronously. This will help to avoid any confusion or mixing the both cycle goal discussions. So you have 40 minutes to discuss something that is notoriously hard to sort out for one release let alone the future, and to date there are only 3 goals proposed for Ussuri. Why even consider goals for V at this point when settling on goals for Train was kind of a (train)wreck (get it?!) and goal champions for Ussuri aren't necessarily champing at the bit? I won't be there so I don't have a horse in this race (yay more idioms), just commenting from the peanut gallery. -- Thanks, Matt From hberaud at redhat.com Tue Oct 15 15:41:19 2019 From: hberaud at redhat.com (Herve Beraud) Date: Tue, 15 Oct 2019 17:41:19 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: Le mar. 15 oct. 2019 à 16:48, Ben Nemec a écrit : > > > On 10/15/19 7:48 AM, Herve Beraud wrote: > > I proposed some patches through heat templates and puppet-cinder to > > remove lock files older than 1 week and avoid file system growing. > > > > This is a solution based on a cron job, to fix that on stable branches, > > in a second time I'll help the fasteners project to fix the root cause > > by reviewing and testing the proposed patch (lock based on file offset). > > In next versions I hope we will use a patched fasteners and so we could > > drop the cron based solution. > > > > Please can you give /reviews/feedbacks: > > - https://review.opendev.org/688413 > > - https://review.opendev.org/688414 > > - https://review.opendev.org/688415 > > I'm rather hesitant to recommend this. It looks like the change is only > removing the -delete lock files, which are a fraction of the total lock > files created by Cinder, and I don't particularly want to encourage > people to start monkeying with the lock files while a service is > running. Even with this limited set of deletions, I think we need a > Cinder person to look and verify that we aren't making bad assumptions > about how the locks are used. > Yes these changes should be validated by the cinder team. I chosen this approach to allow use to fix that on stable branches too, and to avoid to introduce a non backportable new feature. > > In essence, I don't think this is going to meaningfully reduce the > amount of leftover lock files and it sets a bad precedent for how to > handle them. > > Personally, I'd rather see a boot-time service added for each OpenStack > service that goes out and wipes the lock file directory before starting > the service. > I agree it can be an alternative to the proposed changes. I guess it's related to some sort of puppet code too, I'm right? (the boot-time service) > > On a more general note, I'm going to challenge the assertion that > "Customer file system growing slowly and so customer risk to facing some > issues to file system usage after a long period." I have yet to hear an > actual bug report from the leftover lock files. Every time this comes up > it's because someone noticed a lot of lock files and thought we were > leaking them. I've never heard anyone report an actual functional or > performance problem as a result of the lock files. I don't think we > should "fix" this until someone reports that it's actually broken. > Especially because previous attempts have all resulted in very real bugs > that did break people. > Yes I agreee it's more an assumption than a reality, I never seen anybody report a disk usage issue or things like this due to leftover lock files. > Maybe we should have oslo.concurrency drop a file named _README (or > something else likely to sort first in the file listing) into the > configured lock_path that explains why the files are there and the > proper way to deal with them. > Good idea. Anyway, even if nobody facing a file system issue related to files leftover, I think it's not a good thing to lets grow a FS, and we need to try to address it to prevent potential file system issues related to disk usage and lock files, but in a secure way to avoid to introduce race conditions with cinder. Cinder people need to confirm that my proposed changes can fit well with cinder's mechanismes or choose a better approach. > > > > > Thanks > > > > > > Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo > > > a > écrit : > > > > On 2019/09/28 1:44, Ben Nemec wrote: > > > > > > > > > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: > > >> Hi Eric, > > >> > > >> On 2019/09/20 23:10, Eric Harney wrote: > > >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: > > >>>> Hi, > > >>>> > > >>>> I'm using Queens cinder with the following setting. > > >>>> > > >>>> --------------------------------- > > >>>> [coordination] > > >>>> backend_url = file://$state_path > > >>>> --------------------------------- > > >>>> > > >>>> As a result, the files like the following were remained under > > the state path after some operations.[1] > > >>>> > > >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume > > >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot > > >>>> > > >>>> In my understanding, these are lock-files created for > > synchronization by tooz. > > >>>> But, these lock-files were not deleted after finishing > operations. > > >>>> Is this behaviour correct? > > >>>> > > >>>> [1] > > >>>> e.g. Delete volume, Delete snapshot > > >>> > > >>> This is a known bug that's described here: > > >>> > > >>> https://github.com/harlowja/fasteners/issues/26 > > >>> > > >>> (The fasteners library is used by tooz, which is used by Cinder > > for managing these lock files.) > > >>> > > >>> There's an old Cinder bug for it here: > > >>> https://bugs.launchpad.net/cinder/+bug/1432387 > > >>> > > >>> but that's marked as "Won't Fix" because Cinder needs it to be > > fixed in the underlying libraries. > > >> Thank you for your explanation. > > >> I understood the state. > > >> > > >> But, I have one more question. > > >> Can I think this bug doesn't affect synchronization? > > > > > > It does not. In fact, it's important to not remove lock files > > while a service is running or you can end up with synchronization > > issues. > > > > > > To clean up the leftover lock files, we generally recommend > > clearing the lock_path for each service on reboot before the > > services have started. > > > > Thank you for your information. > > I think that I understood this issue completely. > > > > Best Regards, > > > > > > >> > > >> Best regards, > > >> > > >>> Thanks, > > >>> Eric > > >>> > > >> > > > > > > > -- > > _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > > Rikimaru Honjo > > E-mail:honjo.rikimaru at ntt-tx.co.jp > > > > > > > > > > > > -- > > Hervé Beraud > > Senior Software Engineer > > Red Hat - Openstack Oslo > > irc: hberaud > > -----BEGIN PGP SIGNATURE----- > > > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > > v6rDpkeNksZ9fFSyoY2o > > =ECSj > > -----END PGP SIGNATURE----- > > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjeanner at redhat.com Tue Oct 15 15:24:55 2019 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Tue, 15 Oct 2019 17:24:55 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: On 10/15/19 4:47 PM, Ben Nemec wrote: > > > On 10/15/19 7:48 AM, Herve Beraud wrote: >> I proposed some patches through heat templates and puppet-cinder to >> remove lock files older than 1 week and avoid file system growing. >> >> This is a solution based on a cron job, to fix that on stable >> branches, in a second time I'll help the fasteners project to fix the >> root cause by reviewing and testing the proposed patch (lock based on >> file offset). In next versions I hope we will use a patched fasteners >> and so we could drop the cron based solution. >> >> Please can you give /reviews/feedbacks: >> - https://review.opendev.org/688413 >> - https://review.opendev.org/688414 >> - https://review.opendev.org/688415 > > I'm rather hesitant to recommend this. It looks like the change is only > removing the -delete lock files, which are a fraction of the total lock > files created by Cinder, and I don't particularly want to encourage > people to start monkeying with the lock files while a service is > running. Even with this limited set of deletions, I think we need a > Cinder person to look and verify that we aren't making bad assumptions > about how the locks are used. I'm also not that happy with the cron way - but apparently it might create some issues in some setup with the current way things are done: - inodes aren't infinit on ext* FS (ext3, ext4, blah) - see bellow for context - perfs might be bad (see bellow for context) So one way or another, cleanup is needed. > > In essence, I don't think this is going to meaningfully reduce the > amount of leftover lock files and it sets a bad precedent for how to > handle them. The filter is strict - on purpose, to address this specific issue. Of course, we might want to loosen it, but... do we really want that? IF we're to go with the cron thingy of course. Some more thinking is needed I guess. > > Personally, I'd rather see a boot-time service added for each OpenStack > service that goes out and wipes the lock file directory before starting > the service. Well.... former sysops here: don't count on regular reboot - once a year is a fair average - and it's usually due to some power cut... Sad world, I know ;). So a "boot-time cleanup" will help. A little. And wouldn't hurt anyway. So +1 for that idea, but I wouldn't rely only on it. And there might be some issues (see bellow) > > On a more general note, I'm going to challenge the assertion that > "Customer file system growing slowly and so customer risk to facing some > issues to file system usage after a long period." I have yet to hear an > actual bug report from the leftover lock files. Every time this comes up > it's because someone noticed a lot of lock files and thought we were > leaking them. I've never heard anyone report an actual functional or > performance problem as a result of the lock files. I don't think we > should "fix" this until someone reports that it's actually broken. > Especially because previous attempts have all resulted in very real bugs > that did break people. I'm not on your side here. Waiting to get a fire before thinking about correction and counter-measures isn't good. Since we know there's an issue, and that, eventually, it might be a really big one, it would be good to address it before it explodes in our face. The disk *space* is probably not the issue. File with 1k, on a couple of gigas, it's good. But there are other concerns: - inodes. Yes, we're supposed to get things on XFS, and that dude doesn't have inodes. But some ops might want to rely on the good(?) old ext3, or ext4. There, we might get some issues, and pretty quickly depending on the speed of lock creation (so, linked to cinder actions in this case). Or it might be some NFS with an ext4 FS. - rm limits: you probably never ever hit "rm argument list limit". But it does exist, and I did hit it (like 15 years ago - maybe it's sorted out, but I have some doubts). This means that rm will NOT be able to cope with the directory content after a certain amount (which is huge, right, but still... we're talking about some long-lasting process filling a directory). Of course, "find" might be the right tool in such case, but it will take a long, long time to delete (thinking about the "boot-time cleanup proposal" for instance). - performances: it might impact the system, for instance if one has some backup process running and wanting to eat the /var/lib/cinder directory: if the op doesn't know about this issue, they might get some surprises with long, long, loooong running backups. With a target on some ext4 - hello Inodes! Sooo... yeah. I think this issue should be addressed. Really. But I +1 the fact that it should be done "the right way", whatever it is. The "cron" might be the wrong one. Or not. We need some more feedbacks on that :). > > Maybe we should have oslo.concurrency drop a file named _README (or > something else likely to sort first in the file listing) into the > configured lock_path that explains why the files are there and the > proper way to deal with them. Hmmm... who reads README anyway? Like, really. Better getting some cleanup un place to avoid questions ;). Cheers, C. > >> >> Thanks >> >> >> Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo >> > a >> écrit : >> >>     On 2019/09/28 1:44, Ben Nemec wrote: >>      > >>      > >>      > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: >>      >> Hi Eric, >>      >> >>      >> On 2019/09/20 23:10, Eric Harney wrote: >>      >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: >>      >>>> Hi, >>      >>>> >>      >>>> I'm using Queens cinder with the following setting. >>      >>>> >>      >>>> --------------------------------- >>      >>>> [coordination] >>      >>>> backend_url = file://$state_path >>      >>>> --------------------------------- >>      >>>> >>      >>>> As a result, the files like the following were remained under >>     the state path after some operations.[1] >>      >>>> >>      >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume >>      >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot >>      >>>> >>      >>>> In my understanding, these are lock-files created for >>     synchronization by tooz. >>      >>>> But, these lock-files were not deleted after finishing >> operations. >>      >>>> Is this behaviour correct? >>      >>>> >>      >>>> [1] >>      >>>> e.g. Delete volume, Delete snapshot >>      >>> >>      >>> This is a known bug that's described here: >>      >>> >>      >>> https://github.com/harlowja/fasteners/issues/26 >>      >>> >>      >>> (The fasteners library is used by tooz, which is used by Cinder >>     for managing these lock files.) >>      >>> >>      >>> There's an old Cinder bug for it here: >>      >>> https://bugs.launchpad.net/cinder/+bug/1432387 >>      >>> >>      >>> but that's marked as "Won't Fix" because Cinder needs it to be >>     fixed in the underlying libraries. >>      >> Thank you for your explanation. >>      >> I understood the state. >>      >> >>      >> But, I have one more question. >>      >> Can I think this bug doesn't affect synchronization? >>      > >>      > It does not. In fact, it's important to not remove lock files >>     while a service is running or you can end up with synchronization >>     issues. >>      > >>      > To clean up the leftover lock files, we generally recommend >>     clearing the lock_path for each service on reboot before the >>     services have started. >> >>     Thank you for your information. >>     I think that I understood this issue completely. >> >>     Best Regards, >> >> >>      >> >>      >> Best regards, >>      >> >>      >>> Thanks, >>      >>> Eric >>      >>> >>      >> >>      > >> >>     --     _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ >>     Rikimaru Honjo >>     E-mail:honjo.rikimaru at ntt-tx.co.jp >>     >> >> >> >> >> -- >> Hervé Beraud >> Senior Software Engineer >> Red Hat - Openstack Oslo >> irc: hberaud >> -----BEGIN PGP SIGNATURE----- >> >> wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ >> Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ >> RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP >> F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G >> 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g >> glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw >> m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ >> hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 >> qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y >> F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 >> B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O >> v6rDpkeNksZ9fFSyoY2o >> =ECSj >> -----END PGP SIGNATURE----- >> > -- Cédric Jeanneret (He/Him/His) Software Engineer - OpenStack Platform Red Hat EMEA https://www.redhat.com/ From openstack at nemebean.com Tue Oct 15 17:00:38 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 12:00:38 -0500 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: <4083c539-6b3f-1908-16ac-edbbfe8eb04a@nemebean.com> On 10/15/19 10:24 AM, Cédric Jeanneret wrote: > > > On 10/15/19 4:47 PM, Ben Nemec wrote: >> >> >> On 10/15/19 7:48 AM, Herve Beraud wrote: >>> I proposed some patches through heat templates and puppet-cinder to >>> remove lock files older than 1 week and avoid file system growing. >>> >>> This is a solution based on a cron job, to fix that on stable >>> branches, in a second time I'll help the fasteners project to fix the >>> root cause by reviewing and testing the proposed patch (lock based on >>> file offset). In next versions I hope we will use a patched fasteners >>> and so we could drop the cron based solution. >>> >>> Please can you give /reviews/feedbacks: >>> - https://review.opendev.org/688413 >>> - https://review.opendev.org/688414 >>> - https://review.opendev.org/688415 >> >> I'm rather hesitant to recommend this. It looks like the change is >> only removing the -delete lock files, which are a fraction of the >> total lock files created by Cinder, and I don't particularly want to >> encourage people to start monkeying with the lock files while a >> service is running. Even with this limited set of deletions, I think >> we need a Cinder person to look and verify that we aren't making bad >> assumptions about how the locks are used. > > I'm also not that happy with the cron way - but apparently it might > create some issues in some setup with the current way things are done: > - inodes aren't infinit on ext* FS (ext3, ext4, blah) - see bellow for > context > - perfs might be bad (see bellow for context) > > So one way or another, cleanup is needed. > >> >> In essence, I don't think this is going to meaningfully reduce the >> amount of leftover lock files and it sets a bad precedent for how to >> handle them. > > The filter is strict - on purpose, to address this specific issue. Of > course, we might want to loosen it, but... do we really want that? IF > we're to go with the cron thingy of course. Some more thinking is needed > I guess. > >> >> Personally, I'd rather see a boot-time service added for each >> OpenStack service that goes out and wipes the lock file directory >> before starting the service. > > Well.... former sysops here: don't count on regular reboot - once a year > is a fair average - and it's usually due to some power cut... Sad world, > I know ;). > So a "boot-time cleanup" will help. A little. And wouldn't hurt anyway. > So +1 for that idea, but I wouldn't rely only on it. And there might be > some issues (see bellow) I understand that it doesn't happen regularly, but it's the easiest to automate safe time to clean locks. It can also be done when the service is down for maintenance, but even then you need to be careful because if you wipe the cinder locks when cinder-volume gets restarted, but cinder-scheduler is still running you might wipe an in-use lock. I don't know if that specific scenario is possible, but the point is that any process that could hold a lock needs to be down before you clear the lock directory. Since most OpenStack services have multiple separate OS services running that complicates the process. > >> >> On a more general note, I'm going to challenge the assertion that >> "Customer file system growing slowly and so customer risk to facing some >> issues to file system usage after a long period." I have yet to hear >> an actual bug report from the leftover lock files. Every time this >> comes up it's because someone noticed a lot of lock files and thought >> we were leaking them. I've never heard anyone report an actual >> functional or performance problem as a result of the lock files. I >> don't think we should "fix" this until someone reports that it's >> actually broken. Especially because previous attempts have all >> resulted in very real bugs that did break people. > > I'm not on your side here. Waiting to get a fire before thinking about > correction and counter-measures isn't good. Since we know there's an > issue, and that, eventually, it might be a really big one, it would be > good to address it before it explodes in our face. If you have a solution that fixes the problem without introducing concurrency problems then I'm all ears. :-) Until then, I'm not comfortable fixing a hypothetical problem by creating very real new problems. > The disk *space* is probably not the issue. File with 1k, on a couple of > gigas, it's good. > But there are other concerns: > > - inodes. Yes, we're supposed to get things on XFS, and that dude > doesn't have inodes. But some ops might want to rely on the good(?) old > ext3, or ext4. There, we might get some issues, and pretty quickly > depending on the speed of lock creation (so, linked to cinder actions in > this case). Or it might be some NFS with an ext4 FS. > > - rm limits: you probably never ever hit "rm argument list limit". But > it does exist, and I did hit it (like 15 years ago - maybe it's sorted > out, but I have some doubts). This means that rm will NOT be able to > cope with the directory content after a certain amount (which is huge, > right, but still... we're talking about some long-lasting process > filling a directory). > Of course, "find" might be the right tool in such case, but it will take > a long, long time to delete (thinking about the "boot-time cleanup > proposal" for instance). > > - performances: it might impact the system, for instance if one has some > backup process running and wanting to eat the /var/lib/cinder directory: > if the op doesn't know about this issue, they might get some surprises > with long, long, loooong running backups. > With a target on some ext4 - hello Inodes! Sure, but these are all still theoretical problems. I've _never_ heard of anyone running into them, and we have some pretty big OpenStack deployments in the wild. I feel like at one point I did the math to figure out what it would take to hit an inode limit because of lock files, and it was fairly absurd. Like you would have to leave a deployment running for a decade under heavy use to actually get there. I don't still have those numbers handy, but it might be a useful exercise to look at that again. And this reminds me of another thing that has been suggested in the past to address the lock file cleanup issue (we should really write all of this down if we haven't already...), which is to put them on tmpfs. That way they get cleared automatically on reboot and you don't have to manage anything. Locks don't persist over reboots anyway so it doesn't matter that it's on volatile storage. The whole file thing is actually a consequence of Linux IPC being complete garbage, not because we want persistent storage of locks. > > Sooo... yeah. I think this issue should be addressed. Really. But I +1 > the fact that it should be done "the right way", whatever it is. The > "cron" might be the wrong one. Or not. We need some more feedbacks on > that :). Patches welcome. But fair warning: This problem is a lot harder than it looks on the surface. Many solutions have been proposed over the years, all of them were worse than what we have now. > >> >> Maybe we should have oslo.concurrency drop a file named _README (or >> something else likely to sort first in the file listing) into the >> configured lock_path that explains why the files are there and the >> proper way to deal with them. > > Hmmm... who reads README anyway? Like, really. Better getting some > cleanup un place to avoid questions ;). If it heads off even one of these threads that happen every few months then it will have been worth it. :-D > > Cheers, > > C. > >> >>> >>> Thanks >>> >>> >>> Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo >>> > a >>> écrit : >>> >>>     On 2019/09/28 1:44, Ben Nemec wrote: >>>      > >>>      > >>>      > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: >>>      >> Hi Eric, >>>      >> >>>      >> On 2019/09/20 23:10, Eric Harney wrote: >>>      >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: >>>      >>>> Hi, >>>      >>>> >>>      >>>> I'm using Queens cinder with the following setting. >>>      >>>> >>>      >>>> --------------------------------- >>>      >>>> [coordination] >>>      >>>> backend_url = file://$state_path >>>      >>>> --------------------------------- >>>      >>>> >>>      >>>> As a result, the files like the following were remained under >>>     the state path after some operations.[1] >>>      >>>> >>>      >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume >>>      >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot >>>      >>>> >>>      >>>> In my understanding, these are lock-files created for >>>     synchronization by tooz. >>>      >>>> But, these lock-files were not deleted after finishing >>> operations. >>>      >>>> Is this behaviour correct? >>>      >>>> >>>      >>>> [1] >>>      >>>> e.g. Delete volume, Delete snapshot >>>      >>> >>>      >>> This is a known bug that's described here: >>>      >>> >>>      >>> https://github.com/harlowja/fasteners/issues/26 >>>      >>> >>>      >>> (The fasteners library is used by tooz, which is used by Cinder >>>     for managing these lock files.) >>>      >>> >>>      >>> There's an old Cinder bug for it here: >>>      >>> https://bugs.launchpad.net/cinder/+bug/1432387 >>>      >>> >>>      >>> but that's marked as "Won't Fix" because Cinder needs it to be >>>     fixed in the underlying libraries. >>>      >> Thank you for your explanation. >>>      >> I understood the state. >>>      >> >>>      >> But, I have one more question. >>>      >> Can I think this bug doesn't affect synchronization? >>>      > >>>      > It does not. In fact, it's important to not remove lock files >>>     while a service is running or you can end up with synchronization >>>     issues. >>>      > >>>      > To clean up the leftover lock files, we generally recommend >>>     clearing the lock_path for each service on reboot before the >>>     services have started. >>> >>>     Thank you for your information. >>>     I think that I understood this issue completely. >>> >>>     Best Regards, >>> >>> >>>      >> >>>      >> Best regards, >>>      >> >>>      >>> Thanks, >>>      >>> Eric >>>      >>> >>>      >> >>>      > >>> >>>     --     _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ >>>     Rikimaru Honjo >>>     E-mail:honjo.rikimaru at ntt-tx.co.jp >>>     >>> >>> >>> >>> >>> -- >>> Hervé Beraud >>> Senior Software Engineer >>> Red Hat - Openstack Oslo >>> irc: hberaud >>> -----BEGIN PGP SIGNATURE----- >>> >>> wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ >>> Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ >>> RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP >>> F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G >>> 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g >>> glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw >>> m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ >>> hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 >>> qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y >>> F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 >>> B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O >>> v6rDpkeNksZ9fFSyoY2o >>> =ECSj >>> -----END PGP SIGNATURE----- >>> >> > From openstack at nemebean.com Tue Oct 15 17:13:35 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 12:13:35 -0500 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> On 10/15/19 10:41 AM, Herve Beraud wrote: > > > Le mar. 15 oct. 2019 à 16:48, Ben Nemec > a écrit : > > > > On 10/15/19 7:48 AM, Herve Beraud wrote: > > I proposed some patches through heat templates and puppet-cinder to > > remove lock files older than 1 week and avoid file system growing. > > > > This is a solution based on a cron job, to fix that on stable > branches, > > in a second time I'll help the fasteners project to fix the root > cause > > by reviewing and testing the proposed patch (lock based on file > offset). > > In next versions I hope we will use a patched fasteners and so we > could > > drop the cron based solution. > > > > Please can you give /reviews/feedbacks: > > - https://review.opendev.org/688413 > > - https://review.opendev.org/688414 > > - https://review.opendev.org/688415 > > I'm rather hesitant to recommend this. It looks like the change is only > removing the -delete lock files, which are a fraction of the total lock > files created by Cinder, and I don't particularly want to encourage > people to start monkeying with the lock files while a service is > running. Even with this limited set of deletions, I think we need a > Cinder person to look and verify that we aren't making bad assumptions > about how the locks are used. > > > Yes these changes should be validated by the cinder team. > I chosen this approach to allow use to fix that on stable branches too, > and to avoid to introduce a non backportable new feature. > > > In essence, I don't think this is going to meaningfully reduce the > amount of leftover lock files and it sets a bad precedent for how to > handle them. > > Personally, I'd rather see a boot-time service added for each OpenStack > service that goes out and wipes the lock file directory before starting > the service. > > > I agree it can be an alternative to the proposed changes. > I guess it's related to some sort of puppet code too, I'm right? (the > boot-time service) That's probably how you'd implement it in TripleO. Or maybe Ansible now. Best to check with the TripleO team on that since my knowledge is quite out of date on that project now. > > > On a more general note, I'm going to challenge the assertion that > "Customer file system growing slowly and so customer risk to facing some > issues to file system usage after a long period." I have yet to hear an > actual bug report from the leftover lock files. Every time this > comes up > it's because someone noticed a lot of lock files and thought we were > leaking them. I've never heard anyone report an actual functional or > performance problem as a result of the lock files. I don't think we > should "fix" this until someone reports that it's actually broken. > Especially because previous attempts have all resulted in very real > bugs > that did break people. > > > Yes I agreee it's more an assumption than a reality, I never seen > anybody report a disk usage issue or things like this due to leftover > lock files. > > > Maybe we should have oslo.concurrency drop a file named _README (or > something else likely to sort first in the file listing) into the > configured lock_path that explains why the files are there and the > proper way to deal with them. > > > Good idea. > > Anyway, even if nobody facing a file system issue related to files > leftover, I think it's not a good thing to lets grow a FS, and we need > to try to address it to prevent potential file system issues related to > disk usage and lock files, but in a secure way to avoid to introduce > race conditions with cinder. > > Cinder people need to confirm that my proposed changes can fit well with > cinder's mechanismes or choose a better approach. I'm opposed in general to external solutions. If lock files are to be cleaned up, it needs to happen either when the service isn't running so there's no chance of deleting an in-use lock, or it needs to be done by the service itself when it knows that it is done with the lock. Any fixes outside the service run the risk of drifting from the implementation if, for example, Cinder made a change to its locking semantics such that locks you could safely remove previously no longer could be. I believe Neutron implements lock file cleanup using [0], which is really the only way runtime lock cleanup should be done IMHO. 0: https://github.com/openstack/oslo.concurrency/blob/85c341aced7b181724b68c9d883768b5c5f7e982/oslo_concurrency/lockutils.py#L194 > > > > > > Thanks > > > > > > Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo > > > >> a écrit : > > > >     On 2019/09/28 1:44, Ben Nemec wrote: > >      > > >      > > >      > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: > >      >> Hi Eric, > >      >> > >      >> On 2019/09/20 23:10, Eric Harney wrote: > >      >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: > >      >>>> Hi, > >      >>>> > >      >>>> I'm using Queens cinder with the following setting. > >      >>>> > >      >>>> --------------------------------- > >      >>>> [coordination] > >      >>>> backend_url = file://$state_path > >      >>>> --------------------------------- > >      >>>> > >      >>>> As a result, the files like the following were remained > under > >     the state path after some operations.[1] > >      >>>> > >      >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume > >      >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot > >      >>>> > >      >>>> In my understanding, these are lock-files created for > >     synchronization by tooz. > >      >>>> But, these lock-files were not deleted after finishing > operations. > >      >>>> Is this behaviour correct? > >      >>>> > >      >>>> [1] > >      >>>> e.g. Delete volume, Delete snapshot > >      >>> > >      >>> This is a known bug that's described here: > >      >>> > >      >>> https://github.com/harlowja/fasteners/issues/26 > >      >>> > >      >>> (The fasteners library is used by tooz, which is used by > Cinder > >     for managing these lock files.) > >      >>> > >      >>> There's an old Cinder bug for it here: > >      >>> https://bugs.launchpad.net/cinder/+bug/1432387 > >      >>> > >      >>> but that's marked as "Won't Fix" because Cinder needs it > to be > >     fixed in the underlying libraries. > >      >> Thank you for your explanation. > >      >> I understood the state. > >      >> > >      >> But, I have one more question. > >      >> Can I think this bug doesn't affect synchronization? > >      > > >      > It does not. In fact, it's important to not remove lock files > >     while a service is running or you can end up with synchronization > >     issues. > >      > > >      > To clean up the leftover lock files, we generally recommend > >     clearing the lock_path for each service on reboot before the > >     services have started. > > > >     Thank you for your information. > >     I think that I understood this issue completely. > > > >     Best Regards, > > > > > >      >> > >      >> Best regards, > >      >> > >      >>> Thanks, > >      >>> Eric > >      >>> > >      >> > >      > > > > >     -- > >     _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > >     Rikimaru Honjo > > E-mail:honjo.rikimaru at ntt-tx.co.jp > > >      > > > > > > > > > > > -- > > Hervé Beraud > > Senior Software Engineer > > Red Hat - Openstack Oslo > > irc: hberaud > > -----BEGIN PGP SIGNATURE----- > > > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > > v6rDpkeNksZ9fFSyoY2o > > =ECSj > > -----END PGP SIGNATURE----- > > > > > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From jimmy at openstack.org Tue Oct 15 17:47:14 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Tue, 15 Oct 2019 12:47:14 -0500 Subject: OpenStack COA Exam Update Message-ID: <5DA60622.9040102@openstack.org> We wanted to circle back on this email thread and provide an update on the Certified OpenStack Administrator (COA) exam. We’ve listened to community feedback and so has the OpenStack ecosystem. We are excited to collaborate with Mirantis who has stepped up to donate resources, including the administration of the vendor-neutral OpenStack certification exam now running on OpenStack Rocky. We are planning to continue COA exam sales starting this Thursday, October 17. If you’re interested in becoming a COA, you will be able to buy an exam through the OpenStack website or through one of the many OpenStack training partners in the marketplace . We are excited to continue growing the market of certified OpenStack professionals and appreciate the community’s patience as we identified a solution moving forward. Please reach out if you have any questions and we will continue to update openstack.org/coa as the certification program evolves. Thanks, Jimmy -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Oct 15 18:18:03 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 15 Oct 2019 13:18:03 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack Message-ID: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> Hello Everyone, Python 2.7 is going to retire in Jan 2020 [1] and we planned to drop the python 2 support from OpenStack during the start of the Ussuri cycle[2]. Time has come now to start the planning on dropping the Python2. It needs to be coordinated among various Projects, libraries, vendors driver, third party CI and testing frameworks. * Preparation for the Plan & Schedule: Etherpad: https://etherpad.openstack.org/p/drop-python2-support We discussed it in TC to come up with the plan, execute it smoothly and avoid breaking any dependent projects. I have prepared an etherpad[3](mentioned above also) to capture all the points related to this topic and most importantly the draft schedule about who can drop the support and when. The schedule is in the draft state and not final yet. The most important points are if you are dropping the support then all your consumers (OpenStack Projects, Vendors drivers etc) are ready for that. For example, oslo, os-bricks, client lib, testing framework projects will keep the python2 support until we make sure all the consumers of those projects do not require py2 support. If anyone require then how long they can support py2. These libraries, testing frameworks will be the last one to drop py2. We have planned to have a dedicated discussion in TC office hours on the 24th Thursday #openstack-tc channel. We will discuss what all need to be done and the schedules. You do not have to drop it immediately and keep eyes on this ML thread till we get the consensus on the community-level plan and schedule. Meanwhile, you can always start pre-planning for your projects, for example, stephenfin has started for Nova[4] to migrate the third party CI etc. Cinder has coordinated with all vendor drivers & their CI to migrate from py2 to py3. * Projects want to keep the py2 support? There is no mandate that projects have to drop the py2 support right now. If you want to keep the support then key things to discuss are what all you need and does all your dependent projects/libs provide the support of py2. This is something needs to be discussed case by case. If any project wants to keep the support, add that in the etherpad with a brief reason which will be helpful to discuss the need and feasibility. Feel free to provide feedback or add the missing point on the etherpad. Do not forget to attend the 24th Oct 2019, TC office hour on Thursday at 1500 UTC in #openstack-tc. [1] https://pythonclock.org/ [2] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html [3] https://etherpad.openstack.org/p/drop-python2-support [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010109.html -gmann From emilien at redhat.com Tue Oct 15 19:47:58 2019 From: emilien at redhat.com (Emilien Macchi) Date: Tue, 15 Oct 2019 15:47:58 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: On Fri, Oct 11, 2019 at 10:55 AM James Slagle wrote: > An idea for a future improvement I would like to see as we move in > this direction is to switch from reading the container startup configs > from a single file per step > (/var/lib/tripleo-config/container-startup-config-step_{{ step > }}.json), to using a directory per step instead. It would look > something like: > > /var/lib/tripleo-config/container-startup-config/step1 > > /var/lib/tripleo-config/container-startup-config/step1/keystone-init-tasks.json > > /var/lib/tripleo-config/container-startup-config/step1/pacemaker-init-tasks.json > etc. > https://review.opendev.org/#/c/688779/ is WIP and will address this idea. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue Oct 15 22:15:19 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 17:15:19 -0500 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> Message-ID: In the interest of not having to start this discussion from scratch every time, I've done a bit of a brain dump into https://review.opendev.org/#/c/688825/ that covers why things are the way they are and what we recommend people do about it. Please take a look and let me know if you see any issues with it. Thanks. -Ben From openstack at nemebean.com Tue Oct 15 22:19:31 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 17:19:31 -0500 Subject: [oslo] On PTO rest of the week Message-ID: <959ee3bc-ca31-c2e8-ddef-9f9b8a394c41@nemebean.com> Hey, I'm not working for the rest of the week. \o/ I also realized that I haven't announced that here. /o\ Things are pretty quiet in Oslo right now so I don't anticipate that I'll be needed in the interim, but I should have internet the whole time I'm out and I'll _probably_ be checking email. If something comes up that can't wait until Monday just holler and hopefully I'll see it. Thanks. -Ben From iwienand at redhat.com Tue Oct 15 23:03:16 2019 From: iwienand at redhat.com (Ian Wienand) Date: Wed, 16 Oct 2019 10:03:16 +1100 Subject: CentOS 8 nodes available now Message-ID: <20191015230316.GA29186@fedora19.localdomain> Hello, I'm happy to say that CentOS 8 images are now live in OpenDev infra. Using a node label of "centos-8" will get you started. --- The python environment setup on these images is different to our other images. Firstly, some background: currently during image build we go to some effort to: a) install the latest pip/virtualenv/setuptools b) ensure standard behaviour: /usr/bin/python -> python2 /usr/bin/pip -> python2 install /usr/bin/pip3 -> python3 install /usr/bin/virtualenv -> creates python2 environment by default; python3 virtualenv package in sync This means a number of things; hosts always have python2, and because we overwrite the system pip/virtualenv/setuptools we put these packages on "hold" (depending on the package manager) so jobs don't re-install them and create (even more of) a mess. This made sense in the past, when we had versions of pip/setuptools in distributions that couldn't understand syntax in requirements files (and other bugs) and didn't have the current fantastic job inheritance and modularity that Zuul (v3) provides. However, it also introduces a range of problems for various users, and has a high maintenance overhead. Thus these new images are, by default, python3 only, and have upstream pip and virtualenv packages installed. You will have a default situation: /usr/bin/python -> not provided /usr/bin/pip -> not provided, use /usr/bin/pip3 or "python3 -m pip" /usr/bin/virtualenv -> create python3 environment; provided by upstream python3-virtualenv package /usr/bin/pyvenv -> not provided (is provided by Ubuntu python3-venv), use /usr/bin/pyvenv-3 or "python3 -m venv". Ergo, the "standard behaviour" is not so standard any more. I would suggest if you wish to write somewhat portable jobs/roles etc., you do the following: * in general don't rely on "unversioned" calls of tools at all (python/pip/virtualenv) -- they can all mean different things on different platforms. * scripts should always be #!/usr/bin/python3 * use "python3 -m venv" for virtual environments (if you really need "virtualenv" because of one of the features it provides, use "-m virtualenv") * use "python3 -m pip" to install global pip packages; but try not too -- mixing packages and pip installs never works that well. * if you need python2 for some reason, use a bindep file+role to install it (don't assume it is there) --- For any Zuul admins, note that to use python3-only images similar to what we make, you'll need to set "python-path" to python3 in nodepool so that Ansible calls the correct remote binary. Keep an eye on [1] which will automate this for Ansible >=2.8 after things are merged and released. --- Most of the job setup has been tested (network configs, setting mirrors, adding swap etc.) but there's always a chance of issues with a new platform. Please bring up any issues in #openstack-infra and we'll be sure to get them fixed. --- If you're interested in the images, they are exported at https://nb01.openstack.org/images/ although they are rather large, because we pre-cache a lot. If you'd like to build your own, [2] might help with: DISTRO=centos-minimal DIB_RELEASE=8 --- Thanks, -i [1] https://review.opendev.org/#/c/682797/ [2] https://opendev.org/openstack/project-config/src/branch/master/tools/build-image.sh From smooney at redhat.com Wed Oct 16 00:35:48 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 16 Oct 2019 01:35:48 +0100 Subject: CentOS 8 nodes available now In-Reply-To: <20191015230316.GA29186@fedora19.localdomain> References: <20191015230316.GA29186@fedora19.localdomain> Message-ID: On Wed, 2019-10-16 at 10:03 +1100, Ian Wienand wrote: > Hello, > > I'm happy to say that CentOS 8 images are now live in OpenDev infra. > Using a node label of "centos-8" will get you started. > > --- > > The python environment setup on these images is different to our other > images. Firstly, some background: currently during image build we go > to some effort to: > > a) install the latest pip/virtualenv/setuptools > b) ensure standard behaviour: > /usr/bin/python -> python2 python on centos8 should be a link to python3 infact ideally python 2 should not be installed at all. > /usr/bin/pip -> python2 install > /usr/bin/pip3 -> python3 install > /usr/bin/virtualenv -> creates python2 environment by default; > python3 virtualenv package in sync > > This means a number of things; hosts always have python2, why would we want this. ideally we should try to ensure that there is non python 2 on the system so that we can ensure we are not using it bay mistake and can do an entirly python3 only install on centos8 ---- later: i realised read later that you are descibing how thigns work on other images here. > and because > we overwrite the system pip/virtualenv/setuptools we put these > packages on "hold" (depending on the package manager) so jobs don't > re-install them and create (even more of) a mess. > > This made sense in the past, when we had versions of pip/setuptools in > distributions that couldn't understand syntax in requirements files > (and other bugs) and didn't have the current fantastic job inheritance > and modularity that Zuul (v3) provides. However, it also introduces a > range of problems for various users, and has a high maintenance > overhead. > > Thus these new images are, by default, python3 only, and have upstream > pip and virtualenv packages installed. You will have a default > situation: > > /usr/bin/python -> not provided > /usr/bin/pip -> not provided, use /usr/bin/pip3 or "python3 -m pip" > /usr/bin/virtualenv -> create python3 environment; > provided by upstream python3-virtualenv package > /usr/bin/pyvenv -> not provided (is provided by Ubuntu python3-venv), > use /usr/bin/pyvenv-3 or "python3 -m venv". > oh i see you were descirbinbg the standard behavior of our other envs before this is closer to what i was expecting alther i would personally prefer to have /usr/bin/python -> /user/bin/python3 linux distros seem to be a bit split on this i belive arch and maybe debian (i saw on of the other majory disto families adopt the same apparch) link python to python3 the redhat family of operating systems do not provide python any more an leave it to the user to either use only the versions specific pythons or user update-alternitives to set there default python > Ergo, the "standard behaviour" is not so standard any more. > > I would suggest if you wish to write somewhat portable jobs/roles > etc., you do the following: > > * in general don't rely on "unversioned" calls of tools at all > (python/pip/virtualenv) -- they can all mean different things on > different platforms. > * scripts should always be #!/usr/bin/python3 see i think that is an anti pattern they could be but i think /usr/bin/python should map to /usr/bin/python3 and you should assume that it now python3. if you dont do that hen every script that has ever been writtne or packaged needs to be updated to reference python3 explictly. there were much fewer user of python1 when that tansition happened but python became a link to the default systme python and eventully pointed to python2 i think we should continue to do that and after a decase of deprecating python2 we should reclaim the python symlink and point it to python3 > * use "python3 -m venv" for virtual environments (if you really need > "virtualenv" because of one of the features it provides, use "-m > virtualenv") > * use "python3 -m pip" to install global pip packages; but try not > too -- mixing packages and pip installs never works that well. well from a devstack point of view we almost exclucive install form pip so installing python packages form the disto is the anti pattern not installing form pip. that said we shoudl consider moving devstack to use --user at somepoint. > * if you need python2 for some reason, use a bindep file+role > to install it (don't assume it is there) +1 also dont assmue python will be python > > --- > > For any Zuul admins, note that to use python3-only images similar to > what we make, you'll need to set "python-path" to python3 in nodepool > so that Ansible calls the correct remote binary. Keep an eye on [1] > which will automate this for Ansible >=2.8 after things are merged and > released. > > --- > > Most of the job setup has been tested (network configs, setting > mirrors, adding swap etc.) but there's always a chance of issues with > a new platform. Please bring up any issues in #openstack-infra and > we'll be sure to get them fixed. > > --- > > If you're interested in the images, they are exported at > > https://nb01.openstack.org/images/ > > although they are rather large, because we pre-cache a lot. If you'd > like to build your own, [2] might help with: > > DISTRO=centos-minimal > DIB_RELEASE=8 > > --- > > Thanks, thanks for al the work on this. > > -i > > [1] https://review.opendev.org/#/c/682797/ > [2] https://opendev.org/openstack/project-config/src/branch/master/tools/build-image.sh > > From Richard.Pioso at dell.com Wed Oct 16 00:55:38 2019 From: Richard.Pioso at dell.com (Richard.Pioso at dell.com) Date: Wed, 16 Oct 2019 00:55:38 +0000 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt Message-ID: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Hi, The Ironic Train release can be broken due to an entry in its driver-requirements.txt. driver-requirements.txt defines a dependency on the sushy package [1] which can be satisfied by version 1.9.0. Unfortunately, that version contains a few bugs which prevent Ironic from being able to manage Dell EMC and perhaps other vendors' bare metal hardware with its Redfish hardware type (driver). The fixes to them [2][3][4] were merged into master before the creation of stable/train. Therefore, they are available on stable/train and in the last sushy release created during the Train cycle, 2.0.0, the only other version which can satisfy the dependency today. However, consumers -- packagers, operators, and users -- could, fighting time constraints or lacking solid visibility into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the dependency, but, in so doing, unknowingly render the package or installation severely broken. A change [5] has been proposed as part of a prospective solution to this issue. It creates a new release of sushy from the change which fixes the first bug [2]. Review comments [6] discuss basing the new release on a more recent stable/train change to pick up other bug fixes and, less importantly, backward compatible feature modifications and enhancements which merged before the change from which 2.0.0 was created. Backward compatible feature modifications and enhancements are interspersed in time among the bug fixes. Once a new release is available, the sushy entry in driver-requirements.txt on stable/train would be updated. However, apparently, the stable branch policy prevents releases from being done at a point earlier than the last release within a given cycle [6], which was 2.0.0. Another possible resolution which comes to mind is to change the definition of the sushy dependency in driver-requirements.txt [1] from "sushy>=1.9.0" to "sushy>=2.0.0". Does anyone have a suggestion on how to proceed? Thank you, Rick [1] https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 [2] https://review.opendev.org/#/c/666253/ [3] https://review.opendev.org/#/c/668936/ [4] https://review.opendev.org/#/c/669889/ [5] https://review.opendev.org/#/c/688551/ [6] https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 From cboylan at sapwetik.org Wed Oct 16 01:42:52 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Tue, 15 Oct 2019 18:42:52 -0700 Subject: CentOS 8 nodes available now In-Reply-To: References: <20191015230316.GA29186@fedora19.localdomain> Message-ID: <18bfb37e-d448-4b8a-a6c8-5c4f4ee57107@www.fastmail.com> On Tue, Oct 15, 2019, at 5:35 PM, Sean Mooney wrote: > On Wed, 2019-10-16 at 10:03 +1100, Ian Wienand wrote: > > Hello, > > > > I'm happy to say that CentOS 8 images are now live in OpenDev infra. > > Using a node label of "centos-8" will get you started. > > > > --- > > > > The python environment setup on these images is different to our other > > images. Firstly, some background: currently during image build we go > > to some effort to: > > > > a) install the latest pip/virtualenv/setuptools > > b) ensure standard behaviour: > > /usr/bin/python -> python2 > python on centos8 should be a link to python3 > infact ideally python 2 should not be installed at all. > > /usr/bin/pip -> python2 install > > /usr/bin/pip3 -> python3 install > > /usr/bin/virtualenv -> creates python2 environment by default; > > python3 virtualenv package in sync > > > > This means a number of things; hosts always have python2, > why would we want this. ideally we should try to ensure that > there is non python 2 on the system so that we can ensure we are not > using > it bay mistake and can do an entirly python3 only install on centos8 > ---- > later: i realised read later that you are descibing how thigns work on > other images > here. See below but being specific about the version of python we want is one way to help ensure we test with the correct python. Also, some of our platforms don't have python3 (so we will continue to install python2 there). > > and because > > we overwrite the system pip/virtualenv/setuptools we put these > > packages on "hold" (depending on the package manager) so jobs don't > > re-install them and create (even more of) a mess. > > > > This made sense in the past, when we had versions of pip/setuptools in > > distributions that couldn't understand syntax in requirements files > > (and other bugs) and didn't have the current fantastic job inheritance > > and modularity that Zuul (v3) provides. However, it also introduces a > > range of problems for various users, and has a high maintenance > > overhead. > > > > Thus these new images are, by default, python3 only, and have upstream > > pip and virtualenv packages installed. You will have a default > > situation: > > > > /usr/bin/python -> not provided > > /usr/bin/pip -> not provided, use /usr/bin/pip3 or "python3 -m pip" > > /usr/bin/virtualenv -> create python3 environment; > > provided by upstream python3-virtualenv package > > /usr/bin/pyvenv -> not provided (is provided by Ubuntu python3-venv), > > use /usr/bin/pyvenv-3 or "python3 -m venv". > > > oh i see you were descirbinbg the standard behavior of our other envs before > this is closer to what i was expecting alther i would personally prefer to have > /usr/bin/python -> /user/bin/python3 > > linux distros seem to be a bit split on this > i belive arch and maybe debian (i saw on of the other majory disto > families adopt the same apparch) link > python to python3 > > the redhat family of operating systems do not provide python any more > an leave it to the user to either use > only the versions specific pythons or user update-alternitives to set > there default python This is actually one of the recommendations from PEP 394, https://www.python.org/dev/peps/pep-0394/#for-python-runtime-distributors. For our purposes I think it works well. Our jobs should be explicit about which version of python they use so there is no ambiguity in testing, but if jobs want to set up the alias they can opt into doing so. In general though I expect we'll stick to the various distro expectations for each distro as we build images for them. This avoids confusion when people discover `python` is something other than what you get if you download the image from the cloud provider. For this reason I don't think we should alias `python` to `python3` on CentOS8. > > Ergo, the "standard behaviour" is not so standard any more. > > > > I would suggest if you wish to write somewhat portable jobs/roles > > etc., you do the following: > > > > * in general don't rely on "unversioned" calls of tools at all > > (python/pip/virtualenv) -- they can all mean different things on > > different platforms. > > * scripts should always be #!/usr/bin/python3 > see i think that is an anti pattern they could be but i think > /usr/bin/python should map to /usr/bin/python3 and you should assume > that it now python3. if you dont do that hen > every script that has ever been > writtne or packaged needs to be updated to reference python3 > explictly. Or simply execute it with the interpreter you need: python3 /usr/local/bin/pbr freeze Is a common invocation for me. > there were much fewer user of python1 when that tansition happened > but > python became a link to the default systme python and eventully > pointed to python2 > i think we should continue to do that and after a decase of > deprecating python2 we > should reclaim the python symlink and point it to python3 > > * use "python3 -m venv" for virtual environments (if you really need > > "virtualenv" because of one of the features it provides, use "-m > > virtualenv") > > * use "python3 -m pip" to install global pip packages; but try not > > too -- mixing packages and pip installs never works that well. > well from a devstack point of view we almost exclucive install form > pip so installing python packages form the disto is the anti pattern > not installing form pip. that said we shoudl consider moving devstack to use > --user at somepoint. I actually resurrected the install into virtualenv idea when pip 10 (I think that was the version) happened as it refused to uninstall distutils installed packages. It is mostly doable though there are a few corner cases that kept it from happening. > > * if you need python2 for some reason, use a bindep file+role > > to install it (don't assume it is there) > +1 also dont assmue python will be python > > > > --- > > > > For any Zuul admins, note that to use python3-only images similar to > > what we make, you'll need to set "python-path" to python3 in nodepool > > so that Ansible calls the correct remote binary. Keep an eye on [1] > > which will automate this for Ansible >=2.8 after things are merged and > > released. > > > > --- > > > > Most of the job setup has been tested (network configs, setting > > mirrors, adding swap etc.) but there's always a chance of issues with > > a new platform. Please bring up any issues in #openstack-infra and > > we'll be sure to get them fixed. > > > > --- > > > > If you're interested in the images, they are exported at > > > > https://nb01.openstack.org/images/ > > > > although they are rather large, because we pre-cache a lot. If you'd > > like to build your own, [2] might help with: > > > > DISTRO=centos-minimal > > DIB_RELEASE=8 > > > > --- > > > > Thanks, > thanks for al the work on this. > > > > -i > > > > [1] https://review.opendev.org/#/c/682797/ > > [2] https://opendev.org/openstack/project-config/src/branch/master/tools/build-image.sh > > > > > > > From fungi at yuggoth.org Wed Oct 16 02:43:40 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 16 Oct 2019 02:43:40 +0000 Subject: CentOS 8 nodes available now In-Reply-To: References: <20191015230316.GA29186@fedora19.localdomain> Message-ID: <20191016024339.s7s24wpcprra7f3x@yuggoth.org> On 2019-10-16 01:35:48 +0100 (+0100), Sean Mooney wrote: [...] > i would personally prefer to have /usr/bin/python -> > /user/bin/python3 > > linux distros seem to be a bit split on this i belive arch and > maybe debian (i saw on of the other majory disto families adopt > the same apparch) link python to python3 [...] Debian definitely does not. The current plan for when Debian stops shipping Python 2.7 is that it will have no /usr/bin/python installed. The unversioned /usr/bin/python is and has long been an interpreter for the Python 2 language. Python 3 is a different language, and its interpreter should not by default assume the command name of the Python 2 interpreter. I think Arch Linux made a huge mistake in pretending they were the same thing, and sincerely hope no other distribution does the same. > i think /usr/bin/python should map to /usr/bin/python3 and you > should assume that it now python3. I think that's a disaster waiting to happen. > if you dont do that hen every script that has ever been writtne or > packaged needs to be updated to reference python3 explictly. Yep, that has to happen anyway in most cases to address Python 2 vs 3 language compatibility differences. Being explicit about which language a script is written in is a good thing. > there were much fewer user of python1 when that tansition happened > but python became a link to the default systme python and > eventully pointed to python2 i think we should continue to do that > and after a decase of deprecating python2 we should reclaim the > python symlink and point it to python3 [...] The language did not change in significantly backward-incompatible ways with 2.0. On the other hand "Python 3000" (3.0) was essentially meant as a redesign of the language where backward-incompatibility was a tool to abandon broken paradigms. It's possible to write software which will run under both interpreters (and we have in fact, rather a lot even), but random scripts written for Python 2 without concerns with forward-compatibility usually won't work on a Python 3 interpreter. > well from a devstack point of view we almost exclucive install > form pip so installing python packages form the disto is the anti > pattern not installing form pip. that said we shoudl consider > moving devstack to use --user at somepoint. [...] It's hard not to call https://review.openstack.org/562884 an anti-pattern. The pip maintainers these days basically don't want to have to continue supporting system-context installs, as responses on https://github.com/pypa/pip/issues/4805 clearly demonstrate. DevStack's been working around that for a year and a half now, as have our image builds (until Ian's efforts to stop doing that for the centos-8 images). Yes doing --user or venv installs is likely the core of the solution for DevStack but it needs more folks actually working to make it happen, and the ugly hack has been in place for so long I have doubts we'll see a major overhaul like that any time soon. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From emiller at genesishosting.com Wed Oct 16 04:50:19 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Tue, 15 Oct 2019 23:50:19 -0500 Subject: [Octavia] Amphora build issues Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> Hi, It seems that every build I have attempted of an amphora fails in some way. I have tried CentOS 7, Ubuntu Bionic, Xenial, and Trusty. Note that we are running Stein. I will concentrate on Ubuntu issues for now. I first create a fresh VM that is used to install the diskimage-create tool, then run (after sudo'ing to root): apt update apt -y upgrade apt-get -y install qemu qemu-system-common uuid-runtime curl kpartx git jq python-pip debootstrap libguestfs-tools pip install 'networkx==2.2' pip install argparse Babel dib-utils PyYAML git clone -b stable/stein https://github.com/openstack/octavia.git git clone https://git.openstack.org/openstack/diskimage-builder.git cd diskimage-builder pip install -r requirements.txt cd ../octavia/diskimage-create/ pip install -r requirements.txt # And finally, I run the diskimage-create script, specifying the image's OS, so ONE of these, depending on the OS: ./diskimage-create.sh -d bionic # or to use Xenial: ./diskimage-create.sh -d xenial # Note that when selecting Trusty, diskimage-create.sh error's, and so never finishes successfully. # Somewhat expected since it is quite old and unsupported. ./diskimage-create.sh -d trusty The amphorae launch when creating a load balancer, but the amphora agent fails to start, and thus is not responsive on TCP Port 9443. The log from inside the amphora is below. Has anyone successfully created an image? Am I missing something? Thanks! Eric Amphora agent fails to start inside amphora - this is logged when running the agent from the command line: 2019-10-16 03:41:04.835 1119 INFO octavia.common.config [-] /usr/local/bin/amphora-agent version 5.1.0.dev20 2019-10-16 03:41:04.835 1119 DEBUG octavia.common.config [-] command line: /usr/local/bin/amphora-agent --config-file /etc/octavia/amphora-agent.conf setup_logging /opt/amphora-agent-venv/lib/python3.5/site-packages/octavia/common/confi g.py:779 2019-10-16 03:41:05.036 1124 INFO octavia.amphorae.backends.health_daemon.health_daemon [-] Health Manager Sender starting. 2019-10-16 03:41:05.084 1119 CRITICAL octavia [-] Unhandled error: FileNotFoundError: [Errno 2] No such file or directory 2019-10-16 03:41:05.084 1119 ERROR octavia Traceback (most recent call last): 2019-10-16 03:41:05.084 1119 ERROR octavia File "/usr/local/bin/amphora-agent", line 8, in 2019-10-16 03:41:05.084 1119 ERROR octavia sys.exit(main()) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/octavia/cmd/agent.p y", line 89, in main 2019-10-16 03:41:05.084 1119 ERROR octavia AmphoraAgent(server_instance.app, options).run() 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/app/base.p y", line 72, in run 2019-10-16 03:41:05.084 1119 ERROR octavia Arbiter(self).run() 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/arbiter.py ", line 60, in __init__ 2019-10-16 03:41:05.084 1119 ERROR octavia self.setup(app) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/arbiter.py ", line 95, in setup 2019-10-16 03:41:05.084 1119 ERROR octavia self.log = self.cfg.logger_class(app.cfg) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/glogging.p y", line 200, in __init__ 2019-10-16 03:41:05.084 1119 ERROR octavia self.setup(cfg) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/glogging.p y", line 227, in setup 2019-10-16 03:41:05.084 1119 ERROR octavia self.error_log, cfg, self.syslog_fmt, "error" 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/glogging.p y", line 449, in _set_syslog_handler 2019-10-16 03:41:05.084 1119 ERROR octavia facility=facility, socktype=socktype) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/usr/lib/python3.5/logging/handlers.py", line 806, in __init__ 2019-10-16 03:41:05.084 1119 ERROR octavia self._connect_unixsocket(address) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/usr/lib/python3.5/logging/handlers.py", line 823, in _connect_unixsocket 2019-10-16 03:41:05.084 1119 ERROR octavia self.socket.connect(address) 2019-10-16 03:41:05.084 1119 ERROR octavia FileNotFoundError: [Errno 2] No such file or directory -------------- next part -------------- An HTML attachment was scrubbed... URL: From adriant at catalyst.net.nz Wed Oct 16 05:04:41 2019 From: adriant at catalyst.net.nz (Adrian Turjak) Date: Wed, 16 Oct 2019 18:04:41 +1300 Subject: [ospurge] looking for project owners / considering adoption In-Reply-To: References: Message-ID: I tried to get a community goal to do project deletion per project, but we ended up deciding that a community goal wasn't ideal unless we did build a bulk delete API in each service: https://review.opendev.org/#/c/639010/ https://etherpad.openstack.org/p/community-goal-project-deletion https://etherpad.openstack.org/p/DEN-Deletion-of-resources https://etherpad.openstack.org/p/DEN-Train-PublicCloudWG-brainstorming What we decided on, but didn't get a chance to work on, was building into the OpenstackSDK OS-purge like functionality, as well as reporting functionality (of all project resources to be deleted). That way we could have per project per resource deletion logic, and all of that defined in the SDK. I was up for doing some of the work, but ended up swamped with internal work and just didn't drive or push for the deletion work upstream. If you want to do something useful, don't pursue OS-Purge, help us add that official functionality to the SDK, and then we can push for bulk deletion APIs in each project to make resource deletion more pleasant. I'd be happy to help with the work, and Monty on the SDK team will most likely be happy to as well. :) Cheers, Adrian On 1/10/19 11:48 am, Adam Harwell wrote: > I haven't seen much activity on this project in a while, and it's been > moved to opendev/x since the opendev migration... Who is the current > owner of this project? Is there anyone who actually is maintaining it, > or would mind if others wanted to adopt the project to move it forward? > > Thanks, >    --Adam Harwell From yu.chengde at 99cloud.net Wed Oct 16 10:59:58 2019 From: yu.chengde at 99cloud.net (yu.chengde at 99cloud.net) Date: Wed, 16 Oct 2019 18:59:58 +0800 Subject: [nova] Which nova container service that nova/conf/compute.py map to Message-ID: Hi, I have deployed a stein version openstack on server thought Kolla-ansible method. Then, I git clone the nova code, and ready to do coding in " nova/nova/conf/compute.py" However, many of nova containers include this file. So, I want to know that I should modify them all, or just pick a specific one. Thanks [root at chantyu kolla-ansible]# docker ps | grep nova 05f72e539974 kolla/centos-source-nova-compute:stein "dumb-init --single-…" 28 hours ago Up 2 hours nova_compute 7393a7d566ee kolla/centos-source-nova-libvirt:stein "dumb-init --single-…" 28 hours ago Up 5 hours nova_libvirt 9d8357cfa334 kolla/centos-source-nova-scheduler:stein "dumb-init --single-…" 32 hours ago Up 3 hours nova_scheduler 085b9da918df kolla/centos-source-nova-api:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_api b80e9503e93e kolla/centos-source-nova-serialproxy:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_serialproxy c15d41823a22 kolla/centos-source-nova-novncproxy:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_novncproxy c30e47cd56c6 kolla/centos-source-nova-consoleauth:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_consoleauth b7d5e9ba1f11 kolla/centos-source-nova-ssh:stein "dumb-init --single-…" 7 days ago Up 5 hours nova_ssh 3f81cd0a97ce kolla/centos-source-nova-conductor:stein "dumb-init --single-…" 7 days ago Up 3 hours nova_conductor [root at chantyu kolla-ansible]# From smooney at redhat.com Wed Oct 16 11:05:20 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 16 Oct 2019 12:05:20 +0100 Subject: CentOS 8 nodes available now In-Reply-To: <20191016024339.s7s24wpcprra7f3x@yuggoth.org> References: <20191015230316.GA29186@fedora19.localdomain> <20191016024339.s7s24wpcprra7f3x@yuggoth.org> Message-ID: <3c28024b026f7f3fe2fb39dfc56687864df53be0.camel@redhat.com> TL;DR ok the way the image will work makes sense :) all i really care about is /usr/bin/python should not be a symlink to /usr/bin/python2 on python3 "only" distors to ensure we dont execute code under python 2 by mistake. On Wed, 2019-10-16 at 02:43 +0000, Jeremy Stanley wrote: > On 2019-10-16 01:35:48 +0100 (+0100), Sean Mooney wrote: > [...] > > i would personally prefer to have /usr/bin/python -> > > /user/bin/python3 > > > > linux distros seem to be a bit split on this i belive arch and > > maybe debian (i saw on of the other majory disto families adopt > > the same apparch) link python to python3 > > [...] > > Debian definitely does not. The current plan for when Debian stops > shipping Python 2.7 is that it will have no /usr/bin/python > installed. The unversioned /usr/bin/python is and has long been an > interpreter for the Python 2 language. Python 3 is a different > language, and its interpreter should not by default assume the > command name of the Python 2 interpreter. I think Arch Linux made a > huge mistake in pretending they were the same thing, and sincerely > hope no other distribution does the same. perhaps though i would argue that the code name of the python2 interperatr was always python2 or python2.7 no python. python was the name of system python the fact that they happen to be the same thin was a historical acident fo rthe last decade but if i exend your argument then we never should have had #!/usr/bin/python at all as a script entry point which in hindsight may have been correct. https://www.python.org/dev/peps/pep-0394/#for-python-runtime-distributors allow effectivly all of the possible options so there is no wrong answer just different tradeoffs. > > > i think /usr/bin/python should map to /usr/bin/python3 and you > > should assume that it now python3. > > I think that's a disaster waiting to happen. perhaps the only thin i hope we really avoid going forward is a python that maps to python2 that silntly allows things to work when we think we are running python 3 only. e.g. i prefer spipts that can run under python3 sliently doing so then over silently running on python2. if on python3 systems distro ensure that python does not map to python2 even when python 2 is installed unless the system admin expcitly sets up the symlink i guess that acive the same goal. it seams that is the path debain an rhel are taking e.g. dont provide "python" via packages so that it is only created if the sytem admin creates it. > > > if you dont do that hen every script that has ever been writtne or > > packaged needs to be updated to reference python3 explictly. > > Yep, that has to happen anyway in most cases to address Python 2 vs > 3 language compatibility differences. Being explicit about which > language a script is written in is a good thing. i guess but it feels kind of sad to say that forever we will have to type python3 in stead of python just because legacy script coudl break. i woudl prefer them to break and have the convinece. the is partly because most python2 i have encounterd has been vaild python3 but i know that that was not the case for alot of scripts. > > > there were much fewer user of python1 when that tansition happened > > but python became a link to the default systme python and > > eventully pointed to python2 i think we should continue to do that > > and after a decase of deprecating python2 we should reclaim the > > python symlink and point it to python3 > > [...] > > The language did not change in significantly backward-incompatible > ways with 2.0. On the other hand "Python 3000" (3.0) was essentially > meant as a redesign of the language where backward-incompatibility > was a tool to abandon broken paradigms. It's possible to write > software which will run under both interpreters (and we have in > fact, rather a lot even), but random scripts written for Python 2 > without concerns with forward-compatibility usually won't work on a > Python 3 interpreter. > > > well from a devstack point of view we almost exclucive install > > form pip so installing python packages form the disto is the anti > > pattern not installing form pip. that said we shoudl consider > > moving devstack to use --user at somepoint. > > [...] > > It's hard not to call https://review.openstack.org/562884 an > anti-pattern. i ment when using devstack prefering distro pacagke over pip would be an anti patteren as part of the reason we install form pip in the first place is to normalise the install so that it is contolled and as similar as possible beteen distors. the fact we have to do that in devstack ya does feel like a hack i was not aware we did that. > The pip maintainers these days basically don't want to > have to continue supporting system-context installs, as responses on > https://github.com/pypa/pip/issues/4805 clearly demonstrate. yep i was just trying to suggest that we shoudl avoid installing the disto version and basicaly only install the interpreter form the distro to avoid mixing packages as much as possible. > DevStack's been working around that for a year and a half now, as > have our image builds (until Ian's efforts to stop doing that for > the centos-8 images). Yes doing --user or venv installs is likely > the core of the solution for DevStack i do think its worth revisiting devstack venv install capablity at some point i know its there but have never really used it but hte fact that devstack installs but the requirement.txt and test-requriement.txt gloablly has caused issue in the past. i tried to remove install the test-requirement.txt in the past but some jobs depend on that to install optional packages so we cant. > but it needs more folks > actually working to make it happen, and the ugly hack has been in > place for so long I have doubts we'll see a major overhaul like that > any time soon. well the main thing that motivated me to even comment on this thread was the fact we currently have a hack that with lib_from_git where if you enable python 3 i will install the lib under python 2 and python3. the problem is the interperter line at the top of the entry point will be replaced with the python2 version due to the order of installs. so if you use libs_form _git with nova or with a lib that provides a setup tools console script entrypoint you can get into a situation where your python 3 only build can end up trying to run "python2" scripts. this has lead to some interesting errors to debug in the past. anyway i was looking forward to having a python3 only disto to not have to deal with that in the future with it looks like From radoslaw.piliszek at gmail.com Wed Oct 16 11:50:52 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 16 Oct 2019 13:50:52 +0200 Subject: [nova] Which nova container service that nova/conf/compute.py map to In-Reply-To: References: Message-ID: Hi Yu, you want to read: https://docs.openstack.org/kolla-ansible/latest/contributor/kolla-for-openstack-development.html In your case you should set: nova_dev_mode: yes in globals.yml Kind regards, Radek śr., 16 paź 2019 o 13:10 yu.chengde at 99cloud.net napisał(a): > Hi, > I have deployed a stein version openstack on server thought > Kolla-ansible method. > Then, I git clone the nova code, and ready to do coding in " > nova/nova/conf/compute.py" > However, many of nova containers include this file. > So, I want to know that I should modify them all, or just pick a > specific one. > Thanks > > > [root at chantyu kolla-ansible]# docker ps | grep nova > 05f72e539974 kolla/centos-source-nova-compute:stein > "dumb-init --single-…" 28 hours ago Up 2 hours > nova_compute > 7393a7d566ee kolla/centos-source-nova-libvirt:stein > "dumb-init --single-…" 28 hours ago Up 5 hours > nova_libvirt > 9d8357cfa334 kolla/centos-source-nova-scheduler:stein > "dumb-init --single-…" 32 hours ago Up 3 hours > nova_scheduler > 085b9da918df kolla/centos-source-nova-api:stein > "dumb-init --single-…" 6 days ago Up 3 hours > nova_api > b80e9503e93e kolla/centos-source-nova-serialproxy:stein > "dumb-init --single-…" 6 days ago Up 3 hours > nova_serialproxy > c15d41823a22 kolla/centos-source-nova-novncproxy:stein > "dumb-init --single-…" 6 days ago Up 3 hours > nova_novncproxy > c30e47cd56c6 kolla/centos-source-nova-consoleauth:stein > "dumb-init --single-…" 6 days ago Up 3 hours > nova_consoleauth > b7d5e9ba1f11 kolla/centos-source-nova-ssh:stein > "dumb-init --single-…" 7 days ago Up 5 hours > nova_ssh > 3f81cd0a97ce kolla/centos-source-nova-conductor:stein > "dumb-init --single-…" 7 days ago Up 3 hours > nova_conductor > [root at chantyu kolla-ansible]# > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Wed Oct 16 14:22:30 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 16 Oct 2019 09:22:30 -0500 Subject: OpenStack Train is officially released! Message-ID: <20191016142230.GB13004@sm-workstation> The official OpenStack Train release announcement has been sent out: http://lists.openstack.org/pipermail/openstack-announce/2019-October/002024.html Thanks to all who were part of making the Train series a success! This marks the official opening of the releases repo for Ussuri, and freezes are now lifted. Train is now a full stable branch. Thanks! Sean From juliaashleykreger at gmail.com Wed Oct 16 14:26:33 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 16 Oct 2019 07:26:33 -0700 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Message-ID: I'm okay if we just change driver-requirements.txt at this point and go ahead and cut new release for ironic. I actually feel like we should have bumped driver-requirements.txt after releasing sushy 2.0.0 anyway. The bottom line is we need to focus on the user experience of using the software. For ironic, if a vendor's class of gear just doesn't work with a possible combination, then we should try and take the least resistance and greatest impact path to remedying that situation. As for the "prevents a prior release" portion of policy, That is likely written for the projects that perform release candidates and not projects that do not. At least, that is my current feeling. That seems super counter-intuitive for ironic's release model if there is a major bug that is identified that needs to be fixed in the software we have shipped. -Julia On Tue, Oct 15, 2019 at 6:21 PM wrote: > > Hi, > > The Ironic Train release can be broken due to an entry in its driver-requirements.txt. driver-requirements.txt defines a dependency on the sushy package [1] which can be satisfied by version 1.9.0. Unfortunately, that version contains a few bugs which prevent Ironic from being able to manage Dell EMC and perhaps other vendors' bare metal hardware with its Redfish hardware type (driver). The fixes to them [2][3][4] were merged into master before the creation of stable/train. Therefore, they are available on stable/train and in the last sushy release created during the Train cycle, 2.0.0, the only other version which can satisfy the dependency today. However, consumers -- packagers, operators, and users -- could, fighting time constraints or lacking solid visibility into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the dependency, but, in so doing, unknowingly render the package or installation severely broken. > > A change [5] has been proposed as part of a prospective solution to this issue. It creates a new release of sushy from the change which fixes the first bug [2]. Review comments [6] discuss basing the new release on a more recent stable/train change to pick up other bug fixes and, less importantly, backward compatible feature modifications and enhancements which merged before the change from which 2.0.0 was created. Backward compatible feature modifications and enhancements are interspersed in time among the bug fixes. Once a new release is available, the sushy entry in driver-requirements.txt on stable/train would be updated. However, apparently, the stable branch policy prevents releases from being done at a point earlier than the last release within a given cycle [6], which was 2.0.0. > > Another possible resolution which comes to mind is to change the definition of the sushy dependency in driver-requirements.txt [1] from "sushy>=1.9.0" to "sushy>=2.0.0". > > Does anyone have a suggestion on how to proceed? > > Thank you, > Rick > > > [1] https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 > [2] https://review.opendev.org/#/c/666253/ > [3] https://review.opendev.org/#/c/668936/ > [4] https://review.opendev.org/#/c/669889/ > [5] https://review.opendev.org/#/c/688551/ > [6] https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 > > From openstack at fried.cc Wed Oct 16 14:59:51 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 16 Oct 2019 09:59:51 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> Message-ID: <3aceecad-626b-99de-3ba5-512b178a941a@fried.cc> Update: > the nova-specs patch introducing > the "Core Liaison" concept [1]. This is merged (it's now called "Feature Liaison"). Here's the new spec template section [2] and the FAQ [3]. Thanks to those who helped shape it. > (A) Note that the idea of capping the number of specs is (mostly) > unrelated, and we still haven't closed on it. I feel like we've agreed > to have a targeted discussion around spec freeze time where we decide > whether to defer features for resource reasons. That would be a new (and > good, IMO) thing. But it's still TBD whether "30 approved for 25 > completed" will apply, and/or what criteria would be used to decide what > gets cut. Nothing new here. efried > [1] https://review.opendev.org/#/c/685857 [2] http://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/ussuri-template.html#feature-liaison [3] http://specs.openstack.org/openstack/nova-specs/readme.html#feature-liaison-faq From Arkady.Kanevsky at dell.com Wed Oct 16 15:14:18 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 16 Oct 2019 15:14:18 +0000 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Message-ID: <9a68c819b4f34bff8bba7eeb2e862180@AUSX13MPS308.AMER.DELL.COM> Julia, I am for it also. But with Train just released, from Sean email, how does it get into Train? -----Original Message----- From: Julia Kreger Sent: Wednesday, October 16, 2019 9:27 AM To: Pioso, Richard Cc: openstack-discuss Subject: Re: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt [EXTERNAL EMAIL] I'm okay if we just change driver-requirements.txt at this point and go ahead and cut new release for ironic. I actually feel like we should have bumped driver-requirements.txt after releasing sushy 2.0.0 anyway. The bottom line is we need to focus on the user experience of using the software. For ironic, if a vendor's class of gear just doesn't work with a possible combination, then we should try and take the least resistance and greatest impact path to remedying that situation. As for the "prevents a prior release" portion of policy, That is likely written for the projects that perform release candidates and not projects that do not. At least, that is my current feeling. That seems super counter-intuitive for ironic's release model if there is a major bug that is identified that needs to be fixed in the software we have shipped. -Julia On Tue, Oct 15, 2019 at 6:21 PM wrote: > > Hi, > > The Ironic Train release can be broken due to an entry in its driver-requirements.txt. driver-requirements.txt defines a dependency on the sushy package [1] which can be satisfied by version 1.9.0. Unfortunately, that version contains a few bugs which prevent Ironic from being able to manage Dell EMC and perhaps other vendors' bare metal hardware with its Redfish hardware type (driver). The fixes to them [2][3][4] were merged into master before the creation of stable/train. Therefore, they are available on stable/train and in the last sushy release created during the Train cycle, 2.0.0, the only other version which can satisfy the dependency today. However, consumers -- packagers, operators, and users -- could, fighting time constraints or lacking solid visibility into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the dependency, but, in so doing, unknowingly render the package or installation severely broken. > > A change [5] has been proposed as part of a prospective solution to this issue. It creates a new release of sushy from the change which fixes the first bug [2]. Review comments [6] discuss basing the new release on a more recent stable/train change to pick up other bug fixes and, less importantly, backward compatible feature modifications and enhancements which merged before the change from which 2.0.0 was created. Backward compatible feature modifications and enhancements are interspersed in time among the bug fixes. Once a new release is available, the sushy entry in driver-requirements.txt on stable/train would be updated. However, apparently, the stable branch policy prevents releases from being done at a point earlier than the last release within a given cycle [6], which was 2.0.0. > > Another possible resolution which comes to mind is to change the definition of the sushy dependency in driver-requirements.txt [1] from "sushy>=1.9.0" to "sushy>=2.0.0". > > Does anyone have a suggestion on how to proceed? > > Thank you, > Rick > > > [1] > https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4 > a507e9a8b3a19e8a58/driver-requirements.txt#L14 > [2] https://review.opendev.org/#/c/666253/ > [3] https://review.opendev.org/#/c/668936/ > [4] https://review.opendev.org/#/c/669889/ > [5] https://review.opendev.org/#/c/688551/ > [6] https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 > > From jim at jimrollenhagen.com Wed Oct 16 15:14:12 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Wed, 16 Oct 2019 11:14:12 -0400 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Message-ID: On Wed, Oct 16, 2019 at 10:27 AM Julia Kreger wrote: > I'm okay if we just change driver-requirements.txt at this point and > go ahead and cut new release for ironic. I actually feel like we > should have bumped driver-requirements.txt after releasing sushy 2.0.0 > anyway. > Yeah, I think I agree here. The options I see are: * Ironic train depends on 2.0.0. This breaks stable policy. * We release 1.10.0 and depend on that. This breaks stable and release team policy. * We keep our dependencies the same and document that 1.9.0 is broken. This breaks no policy. The last option follows the letter of the law best, but doesn't actually help our users. If they need to use 2.0.0 to have a working system anyway, then the effect on users is the same as the first option, but in a backwards way. Let's just bring ironic to 2.0.0 and fix any breakage that comes with it. // jim > The bottom line is we need to focus on the user experience of using > the software. For ironic, if a vendor's class of gear just doesn't > work with a possible combination, then we should try and take the > least resistance and greatest impact path to remedying that situation. > > As for the "prevents a prior release" portion of policy, That is > likely written for the projects that perform release candidates and > not projects that do not. At least, that is my current feeling. That > seems super counter-intuitive for ironic's release model if there is a > major bug that is identified that needs to be fixed in the software we > have shipped. > > -Julia > > On Tue, Oct 15, 2019 at 6:21 PM wrote: > > > > Hi, > > > > The Ironic Train release can be broken due to an entry in its > driver-requirements.txt. driver-requirements.txt defines a dependency on > the sushy package [1] which can be satisfied by version 1.9.0. > Unfortunately, that version contains a few bugs which prevent Ironic from > being able to manage Dell EMC and perhaps other vendors' bare metal > hardware with its Redfish hardware type (driver). The fixes to them > [2][3][4] were merged into master before the creation of stable/train. > Therefore, they are available on stable/train and in the last sushy release > created during the Train cycle, 2.0.0, the only other version which can > satisfy the dependency today. However, consumers -- packagers, operators, > and users -- could, fighting time constraints or lacking solid visibility > into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the > dependency, but, in so doing, unknowingly render the package or > installation severely broken. > > > > A change [5] has been proposed as part of a prospective solution to this > issue. It creates a new release of sushy from the change which fixes the > first bug [2]. Review comments [6] discuss basing the new release on a more > recent stable/train change to pick up other bug fixes and, less > importantly, backward compatible feature modifications and enhancements > which merged before the change from which 2.0.0 was created. Backward > compatible feature modifications and enhancements are interspersed in time > among the bug fixes. Once a new release is available, the sushy entry in > driver-requirements.txt on stable/train would be updated. However, > apparently, the stable branch policy prevents releases from being done at a > point earlier than the last release within a given cycle [6], which was > 2.0.0. > > > > Another possible resolution which comes to mind is to change the > definition of the sushy dependency in driver-requirements.txt [1] from > "sushy>=1.9.0" to "sushy>=2.0.0". > > > > Does anyone have a suggestion on how to proceed? > > > > Thank you, > > Rick > > > > > > [1] > https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 > > [2] https://review.opendev.org/#/c/666253/ > > [3] https://review.opendev.org/#/c/668936/ > > [4] https://review.opendev.org/#/c/669889/ > > [5] https://review.opendev.org/#/c/688551/ > > [6] > https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Wed Oct 16 17:44:31 2019 From: elod.illes at est.tech (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Wed, 16 Oct 2019 17:44:31 +0000 Subject: [stable][EM] Extended Maintenance - Queens Message-ID: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> Hi, As it was agreed during PTG, the planned date of Extended Maintenance transition of Queens is around two weeks after Train release (a less busy period) [1]. Now that Train is released, it is a good opportunity for teams to go through the list of open and unreleased changes in Queens [2] and schedule a final release for Queens if needed. Feel free to use / edit / modify the lists (I've generated the lists for repositories which have 'follows-policy' tag). I hope this helps. [1] https://releases.openstack.org/ [2] https://etherpad.openstack.org/p/queens-final-release-before-em Thanks, Előd From kennelson11 at gmail.com Wed Oct 16 19:02:45 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 16 Oct 2019 12:02:45 -0700 Subject: [PTL] PTG Team Photos In-Reply-To: References: Message-ID: Wanted to bring this to the top of people's inboxes as a reminder :) Definitely not required, but we have lots of slots left if your team is interested! -Kendall (diablo_rojo) On Wed, Oct 9, 2019 at 11:06 AM Kendall Nelson wrote: > Hello Everyone! > > We are excited to see you in a few weeks at the PTG and wanted to share > that we will be taking team photos again! > > Here is an ethercalc signup for the available time slots [1]. We will be > providing time on Thursday Morning/Afternoon and Friday morning to come as > a team to get your photo taken. Slots are only ten minutes so its *important > that everyone be on time*! > > The location is TBD at this point, but it will likely be in the > prefunction space near registration. > > Thanks, > > -Kendall Nelson (diablo_rojo) > > [1] https://ethercalc.openstack.org/lnupu1sx6ljl > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Wed Oct 16 18:51:46 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 16 Oct 2019 11:51:46 -0700 Subject: OpenStack Train is officially released! In-Reply-To: <20191016142230.GB13004@sm-workstation> References: <20191016142230.GB13004@sm-workstation> Message-ID: Woohoo! Onward to Ussuri! [image: image.png] -Kendall (diablo_rojo) On Wed, Oct 16, 2019 at 7:23 AM Sean McGinnis wrote: > The official OpenStack Train release announcement has been sent out: > > > http://lists.openstack.org/pipermail/openstack-announce/2019-October/002024.html > > Thanks to all who were part of making the Train series a success! > > This marks the official opening of the releases repo for Ussuri, and > freezes > are now lifted. Train is now a full stable branch. > > Thanks! > Sean > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 173430 bytes Desc: not available URL: From fungi at yuggoth.org Wed Oct 16 19:09:55 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 16 Oct 2019 19:09:55 +0000 Subject: [tc] Feedback on Airship pilot project Message-ID: <20191016190954.wscdgflttnfxvhlm@yuggoth.org> Hi TC members, The Airship project will start its confirmation process with the OSF Board of Directors at the Board meeting[1] Tuesday next week. A draft of the slide deck[2] they plan to present is available for reference. Per the confirmation guidelines[3], the OSF Board of directors will take into account the feedback from representative bodies of existing confirmed Open Infrastructure Projects (OpenStack, Zuul and Kata) when evaluating Airship for confirmation. Particularly worth calling out, guideline #4 "Open collaboration" asserts the following: Project behaves as a good neighbor to other confirmed and pilot projects. If you (our community at large, not just TC members) have any observations/interactions with the Airship project which could serve as useful examples for how these projects do or do not meet this and other guidelines, please provide them on the etherpad[4] ASAP. If possible, include a citation with links to substantiate your feedback. If a TC representative can assemble this feedback and send it to the Board (for example, to the foundation mailing list) for consideration before the meeting next week, that would be appreciated. Apologies for the short notice. [1] http://lists.openstack.org/pipermail/foundation/2019-October/002800.html [2] https://www.airshipit.org/collateral/AirshipConfirmation-Review-for-the-OSF-Board.pdf [3] https://wiki.openstack.org/wiki/Governance/Foundation/OSFProjectConfirmationGuidelines [4] https://etherpad.openstack.org/p/openstack-tc-airship-confirmation-feedback -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From zigo at debian.org Wed Oct 16 19:33:02 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 16 Oct 2019 21:33:02 +0200 Subject: Debian OpenStack Train packages are officially released! [was: OpenStack Train is officially released!] In-Reply-To: <20191016142230.GB13004@sm-workstation> References: <20191016142230.GB13004@sm-workstation> Message-ID: On 10/16/19 4:22 PM, Sean McGinnis wrote: > The official OpenStack Train release announcement has been sent out: > > http://lists.openstack.org/pipermail/openstack-announce/2019-October/002024.html > > Thanks to all who were part of making the Train series a success! > > This marks the official opening of the releases repo for Ussuri, and freezes > are now lifted. Train is now a full stable branch. > > Thanks! > Sean Same, thanks everyone! Train packages for Debian have all been uploaded today, either to Sid when it was Horizon and its plugins, or to Experimental for everything else. A subsequent upload to Debian Sid will follow, but will take some time. For those willing to use Train on Buster, the usual repository scheme applies: deb http://buster-train.debian.net/debian buster-train-backports main deb-src http://buster-train.debian.net/debian buster-train-backports main deb http://buster-train.debian.net/debian buster-train-backports-nochange main deb-src http://buster-train.debian.net/debian buster-train-backports-nochange main Please report back any issue you see through the Debian bug tracker as usual. I still haven't had time to run tempest on this, but I could install Train in Buster and it worked. Cheers, Thomas Goirand (zigo) From flux.adam at gmail.com Wed Oct 16 19:48:02 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Wed, 16 Oct 2019 12:48:02 -0700 Subject: [ospurge] looking for project owners / considering adoption In-Reply-To: References: Message-ID: That's interesting -- we have already started working to add features and improve ospurge, and it seems like a plenty useful tool for our needs, but I think I agree that it would be nice to have that functionality built into the sdk. I might be able to help with both, since one is immediately useful and we (like everyone) have deadlines to meet, and the other makes sense to me as a possible future direction that could be more widely supported. Will you or someone else be hosting and discussion about this at the Shanghai summit? I'll be there and would be happy to join and discuss. --Adam On Tue, Oct 15, 2019, 22:04 Adrian Turjak wrote: > I tried to get a community goal to do project deletion per project, but > we ended up deciding that a community goal wasn't ideal unless we did > build a bulk delete API in each service: > https://review.opendev.org/#/c/639010/ > https://etherpad.openstack.org/p/community-goal-project-deletion > https://etherpad.openstack.org/p/DEN-Deletion-of-resources > https://etherpad.openstack.org/p/DEN-Train-PublicCloudWG-brainstorming > > What we decided on, but didn't get a chance to work on, was building > into the OpenstackSDK OS-purge like functionality, as well as reporting > functionality (of all project resources to be deleted). That way we > could have per project per resource deletion logic, and all of that > defined in the SDK. > > I was up for doing some of the work, but ended up swamped with internal > work and just didn't drive or push for the deletion work upstream. > > If you want to do something useful, don't pursue OS-Purge, help us add > that official functionality to the SDK, and then we can push for bulk > deletion APIs in each project to make resource deletion more pleasant. > > I'd be happy to help with the work, and Monty on the SDK team will most > likely be happy to as well. :) > > Cheers, > Adrian > > On 1/10/19 11:48 am, Adam Harwell wrote: > > I haven't seen much activity on this project in a while, and it's been > > moved to opendev/x since the opendev migration... Who is the current > > owner of this project? Is there anyone who actually is maintaining it, > > or would mind if others wanted to adopt the project to move it forward? > > > > Thanks, > > --Adam Harwell > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emiller at genesishosting.com Wed Oct 16 20:05:45 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Wed, 16 Oct 2019 15:05:45 -0500 Subject: [Octavia] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> Just an update on this. It appears that the diskimage-create script is pulling the master version of the Octavia amphora agent, instead of the Stein branch. I took a closer look at the first error line: 2019-10-16 19:47:46.389 1160 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/octavia/cmd/agent.p y", line 89, in main 2019-10-16 19:47:46.389 1160 ERROR octavia AmphoraAgent(server_instance.app, options).run() and it references line 89, which doesn't exist in agent.py except in the master branch. I had cloned the Stein branch of Octavia here: git clone -b stable/stein https://github.com/openstack/octavia.git I will keep looking... Eric From johnsomor at gmail.com Wed Oct 16 20:50:50 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Wed, 16 Oct 2019 13:50:50 -0700 Subject: [Octavia] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> Message-ID: Hi Eric, You are correct, diskimage-builder defaults to pulling the master version of the amphora agent. You want to set the following variables: export DIB_REPOREF_amphora_agent=stable/stein Then run the diskimage-create script. See the guide for more information: https://docs.openstack.org/octavia/latest/admin/amphora-image-build.html#environment-variables Michael On Wed, Oct 16, 2019 at 1:09 PM Eric K. Miller wrote: > > Just an update on this. > > It appears that the diskimage-create script is pulling the master > version of the Octavia amphora agent, instead of the Stein branch. > > I took a closer look at the first error line: > > 2019-10-16 19:47:46.389 1160 ERROR octavia File > "/opt/amphora-agent-venv/lib/python3.5/site-packages/octavia/cmd/agent.p > y", line 89, in main > 2019-10-16 19:47:46.389 1160 ERROR octavia > AmphoraAgent(server_instance.app, options).run() > > and it references line 89, which doesn't exist in agent.py except in the > master branch. > > I had cloned the Stein branch of Octavia here: > git clone -b stable/stein https://github.com/openstack/octavia.git > > I will keep looking... > > Eric > > > > From emiller at genesishosting.com Wed Oct 16 21:32:25 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Wed, 16 Oct 2019 16:32:25 -0500 Subject: [Octavia] Amphora build issues In-Reply-To: References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A04@gmsxchsvr01.thecreation.com> Thank you Michael! After doing this, a different problem occurs, which is logged in the octavia_worker.log. This also happened with all of my tests with CentOS amphorae. See the log snippet below. Note that creation of the load balancer fails with a status of ERROR and amphorae are deleted right after, so I wasn't able to login to the amphorae. Also note that this was deployed with Kolla Ansible 8.0.2, in case that helps. Eric 2019-10-16 16:27:49.399 23 DEBUG octavia.controller.worker.controller_worker [-] Task 'octavia.controller.worker.tasks.lifecycle_tasks.LoadBalancerIDToErrorOnRevertTask' (0af6f3fc-83d0-4093-b5e4-9cb955bf3397) transitioned into state 'REVERTING' from state 'SUCCESS' _task_receiver /var/lib/kolla/venv/lib/python2.7/site-packages/taskflow/listeners/logging.py:194 2019-10-16 16:27:49.404 23 WARNING octavia.controller.worker.controller_worker [-] Task 'octavia.controller.worker.tasks.lifecycle_tasks.LoadBalancerIDToErrorOnRevertTask' (0af6f3fc-83d0-4093-b5e4-9cb955bf3397) transitioned into state 'REVERTED' from state 'REVERTING' with result 'None' 2019-10-16 16:27:49.415 23 WARNING octavia.controller.worker.controller_worker [-] Flow 'octavia-create-loadbalancer-flow' (337640bf-0f7c-4c9e-b903-994f2c5827dd) transitioned into state 'REVERTED' from state 'RUNNING' 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server [-] Exception during message handling: WrappedFailure: WrappedFailure: [Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found] 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/octavia/controller/queue/endpoint.py", line 45, in create_load_balancer 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server self.worker.create_load_balancer(load_balancer_id, flavor) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/tenacity/__init__.py", line 292, in wrapped_f 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server return self.call(f, *args, **kw) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/tenacity/__init__.py", line 358, in call 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server do = self.iter(retry_state=retry_state) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/tenacity/__init__.py", line 319, in iter 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server return fut.result() 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/concurrent/futures/_base.py", line 455, in result 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server return self.__get_result() 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/tenacity/__init__.py", line 361, in call 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server result = fn(*args, **kwargs) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/octavia/controller/worker/controller_worker.py", line 343, in create_load_balancer 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server create_lb_tf.run() 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/taskflow/engines/action_engine/engine.py", line 247, in run 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server for _state in self.run_iter(timeout=timeout): 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/taskflow/engines/action_engine/engine.py", line 340, in run_iter 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server failure.Failure.reraise_if_any(er_failures) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/taskflow/types/failure.py", line 341, in reraise_if_any 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server raise exc.WrappedFailure(failures) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server WrappedFailure: WrappedFailure: [Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found] 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server ~ From emiller at genesishosting.com Wed Oct 16 21:43:16 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Wed, 16 Oct 2019 16:43:16 -0500 Subject: [Octavia] Amphora build issues References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> Looking at the error, it appears it can't find the exceptions.py script, but it appears to be in the octavia worker container: 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server WrappedFailure: WrappedFailure: [Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found] (octavia-worker)[root at controller001 haproxy]# cd /var/lib/kolla/venv/lib/python2.7/site-packages/octavia/amphorae/drivers/haproxy (octavia-worker)[root at controller001 haproxy]# ls -al total 84 drwxr-xr-x. 2 root root 186 Sep 29 08:12 . drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc Eric From Arkady.Kanevsky at dell.com Wed Oct 16 21:52:52 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 16 Oct 2019 21:52:52 +0000 Subject: OpenStack Train is officially released! In-Reply-To: References: <20191016142230.GB13004@sm-workstation> Message-ID: <13b25b69a064487cb5b7f0ccabe23dc9@AUSX13MPS308.AMER.DELL.COM> Indeed! From: Kendall Nelson Sent: Wednesday, October 16, 2019 1:52 PM To: Sean McGinnis Cc: OpenStack Discuss Subject: Re: OpenStack Train is officially released! [EXTERNAL EMAIL] Woohoo! Onward to Ussuri! [image.png] -Kendall (diablo_rojo) On Wed, Oct 16, 2019 at 7:23 AM Sean McGinnis > wrote: The official OpenStack Train release announcement has been sent out: http://lists.openstack.org/pipermail/openstack-announce/2019-October/002024.html Thanks to all who were part of making the Train series a success! This marks the official opening of the releases repo for Ussuri, and freezes are now lifted. Train is now a full stable branch. Thanks! Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 173430 bytes Desc: image001.png URL: From johnsomor at gmail.com Wed Oct 16 22:36:00 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Wed, 16 Oct 2019 15:36:00 -0700 Subject: [Octavia] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> Message-ID: This is caused by the controller version being older than the image version. Our upgrade strategy requires the control plane be updated before the image. A few lines above in the log you will see it is attempting to connect to /0.5 and not finding it. You have two options: 1. update the controllers to use the latest stable/stein verison 2. build a new image, but limit it to an older version of stein (such as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) We would recommend you run the latest minor release of stable/stein. (4.1.0 is the latest) https://releases.openstack.org/stein/index.html#octavia I'm not sure why kolla would install an old release. I'm not very familiar with it. Michael On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller wrote: > > Looking at the error, it appears it can't find the exceptions.py script, but it appears to be in the octavia worker container: > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server WrappedFailure: WrappedFailure: [Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found] > > (octavia-worker)[root at controller001 haproxy]# cd /var/lib/kolla/venv/lib/python2.7/site-packages/octavia/amphorae/drivers/haproxy > (octavia-worker)[root at controller001 haproxy]# ls -al > total 84 > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > Eric From emiller at genesishosting.com Wed Oct 16 22:39:37 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Wed, 16 Oct 2019 17:39:37 -0500 Subject: [Octavia] Amphora build issues In-Reply-To: References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A08@gmsxchsvr01.thecreation.com> Thanks Michael! I will configure Kolla Ansible to pull the latest Stein release of the controller components and rebuild/install the containers and get back to you. Much appreciated for the assistance. Eric From emiller at genesishosting.com Thu Oct 17 05:14:58 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Thu, 17 Oct 2019 00:14:58 -0500 Subject: [Octavia] Amphora build issues In-Reply-To: References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> Success! The current Kolla Ansible release simply has the 4.0.1 version specified in kolla-build.conf, which can be easily updated to 4.1.0. So, I adjusted the kolla-build.conf file, re-built the Octavia containers, deleted the containers from the controllers, re-deployed Octavia with Kolla Ansible, and tested load balancer creation, and everything succeeded. Again, much appreciated for the assistance. Eric > -----Original Message----- > From: Michael Johnson [mailto:johnsomor at gmail.com] > Sent: Wednesday, October 16, 2019 5:36 PM > To: Eric K. Miller > Cc: openstack-discuss > Subject: Re: [Octavia] Amphora build issues > > This is caused by the controller version being older than the image > version. Our upgrade strategy requires the control plane be updated > before the image. > A few lines above in the log you will see it is attempting to connect > to /0.5 and not finding it. > > You have two options: > 1. update the controllers to use the latest stable/stein verison > 2. build a new image, but limit it to an older version of stein (such > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > We would recommend you run the latest minor release of stable/stein. > (4.1.0 is the latest) > https://releases.openstack.org/stein/index.html#octavia > > I'm not sure why kolla would install an old release. I'm not very > familiar with it. > > Michael > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > wrote: > > > > Looking at the error, it appears it can't find the exceptions.py script, but it > appears to be in the octavia worker container: > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > WrappedFailure: WrappedFailure: [Failure: > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > Found] > > > > (octavia-worker)[root at controller001 haproxy]# cd > /var/lib/kolla/venv/lib/python2.7/site- > packages/octavia/amphorae/drivers/haproxy > > (octavia-worker)[root at controller001 haproxy]# ls -al > > total 84 > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > Eric From radoslaw.piliszek at gmail.com Thu Oct 17 06:14:41 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 17 Oct 2019 08:14:41 +0200 Subject: [Octavia][Kolla] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> Message-ID: Hi Eric, Octavia has recently been bumped up to 4.1.0 [1] This applies to the cases when you are either using our (upstream, in-registry) images or kolla from git. Released kolla is behind. @kolla Stable branches are really stable so I am not sure whether we should not just point people willing to build their own images to use the version from git rather than PyPI, especially since our images are built from branch, not release. [1] https://review.opendev.org/688426 Kind regards, Radek czw., 17 paź 2019 o 07:23 Eric K. Miller napisał(a): > Success! > > The current Kolla Ansible release simply has the 4.0.1 version specified > in kolla-build.conf, which can be easily updated to 4.1.0. > > So, I adjusted the kolla-build.conf file, re-built the Octavia containers, > deleted the containers from the controllers, re-deployed Octavia with Kolla > Ansible, and tested load balancer creation, and everything succeeded. > > Again, much appreciated for the assistance. > > Eric > > > -----Original Message----- > > From: Michael Johnson [mailto:johnsomor at gmail.com] > > Sent: Wednesday, October 16, 2019 5:36 PM > > To: Eric K. Miller > > Cc: openstack-discuss > > Subject: Re: [Octavia] Amphora build issues > > > > This is caused by the controller version being older than the image > > version. Our upgrade strategy requires the control plane be updated > > before the image. > > A few lines above in the log you will see it is attempting to connect > > to /0.5 and not finding it. > > > > You have two options: > > 1. update the controllers to use the latest stable/stein verison > > 2. build a new image, but limit it to an older version of stein (such > > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > > > We would recommend you run the latest minor release of stable/stein. > > (4.1.0 is the latest) > > https://releases.openstack.org/stein/index.html#octavia > > > > I'm not sure why kolla would install an old release. I'm not very > > familiar with it. > > > > Michael > > > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > > wrote: > > > > > > Looking at the error, it appears it can't find the exceptions.py > script, but it > > appears to be in the octavia worker container: > > > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > > WrappedFailure: WrappedFailure: [Failure: > > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > > Found] > > > > > > (octavia-worker)[root at controller001 haproxy]# cd > > /var/lib/kolla/venv/lib/python2.7/site- > > packages/octavia/amphorae/drivers/haproxy > > > (octavia-worker)[root at controller001 haproxy]# ls -al > > > total 84 > > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > > > Eric > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sindhugauri1 at gmail.com Thu Oct 17 06:07:10 2019 From: sindhugauri1 at gmail.com (Gauri Sindhu) Date: Thu, 17 Oct 2019 11:37:10 +0530 Subject: [all][ceilometer][aodh][[docs] Possible error in Stein's Aodh documentation and how to configure cpu_util and pass on value to Aodh Message-ID: Hi all, As per the Rocky release notes , *cpu_util and *.rate meters are deprecated and will be removed in future release in favor of the Gnocchi rate calculation equivalent.* I have two doubts regarding this. Firstly, if the 'cpu_util' metric has been deprecated then why has it been used as an example in the documentation in the 'Using Alarms' section? I've attached an image of the same. Secondly, I'm using OpenStack Stein and want to use the cpu_util or its equivalent to create an alarm. If this metric is no longer available then what do I pass onto Aodh to create the alarm? There seems to be no documentation that to help me out with this. Additionally, even if Gnocchi rate calculation is to be used, how am I supposed to transfer the result to Aodh? I also cannot seem to find the Gnocchi documentation. Regards, Gauri Sindhu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cpu_util example in 'Using Alarms' section.PNG Type: image/png Size: 30607 bytes Desc: not available URL: From emiller at genesishosting.com Thu Oct 17 06:18:27 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Thu, 17 Oct 2019 01:18:27 -0500 Subject: [Octavia][Kolla] Amphora build issues In-Reply-To: References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A0C@gmsxchsvr01.thecreation.com> Hi Radek, In case this was useful, Kolla Ansible pulls the files (when using the "source" option) from: http://tarballs.openstack.org/octavia/ So, I just adjusted the version it pulled for Octavia. Eric From: Radosław Piliszek [mailto:radoslaw.piliszek at gmail.com] Sent: Thursday, October 17, 2019 1:15 AM To: Eric K. Miller Cc: Michael Johnson; openstack-discuss Subject: Re: [Octavia][Kolla] Amphora build issues Hi Eric, Octavia has recently been bumped up to 4.1.0 [1] This applies to the cases when you are either using our (upstream, in-registry) images or kolla from git. Released kolla is behind. @kolla Stable branches are really stable so I am not sure whether we should not just point people willing to build their own images to use the version from git rather than PyPI, especially since our images are built from branch, not release. [1] https://review.opendev.org/688426 Kind regards, Radek czw., 17 paź 2019 o 07:23 Eric K. Miller napisał(a): Success! The current Kolla Ansible release simply has the 4.0.1 version specified in kolla-build.conf, which can be easily updated to 4.1.0. So, I adjusted the kolla-build.conf file, re-built the Octavia containers, deleted the containers from the controllers, re-deployed Octavia with Kolla Ansible, and tested load balancer creation, and everything succeeded. Again, much appreciated for the assistance. Eric > -----Original Message----- > From: Michael Johnson [mailto:johnsomor at gmail.com] > Sent: Wednesday, October 16, 2019 5:36 PM > To: Eric K. Miller > Cc: openstack-discuss > Subject: Re: [Octavia] Amphora build issues > > This is caused by the controller version being older than the image > version. Our upgrade strategy requires the control plane be updated > before the image. > A few lines above in the log you will see it is attempting to connect > to /0.5 and not finding it. > > You have two options: > 1. update the controllers to use the latest stable/stein verison > 2. build a new image, but limit it to an older version of stein (such > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > We would recommend you run the latest minor release of stable/stein. > (4.1.0 is the latest) > https://releases.openstack.org/stein/index.html#octavia > > I'm not sure why kolla would install an old release. I'm not very > familiar with it. > > Michael > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > wrote: > > > > Looking at the error, it appears it can't find the exceptions.py script, but it > appears to be in the octavia worker container: > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > WrappedFailure: WrappedFailure: [Failure: > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > Found] > > > > (octavia-worker)[root at controller001 haproxy]# cd > /var/lib/kolla/venv/lib/python2.7/site- > packages/octavia/amphorae/drivers/haproxy > > (octavia-worker)[root at controller001 haproxy]# ls -al > > total 84 > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From emiller at genesishosting.com Thu Oct 17 06:20:15 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Thu, 17 Oct 2019 01:20:15 -0500 Subject: [Octavia][Kolla] Amphora build issues References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A0D@gmsxchsvr01.thecreation.com> Correction - I meant "Kolla" (not Kolla Ansible) Eric From: Eric K. Miller Sent: Thursday, October 17, 2019 1:18 AM To: 'Radosław Piliszek' Cc: Michael Johnson; openstack-discuss Subject: RE: [Octavia][Kolla] Amphora build issues Hi Radek, In case this was useful, Kolla Ansible pulls the files (when using the "source" option) from: http://tarballs.openstack.org/octavia/ So, I just adjusted the version it pulled for Octavia. Eric From: Radosław Piliszek [mailto:radoslaw.piliszek at gmail.com] Sent: Thursday, October 17, 2019 1:15 AM To: Eric K. Miller Cc: Michael Johnson; openstack-discuss Subject: Re: [Octavia][Kolla] Amphora build issues Hi Eric, Octavia has recently been bumped up to 4.1.0 [1] This applies to the cases when you are either using our (upstream, in-registry) images or kolla from git. Released kolla is behind. @kolla Stable branches are really stable so I am not sure whether we should not just point people willing to build their own images to use the version from git rather than PyPI, especially since our images are built from branch, not release. [1] https://review.opendev.org/688426 Kind regards, Radek czw., 17 paź 2019 o 07:23 Eric K. Miller napisał(a): Success! The current Kolla Ansible release simply has the 4.0.1 version specified in kolla-build.conf, which can be easily updated to 4.1.0. So, I adjusted the kolla-build.conf file, re-built the Octavia containers, deleted the containers from the controllers, re-deployed Octavia with Kolla Ansible, and tested load balancer creation, and everything succeeded. Again, much appreciated for the assistance. Eric > -----Original Message----- > From: Michael Johnson [mailto:johnsomor at gmail.com] > Sent: Wednesday, October 16, 2019 5:36 PM > To: Eric K. Miller > Cc: openstack-discuss > Subject: Re: [Octavia] Amphora build issues > > This is caused by the controller version being older than the image > version. Our upgrade strategy requires the control plane be updated > before the image. > A few lines above in the log you will see it is attempting to connect > to /0.5 and not finding it. > > You have two options: > 1. update the controllers to use the latest stable/stein verison > 2. build a new image, but limit it to an older version of stein (such > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > We would recommend you run the latest minor release of stable/stein. > (4.1.0 is the latest) > https://releases.openstack.org/stein/index.html#octavia > > I'm not sure why kolla would install an old release. I'm not very > familiar with it. > > Michael > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > wrote: > > > > Looking at the error, it appears it can't find the exceptions.py script, but it > appears to be in the octavia worker container: > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > WrappedFailure: WrappedFailure: [Failure: > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > Found] > > > > (octavia-worker)[root at controller001 haproxy]# cd > /var/lib/kolla/venv/lib/python2.7/site- > packages/octavia/amphorae/drivers/haproxy > > (octavia-worker)[root at controller001 haproxy]# ls -al > > total 84 > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaronzhu1121 at gmail.com Thu Oct 17 06:32:39 2019 From: aaronzhu1121 at gmail.com (Rong Zhu) Date: Thu, 17 Oct 2019 14:32:39 +0800 Subject: [all][ceilometer][aodh][[docs] Possible error in Stein's Aodh documentation and how to configure cpu_util and pass on value to Aodh In-Reply-To: References: Message-ID: Hi Gauri, We received a lot of feedback about cpu_utils, And we had plan to add cpu_utils back in U release. Gauri Sindhu 于2019年10月17日 周四14:20写道: > Hi all, > > As per the Rocky release notes > , *cpu_util > and *.rate meters are deprecated and will be removed in future release in > favor of the Gnocchi rate calculation equivalent.* > > I have two doubts regarding this. > > Firstly, if the 'cpu_util' metric has been deprecated then why has it been > used as an example in the documentation > in > the 'Using Alarms' section? I've attached an image of the same. > > Secondly, I'm using OpenStack Stein and want to use the cpu_util or its > equivalent to create an alarm. If this metric is no longer available then > what do I pass onto Aodh to create the alarm? There seems to be no > documentation that to help me out with this. Additionally, even if Gnocchi > rate calculation is to be used, how am I supposed to transfer the result to > Aodh? I also cannot seem to find the Gnocchi documentation. > > Regards, > Gauri Sindhu > -- Thanks, Rong Zhu -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Thu Oct 17 06:42:06 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 17 Oct 2019 08:42:06 +0200 Subject: [Octavia][Kolla] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A0D@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0D@gmsxchsvr01.thecreation.com> Message-ID: Hi Eric, that's exactly what the change I mentioned has done. :-) Kind regards, Radek czw., 17 paź 2019 o 08:20 Eric K. Miller napisał(a): > Correction - I meant "Kolla" (not Kolla Ansible) > > > > Eric > > > > *From:* Eric K. Miller > *Sent:* Thursday, October 17, 2019 1:18 AM > *To:* 'Radosław Piliszek' > *Cc:* Michael Johnson; openstack-discuss > *Subject:* RE: [Octavia][Kolla] Amphora build issues > > > > Hi Radek, > > > > In case this was useful, Kolla Ansible pulls the files (when using the > "source" option) from: > > http://tarballs.openstack.org/octavia/ > > > > So, I just adjusted the version it pulled for Octavia. > > > > Eric > > > > > > *From:* Radosław Piliszek [mailto:radoslaw.piliszek at gmail.com] > *Sent:* Thursday, October 17, 2019 1:15 AM > *To:* Eric K. Miller > *Cc:* Michael Johnson; openstack-discuss > *Subject:* Re: [Octavia][Kolla] Amphora build issues > > > > Hi Eric, > > > > Octavia has recently been bumped up to 4.1.0 [1] > > This applies to the cases when you are either using our (upstream, > in-registry) images or kolla from git. > > Released kolla is behind. > > > > @kolla > > Stable branches are really stable so I am not sure whether we should not > just point people willing to build their own images to use the version from > git rather than PyPI, especially since our images are built from branch, > not release. > > > > [1] https://review.opendev.org/688426 > > > > Kind regards, > > Radek > > > > czw., 17 paź 2019 o 07:23 Eric K. Miller > napisał(a): > > Success! > > The current Kolla Ansible release simply has the 4.0.1 version specified > in kolla-build.conf, which can be easily updated to 4.1.0. > > So, I adjusted the kolla-build.conf file, re-built the Octavia containers, > deleted the containers from the controllers, re-deployed Octavia with Kolla > Ansible, and tested load balancer creation, and everything succeeded. > > Again, much appreciated for the assistance. > > Eric > > > -----Original Message----- > > From: Michael Johnson [mailto:johnsomor at gmail.com] > > Sent: Wednesday, October 16, 2019 5:36 PM > > To: Eric K. Miller > > Cc: openstack-discuss > > Subject: Re: [Octavia] Amphora build issues > > > > This is caused by the controller version being older than the image > > version. Our upgrade strategy requires the control plane be updated > > before the image. > > A few lines above in the log you will see it is attempting to connect > > to /0.5 and not finding it. > > > > You have two options: > > 1. update the controllers to use the latest stable/stein verison > > 2. build a new image, but limit it to an older version of stein (such > > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > > > We would recommend you run the latest minor release of stable/stein. > > (4.1.0 is the latest) > > https://releases.openstack.org/stein/index.html#octavia > > > > I'm not sure why kolla would install an old release. I'm not very > > familiar with it. > > > > Michael > > > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > > wrote: > > > > > > Looking at the error, it appears it can't find the exceptions.py > script, but it > > appears to be in the octavia worker container: > > > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > > WrappedFailure: WrappedFailure: [Failure: > > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > > Found] > > > > > > (octavia-worker)[root at controller001 haproxy]# cd > > /var/lib/kolla/venv/lib/python2.7/site- > > packages/octavia/amphorae/drivers/haproxy > > > (octavia-worker)[root at controller001 haproxy]# ls -al > > > total 84 > > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > > > Eric > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sindhugauri1 at gmail.com Thu Oct 17 08:34:26 2019 From: sindhugauri1 at gmail.com (Gauri Sindhu) Date: Thu, 17 Oct 2019 14:04:26 +0530 Subject: [all][ceilometer][aodh][[docs] Possible error in Stein's Aodh documentation and how to configure cpu_util and pass on value to Aodh In-Reply-To: References: Message-ID: Hi Rong, Is there any workaround for this at the moment? Is there any other replacement for the cpu_util metric so that we can transfer the metric or data to the alarm? Regards, Gauri Sindhu On Thu, Oct 17, 2019 at 12:02 PM Rong Zhu wrote: > Hi Gauri, > > We received a lot of feedback about cpu_utils, And we had plan to add > cpu_utils back in U release. > > > Gauri Sindhu 于2019年10月17日 周四14:20写道: > >> Hi all, >> >> As per the Rocky release notes >> , *cpu_util >> and *.rate meters are deprecated and will be removed in future release in >> favor of the Gnocchi rate calculation equivalent.* >> >> I have two doubts regarding this. >> >> Firstly, if the 'cpu_util' metric has been deprecated then why has it >> been used as an example in the documentation >> in >> the 'Using Alarms' section? I've attached an image of the same. >> >> Secondly, I'm using OpenStack Stein and want to use the cpu_util or its >> equivalent to create an alarm. If this metric is no longer available then >> what do I pass onto Aodh to create the alarm? There seems to be no >> documentation that to help me out with this. Additionally, even if Gnocchi >> rate calculation is to be used, how am I supposed to transfer the result to >> Aodh? I also cannot seem to find the Gnocchi documentation. >> >> Regards, >> Gauri Sindhu >> > -- > Thanks, > Rong Zhu > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tetsuro.nakamura.bc at hco.ntt.co.jp Thu Oct 17 09:31:53 2019 From: tetsuro.nakamura.bc at hco.ntt.co.jp (Tetsuro Nakamura) Date: Thu, 17 Oct 2019 18:31:53 +0900 Subject: [placement][ptg] Ussuri Placement Topics Message-ID: <9db858d3-7fdd-fb69-a27e-2d9af0f86dfa@hco.ntt.co.jp_1> Hi Placementers, We won't have a specific meeting space for Shanghai PTG, but I'd like to have retrospective on Stein, and gather and note work items we have. Please put your ideas on the etherpad [1]. [1] https://etherpad.openstack.org/p/placement-shanghai-ptg -- Tetsuro Nakamura NTT Network Service Systems Laboratories TEL:0422 59 6914(National)/+81 422 59 6914(International) 3-9-11, Midori-Cho Musashino-Shi, Tokyo 180-8585 Japan From sfinucan at redhat.com Thu Oct 17 09:34:42 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Thu, 17 Oct 2019 10:34:42 +0100 Subject: CentOS 8 nodes available now In-Reply-To: <3c28024b026f7f3fe2fb39dfc56687864df53be0.camel@redhat.com> References: <20191015230316.GA29186@fedora19.localdomain> <20191016024339.s7s24wpcprra7f3x@yuggoth.org> <3c28024b026f7f3fe2fb39dfc56687864df53be0.camel@redhat.com> Message-ID: <84ef448aba6e07132ba2662fe0e7cd6fecf8df8b.camel@redhat.com> On Wed, 2019-10-16 at 12:05 +0100, Sean Mooney wrote: > > but it needs more folks > > actually working to make it happen, and the ugly hack has been in > > place for so long I have doubts we'll see a major overhaul like that > > any time soon. > well the main thing that motivated me to even comment on this thread > was the fact we currently have a hack that with lib_from_git where if you enable > python 3 i will install the lib under python 2 and python3. the problem is the > interperter line at the top of the entry point will be replaced with the python2 version > due to the order of installs. so if you use libs_form _git with nova or with a lib that provides > a setup tools console script entrypoint you can get into a situation where your python 3 only > build can end up trying to run "python2" scripts. this has lead to some interesting > errors to debug in the past. anyway i was looking forward to having a python3 only disto > to not have to deal with that in the future with it looks like You should probably look at https://review.opendev.org/#/c/687585/ so Stephen From hberaud at redhat.com Thu Oct 17 09:51:07 2019 From: hberaud at redhat.com (Herve Beraud) Date: Thu, 17 Oct 2019 11:51:07 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> Message-ID: Thanks Ben for your feedbacks. I already tried to follow the `remove_external_lock_file` few months ago, but unfortunately, I don't think we can goes like this with Cinder... As Gorka has explained to me few months ago: > Those are not the only type of locks we use in Cinder. Those are the > ones we call "Global locks" and use TooZ so the DLM can be configured > for Cinder Active-Active. > > We also use Oslo's synchronized locks. > > More information is available in the Cinder HA dev ref I wrote last > year. It has a section dedicated to the topic of mutual exclusion and > the 4 types we currently have in Cinder [1]: > > - Database locking using resource states. > - Process locks. > - Node locks. > - Global locks. > > As for calling the remove_external_lock_file_with_prefix directly on > delete, I don't think that's something we can do, as the locks may still > be in use. Example: > > - Start deleting volume -> get lock > - Try to clone volume -> wait for lock > - Finish deleting volume -> release and delete lock > - Cloning recreates the lock when acquiring it > - Cloning fails because the volume no longer exists but leaves the lock So the Cinder workflow and mechanisms seems to definitively forbid to us the possibility to use the remove features of oslo.concurrency... Also like discussed on the review (https://review.opendev.org/#/c/688413), this issue can't be fixed in the underlying libraries, and I think that if we want to fix that on stable branches then Cinder need to address it directly by adding some piece of code who will be triggered if needed and in a safely manner, in other words, only Cinder can really address it and remove safely these file. See the discussion extract on the review ( https://review.opendev.org/#/c/688413): > Thanks Gorka for your feedback, then in view of all the discussions > about this topic I suppose only Cinder can really address it safely > on stable branches. > > > It is not a safe assumption that *-delete_volume file locks can be > > removed just because they have not been used in a couple of days. > > A new volume clone could come in that would use it and then we > > could have a race condition if the cron job was running. > > > > The only way to be sure that it can be removed is checking in the > > Cinder DB and making sure that the volume has been deleted or it > > doesn't even exist (DB has been purged). > > > > Same thing with detach_volume, delete_snapshot, and those that are > > directly volume ids locks. > > I definitely think that it can't be fixed in the underlying > libraries like Eric has suggested [1], indeed, as you has explained > only Cinder can know if a lock file can be removed safely. > > > In my opinion the fix should be done in fasteners, or we should add > > code in Cinder that cleans up all locks related to a volume or > > snapshot when this one is deleted. > > I agree the most better solution is to fix the root cause and so to > fix fasteners, but I don't think it's can be backported to stable > branches because we will need to bump a requirement version on > stable branche in this case and also because it'll introduce new > features, so I guess Cinder need to add some code to remove these > files and possibly backport it to stable branches. > > [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009563.html The Fasteners fix IMHO can only be used by future versions of openstack, due to the version bump and due to the new features added. I think that it could be available only from the ussuri or future cycle like V. The main goal of the cron approach was to definitively avoid to unearth this topic each 6 months, try to address it on stable branches, and try to take care of the file system usage even if it's a theoretical issue, but by getting feedbacks from the Cinder team and their warnings I don't think that this track is still followable. Definitely, this is not an oslo.concurrency bug. Anyway your proposed "Administrator Guide" is a must to have, to track things in one place, inform users and avoid to spend time to explain the same things again and again about this topic... so it's worth-it. I'll review it and propose my related knowledge on this topic. oslo.concurrency can't address this safely because we risk to introduce race conditions and worse situations than the leftover lock files. So, due to all these elements, only cinder can address it for the moment and for fix that on stable branches too. Le mer. 16 oct. 2019 à 00:15, Ben Nemec a écrit : > In the interest of not having to start this discussion from scratch > every time, I've done a bit of a brain dump into > https://review.opendev.org/#/c/688825/ that covers why things are the > way they are and what we recommend people do about it. Please take a > look and let me know if you see any issues with it. > > Thanks. > > -Ben > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Thu Oct 17 10:24:19 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Thu, 17 Oct 2019 12:24:19 +0200 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> <20191010100050.hn546tikeihaho7e@localhost> Message-ID: <20191017102419.pa3qqlqgrlp2b7qx@localhost> On 10/10, Matt Riedemann wrote: > On 10/10/2019 5:00 AM, Gorka Eguileor wrote: > > > 1. Yeah if the existing legacy attachment record doesn't have a connector I > > > was worried about not properly cleaning on for that old connection, which is > > > something I mentioned before, but also as mentioned we potentially have that > > > case when a server is deleted and we can't get to the compute host to get > > > the host connector, right? > > > > > Hi, > > > > Not really... In that case we still have the BDM info in the DB, so we > > can just make the 3 Cinder REST API calls ourselves (begin_detaching, > > terminate_connection and detach) to have the volume unmapped, the export > > removed, and the volume return to available as usual, without needing to > > go to the storage array manually. > > I'm not sure what you mean. Yes we have the BDM in nova but if it's really > old it won't have the host connector stashed away in the connection_info > dict and we won't be able to pass that to the terminate_connection API: > > https://github.com/openstack/nova/blob/19.0.0/nova/compute/api.py#L2186 > > Are you talking about something else? I realize ^ is very edge case since > we've been storing the connector in the BDM.connection_info since I think at > least Liberty or Mitaka. Hi, I didn't know that Nova didn't use to store the connector... For those cases it is definitely going to be a problem. If you have one such cases, and the Nova compute node is down (so you cannot get the connector info), then we should just wait until the node is back up to do the migration. > > > > > > > > 2. If I were to use os-terminate_connection, I seem to have a tricky > > > situation on the migration flow because right now I'm doing: > > > > > > a) create new attachment with host connector > > > b) complete new attachment (put the volume back to in-use status) > > > - if this fails I attempt to delete the new attachment > > > c) delete the legacy attachment - I intentionally left this until the end to > > > make sure (a) and (b) were successful. > > > > > > If I change (c) to be os-terminate_connection, will that screw up the > > > accounting on the attachment created in (a)? > > > > > > If I did the terminate_connection first (before creating a new attachment), > > > could that leave a window of time where the volume is shown as not > > > attached/in-use? Maybe not since it's not the begin_detaching/os-detach > > > API...I'm fuzzy on the cinder volume state machine here. > > > > > > Or maybe the flow would become: > > > > > > a) create new attachment with host connector > > This is a good idea in itself, but it's not taking into account weird > > behaviors that some Cinder drivers may have when you call them twice to > > initialize the connection on the same host. Some drivers end up > > creating a different mapping for the volume instead of returning the > > existing one; we've had bugs like this before, and that's why Nova made > > a change in its live instance migration code to not call > > intialize_connection on the source host to get the connection_info for > > detaching. > > Huh...I thought attachments in cinder were a dime a dozen and you could > create/delete them as needed, or that was the idea behind the new v3 > attachments stuff. It seems to at least be what I remember John Griffith > always saying we should be able to do. Sure, you can create them freely, but the old and new API's were not meant to be mixed, which is what we would be doing here. The more I look at this, the more I think it is a bad idea. First, when you create the new attachment on a volume that was attached using the old APIs (Cinder attachment exists in DB without connection info): - Cinder DB attachment entry has the instance_uuid: then, since the volume to be migrated is not multi-attach, we will reuse the attachment that already exists [1] and a new one will not be created. So we will end up with just 1 attachment entry like if we didn't do anything. - If the DB attachment entry doesn't have the instance_uuid, then the attachment creation will fail [2] because it is not a multi-attach volume. If somehow we were able to create a second attachment entry, then the attachment_update will raise an exception [3], because it's expecting to have the connector information in all the attachments (because old and new attachment flows were not meant to be mixed). Even if we pass through that, this is not a multi-attach volume, so it will still fail because it cannot have 2 attachments [4]. Even if we get past that, when we create the attachment with the connector info or create it and then update it with the connection info, we'll get a new call to initialize_connection, and depending on the driver this will create a new export and mapping or reuse the old one. - If we create a new one the code could be fine when you call Cinder to remove that attachment, because we have 2 exports and we'll remove one. "Hopefully" the one we want. - The problem is if the driver's initialize_connection was idempotent, because then the new attach API expects it to return "True" when called a second time to initialize_connection with the same connector info, yet I don't think this was documented anywhere [5], so I don't think there are any drivers that are doing this. If drivers are idempotent and they don't return "True", then when we terminate the old attach connection we'll remove the export that is used by both connections, breaking the attachment. And this is not even thinking of what will happen on the OS-Brick side, which is most likely not good. For example, if we have a multipath iSCSI volume and the driver created a new target-portal, then the new devices that will appear on the system will be aggregated to the already existing multipath DM, which means that the terminate connection of the first one will flush the shared multipath DM, thus destroying the mapping from the new API flow. I stand by my initial recommendation, being able to update the existing attachment to add the connection information from Nova. Cheers, Gorka. [1]: https://opendev.org/openstack/cinder/src/commit/b8198de09aa13113d16d0cef8916223e66f9d8c1/cinder/volume/api.py#L2107-L2118 [2]: https://opendev.org/openstack/cinder/src/commit/b8198de09aa13113d16d0cef8916223e66f9d8c1/cinder/volume/api.py#L2121-L2127 [3]: https://opendev.org/openstack/cinder/src/commit/b8198de09aa13113d16d0cef8916223e66f9d8c1/cinder/volume/api.py#L2223 [4]: https://opendev.org/openstack/cinder/src/commit/b8198de09aa13113d16d0cef8916223e66f9d8c1/cinder/volume/api.py#L2227-L2233 [5]: https://opendev.org/openstack/cinder/src/commit/b8198de09aa13113d16d0cef8916223e66f9d8c1/cinder/volume/driver.py#L1470-L1476 > > Also if you can't refresh the connection info on the same host then a change > like this: > > https://review.opendev.org/#/c/579004/ > > Which does just that - refreshes the connection info doing reboot and start > instance operations - would break on those volume drivers if I'm following > you. > The part related to the new API looks fine, but doing that for the old initialize_connection doesn't look right to me. Cheers, Gorka. > > > > > > > b) terminate the connection for the legacy attachment > > > - if this fails, delete the new attachment created in (a) > > > c) complete the new attachment created in (a) > > > - if this fails...? > > > > > > Without digging into the flow of a cold or live migration I want to say > > > that's closer to what we do there, e.g. initialize_connection for the new > > > host, terminate_connection for the old host, complete the new attachment. > > > > > I think any workaround we try to find has a good chance of resulting in > > a good number of bugs. > > > > In my opinion our options are: > > > > 1- Completely detach and re-attach the volume > > I'd really like to avoid this if possible because it could screw up running > applications and the migration operation itself is threaded out to not hold > up the restart of the compute service. But maybe that's also true of what > I've got written up today though it's closer to what we do during > resize/cold migrate (though those of course involve downtime for the guest). I agree, this is not an idea we should pursue. > > > 2- Write new code in Cinder > > > > The new code can be either a new action or we can just add a > > microversion to attachment create to also accept "connection_info", and > > when we provide connection_info on the call the method confirms that > > it's a "migration" (the volume is 'in-use' and doesn't have any > > attachments) and it doesn't bother to call the cinder-volume to export > > and map the volume again and simply saves this information in the DB. > > If the volume is in-use it would have attachments, so I'm not > following you there. Even if the volume is attached the "legacy" way > from a nova perspective, using os-initialize_connection, there is a > volume attachment record in the cinder DB (I confirmed this in my > devstack testing and the notes are in my patch). It's also precisely > the problem I'm trying to solve which is without deleting the old > legacy attachment, when you delete the server the volume is detached > but still shows up in cinder as in-use because of the orphaned > attachment. > > > > > I know the solution it's not "clean/nice/elegant", and I'd rather go > > with option 1, but that would be terrible user experience, so I'll > > settle for a solution that doesn't add much code to Cinder, is simple > > for Nova, and is less likely to result in bugs. > > > > What do you think? > > > > Regards, > > Gorka. > > > > PS: In this week's meeting we briefly discussed this topic and agreed to > > continue the conversation here and retake it on next week's meeting. > > > > Thanks for discussing it and getting back to me. > > -- > > Thanks, > > Matt From sean.mcginnis at gmx.com Thu Oct 17 11:52:10 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 17 Oct 2019 06:52:10 -0500 Subject: [openstacksdk][RelMgmt] release-post job for openstack/releases for ref refs/heads/master failed Message-ID: <20191017115210.GA1913@sm-workstation> ----- Forwarded message from zuul at openstack.org ----- Date: Thu, 17 Oct 2019 10:19:24 +0000 From: zuul at openstack.org To: release-job-failures at lists.openstack.org Subject: [Release-job-failures] release-post job for openstack/releases for ref refs/heads/master failed Reply-To: openstack-discuss at lists.openstack.org Build failed. - tag-releases https://zuul.opendev.org/t/openstack/build/571c0f0e43cb469d8727121b77a3c92b : FAILURE in 3m 14s - publish-tox-docs-static https://zuul.opendev.org/t/openstack/build/None : SKIPPED _______________________________________________ Release-job-failures mailing list Release-job-failures at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/release-job-failures ----- End forwarded message ----- The tagging of the new stable/queens OpenStackSDK release failed. It appears to be due to the .gitreview file in that branch still referring to the python-openstacksdk name. That caused the automation to fail on trying to interact with that repo, but it has since been renamed to openstack/openstacksdk. Since this wasn't able to process the release, even though the patch had merged, I think the fix is actually fairly easy. I've proposed a backport of the repo name changes with: https://review.opendev.org/#/c/689131/ If we were to do another release to pick up that change, I believe it would actually fail validation because it would not find the current 0.11.4 tag and therefore think that two releases were needed instead of just one. So since it was not able to process anything to actually perform the release, I believe the easy path forward is to just revert the release patch of this last release: https://review.opendev.org/#/c/689132/ Then wait for the .gitreview patch to merge, then propose the 0.11.4 release again with the hash updated to include that rename fix. Please let me know if there are any questions or if I've overlooked anything. Thanks, Sean From emilien at redhat.com Thu Oct 17 13:07:24 2019 From: emilien at redhat.com (Emilien Macchi) Date: Thu, 17 Oct 2019 09:07:24 -0400 Subject: [tripleo] tripleoclient and tripleo-common have stable/train branch Message-ID: We branched stable/train for python-tripleoclient and tripleo-common. Please do the backports to that branch when necessary. We'll continue with branching hopefully today or tomorrow. Let me know if any questions, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Oct 17 15:22:02 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 17 Oct 2019 10:22:02 -0500 Subject: [nova][gate] Hold off on rechecks until https://review.opendev.org/689152/ merges Message-ID: <14815036-4a5b-4db4-6350-a4c0859b8402@gmail.com> The gate is blocked for nova changes until [1] is merged so hold off on rechecking anything until then. [1] https://review.opendev.org/#/c/689152/ -- Thanks, Matt From mriedemos at gmail.com Thu Oct 17 15:39:49 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 17 Oct 2019 10:39:49 -0500 Subject: [nova] Stance on trivial features for driver configs without integration testing Message-ID: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> This was brought up in the nova meeting today [1] as: "Do we have a particular stance on features to the libvirt driver for non-integration tested configurations, e.g. lxc [2] and xen [3], meaning if they are trivial enough do we just say the driver's quality warning on startup is sufficient to let them land since these are changes from casual contributors scratching an itch?" We agreed to move this to the mailing list. We don't have tempest jobs for the libvirt+lxc or libvirt+xen configurations (Citrix used to host 3rd party CI for the latter) and for the changes referenced they are from part-time contributors, minor and self-contained, and therefore I wouldn't expect them to build CI jobs for those configurations or stand up 3rd party CI. There are cases in the past where we've held features out due to lack of CI, e.g. live migration support in the vSphere driver. That's quite a bit different in my opinion because (1) it's a much more complicated feature, (2) there already was 3rd party CI for the vSphere driver and (3) there is a big rich corporation maintaining the driver so I figured they could pony up the resources to make that testing happen (and it eventually did). For these other small changes are we OK with letting them in knowing that the libvirt driver already logs a quality warning on startup for these configs [4]? In this case I am but wanted to ask and I don't think this sets a precedent as not all changes are equal. [1] http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-17-14.00.log.html#l-287 [2] https://review.opendev.org/#/c/667976/ [3] https://review.opendev.org/#/c/687827/ [4] https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/virt/libvirt/driver.py#L609 -- Thanks, Matt From mihalis68 at gmail.com Thu Oct 17 16:10:17 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Thu, 17 Oct 2019 12:10:17 -0400 Subject: [tc] Feedback on Airship pilot project In-Reply-To: <20191016190954.wscdgflttnfxvhlm@yuggoth.org> References: <20191016190954.wscdgflttnfxvhlm@yuggoth.org> Message-ID: I notice that although the code is released under the Apache license, looking at a conceivable real at-scale deployment one would need to read docs still marked as belonging to AT&T, for example https://opendev.org/airship/treasuremap links to https://airship-treasuremap.readthedocs.io/en/latest/index.html which is marked "© Copyright 2018 AT&T Intellectual Property. Revision 93aed048." I do not know if this is a problem, per se, but does not seem entirely openstack-like to me. Chris On Wed, Oct 16, 2019 at 3:17 PM Jeremy Stanley wrote: > Hi TC members, > > The Airship project will start its confirmation process with the OSF > Board of Directors at the Board meeting[1] Tuesday next week. A > draft of the slide deck[2] they plan to present is available for > reference. > > Per the confirmation guidelines[3], the OSF Board of directors will > take into account the feedback from representative bodies of > existing confirmed Open Infrastructure Projects (OpenStack, Zuul and > Kata) when evaluating Airship for confirmation. > > Particularly worth calling out, guideline #4 "Open collaboration" > asserts the following: > > Project behaves as a good neighbor to other confirmed and pilot > projects. > > If you (our community at large, not just TC members) have any > observations/interactions with the Airship project which could serve > as useful examples for how these projects do or do not meet this and > other guidelines, please provide them on the etherpad[4] ASAP. If > possible, include a citation with links to substantiate your > feedback. > > If a TC representative can assemble this feedback and send it to the > Board (for example, to the foundation mailing list) for > consideration before the meeting next week, that would be > appreciated. Apologies for the short notice. > > [1] > http://lists.openstack.org/pipermail/foundation/2019-October/002800.html > [2] > https://www.airshipit.org/collateral/AirshipConfirmation-Review-for-the-OSF-Board.pdf > [3] > https://wiki.openstack.org/wiki/Governance/Foundation/OSFProjectConfirmationGuidelines > [4] > https://etherpad.openstack.org/p/openstack-tc-airship-confirmation-feedback > > -- > Jeremy Stanley > -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Thu Oct 17 16:33:55 2019 From: melwittt at gmail.com (melanie witt) Date: Thu, 17 Oct 2019 09:33:55 -0700 Subject: [nova] Stance on trivial features for driver configs without integration testing In-Reply-To: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> References: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> Message-ID: <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> On 10/17/19 08:39, Matt Riedemann wrote: > This was brought up in the nova meeting today [1] as: > > "Do we have a particular stance on features to the libvirt driver for > non-integration tested configurations, e.g. lxc [2] and xen [3], meaning > if they are trivial enough do we just say the driver's quality warning > on startup is sufficient to let them land since these are changes from > casual contributors scratching an itch?" > > We agreed to move this to the mailing list. > > We don't have tempest jobs for the libvirt+lxc or libvirt+xen > configurations (Citrix used to host 3rd party CI for the latter) and for > the changes referenced they are from part-time contributors, minor and > self-contained, and therefore I wouldn't expect them to build CI jobs > for those configurations or stand up 3rd party CI. > > There are cases in the past where we've held features out due to lack of > CI, e.g. live migration support in the vSphere driver. That's quite a > bit different in my opinion because (1) it's a much more complicated > feature, (2) there already was 3rd party CI for the vSphere driver and > (3) there is a big rich corporation maintaining the driver so I figured > they could pony up the resources to make that testing happen (and it > eventually did). > > For these other small changes are we OK with letting them in knowing > that the libvirt driver already logs a quality warning on startup for > these configs [4]? In this case I am but wanted to ask and I don't think > this sets a precedent as not all changes are equal. I'm OK with this and I think the quality warning sets an appropriate expectation. As I mentioned in the meeting, my opinion is I think sufficiently trivial changes are fine on this basis. I also wouldn't try to set a hard precedent because each thing needs review on whether it's "trivial", but I support a spirit of accepting simple changes without requiring full blown 3rd party CI, given the quality warnings we have for the configs mentioned. -melanie > [1] > http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-17-14.00.log.html#l-287 > > [2] https://review.opendev.org/#/c/667976/ > [3] https://review.opendev.org/#/c/687827/ > [4] > https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/virt/libvirt/driver.py#L609 > > From paye600 at gmail.com Thu Oct 17 16:38:15 2019 From: paye600 at gmail.com (Roman Gorshunov) Date: Thu, 17 Oct 2019 18:38:15 +0200 Subject: [tc] Feedback on Airship pilot project In-Reply-To: References: <20191016190954.wscdgflttnfxvhlm@yuggoth.org> Message-ID: <9AA9FBE4-99BA-4751-8B4D-895D28526067@gmail.com> Chris, thank you for the feedback. I think it comes from the way how Apache 2.0 license needs to get applied, with a line like described here – [0]. > Copyright [yyyy] [name of copyright owner] But the concern is valid, and I have submitted a change to get it changed to "Airship Team” – [1]. [0] https://www.apache.org/licenses/LICENSE-2.0#apply [1] https://review.opendev.org/#/c/689212/ Best regards, — Roman Gorshunov > On 17 Oct 2019, at 18:10, Chris Morgan wrote: > > I notice that although the code is released under the Apache license, looking at a conceivable real at-scale deployment one would need to read docs still marked as belonging to AT&T, for example > > https://opendev.org/airship/treasuremap links to > https://airship-treasuremap.readthedocs.io/en/latest/index.html > > which is marked "© Copyright 2018 AT&T Intellectual Property. Revision 93aed048." > > I do not know if this is a problem, per se, but does not seem entirely openstack-like to me. > > Chris > > On Wed, Oct 16, 2019 at 3:17 PM Jeremy Stanley > wrote: > Hi TC members, > > The Airship project will start its confirmation process with the OSF > Board of Directors at the Board meeting[1] Tuesday next week. A > draft of the slide deck[2] they plan to present is available for > reference. > > Per the confirmation guidelines[3], the OSF Board of directors will > take into account the feedback from representative bodies of > existing confirmed Open Infrastructure Projects (OpenStack, Zuul and > Kata) when evaluating Airship for confirmation. > > Particularly worth calling out, guideline #4 "Open collaboration" > asserts the following: > > Project behaves as a good neighbor to other confirmed and pilot > projects. > > If you (our community at large, not just TC members) have any > observations/interactions with the Airship project which could serve > as useful examples for how these projects do or do not meet this and > other guidelines, please provide them on the etherpad[4] ASAP. If > possible, include a citation with links to substantiate your > feedback. > > If a TC representative can assemble this feedback and send it to the > Board (for example, to the foundation mailing list) for > consideration before the meeting next week, that would be > appreciated. Apologies for the short notice. > > [1] http://lists.openstack.org/pipermail/foundation/2019-October/002800.html > [2] https://www.airshipit.org/collateral/AirshipConfirmation-Review-for-the-OSF-Board.pdf > [3] https://wiki.openstack.org/wiki/Governance/Foundation/OSFProjectConfirmationGuidelines > [4] https://etherpad.openstack.org/p/openstack-tc-airship-confirmation-feedback > > -- > Jeremy Stanley > > > -- > Chris Morgan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu Oct 17 16:39:05 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 17 Oct 2019 16:39:05 +0000 Subject: [tc] Feedback on Airship pilot project In-Reply-To: References: <20191016190954.wscdgflttnfxvhlm@yuggoth.org> Message-ID: <20191017163905.wyl6pl7ijgi4pjfg@yuggoth.org> On 2019-10-17 12:10:17 -0400 (-0400), Chris Morgan wrote: > I notice that although the code is released under the Apache license, > looking at a conceivable real at-scale deployment one would need to read > docs still marked as belonging to AT&T, for example > > https://opendev.org/airship/treasuremap links to > https://airship-treasuremap.readthedocs.io/en/latest/index.html > > which is marked "© Copyright 2018 AT&T Intellectual Property. Revision > 93aed048." > > I do not know if this is a problem, per se, but does not seem entirely > openstack-like to me. [...] Thanks for pointing this out! It looks like that's coming from here: In OpenStack we've also not been great about consistency when it comes to the copyright directive in our Sphinx configs. A lot of those indicate "OpenStack Foundation" as the copyright holder, which isn't right either: But more to the point, trying to express specific copyright in that field is not a good idea anyway, since various files within the documentation source are likely to have copyright from a variety of entities. What's usually worked best is a vague expression like "Airship Contributors" similar to that of oslo.messaging (I picked this example at random because I know the Oslo team tends to pay closer attention to these sorts of details): It also doesn't help that the copyright holder's legal entity name is actually "AT&T Intellectual Property" rather than just "AT&T" or something, which makes it sound all the more possessive, but that's a corporate legal thing on their end for which we're unlikely to convince them otherwise. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at fried.cc Thu Oct 17 16:44:10 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 17 Oct 2019 11:44:10 -0500 Subject: [nova] Stance on trivial features for driver configs without integration testing In-Reply-To: <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> References: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> Message-ID: I'm fine with this from the perspective of: >> meaning if they are trivial enough do we just say the driver's quality >> warning on startup is sufficient to let them land since these are >> changes from casual contributors scratching an itch?" However, we have a large and ever-growing backlog of reviews in nova-land. If a core is going to spend X amount of time reviewing, I would rather (s)he prioritize things in supported areas of code. efried . From witold.bedyk at suse.com Thu Oct 17 17:06:56 2019 From: witold.bedyk at suse.com (Witek Bedyk) Date: Thu, 17 Oct 2019 19:06:56 +0200 Subject: [monasca] Ussuri planning meeting summary Message-ID: Hello everyone, yesterday we held a planning meeting for the next release cycle (Ussuri). You can find the summary of the meeting in the etherpad [1]. Please discuss with your teams and vote for the proposed topics in this spreadsheet [2] until next Wednesday. We meet next week again to agree on priorities during the regular Team Meeting hour, Wednesday, 15 UTC [3]. Thanks Witek [1] https://etherpad.openstack.org/p/monasca-planning-ussuri [2] https://docs.google.com/spreadsheets/d/17PLO8PJr28jXuPaFJ7efZCwgdgMpinZyDRTRJ1o5r8A/edit?usp=sharing [3] https://global.gotomeeting.com/join/817873693 From sfinucan at redhat.com Thu Oct 17 17:12:24 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Thu, 17 Oct 2019 18:12:24 +0100 Subject: PostgreSQL driver has been removed from DevStack Message-ID: <057b4d3a973c11e3a7d230fda39c627054cffb3c.camel@redhat.com> Just a heads up that the PostgreSQL DB driver has been removed from DevStack as of today [1]. This was deprecated in Pike due to lack of maintenance and has been on the chopping block since then. If you have jobs that relied on this, it would be advisable to drop these jobs now. I don't think the DevStack team have the resources to maintain this so if someone *really* needed to keep it around, they should probably put together a plugin and consume it that way. Stephen [1] https://review.opendev.org/#/c/678496/ From smooney at redhat.com Thu Oct 17 17:26:32 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 17 Oct 2019 18:26:32 +0100 Subject: [nova] Stance on trivial features for driver configs without integration testing In-Reply-To: <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> References: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> Message-ID: <142f623be6a67359f029dcaea6ffd2fdc037315c.camel@redhat.com> On Thu, 2019-10-17 at 09:33 -0700, melanie witt wrote: > On 10/17/19 08:39, Matt Riedemann wrote: > > This was brought up in the nova meeting today [1] as: > > > > "Do we have a particular stance on features to the libvirt driver for > > non-integration tested configurations, e.g. lxc [2] and xen [3], meaning > > if they are trivial enough do we just say the driver's quality warning > > on startup is sufficient to let them land since these are changes from > > casual contributors scratching an itch?" > > > > We agreed to move this to the mailing list. > > > > We don't have tempest jobs for the libvirt+lxc or libvirt+xen > > configurations (Citrix used to host 3rd party CI for the latter) and for > > the changes referenced they are from part-time contributors, minor and > > self-contained, and therefore I wouldn't expect them to build CI jobs > > for those configurations or stand up 3rd party CI. > > > > There are cases in the past where we've held features out due to lack of > > CI, e.g. live migration support in the vSphere driver. That's quite a > > bit different in my opinion because (1) it's a much more complicated > > feature, (2) there already was 3rd party CI for the vSphere driver and > > (3) there is a big rich corporation maintaining the driver so I figured > > they could pony up the resources to make that testing happen (and it > > eventually did). > > > > For these other small changes are we OK with letting them in knowing > > that the libvirt driver already logs a quality warning on startup for > > these configs [4]? In this case I am but wanted to ask and I don't think > > this sets a precedent as not all changes are equal. > > I'm OK with this and I think the quality warning sets an appropriate > expectation. As I mentioned in the meeting, my opinion is I think > sufficiently trivial changes are fine on this basis. I also wouldn't try > to set a hard precedent because each thing needs review on whether it's > "trivial", but I support a spirit of accepting simple changes without > requiring full blown 3rd party CI, given the quality warnings we have > for the configs mentioned. i cant speak for the xen change but at least in the case of the lxc change i was able to test it myself and comment as such on the patch. the author also explained how they were testing and i was able to follow there steps to reproduce it. so while i dont think we shoudl have to deploy each change ourselves if it does not have automated ci if it trival and someone volunteers to test it then i think that is another reason to allow such changes. i think the openstack mantra of if its not in ci its probably broke still applies but we have that quality warning in the logs on start up which was mentioned so operator know what are getting into upfront. > > -melanie > > > [1] > > http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-17-14.00.log.html#l-287 > > > > [2] https://review.opendev.org/#/c/667976/ > > [3] https://review.opendev.org/#/c/687827/ > > [4] > > https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/virt/libvirt/driver.py#L609 > > > > > > From mriedemos at gmail.com Thu Oct 17 18:00:33 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 17 Oct 2019 13:00:33 -0500 Subject: PostgreSQL driver has been removed from DevStack In-Reply-To: <057b4d3a973c11e3a7d230fda39c627054cffb3c.camel@redhat.com> References: <057b4d3a973c11e3a7d230fda39c627054cffb3c.camel@redhat.com> Message-ID: <4ce6ae68-a11e-abf0-5dc1-01af55374e5e@gmail.com> On 10/17/2019 12:12 PM, Stephen Finucane wrote: > Just a heads up that the PostgreSQL DB driver has been removed from > DevStack as of today [1]. This was deprecated in Pike due to lack of > maintenance and has been on the chopping block since then. If you have > jobs that relied on this, it would be advisable to drop these jobs now. > I don't think the DevStack team have the resources to maintain this so > if someone*really* needed to keep it around, they should probably put > together a plugin and consume it that way. > > Stephen > > [1]https://review.opendev.org/#/c/678496/ This would have been useful information before actually merging the change to drop this code. We have at least a few postgres jobs (the grenade one is relatively new) because we have postgres support in a few places, like placement. What maintenance burden did people actually have when the pg jobs didn't break? -- Thanks, Matt From fungi at yuggoth.org Thu Oct 17 18:14:52 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 17 Oct 2019 18:14:52 +0000 Subject: [tc] Feedback on Airship pilot project In-Reply-To: <9AA9FBE4-99BA-4751-8B4D-895D28526067@gmail.com> References: <20191016190954.wscdgflttnfxvhlm@yuggoth.org> <9AA9FBE4-99BA-4751-8B4D-895D28526067@gmail.com> Message-ID: <20191017181452.fxpmgenarjjhlli4@yuggoth.org> On 2019-10-17 18:38:15 +0200 (+0200), Roman Gorshunov wrote: [...] > I think it comes from the way how Apache 2.0 license needs to get > applied [...] Well, the Sphinx configuration directive could be considered independent from asserting copyright in individual source files. It's a general blurb which the theme incorporates into the rendered footer of all pages, so if some pages' content are copyrighted by other contributing organizations then the copyright info displayed on that page becomes incorrect. This is why it tends to be simpler to just use a vague copyright entity in the Sphinx config field so that some copyright is asserted/implied, while allowing the individual copyrights of various files to differ from one another. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From aj at suse.com Thu Oct 17 18:41:37 2019 From: aj at suse.com (Andreas Jaeger) Date: Thu, 17 Oct 2019 20:41:37 +0200 Subject: PostgreSQL driver has been removed from DevStack In-Reply-To: <4ce6ae68-a11e-abf0-5dc1-01af55374e5e@gmail.com> References: <057b4d3a973c11e3a7d230fda39c627054cffb3c.camel@redhat.com> <4ce6ae68-a11e-abf0-5dc1-01af55374e5e@gmail.com> Message-ID: <783019b6-29f2-c83e-f921-59a4981d65e8@suse.com> On 17/10/2019 20.00, Matt Riedemann wrote: > On 10/17/2019 12:12 PM, Stephen Finucane wrote: >> Just a heads up that the PostgreSQL DB driver has been removed from >> DevStack as of today [1]. This was deprecated in Pike due to lack of >> maintenance and has been on the chopping block since then. If you have >> jobs that relied on this, it would be advisable to drop these jobs now. >> I don't think the DevStack team have the resources to maintain this so >> if someone*really*  needed to keep it around, they should probably put >> together a plugin and consume it that way. >> >> Stephen >> >> [1]https://review.opendev.org/#/c/678496/ > > This would have been useful information before actually merging the > change to drop this code. We have at least a few postgres jobs (the > grenade one is relatively new) because we have postgres support in a few > places, like placement. FYI: http://zuul.opendev.org/t/openstack/jobs allows searching for jobs, just add "postgres" in the search box and see which ones are used, Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From fsbiz at yahoo.com Thu Oct 17 18:51:39 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Thu, 17 Oct 2019 18:51:39 +0000 (UTC) Subject: [neutron]: How to find the version of the L2 agent References: <2094645143.3124047.1571338299474.ref@mail.yahoo.com> Message-ID: <2094645143.3124047.1571338299474@mail.yahoo.com> I know neutron-server --version  gives me the version of the neutron server currently running.For the recent queens release it is 12.1.0 Is there a command or easy way to check the version of the L2 agent on the compute hosts? We are running the the neutron-linuxbridge-agent. thanks,Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Oct 17 19:26:08 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 17 Oct 2019 14:26:08 -0500 Subject: PostgreSQL driver has been removed from DevStack In-Reply-To: <783019b6-29f2-c83e-f921-59a4981d65e8@suse.com> References: <057b4d3a973c11e3a7d230fda39c627054cffb3c.camel@redhat.com> <4ce6ae68-a11e-abf0-5dc1-01af55374e5e@gmail.com> <783019b6-29f2-c83e-f921-59a4981d65e8@suse.com> Message-ID: <8360c18b-f01b-2e04-06be-9ca428347908@gmail.com> On 10/17/2019 1:41 PM, Andreas Jaeger wrote: > FYI: > http://zuul.opendev.org/t/openstack/jobs allows searching for jobs, just > add "postgres" > in the search box and see which ones are used, Yeah so I count about 56 jobs between -postgres and -pg in the name. -- Thanks, Matt From cboylan at sapwetik.org Thu Oct 17 19:32:12 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 17 Oct 2019 12:32:12 -0700 Subject: =?UTF-8?Q?Re:_[nova]_Stance_on_trivial_features_for_driver_configs_witho?= =?UTF-8?Q?ut_integration_testing?= In-Reply-To: <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> References: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> Message-ID: On Thu, Oct 17, 2019, at 9:33 AM, melanie witt wrote: > On 10/17/19 08:39, Matt Riedemann wrote: > > This was brought up in the nova meeting today [1] as: > > > > "Do we have a particular stance on features to the libvirt driver for > > non-integration tested configurations, e.g. lxc [2] and xen [3], meaning > > if they are trivial enough do we just say the driver's quality warning > > on startup is sufficient to let them land since these are changes from > > casual contributors scratching an itch?" > > > > We agreed to move this to the mailing list. > > > > We don't have tempest jobs for the libvirt+lxc or libvirt+xen > > configurations (Citrix used to host 3rd party CI for the latter) and for > > the changes referenced they are from part-time contributors, minor and > > self-contained, and therefore I wouldn't expect them to build CI jobs > > for those configurations or stand up 3rd party CI. > > > > There are cases in the past where we've held features out due to lack of > > CI, e.g. live migration support in the vSphere driver. That's quite a > > bit different in my opinion because (1) it's a much more complicated > > feature, (2) there already was 3rd party CI for the vSphere driver and > > (3) there is a big rich corporation maintaining the driver so I figured > > they could pony up the resources to make that testing happen (and it > > eventually did). > > > > For these other small changes are we OK with letting them in knowing > > that the libvirt driver already logs a quality warning on startup for > > these configs [4]? In this case I am but wanted to ask and I don't think > > this sets a precedent as not all changes are equal. > > I'm OK with this and I think the quality warning sets an appropriate > expectation. As I mentioned in the meeting, my opinion is I think > sufficiently trivial changes are fine on this basis. I also wouldn't try > to set a hard precedent because each thing needs review on whether it's > "trivial", but I support a spirit of accepting simple changes without > requiring full blown 3rd party CI, given the quality warnings we have > for the configs mentioned. I think it is important to clarify that for both LXC and Xen testing should be able to happen in the upstream CI system. Both are open source projects and the major gotcha to testing Xen in the past (reboots) should be possible with Zuulv3. Typically we should fall back on third party CI when there are special hardware requirements or licenses that prevent us from running software freely on our existing CI system. For open source tools like LXC and Xen I don't believe this is a problem. > > -melanie > > > [1] > > http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-17-14.00.log.html#l-287 > > > > [2] https://review.opendev.org/#/c/667976/ > > [3] https://review.opendev.org/#/c/687827/ > > [4] > > https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/virt/libvirt/driver.py#L609 From mriedemos at gmail.com Thu Oct 17 19:42:58 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 17 Oct 2019 14:42:58 -0500 Subject: PostgreSQL driver has been removed from DevStack In-Reply-To: <4ce6ae68-a11e-abf0-5dc1-01af55374e5e@gmail.com> References: <057b4d3a973c11e3a7d230fda39c627054cffb3c.camel@redhat.com> <4ce6ae68-a11e-abf0-5dc1-01af55374e5e@gmail.com> Message-ID: <971bace6-b5ac-7963-5cb1-d606d8226bf4@gmail.com> On 10/17/2019 1:00 PM, Matt Riedemann wrote: > This would have been useful information before actually merging the > change to drop this code. We have at least a few postgres jobs (the > grenade one is relatively new) because we have postgres support in a few > places, like placement. > > What maintenance burden did people actually have when the pg jobs didn't > break? I've proposed a revert: https://review.opendev.org/#/c/689250/ -- Thanks, Matt From sean.mcginnis at gmx.com Thu Oct 17 20:31:52 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 17 Oct 2019 15:31:52 -0500 Subject: [ptl][release] Re: [stable][EM] Extended Maintenance - Queens In-Reply-To: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> References: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> Message-ID: <20191017203152.GA828@sm-workstation> On Wed, Oct 16, 2019 at 05:44:31PM +0000, Elõd Illés wrote: > Hi, > > As it was agreed during PTG, the planned date of Extended Maintenance > transition of Queens is around two weeks after Train release (a less > busy period) [1]. Now that Train is released, it is a good opportunity > for teams to go through the list of open and unreleased changes in > Queens [2] and schedule a final release for Queens if needed. Feel free > to use / edit / modify the lists (I've generated the lists for > repositories which have 'follows-policy' tag). I hope this helps. > > [1] https://releases.openstack.org/ > [2] https://etherpad.openstack.org/p/queens-final-release-before-em > > Thanks, > > Előd > Trying to amplify this. The date for Queens to transition to Extended Maintenance is next week. Late in the week we will be proposing a patch to tag all deliverables with a "queens-em" tag. After this point, no additional releases will be allowed. I took a quick look through our stable/queens deliverables, and there are several that look to have a sizable amount of patches landed that have not been released. Elod was super nice by including all of that for easy checking in [2] above. As part of Extended Maintenance, bugfixes can (and should) be cherry-picked to stable/queens. But once we enter Extended Maintenance, there won't be any official releases and it will be up to downstream consumers to pick up these fixes locally as they need them. So consider this a last call for stable/queens releases. Thanks! Sean From emiller at genesishosting.com Thu Oct 17 20:48:20 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Thu, 17 Oct 2019 15:48:20 -0500 Subject: [Neutron] Metering agent issues Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A1F@gmsxchsvr01.thecreation.com> Hi, We have a Stein deployment using Kolla Ansible 8.0.1 (the latest) and have enabled the deployment of the neutron metering agent. We have separate network nodes and compute nodes and use DVR. DVR is working properly, where ingress and egress traffic to floating IPs binded to fixed IPs flows directly to our upstream routers, not through the network nodes, whereas SNAT traffic flows through the network nodes. I configured the agent using the simplest of rules to test the agent (as cloud admin): openstack network meter create --description Shared --share bandwidth openstack network meter rule create --egress --remote-ip-prefix 0.0.0.0/0 bandwidth openstack network meter rule create --ingress --remote-ip-prefix 0.0.0.0/0 bandwidth None of the compute nodes appeared to be metering. So I checked the metering agent log (see below my signature for a snippet of the respective log entries), which shows that no routers are being found. Note that the Debug flag is set to True (in metering_agent.ini). On the primary network node, the "get_traffic_counters" method returns a list of all routers for every project, and runs a respective meter retrieval for each, such as the following, for each router: 2019-10-17 15:29:30.437 6 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-e0d2ee4f-76ec-4043-93f7-a47fae488d59', 'iptables', '-t', 'filter', '-L', 'neutron-meter-l-fb2c912d-9ce', '-n', '-v', '-x', '-w', '10', '-Z'] execute_rootwrap_daemon /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/agent/linux/util s.py:103 However, a single metering report is being created. Only one debug row appears in the log, and only one "bandwidth metric" measure is added to Gnocchi: 2019-10-17 15:38:15.494 6 DEBUG neutron.services.metering.agents.metering_agent [-] Send metering report: {'label_id': u'fb2c912d-9ced-433b-914e-df6526e034d1', 'time': 30, 'pkts': 0, 'tenant_id': u'96b2e5246a7744219e2af51f0e7b91d6', 'first_update': 1571300562, 'bytes': 0, 'host': 'networknode002', 'last_update': 1571344695} _metering_notification /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/services/meterin g/agents/metering_agent.py:103 No errors are being reported in the logs. Am I missing something? Thanks! Eric 2019-10-17 15:25:12.418 6 DEBUG oslo_concurrency.lockutils [-] Lock "metering-agent" acquired by "neutron.services.metering.agents.metering_agent._invoke_driver" :: waited 0.000s inner /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockuti ls.py:327 2019-10-17 15:25:12.418 6 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [-] neutron.services.metering.drivers.iptables.iptables_driver.IptablesMeter ingDriver method get_traffic_counters called with arguments (, []) {} wrapper /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_log/helpers.py:66 2019-10-17 15:25:12.418 6 DEBUG oslo_concurrency.lockutils [-] Lock "metering-agent" released by "neutron.services.metering.agents.metering_agent._invoke_driver" :: held 0.000s inner /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockuti ls.py:339 2019-10-17 15:25:27.421 6 DEBUG neutron.services.metering.agents.metering_agent [-] Get router traffic counters _get_traffic_counters /var/lib/kolla/venv/lib/python2.7/site-packages/neutron/services/meterin g/agents/metering_agent.py:215 2019-10-17 15:25:27.421 6 DEBUG oslo_concurrency.lockutils [-] Lock "metering-agent" acquired by "neutron.services.metering.agents.metering_agent._invoke_driver" :: waited 0.000s inner /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockuti ls.py:327 2019-10-17 15:25:27.421 6 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [-] neutron.services.metering.drivers.iptables.iptables_driver.IptablesMeter ingDriver method get_traffic_counters called with arguments (, []) {} wrapper /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_log/helpers.py:66 2019-10-17 15:25:27.421 6 DEBUG oslo_concurrency.lockutils [-] Lock "metering-agent" released by "neutron.services.metering.agents.metering_agent._invoke_driver" :: held 0.000s inner /var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockuti ls.py:339 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony at bakeyournoodle.com Thu Oct 17 21:58:56 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Fri, 18 Oct 2019 08:58:56 +1100 Subject: [tripleo] owls at ptg In-Reply-To: References: Message-ID: <20191017215856.GF8065@thor.bakeyournoodle.com> On Tue, Oct 08, 2019 at 10:54:40AM -0600, Wesley Hayutin wrote: > Greetings, > > A number of folks from TripleO will be at the OpenDev PTG. If you would > like to discuss anything and collaborate please list your topic on this > etherpad [1] Looks like there is enough interest that it's worth doing. Currently we don't have any space allocated to tripleo Can someone closer to the project ask the OSF (Kendall Nelson probably) if there is space we can have? Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From mthode at mthode.org Thu Oct 17 22:44:51 2019 From: mthode at mthode.org (Matthew Thode) Date: Thu, 17 Oct 2019 17:44:51 -0500 Subject: PostgreSQL driver has been removed from DevStack In-Reply-To: <971bace6-b5ac-7963-5cb1-d606d8226bf4@gmail.com> References: <057b4d3a973c11e3a7d230fda39c627054cffb3c.camel@redhat.com> <4ce6ae68-a11e-abf0-5dc1-01af55374e5e@gmail.com> <971bace6-b5ac-7963-5cb1-d606d8226bf4@gmail.com> Message-ID: <20191017224451.yvce3o6efica3yxv@mthode.org> On 19-10-17 14:42:58, Matt Riedemann wrote: > On 10/17/2019 1:00 PM, Matt Riedemann wrote: > > This would have been useful information before actually merging the > > change to drop this code. We have at least a few postgres jobs (the > > grenade one is relatively new) because we have postgres support in a few > > places, like placement. > > > > What maintenance burden did people actually have when the pg jobs didn't > > break? > > I've proposed a revert: > > https://review.opendev.org/#/c/689250/ > Thanks, some of us are still using postgres (think it was installed grizzly / folsom time frame). I understand the desire for removal but doesn't make me any less sad. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From hello at dincercelik.com Thu Oct 17 23:45:51 2019 From: hello at dincercelik.com (=?utf-8?B?RGluw6dlciDDh2VsaWs=?=) Date: Fri, 18 Oct 2019 02:45:51 +0300 Subject: [openstack-operators] RBD problems after data center power outage Message-ID: Greetings, Today I had a data center power outage, and the OpenStack cluster went down. After taking the cluster up again, I cannot start some VMs due to error below. I've tried "rbd object-map rebuild" but it didn't work. What's the proper way to re-create the missing "_disk.config" files? Thanks. [instance: c2b54eac-179b-4907-9d61-8e075edc21cf] Failed to start libvirt guest: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-10-17T23:19:41.103720Z qemu-system-x86_64: -drive file=rbd:vms/c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config:id=nova:auth_supported=cephx\;none:mon_host=10.250.129.10\:6789\;10.250.129.11\:6789\;10.250.129.12\:6789\;10.250.129.15\:6789,file.password-secret=ide0-0-0-secret0,format=raw,if=none,id=drive-ide0-0-0,readonly=on,cache=writeback,discard=unmap: error reading header from c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or directory /* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */ /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ From aaronzhu1121 at gmail.com Fri Oct 18 01:41:57 2019 From: aaronzhu1121 at gmail.com (Rong Zhu) Date: Fri, 18 Oct 2019 09:41:57 +0800 Subject: [all][ceilometer][aodh][[docs] Possible error in Stein's Aodh documentation and how to configure cpu_util and pass on value to Aodh In-Reply-To: References: Message-ID: Hi Gauri, Please have a try with this link[0], Our production haven't use the latest release and haven't use gnoochi, So sorry for can not provider more advice about this. [0] https://stackoverflow.com/questions/56216683/openstack-get-vm-cpu-util-with-stein-version Gauri Sindhu 于2019年10月17日 周四16:34写道: > Hi Rong, > > Is there any workaround for this at the moment? Is there any other > replacement for the cpu_util metric so that we can transfer the metric or > data to the alarm? > > Regards, > Gauri Sindhu > > > On Thu, Oct 17, 2019 at 12:02 PM Rong Zhu wrote: > >> Hi Gauri, >> >> We received a lot of feedback about cpu_utils, And we had plan to add >> cpu_utils back in U release. >> >> >> Gauri Sindhu 于2019年10月17日 周四14:20写道: >> >>> Hi all, >>> >>> As per the Rocky release notes >>> , *cpu_util >>> and *.rate meters are deprecated and will be removed in future release in >>> favor of the Gnocchi rate calculation equivalent.* >>> >>> I have two doubts regarding this. >>> >>> Firstly, if the 'cpu_util' metric has been deprecated then why has it >>> been used as an example in the documentation >>> in >>> the 'Using Alarms' section? I've attached an image of the same. >>> >>> Secondly, I'm using OpenStack Stein and want to use the cpu_util or its >>> equivalent to create an alarm. If this metric is no longer available then >>> what do I pass onto Aodh to create the alarm? There seems to be no >>> documentation that to help me out with this. Additionally, even if Gnocchi >>> rate calculation is to be used, how am I supposed to transfer the result to >>> Aodh? I also cannot seem to find the Gnocchi documentation. >>> >>> Regards, >>> Gauri Sindhu >>> >> -- >> Thanks, >> Rong Zhu >> > -- Thanks, Rong Zhu -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Fri Oct 18 03:20:14 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Fri, 18 Oct 2019 11:20:14 +0800 Subject: [aodh] [heat] Stein: How to create alarms based on rate metrics like CPU utilization? In-Reply-To: References: Message-ID: FYI, there's a new ML with same topic in [1] [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010210.html On Sun, Aug 4, 2019 at 3:55 PM Bernd Bausch wrote: > Prior to Stein, Ceilometer issued a metric named *cpu_util*, which I > could use to trigger alarms and autoscaling when CPU utilization was too > high. > > cpu_util doesn't exist anymore. Instead, we are asked to use Gnocchi's > *rate* feature. However, when using rates, alarms on a group of resources > require more parameters than just one metric: Both an aggregation and a > reaggregation method are needed. > > For example, a group of instances that implement "myapp": > > gnocchi measures aggregation -m cpu --reaggregation mean --aggregation > rate:mean --query server_group=myapp --resource-type instance > > Actually, this command uses a deprecated API (but from what I can see, > Aodh still uses it). The new way is like this: > > gnocchi aggregates --resource-type instance '(aggregate rate:mean (metric > cpu mean))' server_group=myapp > > If rate:mean is in the archive policy, it also works the other way around: > > gnocchi aggregates --resource-type instance '(aggregate mean (metric cpu > rate:mean))' server_group=myapp > > Without reaggregation, I get quite unexpected numbers, including negative > CPU rates. If you want to understand why, see this discussion with one of > the Gnocchi maintainers [1]. > > *My problem*: Aodh allows me to set an aggregation method, but not a > reaggregation method. How can I create alarms based on rates? The problem > extends to Heat and autoscaling. > > Thanks much, > > Bernd. > > [1] https://github.com/gnocchixyz/gnocchi/issues/1044 > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Fri Oct 18 03:21:13 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Fri, 18 Oct 2019 11:21:13 +0800 Subject: AW: [metrics] [telemetry] [stein] cpu_util In-Reply-To: References: <9058e09f-a5ce-4db9-5077-1217ece1695a@gmail.com> Message-ID: FYI, there's a new ML with similar topic: [1] [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010210.html On Mon, Sep 9, 2019 at 6:19 PM Tobias Urdin wrote: > The cpu_util is a pain-point for us as well, we will unfortunately need > to add that metric back > to keep backward compatibility to our customers. > > Best regards > Tobias > > On 9/9/19 11:37 AM, Blom, Merlin, NMU-OI wrote: > > From Witek Bedyk on Re: [aodh] [heat] Stein: > How to create alarms based on rate metrics like CPU utilization? > > Fr 16.08.2019 17:11 > > ' > > Hi all, > > > > You can also collect `cpu.utilization_perc` metric with Monasca and > trigger Heat auto-scaling as we demonstrated in the hands-on workshop at > the last Summit in Denver. > > > > Here the Heat template we've used [1]. > > You can find the workshop material here [2]. > > > > Cheers > > Witek > > > > [1] > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sjamgade_monasca-2Dautoscaling_blob_master_final_autoscaling.yaml&d=DwICaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=hTUN4-Trlb-8Fh11dR6m5VD1uYA15z7v9WL8kYigkr8&m=KDzBi0a41i4kfZG7LrvMjx6tKJCAZHM71I9snAHtDbU&s=wZLSXjvqYiPmMVbz8fgezCE1iwxZcQXRe3zZZW1JBFo&e= > > [2] > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sjamgade_monasca-2Dautoscaling&d=DwICaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=hTUN4-Trlb-8Fh11dR6m5VD1uYA15z7v9WL8kYigkr8&m=KDzBi0a41i4kfZG7LrvMjx6tKJCAZHM71I9snAHtDbU&s=M1D9BENrKX7HD43HfcYFuB8vdP9fKgAuGOTXtRq5aZI&e= > > ' > > > > Cheers > > Merlin > > > > -----Ursprüngliche Nachricht----- > > Von: Budai Laszlo > > Gesendet: Freitag, 16. August 2019 18:10 > > An: OpenStack Discuss > > Betreff: [metrics] [telemetry] [stein] cpu_util > > > > Hello all, > > > > the release release announce of ceilometer rocky is deprecating the > cpu_util and *.rate metrics > > "* cpu_util and *.rate meters are deprecated and will be removed in > > future release in favor of the Gnocchi rate calculation equivalent." > > > > so we don't have them in Stein. Can you direct me to some document that > describes how to achieve these with Gnocchi rate calculation? > > > > Thank you, > > Laszlo > > > > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From sindhugauri1 at gmail.com Fri Oct 18 05:32:09 2019 From: sindhugauri1 at gmail.com (Gauri Sindhu) Date: Fri, 18 Oct 2019 11:02:09 +0530 Subject: [all][ceilometer][aodh][[docs] Possible error in Stein's Aodh documentation and how to configure cpu_util and pass on value to Aodh In-Reply-To: References: Message-ID: Hi Rong, Thanks for the help! Regards, Gauri Sindhu On Fri, Oct 18, 2019 at 7:12 AM Rong Zhu wrote: > Hi Gauri, > > Please have a try with this link[0], Our production haven't use the latest > release and haven't use gnoochi, So sorry for can not provider more advice > about this. > > [0] > > https://stackoverflow.com/questions/56216683/openstack-get-vm-cpu-util-with-stein-version > > > Gauri Sindhu 于2019年10月17日 周四16:34写道: > >> Hi Rong, >> >> Is there any workaround for this at the moment? Is there any other >> replacement for the cpu_util metric so that we can transfer the metric or >> data to the alarm? >> >> Regards, >> Gauri Sindhu >> >> >> On Thu, Oct 17, 2019 at 12:02 PM Rong Zhu wrote: >> >>> Hi Gauri, >>> >>> We received a lot of feedback about cpu_utils, And we had plan to add >>> cpu_utils back in U release. >>> >>> >>> Gauri Sindhu 于2019年10月17日 周四14:20写道: >>> >>>> Hi all, >>>> >>>> As per the Rocky release notes >>>> , *cpu_util >>>> and *.rate meters are deprecated and will be removed in future release in >>>> favor of the Gnocchi rate calculation equivalent.* >>>> >>>> I have two doubts regarding this. >>>> >>>> Firstly, if the 'cpu_util' metric has been deprecated then why has it >>>> been used as an example in the documentation >>>> in >>>> the 'Using Alarms' section? I've attached an image of the same. >>>> >>>> Secondly, I'm using OpenStack Stein and want to use the cpu_util or its >>>> equivalent to create an alarm. If this metric is no longer available then >>>> what do I pass onto Aodh to create the alarm? There seems to be no >>>> documentation that to help me out with this. Additionally, even if Gnocchi >>>> rate calculation is to be used, how am I supposed to transfer the result to >>>> Aodh? I also cannot seem to find the Gnocchi documentation. >>>> >>>> Regards, >>>> Gauri Sindhu >>>> >>> -- >>> Thanks, >>> Rong Zhu >>> >> -- > Thanks, > Rong Zhu > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Fri Oct 18 06:44:34 2019 From: eblock at nde.ag (Eugen Block) Date: Fri, 18 Oct 2019 06:44:34 +0000 Subject: [openstack-operators] RBD problems after data center power outage In-Reply-To: Message-ID: <20191018064434.Horde.B1S61lbGw5Cyp2me9iOzESF@webmail.nde.ag> Hi, I've recently found this post [1] to recover a failing header, but I haven't tried it myself. I'm curios if it works though. Regards, Eugen https://fnordahl.com/2017/04/17/ceph-rbd-volume-header-recovery/ Zitat von Dinçer Çelik : > Greetings, > > Today I had a data center power outage, and the OpenStack cluster > went down. After taking the cluster up again, I cannot start some > VMs due to error below. I've tried "rbd object-map rebuild" but it > didn't work. What's the proper way to re-create the missing > "_disk.config" files? > > Thanks. > > [instance: c2b54eac-179b-4907-9d61-8e075edc21cf] Failed to start > libvirt guest: libvirt.libvirtError: internal error: qemu > unexpectedly closed the monitor: 2019-10-17T23:19:41.103720Z > qemu-system-x86_64: -drive > file=rbd:vms/c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config:id=nova:auth_supported=cephx\;none:mon_host=10.250.129.10\:6789\;10.250.129.11\:6789\;10.250.129.12\:6789\;10.250.129.15\:6789,file.password-secret=ide0-0-0-secret0,format=raw,if=none,id=drive-ide0-0-0,readonly=on,cache=writeback,discard=unmap: error reading header from c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or > directory > > /* Please encrypt every message you can. Privacy is your right, > don't let anyone take it from you. */ > > /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ From katonalala at gmail.com Fri Oct 18 07:51:02 2019 From: katonalala at gmail.com (Lajos Katona) Date: Fri, 18 Oct 2019 09:51:02 +0200 Subject: [neutron]: How to find the version of the L2 agent In-Reply-To: <2094645143.3124047.1571338299474@mail.yahoo.com> References: <2094645143.3124047.1571338299474.ref@mail.yahoo.com> <2094645143.3124047.1571338299474@mail.yahoo.com> Message-ID: Hi, Similarly on the host where your agent is running you can use the command neutron-linuxbridge-agent --version to check the agent's version. The same works for l3-agent and other deployed neutron (openstack?) related things as well (i.e.: neutron-dhcp-agent --version) Regards Lajos fsbiz at yahoo.com ezt írta (időpont: 2019. okt. 17., Cs, 20:55): > > I know neutron-server --version gives me the version of the neutron server currently running. > For the recent queens release it is 12.1.0 > > Is there a command or easy way to check the version of the L2 agent on the compute hosts? > > We are running the the neutron-linuxbridge-agent. > > thanks, > Fred. From thierry at openstack.org Fri Oct 18 08:37:21 2019 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 18 Oct 2019 10:37:21 +0200 Subject: Deployment tools capabilities now displayed on openstack.org/software Message-ID: Hi everyone, During the Train cycle, we pushed to provide better information to users of OpenStack on the differences between the various upstream ways to deploy OpenStack. We first defined a number of deployment tools capabilities[1] and then asked our various deployment tools to fill out which capabilities applied to them[2]. [1] https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools_capabilities.yaml [2] https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools.yaml Those capabilities are now displayed on the website at: https://www.openstack.org/software/project-navigator/deployment-tools Thanks for everyone who helped defining and providing this information. Please feel free to propose improvements through changes to the osf/openstack-map repository. Cheers, -- Thierry Carrez (ttx) From smooney at redhat.com Fri Oct 18 09:54:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 18 Oct 2019 10:54:54 +0100 Subject: [nova] Stance on trivial features for driver configs without integration testing In-Reply-To: References: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> Message-ID: <89408bfef395c92e07660aba8a5ac41debe500fc.camel@redhat.com> On Thu, 2019-10-17 at 12:32 -0700, Clark Boylan wrote: > On Thu, Oct 17, 2019, at 9:33 AM, melanie witt wrote: > > On 10/17/19 08:39, Matt Riedemann wrote: > > > This was brought up in the nova meeting today [1] as: > > > > > > "Do we have a particular stance on features to the libvirt driver for > > > non-integration tested configurations, e.g. lxc [2] and xen [3], meaning > > > if they are trivial enough do we just say the driver's quality warning > > > on startup is sufficient to let them land since these are changes from > > > casual contributors scratching an itch?" > > > > > > We agreed to move this to the mailing list. > > > > > > We don't have tempest jobs for the libvirt+lxc or libvirt+xen > > > configurations (Citrix used to host 3rd party CI for the latter) and for > > > the changes referenced they are from part-time contributors, minor and > > > self-contained, and therefore I wouldn't expect them to build CI jobs > > > for those configurations or stand up 3rd party CI. > > > > > > There are cases in the past where we've held features out due to lack of > > > CI, e.g. live migration support in the vSphere driver. That's quite a > > > bit different in my opinion because (1) it's a much more complicated > > > feature, (2) there already was 3rd party CI for the vSphere driver and > > > (3) there is a big rich corporation maintaining the driver so I figured > > > they could pony up the resources to make that testing happen (and it > > > eventually did). > > > > > > For these other small changes are we OK with letting them in knowing > > > that the libvirt driver already logs a quality warning on startup for > > > these configs [4]? In this case I am but wanted to ask and I don't think > > > this sets a precedent as not all changes are equal. > > > > I'm OK with this and I think the quality warning sets an appropriate > > expectation. As I mentioned in the meeting, my opinion is I think > > sufficiently trivial changes are fine on this basis. I also wouldn't try > > to set a hard precedent because each thing needs review on whether it's > > "trivial", but I support a spirit of accepting simple changes without > > requiring full blown 3rd party CI, given the quality warnings we have > > for the configs mentioned. > > I think it is important to clarify that for both LXC and Xen testing should be able to happen in the upstream CI > system. Both are open source projects and the major gotcha to testing Xen in the past (reboots) should be possible > with Zuulv3. Typically we should fall back on third party CI when there are special hardware requirements or licenses > that prevent us from running software freely on our existing CI system. For open source tools like LXC and Xen I don't > believe this is a problem. yes that is all true. at present we have not had time to sit down and create a job for each. matt did have a poc job proposed for lxc which was partly working. i might consier creatign a job for each that runs as a periodic job and or on a subset of nova patches but at present i have some other ci jobs that i want to work on first. lxc/xen work for me is very much in the hobby work catagory but i did enjoy using nova with libvirt/lxc in the past and would like to see it work again in the future. it was quite nice for doing nested devstack without the overhead of two levels of vms. although when you get nested virt working its close from a performance point of view but still uses more memory then lxc. the lxc job itself in partalar is not much work to enabel the issue is that we cant actully pass tempest with lxc currently the cloud init fix will help but we also need to modify devstack to create a useable lxc image which is also relitivly trivial just need to be done. > > > > > -melanie > > > > > [1] > > > http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-17-14.00.log.html#l-287 > > > > > > [2] https://review.opendev.org/#/c/667976/ > > > [3] https://review.opendev.org/#/c/687827/ > > > [4] > > > https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/virt/libvirt/driver.py#L609 > > From dtantsur at redhat.com Fri Oct 18 10:19:12 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 18 Oct 2019 12:19:12 +0200 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Message-ID: Hi all, I think we should update global-requirement (on master and train) to exclude sushy 1.9.0, like sushy!=1.9.0 Since train has >=1.9.0 currently, it will be a good excuse to change it to 2.0.0. I'll leave the final word to the stable team though. Dmitry On Wed, Oct 16, 2019 at 3:17 AM wrote: > Hi, > > The Ironic Train release can be broken due to an entry in its > driver-requirements.txt. driver-requirements.txt defines a dependency on > the sushy package [1] which can be satisfied by version 1.9.0. > Unfortunately, that version contains a few bugs which prevent Ironic from > being able to manage Dell EMC and perhaps other vendors' bare metal > hardware with its Redfish hardware type (driver). The fixes to them > [2][3][4] were merged into master before the creation of stable/train. > Therefore, they are available on stable/train and in the last sushy release > created during the Train cycle, 2.0.0, the only other version which can > satisfy the dependency today. However, consumers -- packagers, operators, > and users -- could, fighting time constraints or lacking solid visibility > into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the > dependency, but, in so doing, unknowingly render the package or > installation severely broken. > > A change [5] has been proposed as part of a prospective solution to this > issue. It creates a new release of sushy from the change which fixes the > first bug [2]. Review comments [6] discuss basing the new release on a more > recent stable/train change to pick up other bug fixes and, less > importantly, backward compatible feature modifications and enhancements > which merged before the change from which 2.0.0 was created. Backward > compatible feature modifications and enhancements are interspersed in time > among the bug fixes. Once a new release is available, the sushy entry in > driver-requirements.txt on stable/train would be updated. However, > apparently, the stable branch policy prevents releases from being done at a > point earlier than the last release within a given cycle [6], which was > 2.0.0. > > Another possible resolution which comes to mind is to change the > definition of the sushy dependency in driver-requirements.txt [1] from > "sushy>=1.9.0" to "sushy>=2.0.0". > > Does anyone have a suggestion on how to proceed? > > Thank you, > Rick > > > [1] > https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 > [2] https://review.opendev.org/#/c/666253/ > [3] https://review.opendev.org/#/c/668936/ > [4] https://review.opendev.org/#/c/669889/ > [5] https://review.opendev.org/#/c/688551/ > [6] > https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Fri Oct 18 10:22:46 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 18 Oct 2019 12:22:46 +0200 Subject: [nova] Stance on trivial features for driver configs without integration testing In-Reply-To: <89408bfef395c92e07660aba8a5ac41debe500fc.camel@redhat.com> References: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> <89408bfef395c92e07660aba8a5ac41debe500fc.camel@redhat.com> Message-ID: Hey Clark, re Xen in CI: that's interesting - kolla ansible users request Xen support from time to time and we just spread our hands. We also do not have complete IPv6 support for Xen due to the lack of a testbed. If there was effort to provide Xen-enabled nodes in CI, I would be glad to hear of results. Kind regards, Radek pt., 18 paź 2019 o 12:02 Sean Mooney napisał(a): > On Thu, 2019-10-17 at 12:32 -0700, Clark Boylan wrote: > > On Thu, Oct 17, 2019, at 9:33 AM, melanie witt wrote: > > > On 10/17/19 08:39, Matt Riedemann wrote: > > > > This was brought up in the nova meeting today [1] as: > > > > > > > > "Do we have a particular stance on features to the libvirt driver > for > > > > non-integration tested configurations, e.g. lxc [2] and xen [3], > meaning > > > > if they are trivial enough do we just say the driver's quality > warning > > > > on startup is sufficient to let them land since these are changes > from > > > > casual contributors scratching an itch?" > > > > > > > > We agreed to move this to the mailing list. > > > > > > > > We don't have tempest jobs for the libvirt+lxc or libvirt+xen > > > > configurations (Citrix used to host 3rd party CI for the latter) and > for > > > > the changes referenced they are from part-time contributors, minor > and > > > > self-contained, and therefore I wouldn't expect them to build CI > jobs > > > > for those configurations or stand up 3rd party CI. > > > > > > > > There are cases in the past where we've held features out due to > lack of > > > > CI, e.g. live migration support in the vSphere driver. That's quite > a > > > > bit different in my opinion because (1) it's a much more complicated > > > > feature, (2) there already was 3rd party CI for the vSphere driver > and > > > > (3) there is a big rich corporation maintaining the driver so I > figured > > > > they could pony up the resources to make that testing happen (and it > > > > eventually did). > > > > > > > > For these other small changes are we OK with letting them in knowing > > > > that the libvirt driver already logs a quality warning on startup > for > > > > these configs [4]? In this case I am but wanted to ask and I don't > think > > > > this sets a precedent as not all changes are equal. > > > > > > I'm OK with this and I think the quality warning sets an appropriate > > > expectation. As I mentioned in the meeting, my opinion is I think > > > sufficiently trivial changes are fine on this basis. I also wouldn't > try > > > to set a hard precedent because each thing needs review on whether > it's > > > "trivial", but I support a spirit of accepting simple changes without > > > requiring full blown 3rd party CI, given the quality warnings we have > > > for the configs mentioned. > > > > I think it is important to clarify that for both LXC and Xen testing > should be able to happen in the upstream CI > > system. Both are open source projects and the major gotcha to testing > Xen in the past (reboots) should be possible > > with Zuulv3. Typically we should fall back on third party CI when there > are special hardware requirements or licenses > > that prevent us from running software freely on our existing CI system. > For open source tools like LXC and Xen I don't > > believe this is a problem. > yes that is all true. at present we have not had time to sit down and > create a job for each. > matt did have a poc job proposed for lxc which was partly working. i might > consier creatign a job for each that runs as > a periodic job and or on a subset of nova patches but at present i have > some other ci jobs > that i want to work on first. lxc/xen work for me is very much in the > hobby work catagory but i > did enjoy using nova with libvirt/lxc in the past and would like to see it > work again in the future. > it was quite nice for doing nested devstack without the overhead of two > levels of vms. although when you > get nested virt working its close from a performance point of view but > still uses more memory then lxc. > > the lxc job itself in partalar is not much work to enabel the issue is > that we cant actully pass tempest with lxc > currently the cloud init fix will help but we also need to modify devstack > to create a useable lxc image which is > also relitivly trivial just need to be done. > > > > > > > > -melanie > > > > > > > [1] > > > > > http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-17-14.00.log.html#l-287 > > > > > > > > [2] https://review.opendev.org/#/c/667976/ > > > > [3] https://review.opendev.org/#/c/687827/ > > > > [4] > > > > > https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/virt/libvirt/driver.py#L609 > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Fri Oct 18 10:37:41 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Fri, 18 Oct 2019 07:37:41 -0300 Subject: AW: [metrics] [telemetry] [stein] cpu_util In-Reply-To: References: <9058e09f-a5ce-4db9-5077-1217ece1695a@gmail.com> Message-ID: He guys, we are working to address this issue of Ceilometer. The first PR of a series that we are working on is the following: https://review.opendev.org/#/c/677031/. The idea is to leverage Ceilometer to enable operators/admins to create pollsters on the fly (without coding). Reviews are welcome to help us get the first commit of the feature in. Also, feedback and suggestions for further improvements. On Fri, Oct 18, 2019 at 12:23 AM Rico Lin wrote: > FYI, there's a new ML with similar topic: [1] > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010210.html > > On Mon, Sep 9, 2019 at 6:19 PM Tobias Urdin > wrote: > >> The cpu_util is a pain-point for us as well, we will unfortunately need >> to add that metric back >> to keep backward compatibility to our customers. >> >> Best regards >> Tobias >> >> On 9/9/19 11:37 AM, Blom, Merlin, NMU-OI wrote: >> > From Witek Bedyk on Re: [aodh] [heat] Stein: >> How to create alarms based on rate metrics like CPU utilization? >> > Fr 16.08.2019 17:11 >> > ' >> > Hi all, >> > >> > You can also collect `cpu.utilization_perc` metric with Monasca and >> trigger Heat auto-scaling as we demonstrated in the hands-on workshop at >> the last Summit in Denver. >> > >> > Here the Heat template we've used [1]. >> > You can find the workshop material here [2]. >> > >> > Cheers >> > Witek >> > >> > [1] >> > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sjamgade_monasca-2Dautoscaling_blob_master_final_autoscaling.yaml&d=DwICaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=hTUN4-Trlb-8Fh11dR6m5VD1uYA15z7v9WL8kYigkr8&m=KDzBi0a41i4kfZG7LrvMjx6tKJCAZHM71I9snAHtDbU&s=wZLSXjvqYiPmMVbz8fgezCE1iwxZcQXRe3zZZW1JBFo&e= >> > [2] >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sjamgade_monasca-2Dautoscaling&d=DwICaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=hTUN4-Trlb-8Fh11dR6m5VD1uYA15z7v9WL8kYigkr8&m=KDzBi0a41i4kfZG7LrvMjx6tKJCAZHM71I9snAHtDbU&s=M1D9BENrKX7HD43HfcYFuB8vdP9fKgAuGOTXtRq5aZI&e= >> > ' >> > >> > Cheers >> > Merlin >> > >> > -----Ursprüngliche Nachricht----- >> > Von: Budai Laszlo >> > Gesendet: Freitag, 16. August 2019 18:10 >> > An: OpenStack Discuss >> > Betreff: [metrics] [telemetry] [stein] cpu_util >> > >> > Hello all, >> > >> > the release release announce of ceilometer rocky is deprecating the >> cpu_util and *.rate metrics >> > "* cpu_util and *.rate meters are deprecated and will be removed in >> > future release in favor of the Gnocchi rate calculation equivalent." >> > >> > so we don't have them in Stein. Can you direct me to some document that >> describes how to achieve these with Gnocchi rate calculation? >> > >> > Thank you, >> > Laszlo >> > >> >> >> > > -- > May The Force of OpenStack Be With You, > > *Rico Lin*irc: ricolin > > -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Fri Oct 18 10:47:39 2019 From: satish.txt at gmail.com (Satish Patel) Date: Fri, 18 Oct 2019 06:47:39 -0400 Subject: [openstack-operators] RBD problems after data center power outage In-Reply-To: <20191018064434.Horde.B1S61lbGw5Cyp2me9iOzESF@webmail.nde.ag> References: <20191018064434.Horde.B1S61lbGw5Cyp2me9iOzESF@webmail.nde.ag> Message-ID: <12145433-E163-485B-B499-51A842EC8D6E@gmail.com> Very interesting post. Sent from my iPhone > On Oct 18, 2019, at 2:44 AM, Eugen Block wrote: > > Hi, > > I've recently found this post [1] to recover a failing header, but I haven't tried it myself. I'm curios if it works though. > > Regards, > Eugen > > https://fnordahl.com/2017/04/17/ceph-rbd-volume-header-recovery/ > > > Zitat von Dinçer Çelik : > >> Greetings, >> >> Today I had a data center power outage, and the OpenStack cluster went down. After taking the cluster up again, I cannot start some VMs due to error below. I've tried "rbd object-map rebuild" but it didn't work. What's the proper way to re-create the missing "_disk.config" files? >> >> Thanks. >> >> [instance: c2b54eac-179b-4907-9d61-8e075edc21cf] Failed to start libvirt guest: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-10-17T23:19:41.103720Z qemu-system-x86_64: -drive file=rbd:vms/c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config:id=nova:auth_supported=cephx\;none:mon_host=10.250.129.10\:6789\;10.250.129.11\:6789\;10.250.129.12\:6789\;10.250.129.15\:6789,file.password-secret=ide0-0-0-secret0,format=raw,if=none,id=drive-ide0-0-0,readonly=on,cache=writeback,discard=unmap: error reading header from c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or directory >> >> /* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */ >> >> /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ > > > > From hello at dincercelik.com Fri Oct 18 10:49:54 2019 From: hello at dincercelik.com (=?utf-8?B?RGluw6dlciDDh2VsaWs=?=) Date: Fri, 18 Oct 2019 13:49:54 +0300 Subject: [openstack-operators] RBD problems after data center power outage In-Reply-To: <20191018064434.Horde.B1S61lbGw5Cyp2me9iOzESF@webmail.nde.ag> References: <20191018064434.Horde.B1S61lbGw5Cyp2me9iOzESF@webmail.nde.ag> Message-ID: <827E7ADD-43A5-440F-AD5E-855006572946@dincercelik.com> Hi Eugen, I think this is not the same situation with I’m facing because I can get rbd headers. Regards /* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */ /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ > On 18 Oct 2019, at 09:44, Eugen Block wrote: > > Hi, > > I've recently found this post [1] to recover a failing header, but I haven't tried it myself. I'm curios if it works though. > > Regards, > Eugen > > https://fnordahl.com/2017/04/17/ceph-rbd-volume-header-recovery/ > > > Zitat von Dinçer Çelik : > >> Greetings, >> >> Today I had a data center power outage, and the OpenStack cluster went down. After taking the cluster up again, I cannot start some VMs due to error below. I've tried "rbd object-map rebuild" but it didn't work. What's the proper way to re-create the missing "_disk.config" files? >> >> Thanks. >> >> [instance: c2b54eac-179b-4907-9d61-8e075edc21cf] Failed to start libvirt guest: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-10-17T23:19:41.103720Z qemu-system-x86_64: -drive file=rbd:vms/c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config:id=nova:auth_supported=cephx\;none:mon_host=10.250.129.10\:6789\;10.250.129.11\:6789\;10.250.129.12\:6789\;10.250.129.15\:6789,file.password-secret=ide0-0-0-secret0,format=raw,if=none,id=drive-ide0-0-0,readonly=on,cache=writeback,discard=unmap: error reading header from c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or directory >> >> /* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */ >> >> /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donny at fortnebula.com Fri Oct 18 10:50:22 2019 From: donny at fortnebula.com (Donny Davis) Date: Fri, 18 Oct 2019 06:50:22 -0400 Subject: [nova] Stance on trivial features for driver configs without integration testing In-Reply-To: References: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> <89408bfef395c92e07660aba8a5ac41debe500fc.camel@redhat.com> Message-ID: So would it help if those resources were able in CI 1st party? I would really like to see the lxc driver move forward, but also am happy to work at xen support too. Obviously with CI resources its nice to have them available in all the providers, but for these non-mainline drivers I think something could be done to address it so we can continue to support them. Donny Davis c: 805 814 6800 On Fri, Oct 18, 2019, 6:25 AM Radosław Piliszek wrote: > Hey Clark, > > re Xen in CI: > that's interesting - kolla ansible users request Xen support from time to > time and we just spread our hands. > We also do not have complete IPv6 support for Xen due to the lack of a > testbed. > If there was effort to provide Xen-enabled nodes in CI, I would be glad to > hear of results. > > Kind regards, > Radek > > pt., 18 paź 2019 o 12:02 Sean Mooney napisał(a): > >> On Thu, 2019-10-17 at 12:32 -0700, Clark Boylan wrote: >> > On Thu, Oct 17, 2019, at 9:33 AM, melanie witt wrote: >> > > On 10/17/19 08:39, Matt Riedemann wrote: >> > > > This was brought up in the nova meeting today [1] as: >> > > > >> > > > "Do we have a particular stance on features to the libvirt driver >> for >> > > > non-integration tested configurations, e.g. lxc [2] and xen [3], >> meaning >> > > > if they are trivial enough do we just say the driver's quality >> warning >> > > > on startup is sufficient to let them land since these are changes >> from >> > > > casual contributors scratching an itch?" >> > > > >> > > > We agreed to move this to the mailing list. >> > > > >> > > > We don't have tempest jobs for the libvirt+lxc or libvirt+xen >> > > > configurations (Citrix used to host 3rd party CI for the latter) >> and for >> > > > the changes referenced they are from part-time contributors, minor >> and >> > > > self-contained, and therefore I wouldn't expect them to build CI >> jobs >> > > > for those configurations or stand up 3rd party CI. >> > > > >> > > > There are cases in the past where we've held features out due to >> lack of >> > > > CI, e.g. live migration support in the vSphere driver. That's quite >> a >> > > > bit different in my opinion because (1) it's a much more >> complicated >> > > > feature, (2) there already was 3rd party CI for the vSphere driver >> and >> > > > (3) there is a big rich corporation maintaining the driver so I >> figured >> > > > they could pony up the resources to make that testing happen (and >> it >> > > > eventually did). >> > > > >> > > > For these other small changes are we OK with letting them in >> knowing >> > > > that the libvirt driver already logs a quality warning on startup >> for >> > > > these configs [4]? In this case I am but wanted to ask and I don't >> think >> > > > this sets a precedent as not all changes are equal. >> > > >> > > I'm OK with this and I think the quality warning sets an appropriate >> > > expectation. As I mentioned in the meeting, my opinion is I think >> > > sufficiently trivial changes are fine on this basis. I also wouldn't >> try >> > > to set a hard precedent because each thing needs review on whether >> it's >> > > "trivial", but I support a spirit of accepting simple changes without >> > > requiring full blown 3rd party CI, given the quality warnings we have >> > > for the configs mentioned. >> > >> > I think it is important to clarify that for both LXC and Xen testing >> should be able to happen in the upstream CI >> > system. Both are open source projects and the major gotcha to testing >> Xen in the past (reboots) should be possible >> > with Zuulv3. Typically we should fall back on third party CI when there >> are special hardware requirements or licenses >> > that prevent us from running software freely on our existing CI system. >> For open source tools like LXC and Xen I don't >> > believe this is a problem. >> yes that is all true. at present we have not had time to sit down and >> create a job for each. >> matt did have a poc job proposed for lxc which was partly working. i >> might consier creatign a job for each that runs as >> a periodic job and or on a subset of nova patches but at present i have >> some other ci jobs >> that i want to work on first. lxc/xen work for me is very much in the >> hobby work catagory but i >> did enjoy using nova with libvirt/lxc in the past and would like to see >> it work again in the future. >> it was quite nice for doing nested devstack without the overhead of two >> levels of vms. although when you >> get nested virt working its close from a performance point of view but >> still uses more memory then lxc. >> >> the lxc job itself in partalar is not much work to enabel the issue is >> that we cant actully pass tempest with lxc >> currently the cloud init fix will help but we also need to modify >> devstack to create a useable lxc image which is >> also relitivly trivial just need to be done. >> > >> > > >> > > -melanie >> > > >> > > > [1] >> > > > >> http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-17-14.00.log.html#l-287 >> > > > >> > > > [2] https://review.opendev.org/#/c/667976/ >> > > > [3] https://review.opendev.org/#/c/687827/ >> > > > [4] >> > > > >> https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/virt/libvirt/driver.py#L609 >> > >> > >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Fri Oct 18 11:27:12 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 18 Oct 2019 12:27:12 +0100 Subject: [nova] Stance on trivial features for driver configs without integration testing In-Reply-To: References: <748a09a3-b6c0-b8e8-58b2-88e06c166aa1@gmail.com> <185e0d3c-4a40-8386-9018-35c561c360ee@gmail.com> <89408bfef395c92e07660aba8a5ac41debe500fc.camel@redhat.com> Message-ID: <61a1863db6c3f7a1acbbcef8d4c6f2bd209d623b.camel@redhat.com> On Fri, 2019-10-18 at 06:50 -0400, Donny Davis wrote: > So would it help if those resources were able in CI 1st party? am that is not really the blocker. unlike kvm to use xen you need to reboot into dom0 before you can then install openstack in the vm. that was previously not possibel with jenkins but shoudl be possibel to do with zuulv3. so all that is really requried to get xen to work is to create a pre playbook that installs xen, set the defualt for devstack/kolla to use xen and then run the standard tempest jobs. we may need to also use a l1 vm that supports nested vrit as i think xen will need acess to the vmx/svm instruction set to fucntion but we now have a lable for that so i dont think that would be a blocker anymore. > I would > really like to see the lxc driver move forward, but also am happy to work > at xen support too. lxc is much simpeler. its actully instlled by default on ubunutu so it is quite easy to enable test of it and no reboots are required. the main chalange is that since we went to cirros 4 the devstack code for creating a cirros image does not work. the patch that fixed cloud init however described how to create a functioning image which i did by hand so we coudl automate that. there are also prebuilt images we coudl use instead of the cirros one. so if we actully matt basically had the job working in https://review.opendev.org/#/c/676024/ if we used a different image and fix one or two other edgecases in nova it likely would work. > > Obviously with CI resources its nice to have them available in all the > providers, but for these non-mainline drivers I think something could be > done to address it so we can continue to support them. > > > > Donny Davis > c: 805 814 6800 > > On Fri, Oct 18, 2019, 6:25 AM Radosław Piliszek > wrote: > > > Hey Clark, > > > > re Xen in CI: > > that's interesting - kolla ansible users request Xen support from time to > > time and we just spread our hands. > > We also do not have complete IPv6 support for Xen due to the lack of a > > testbed. > > If there was effort to provide Xen-enabled nodes in CI, I would be glad to > > hear of results. > > > > Kind regards, > > Radek > > > > pt., 18 paź 2019 o 12:02 Sean Mooney napisał(a): > > > > > On Thu, 2019-10-17 at 12:32 -0700, Clark Boylan wrote: > > > > On Thu, Oct 17, 2019, at 9:33 AM, melanie witt wrote: > > > > > On 10/17/19 08:39, Matt Riedemann wrote: > > > > > > This was brought up in the nova meeting today [1] as: > > > > > > > > > > > > "Do we have a particular stance on features to the libvirt driver > > > > > > for > > > > > > non-integration tested configurations, e.g. lxc [2] and xen [3], > > > > > > meaning > > > > > > if they are trivial enough do we just say the driver's quality > > > > > > warning > > > > > > on startup is sufficient to let them land since these are changes > > > > > > from > > > > > > casual contributors scratching an itch?" > > > > > > > > > > > > We agreed to move this to the mailing list. > > > > > > > > > > > > We don't have tempest jobs for the libvirt+lxc or libvirt+xen > > > > > > configurations (Citrix used to host 3rd party CI for the latter) > > > > > > and for > > > > > > the changes referenced they are from part-time contributors, minor > > > > > > and > > > > > > self-contained, and therefore I wouldn't expect them to build CI > > > > > > jobs > > > > > > for those configurations or stand up 3rd party CI. > > > > > > > > > > > > There are cases in the past where we've held features out due to > > > > > > lack of > > > > > > CI, e.g. live migration support in the vSphere driver. That's quite > > > > > > a > > > > > > bit different in my opinion because (1) it's a much more > > > > > > complicated > > > > > > feature, (2) there already was 3rd party CI for the vSphere driver > > > > > > and > > > > > > (3) there is a big rich corporation maintaining the driver so I > > > > > > figured > > > > > > they could pony up the resources to make that testing happen (and > > > > > > it > > > > > > eventually did). > > > > > > > > > > > > For these other small changes are we OK with letting them in > > > > > > knowing > > > > > > that the libvirt driver already logs a quality warning on startup > > > > > > for > > > > > > these configs [4]? In this case I am but wanted to ask and I don't > > > > > > think > > > > > > this sets a precedent as not all changes are equal. > > > > > > > > > > I'm OK with this and I think the quality warning sets an appropriate > > > > > expectation. As I mentioned in the meeting, my opinion is I think > > > > > sufficiently trivial changes are fine on this basis. I also wouldn't > > > > > > try > > > > > to set a hard precedent because each thing needs review on whether > > > > > > it's > > > > > "trivial", but I support a spirit of accepting simple changes without > > > > > requiring full blown 3rd party CI, given the quality warnings we have > > > > > for the configs mentioned. > > > > > > > > I think it is important to clarify that for both LXC and Xen testing > > > > > > should be able to happen in the upstream CI > > > > system. Both are open source projects and the major gotcha to testing > > > > > > Xen in the past (reboots) should be possible > > > > with Zuulv3. Typically we should fall back on third party CI when there > > > > > > are special hardware requirements or licenses > > > > that prevent us from running software freely on our existing CI system. > > > > > > For open source tools like LXC and Xen I don't > > > > believe this is a problem. > > > > > > yes that is all true. at present we have not had time to sit down and > > > create a job for each. > > > matt did have a poc job proposed for lxc which was partly working. i > > > might consier creatign a job for each that runs as > > > a periodic job and or on a subset of nova patches but at present i have > > > some other ci jobs > > > that i want to work on first. lxc/xen work for me is very much in the > > > hobby work catagory but i > > > did enjoy using nova with libvirt/lxc in the past and would like to see > > > it work again in the future. > > > it was quite nice for doing nested devstack without the overhead of two > > > levels of vms. although when you > > > get nested virt working its close from a performance point of view but > > > still uses more memory then lxc. > > > > > > the lxc job itself in partalar is not much work to enabel the issue is > > > that we cant actully pass tempest with lxc > > > currently the cloud init fix will help but we also need to modify > > > devstack to create a useable lxc image which is > > > also relitivly trivial just need to be done. > > > > > > > > > > > > > > -melanie > > > > > > > > > > > [1] > > > > > > > > > > > > http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-17-14.00.log.html#l-287 > > > > > > > > > > > > [2] https://review.opendev.org/#/c/667976/ > > > > > > [3] https://review.opendev.org/#/c/687827/ > > > > > > [4] > > > > > > > > > > > > https://github.com/openstack/nova/blob/1a226aaa9e8c969ddfdfe198c36f7966b1f692f3/nova/virt/libvirt/driver.py#L609 > > > > > > > > > > > > > > > > > From eblock at nde.ag Fri Oct 18 11:29:58 2019 From: eblock at nde.ag (Eugen Block) Date: Fri, 18 Oct 2019 11:29:58 +0000 Subject: [openstack-operators] RBD problems after data center power outage In-Reply-To: <827E7ADD-43A5-440F-AD5E-855006572946@dincercelik.com> References: <20191018064434.Horde.B1S61lbGw5Cyp2me9iOzESF@webmail.nde.ag> <827E7ADD-43A5-440F-AD5E-855006572946@dincercelik.com> Message-ID: <20191018112958.Horde.ucU5QftVSf29_CNJX57eH6S@webmail.nde.ag> I assumed the header was missing because of this message: > error reading header from > c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or > directory If you can stat the header file can you share the output of rados -p vms listomapvals rbd_header. Are there rbd_data objects left in the pool from that config drive? rados -p images ls | grep rbd_object_map.1cbc666b8b4567 rbd_data.1cbc666b8b4567.0000000000000000 rbd_header.1cbc666b8b4567 If yes, maybe there's a way to set things back together, which I haven't done yet. Are all affected VMs referring to a config drive and is it always the config drive object that's missing? Zitat von Dinçer Çelik : > Hi Eugen, > > I think this is not the same situation with I’m facing because I can > get rbd headers. > > Regards > > /* Please encrypt every message you can. Privacy is your right, > don't let anyone take it from you. */ > > /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ > >> On 18 Oct 2019, at 09:44, Eugen Block wrote: >> >> Hi, >> >> I've recently found this post [1] to recover a failing header, but >> I haven't tried it myself. I'm curios if it works though. >> >> Regards, >> Eugen >> >> https://fnordahl.com/2017/04/17/ceph-rbd-volume-header-recovery/ >> >> >> Zitat von Dinçer Çelik : >> >>> Greetings, >>> >>> Today I had a data center power outage, and the OpenStack cluster >>> went down. After taking the cluster up again, I cannot start some >>> VMs due to error below. I've tried "rbd object-map rebuild" but it >>> didn't work. What's the proper way to re-create the missing >>> "_disk.config" files? >>> >>> Thanks. >>> >>> [instance: c2b54eac-179b-4907-9d61-8e075edc21cf] Failed to start >>> libvirt guest: libvirt.libvirtError: internal error: qemu >>> unexpectedly closed the monitor: 2019-10-17T23:19:41.103720Z >>> qemu-system-x86_64: -drive >>> file=rbd:vms/c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config:id=nova:auth_supported=cephx\;none:mon_host=10.250.129.10\:6789\;10.250.129.11\:6789\;10.250.129.12\:6789\;10.250.129.15\:6789,file.password-secret=ide0-0-0-secret0,format=raw,if=none,id=drive-ide0-0-0,readonly=on,cache=writeback,discard=unmap: error reading header from c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or >>> directory >>> >>> /* Please encrypt every message you can. Privacy is your right, >>> don't let anyone take it from you. */ >>> >>> /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ >> >> >> >> From geguileo at redhat.com Fri Oct 18 14:43:21 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 18 Oct 2019 16:43:21 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> Message-ID: <20191018144321.ovl4lqv2hxveblcd@localhost> On 17/10, Herve Beraud wrote: > Thanks Ben for your feedbacks. > > I already tried to follow the `remove_external_lock_file` few months ago, > but unfortunately, I don't think we can goes like this with Cinder... > > As Gorka has explained to me few months ago: > > > Those are not the only type of locks we use in Cinder. Those are the > > ones we call "Global locks" and use TooZ so the DLM can be configured > > for Cinder Active-Active. > > > > We also use Oslo's synchronized locks. > > > > More information is available in the Cinder HA dev ref I wrote last > > year. It has a section dedicated to the topic of mutual exclusion and > > the 4 types we currently have in Cinder [1]: > > > > - Database locking using resource states. > > - Process locks. > > - Node locks. > > - Global locks. > > > > As for calling the remove_external_lock_file_with_prefix directly on > > delete, I don't think that's something we can do, as the locks may still > > be in use. Example: > > > > - Start deleting volume -> get lock > > - Try to clone volume -> wait for lock > > - Finish deleting volume -> release and delete lock > > - Cloning recreates the lock when acquiring it > > - Cloning fails because the volume no longer exists but leaves the lock > > So the Cinder workflow and mechanisms seems to definitively forbid to > us the possibility to use the remove features of oslo.concurrency... > > Also like discussed on the review (https://review.opendev.org/#/c/688413), > this issue can't be fixed in the underlying libraries, and I think that if > we want to fix that on stable branches then Cinder need to address it > directly by adding some piece of code who will be triggered if needed and > in a safely manner, in other words, only Cinder can really address it and > remove safely these file. > > See the discussion extract on the review ( > https://review.opendev.org/#/c/688413): > Hi, I've given it some more thought, and I am now on the side of those that argue that "something imperfect" is better than what we currently have, so maybe we can reach some sort of compromise doing the following: - Cleanup locks directory on node start - Remove locks on delete volume/snapshot operation - Remove locks on missing source on create volume (volume/snapshot) It may not cover 100% of the cases, and in a long running system with few deletions will not help, but it should at least alleviate the issue for some users. To illustrate the Cinder part of this approach I have written a WIP patch [1] (and an oslo.concurrency one [3] that is not needed, but would improve the user experience). I haven't bothered to test it yet, but I'll do it if we agree this is a reasonable compromise we are comfortable with. Cheers, Gorka. [1]: https://review.opendev.org/689486 [2]: https://review.opendev.org/689482 > > Thanks Gorka for your feedback, then in view of all the discussions > > about this topic I suppose only Cinder can really address it safely > > on stable branches. > > > > > It is not a safe assumption that *-delete_volume file locks can be > > > removed just because they have not been used in a couple of days. > > > A new volume clone could come in that would use it and then we > > > could have a race condition if the cron job was running. > > > > > > The only way to be sure that it can be removed is checking in the > > > Cinder DB and making sure that the volume has been deleted or it > > > doesn't even exist (DB has been purged). > > > > > > Same thing with detach_volume, delete_snapshot, and those that are > > > directly volume ids locks. > > > > I definitely think that it can't be fixed in the underlying > > libraries like Eric has suggested [1], indeed, as you has explained > > only Cinder can know if a lock file can be removed safely. > > > > > In my opinion the fix should be done in fasteners, or we should add > > > code in Cinder that cleans up all locks related to a volume or > > > snapshot when this one is deleted. > > > > I agree the most better solution is to fix the root cause and so to > > fix fasteners, but I don't think it's can be backported to stable > > branches because we will need to bump a requirement version on > > stable branche in this case and also because it'll introduce new > > features, so I guess Cinder need to add some code to remove these > > files and possibly backport it to stable branches. > > > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009563.html > > The Fasteners fix IMHO can only be used by future versions of openstack, > due to the version bump and due to the new features added. I think that it > could be available only from the ussuri or future cycle like V. > > The main goal of the cron approach was to definitively avoid to unearth > this topic each 6 months, try to address it on stable branches, and try to > take care of the file system usage even if it's a theoretical issue, but by > getting feedbacks from the Cinder team and their warnings I don't think > that this track is still followable. > > Definitely, this is not an oslo.concurrency bug. Anyway your proposed > "Administrator Guide" is a must to have, to track things in one place, > inform users and avoid to spend time to explain the same things again and > again about this topic... so it's worth-it. I'll review it and propose my > related knowledge on this topic. oslo.concurrency can't address this safely > because we risk to introduce race conditions and worse situations than the > leftover lock files. > > So, due to all these elements, only cinder can address it for the moment > and for fix that on stable branches too. > > Le mer. 16 oct. 2019 à 00:15, Ben Nemec a écrit : > > > In the interest of not having to start this discussion from scratch > > every time, I've done a bit of a brain dump into > > https://review.opendev.org/#/c/688825/ that covers why things are the > > way they are and what we recommend people do about it. Please take a > > look and let me know if you see any issues with it. > > > > Thanks. > > > > -Ben > > > > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- From mthode at mthode.org Fri Oct 18 16:06:09 2019 From: mthode at mthode.org (Matthew Thode) Date: Fri, 18 Oct 2019 11:06:09 -0500 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Message-ID: <20191018160609.eadalm2qwwpjsigc@mthode.org> On 19-10-18 12:19:12, Dmitry Tantsur wrote: > Hi all, > > I think we should update global-requirement (on master and train) to > exclude sushy 1.9.0, like > > sushy!=1.9.0 > > Since train has >=1.9.0 currently, it will be a good excuse to change it to > 2.0.0. > > I'll leave the final word to the stable team though. > > Dmitry > > On Wed, Oct 16, 2019 at 3:17 AM wrote: > > > Hi, > > > > The Ironic Train release can be broken due to an entry in its > > driver-requirements.txt. driver-requirements.txt defines a dependency on > > the sushy package [1] which can be satisfied by version 1.9.0. > > Unfortunately, that version contains a few bugs which prevent Ironic from > > being able to manage Dell EMC and perhaps other vendors' bare metal > > hardware with its Redfish hardware type (driver). The fixes to them > > [2][3][4] were merged into master before the creation of stable/train. > > Therefore, they are available on stable/train and in the last sushy release > > created during the Train cycle, 2.0.0, the only other version which can > > satisfy the dependency today. However, consumers -- packagers, operators, > > and users -- could, fighting time constraints or lacking solid visibility > > into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the > > dependency, but, in so doing, unknowingly render the package or > > installation severely broken. > > > > A change [5] has been proposed as part of a prospective solution to this > > issue. It creates a new release of sushy from the change which fixes the > > first bug [2]. Review comments [6] discuss basing the new release on a more > > recent stable/train change to pick up other bug fixes and, less > > importantly, backward compatible feature modifications and enhancements > > which merged before the change from which 2.0.0 was created. Backward > > compatible feature modifications and enhancements are interspersed in time > > among the bug fixes. Once a new release is available, the sushy entry in > > driver-requirements.txt on stable/train would be updated. However, > > apparently, the stable branch policy prevents releases from being done at a > > point earlier than the last release within a given cycle [6], which was > > 2.0.0. > > > > Another possible resolution which comes to mind is to change the > > definition of the sushy dependency in driver-requirements.txt [1] from > > "sushy>=1.9.0" to "sushy>=2.0.0". > > > > Does anyone have a suggestion on how to proceed? > > > > Thank you, > > Rick > > > > > > [1] > > https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 > > [2] https://review.opendev.org/#/c/666253/ > > [3] https://review.opendev.org/#/c/668936/ > > [4] https://review.opendev.org/#/c/669889/ > > [5] https://review.opendev.org/#/c/688551/ > > [6] > > https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 > > > > > > Excluding known bad versions to efectively update the minimum constraint IS allowed by policy as far as I know (from a reqs team perspective). So this sgtm. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From doug at doughellmann.com Fri Oct 18 17:36:37 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 18 Oct 2019 13:36:37 -0400 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <20191018160609.eadalm2qwwpjsigc@mthode.org> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <20191018160609.eadalm2qwwpjsigc@mthode.org> Message-ID: <5AA858C6-0F6C-433D-86DC-5067C89C072E@doughellmann.com> > On Oct 18, 2019, at 12:06 PM, Matthew Thode wrote: > > On 19-10-18 12:19:12, Dmitry Tantsur wrote: >> Hi all, >> >> I think we should update global-requirement (on master and train) to >> exclude sushy 1.9.0, like >> >> sushy!=1.9.0 >> >> Since train has >=1.9.0 currently, it will be a good excuse to change it to >> 2.0.0. >> >> I'll leave the final word to the stable team though. >> >> Dmitry >> >> On Wed, Oct 16, 2019 at 3:17 AM wrote: >> >>> Hi, >>> >>> The Ironic Train release can be broken due to an entry in its >>> driver-requirements.txt. driver-requirements.txt defines a dependency on >>> the sushy package [1] which can be satisfied by version 1.9.0. >>> Unfortunately, that version contains a few bugs which prevent Ironic from >>> being able to manage Dell EMC and perhaps other vendors' bare metal >>> hardware with its Redfish hardware type (driver). The fixes to them >>> [2][3][4] were merged into master before the creation of stable/train. >>> Therefore, they are available on stable/train and in the last sushy release >>> created during the Train cycle, 2.0.0, the only other version which can >>> satisfy the dependency today. However, consumers -- packagers, operators, >>> and users -- could, fighting time constraints or lacking solid visibility >>> into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the >>> dependency, but, in so doing, unknowingly render the package or >>> installation severely broken. >>> >>> A change [5] has been proposed as part of a prospective solution to this >>> issue. It creates a new release of sushy from the change which fixes the >>> first bug [2]. Review comments [6] discuss basing the new release on a more >>> recent stable/train change to pick up other bug fixes and, less >>> importantly, backward compatible feature modifications and enhancements >>> which merged before the change from which 2.0.0 was created. Backward >>> compatible feature modifications and enhancements are interspersed in time >>> among the bug fixes. Once a new release is available, the sushy entry in >>> driver-requirements.txt on stable/train would be updated. However, >>> apparently, the stable branch policy prevents releases from being done at a >>> point earlier than the last release within a given cycle [6], which was >>> 2.0.0. >>> >>> Another possible resolution which comes to mind is to change the >>> definition of the sushy dependency in driver-requirements.txt [1] from >>> "sushy>=1.9.0" to "sushy>=2.0.0". >>> >>> Does anyone have a suggestion on how to proceed? >>> >>> Thank you, >>> Rick >>> >>> >>> [1] >>> https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 >>> [2] https://review.opendev.org/#/c/666253/ >>> [3] https://review.opendev.org/#/c/668936/ >>> [4] https://review.opendev.org/#/c/669889/ >>> [5] https://review.opendev.org/#/c/688551/ >>> [6] >>> https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 >>> >>> >>> > > Excluding known bad versions to efectively update the minimum constraint > IS allowed by policy as far as I know (from a reqs team perspective). > So this sgtm. > > -- > Matthew Thode I agree, we should exclude the bad version in constraints. But we shouldn’t *only* do that as a way to side-step our other policies and raise the minimum version. We don’t typically change the minimum version of a dependency just because of a bug fix. In this case, though, we have what sounds like a significant incompatibility, and IIRC we have allowed updates in the past to resolve those problems. In this case, I think it’s safe to call the current dependency setting for sushy in the Ironic stable/train branch a bug (in ironic) and to allow that minimum to be updated without considering it a break in the stable policy, because Ironic is broken without the update. Normally we would want the new release after the dependency update to be a feature update (the Y in X.Y.Z) because we haven’t added an incompatible dependency or changed the API but we have updated the dependencies. So, I think that means we will need an Ironic 2.1.0 release off of stable/train after the dependency is fixed (rather than 2.0.1 for a bug fix or 3.0.0 for a major update). Doug From doug at doughellmann.com Fri Oct 18 17:46:08 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 18 Oct 2019 13:46:08 -0400 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <5AA858C6-0F6C-433D-86DC-5067C89C072E@doughellmann.com> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <20191018160609.eadalm2qwwpjsigc@mthode.org> <5AA858C6-0F6C-433D-86DC-5067C89C072E@doughellmann.com> Message-ID: <6607B630-EDAE-48F4-BFFB-F3FCA39FDD3B@doughellmann.com> > On Oct 18, 2019, at 1:36 PM, Doug Hellmann wrote: > > > >> On Oct 18, 2019, at 12:06 PM, Matthew Thode wrote: >> >> On 19-10-18 12:19:12, Dmitry Tantsur wrote: >>> Hi all, >>> >>> I think we should update global-requirement (on master and train) to >>> exclude sushy 1.9.0, like >>> >>> sushy!=1.9.0 >>> >>> Since train has >=1.9.0 currently, it will be a good excuse to change it to >>> 2.0.0. >>> >>> I'll leave the final word to the stable team though. >>> >>> Dmitry >>> >>> On Wed, Oct 16, 2019 at 3:17 AM wrote: >>> >>>> Hi, >>>> >>>> The Ironic Train release can be broken due to an entry in its >>>> driver-requirements.txt. driver-requirements.txt defines a dependency on >>>> the sushy package [1] which can be satisfied by version 1.9.0. >>>> Unfortunately, that version contains a few bugs which prevent Ironic from >>>> being able to manage Dell EMC and perhaps other vendors' bare metal >>>> hardware with its Redfish hardware type (driver). The fixes to them >>>> [2][3][4] were merged into master before the creation of stable/train. >>>> Therefore, they are available on stable/train and in the last sushy release >>>> created during the Train cycle, 2.0.0, the only other version which can >>>> satisfy the dependency today. However, consumers -- packagers, operators, >>>> and users -- could, fighting time constraints or lacking solid visibility >>>> into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the >>>> dependency, but, in so doing, unknowingly render the package or >>>> installation severely broken. >>>> >>>> A change [5] has been proposed as part of a prospective solution to this >>>> issue. It creates a new release of sushy from the change which fixes the >>>> first bug [2]. Review comments [6] discuss basing the new release on a more >>>> recent stable/train change to pick up other bug fixes and, less >>>> importantly, backward compatible feature modifications and enhancements >>>> which merged before the change from which 2.0.0 was created. Backward >>>> compatible feature modifications and enhancements are interspersed in time >>>> among the bug fixes. Once a new release is available, the sushy entry in >>>> driver-requirements.txt on stable/train would be updated. However, >>>> apparently, the stable branch policy prevents releases from being done at a >>>> point earlier than the last release within a given cycle [6], which was >>>> 2.0.0. >>>> >>>> Another possible resolution which comes to mind is to change the >>>> definition of the sushy dependency in driver-requirements.txt [1] from >>>> "sushy>=1.9.0" to "sushy>=2.0.0". >>>> >>>> Does anyone have a suggestion on how to proceed? >>>> >>>> Thank you, >>>> Rick >>>> >>>> >>>> [1] >>>> https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 >>>> [2] https://review.opendev.org/#/c/666253/ >>>> [3] https://review.opendev.org/#/c/668936/ >>>> [4] https://review.opendev.org/#/c/669889/ >>>> [5] https://review.opendev.org/#/c/688551/ >>>> [6] >>>> https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 >>>> >>>> >>>> >> >> Excluding known bad versions to efectively update the minimum constraint >> IS allowed by policy as far as I know (from a reqs team perspective). >> So this sgtm. >> >> -- >> Matthew Thode > > I agree, we should exclude the bad version in constraints. But we shouldn’t *only* do that as a way to side-step our other policies and raise the minimum version. > > We don’t typically change the minimum version of a dependency just because of a bug fix. In this case, though, we have what sounds like a significant incompatibility, and IIRC we have allowed updates in the past to resolve those problems. > > In this case, I think it’s safe to call the current dependency setting for sushy in the Ironic stable/train branch a bug (in ironic) and to allow that minimum to be updated without considering it a break in the stable policy, because Ironic is broken without the update. > > Normally we would want the new release after the dependency update to be a feature update (the Y in X.Y.Z) because we haven’t added an incompatible dependency or changed the API but we have updated the dependencies. So, I think that means we will need an Ironic 2.1.0 release off of stable/train after the dependency is fixed (rather than 2.0.1 for a bug fix or 3.0.0 for a major update). > > Doug > Sorry, ignore my version number math there. I picked up the sushy version number rather than the ironic version number. Just do a feature release of ironic on stable/train, whatever the right number is. Doug From sean.mcginnis at gmx.com Fri Oct 18 19:14:39 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Fri, 18 Oct 2019 14:14:39 -0500 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: <20191018144321.ovl4lqv2hxveblcd@localhost> References: <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> <20191018144321.ovl4lqv2hxveblcd@localhost> Message-ID: <20191018191439.GA26163@sm-workstation> > > > > Hi, > > I've given it some more thought, and I am now on the side of those that > argue that "something imperfect" is better than what we currently have, > so maybe we can reach some sort of compromise doing the following: > > - Cleanup locks directory on node start Didn't we determine this was not safe since multiple services can be configured to use the same lock directory? In fact, that was the recommended configuration I think when running Cinder and Nova services on the same node so os-brick locks would actually work right (or something like that, it was awhile ago now). > - Remove locks on delete volume/snapshot operation > - Remove locks on missing source on create volume (volume/snapshot) +1 From sean.mcginnis at gmx.com Fri Oct 18 19:18:28 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Fri, 18 Oct 2019 14:18:28 -0500 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <5AA858C6-0F6C-433D-86DC-5067C89C072E@doughellmann.com> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <20191018160609.eadalm2qwwpjsigc@mthode.org> <5AA858C6-0F6C-433D-86DC-5067C89C072E@doughellmann.com> Message-ID: <20191018191828.GB26163@sm-workstation> > > > > Excluding known bad versions to efectively update the minimum constraint > > IS allowed by policy as far as I know (from a reqs team perspective). > > So this sgtm. > > > > -- > > Matthew Thode > > I agree, we should exclude the bad version in constraints. But we shouldn’t *only* do that as a way to side-step our other policies and raise the minimum version. > > We don’t typically change the minimum version of a dependency just because of a bug fix. In this case, though, we have what sounds like a significant incompatibility, and IIRC we have allowed updates in the past to resolve those problems. > > In this case, I think it’s safe to call the current dependency setting for sushy in the Ironic stable/train branch a bug (in ironic) and to allow that minimum to be updated without considering it a break in the stable policy, because Ironic is broken without the update. +1, I think this should be fine for a stable backport. The feature version bump will be a good indication to anyone picking it up that something (semi)significant has changed, so they should pay attention. Might be a good idea to include a release note with the change to make sure it is called out in a way that consumers will be aware of the change. > > Normally we would want the new release after the dependency update to be a feature update (the Y in X.Y.Z) because we haven’t added an incompatible dependency or changed the API but we have updated the dependencies. So, I think that means we will need an Ironic 2.1.0 release off of stable/train after the dependency is fixed (rather than 2.0.1 for a bug fix or 3.0.0 for a major update). > > Doug > > > From hello at dincercelik.com Fri Oct 18 19:22:12 2019 From: hello at dincercelik.com (=?utf-8?B?RGluw6dlciDDh2VsaWs=?=) Date: Fri, 18 Oct 2019 22:22:12 +0300 Subject: [openstack-operators] RBD problems after data center power outage In-Reply-To: <20191018112958.Horde.ucU5QftVSf29_CNJX57eH6S@webmail.nde.ag> References: <20191018064434.Horde.B1S61lbGw5Cyp2me9iOzESF@webmail.nde.ag> <827E7ADD-43A5-440F-AD5E-855006572946@dincercelik.com> <20191018112958.Horde.ucU5QftVSf29_CNJX57eH6S@webmail.nde.ag> Message-ID: <2E9FD81A-3855-4C13-8D28-3BC4BFD16C01@dincercelik.com> Hi, I've fixed the issue. First, I would like to thank to all Ceph developers for making it bulletproof. The root cause was "force_config_drive"[1] option of Nova that I had enabled few weeks ago. When you enable this option, Nova creates a new disk with the same name ending with ".config". The reason why I had enabled this option is, I am facing dhcp related issues sometimes. Temporary disabling this option fixed the issue. Regards [1] https://docs.openstack.org/nova/stein/configuration/config.html#DEFAULT.force_config_drive /* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */ /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ > On 18 Oct 2019, at 14:29, Eugen Block wrote: > > I assumed the header was missing because of this message: > >> error reading header from c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or directory > > If you can stat the header file can you share the output of > > rados -p vms listomapvals rbd_header. > > Are there rbd_data objects left in the pool from that config drive? > > rados -p images ls | grep > rbd_object_map.1cbc666b8b4567 > rbd_data.1cbc666b8b4567.0000000000000000 > rbd_header.1cbc666b8b4567 > > If yes, maybe there's a way to set things back together, which I haven't done yet. Are all affected VMs referring to a config drive and is it always the config drive object that's missing? > > > Zitat von Dinçer Çelik : > >> Hi Eugen, >> >> I think this is not the same situation with I’m facing because I can get rbd headers. >> >> Regards >> >> /* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */ >> >> /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ >> >>> On 18 Oct 2019, at 09:44, Eugen Block wrote: >>> >>> Hi, >>> >>> I've recently found this post [1] to recover a failing header, but I haven't tried it myself. I'm curios if it works though. >>> >>> Regards, >>> Eugen >>> >>> https://fnordahl.com/2017/04/17/ceph-rbd-volume-header-recovery/ >>> >>> >>> Zitat von Dinçer Çelik : >>> >>>> Greetings, >>>> >>>> Today I had a data center power outage, and the OpenStack cluster went down. After taking the cluster up again, I cannot start some VMs due to error below. I've tried "rbd object-map rebuild" but it didn't work. What's the proper way to re-create the missing "_disk.config" files? >>>> >>>> Thanks. >>>> >>>> [instance: c2b54eac-179b-4907-9d61-8e075edc21cf] Failed to start libvirt guest: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-10-17T23:19:41.103720Z qemu-system-x86_64: -drive file=rbd:vms/c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config:id=nova:auth_supported=cephx\;none:mon_host=10.250.129.10\:6789\;10.250.129.11\:6789\;10.250.129.12\:6789\;10.250.129.15\:6789,file.password-secret=ide0-0-0-secret0,format=raw,if=none,id=drive-ide0-0-0,readonly=on,cache=writeback,discard=unmap: error reading header from c2b54eac-179b-4907-9d61-8e075edc21cf_disk.config: No such file or directory >>>> >>>> /* Please encrypt every message you can. Privacy is your right, don't let anyone take it from you. */ >>>> >>>> /* My fingerprint is: 5E50 ABB0 F108 24DA 10CC BD43 D2AE DD2A 7893 0EAA */ >>> >>> >>> >>> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Fri Oct 18 19:31:43 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 18 Oct 2019 12:31:43 -0700 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <20191018191828.GB26163@sm-workstation> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <20191018160609.eadalm2qwwpjsigc@mthode.org> <5AA858C6-0F6C-433D-86DC-5067C89C072E@doughellmann.com> <20191018191828.GB26163@sm-workstation> Message-ID: Since there seems to be agreement, the change to require the bump has been posted to master branch[0]. Once backported and merged to the stable branch we can go ahead and release 13.1.0. -Julia [0]: https://review.opendev.org/#/c/687983/ On Fri, Oct 18, 2019 at 12:23 PM Sean McGinnis wrote: > > > > > > > Excluding known bad versions to efectively update the minimum constraint > > > IS allowed by policy as far as I know (from a reqs team perspective). > > > So this sgtm. > > > > > > -- > > > Matthew Thode > > > > I agree, we should exclude the bad version in constraints. But we shouldn’t *only* do that as a way to side-step our other policies and raise the minimum version. > > > > We don’t typically change the minimum version of a dependency just because of a bug fix. In this case, though, we have what sounds like a significant incompatibility, and IIRC we have allowed updates in the past to resolve those problems. > > > > In this case, I think it’s safe to call the current dependency setting for sushy in the Ironic stable/train branch a bug (in ironic) and to allow that minimum to be updated without considering it a break in the stable policy, because Ironic is broken without the update. > > +1, I think this should be fine for a stable backport. > > The feature version bump will be a good indication to anyone picking it up that > something (semi)significant has changed, so they should pay attention. > > Might be a good idea to include a release note with the change to make sure it > is called out in a way that consumers will be aware of the change. > > > > > Normally we would want the new release after the dependency update to be a feature update (the Y in X.Y.Z) because we haven’t added an incompatible dependency or changed the API but we have updated the dependencies. So, I think that means we will need an Ironic 2.1.0 release off of stable/train after the dependency is fixed (rather than 2.0.1 for a bug fix or 3.0.0 for a major update). > > > > Doug > > > > > > > From openstack at fried.cc Fri Oct 18 22:18:25 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 18 Oct 2019 17:18:25 -0500 Subject: [oslo][security] Are config files vetted for ownership/permissions? Message-ID: When $service loads up a config file like /etc/nova/nova.conf via oslo.config, is there anything that makes sure the dir and/or file are owned by the process user/group and have appropriate permissions? E.g. to prevent $hacker from modifying/replacing config opts and making $service do horrible things to my system/cloud. (I'm less concerned with $hacker seeing passwords etc., though I expect we would be accounting for both or neither.) efried . From tony at bakeyournoodle.com Fri Oct 18 23:21:10 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Sat, 19 Oct 2019 10:21:10 +1100 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <20191018160609.eadalm2qwwpjsigc@mthode.org> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <20191018160609.eadalm2qwwpjsigc@mthode.org> Message-ID: <20191018232110.GH8065@thor.bakeyournoodle.com> On Fri, Oct 18, 2019 at 11:06:09AM -0500, Matthew Thode wrote: > Excluding known bad versions to efectively update the minimum constraint > IS allowed by policy as far as I know (from a reqs team perspective). > So this sgtm. This is not allowed with the stable policy. The only policy exception fro raising a minim requirement is because of a security issue. I think The VMT classification was A or B? but I'm fuzzy on that. I say this only to clarify the reasons for allowing fixes like this, not as a statement about this specific case. In this specific case we have: $ tools/grep-all.sh sushy Requirements ------------ 1.2.0-2605-g9806c0bb : sushy # Apache-2.0 origin/master : sushy # Apache-2.0 origin/stable/train : sushy # Apache-2.0 Constraints ----------- 1.2.0-2605-g9806c0bb : sushy===2.0.0 origin/master : sushy===2.0.0 origin/stable/train : sushy===2.0.0 $ So *iff* distros (Debian, UCA, RDO, OpenSuse) are already packaging 2.0.0 for *train* I'm okay with plan to: Master: * Exclude 1.9.0 from global-requirements * Bump the acceptable lower-constraints in ironic Train: * Backport the above changes * release ironic X.(y+1).0 I don't have the where-with-all to do the checking right now, but if someone else does I'll sign off on things (hint hint ;P) If the pre-condition isn't met we can investigate the impact of meeting it vs the UX for the train release. I'm aware we want to get this fixed ASAP Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From tony at bakeyournoodle.com Fri Oct 18 23:34:50 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Sat, 19 Oct 2019 10:34:50 +1100 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: <6ad1f914-c43e-5ae8-57fc-51d3e000b953@nemebean.com> References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> <1c17ad14272bddd29f46ea9790d128f4ff005099.camel@redhat.com> <6ad1f914-c43e-5ae8-57fc-51d3e000b953@nemebean.com> Message-ID: <20191018233450.GI8065@thor.bakeyournoodle.com> On Mon, Oct 14, 2019 at 10:52:56AM -0500, Ben Nemec wrote: > Okay, circling back to wrap this topic up. It sounds like this is a pretty > big win in terms of avoiding random failures either from trying to migrate a > VM with nested guests on older qemu or using newer qemu with older > OpenStack. Since it's a pretty simple patch and it allows our stable > branches to behave more sanely, I'm inclined to go with the backport. If > anyone strongly objects, please let me know ASAP before we release it. It's a little strange but from a stable POV it's okay to backport. We had to do something similar after meltdown, as kernel fixes for that introduced 'anti-features we needed to account for that clearly didn't exist when we built the older releases of OpenStack. Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From fungi at yuggoth.org Fri Oct 18 23:47:29 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 18 Oct 2019 23:47:29 +0000 Subject: [oslo][security] Are config files vetted for ownership/permissions? In-Reply-To: References: Message-ID: <20191018234729.p3dmot3e635o6asc@yuggoth.org> On 2019-10-18 17:18:25 -0500 (-0500), Eric Fried wrote: > When $service loads up a config file like /etc/nova/nova.conf via > oslo.config, is there anything that makes sure the dir and/or file are > owned by the process user/group and have appropriate permissions? E.g. > to prevent $hacker from modifying/replacing config opts and making > $service do horrible things to my system/cloud. (I'm less concerned with > $hacker seeing passwords etc., though I expect we would be accounting > for both or neither.) As with most software, taking care of this is generally up to whoever implements deployment and packaging solutions. Those are in the best position to know what user and group IDs have been created for this purpose, and to set permissions and ownership for them accordingly. If you're asking whether any of our software implements "this conffile's permissions are too loose!" warnings (like how OpenSSH refuses to start if your private key is world-readable), I'm not aware of any, no. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From juliaashleykreger at gmail.com Sat Oct 19 00:52:10 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 18 Oct 2019 17:52:10 -0700 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <20191018232110.GH8065@thor.bakeyournoodle.com> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <20191018160609.eadalm2qwwpjsigc@mthode.org> <20191018232110.GH8065@thor.bakeyournoodle.com> Message-ID: Debian, RDO Project, and OpenSuse Train packaging pipelines show 2.0.0. UCA doesn't seem to have train packaging at this point. -Julia On Fri, Oct 18, 2019 at 4:23 PM Tony Breeds wrote: > > On Fri, Oct 18, 2019 at 11:06:09AM -0500, Matthew Thode wrote: > > > Excluding known bad versions to efectively update the minimum constraint > > IS allowed by policy as far as I know (from a reqs team perspective). > > So this sgtm. > > This is not allowed with the stable policy. The only policy exception > fro raising a minim requirement is because of a security issue. I > think The VMT classification was A or B? but I'm fuzzy on that. > > > I say this only to clarify the reasons for allowing fixes like this, not > as a statement about this specific case. > > In this specific case we have: > > $ tools/grep-all.sh sushy > > Requirements > ------------ > 1.2.0-2605-g9806c0bb : sushy # Apache-2.0 > origin/master : sushy # Apache-2.0 > origin/stable/train : sushy # Apache-2.0 > > > Constraints > ----------- > 1.2.0-2605-g9806c0bb : sushy===2.0.0 > origin/master : sushy===2.0.0 > origin/stable/train : sushy===2.0.0 > > $ > > So *iff* distros (Debian, UCA, RDO, OpenSuse) are already packaging > 2.0.0 for *train* I'm okay with plan to: > > Master: > * Exclude 1.9.0 from global-requirements > * Bump the acceptable lower-constraints in ironic > Train: > * Backport the above changes > * release ironic X.(y+1).0 > > I don't have the where-with-all to do the checking right now, but if > someone else does I'll sign off on things (hint hint ;P) > > If the pre-condition isn't met we can investigate the impact of meeting > it vs the UX for the train release. I'm aware we want to get this fixed > ASAP > > Yours Tony. From fsbiz at yahoo.com Sat Oct 19 01:21:03 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Sat, 19 Oct 2019 01:21:03 +0000 (UTC) Subject: [neutron]: How to find the version of the L2 agent In-Reply-To: References: <2094645143.3124047.1571338299474.ref@mail.yahoo.com> <2094645143.3124047.1571338299474@mail.yahoo.com> Message-ID: <81640440.3796942.1571448063876@mail.yahoo.com> Thank you. Exactly what I wanted. Regards,Fred. On Friday, October 18, 2019, 12:56:23 AM PDT, Lajos Katona wrote: Hi, Similarly on the host where your agent is running you can use the command neutron-linuxbridge-agent --version to check the agent's version. The same works for l3-agent and other deployed neutron (openstack?) related things as well (i.e.: neutron-dhcp-agent --version) Regards Lajos fsbiz at yahoo.com ezt írta (időpont: 2019. okt. 17., Cs, 20:55): > > I know neutron-server --version  gives me the version of the neutron server currently running. > For the recent queens release it is 12.1.0 > > Is there a command or easy way to check the version of the L2 agent on the compute hosts? > > We are running the the neutron-linuxbridge-agent. > > thanks, > Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fsbiz at yahoo.com Sat Oct 19 01:21:03 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Sat, 19 Oct 2019 01:21:03 +0000 (UTC) Subject: [neutron]: How to find the version of the L2 agent In-Reply-To: References: <2094645143.3124047.1571338299474.ref@mail.yahoo.com> <2094645143.3124047.1571338299474@mail.yahoo.com> Message-ID: <81640440.3796942.1571448063876@mail.yahoo.com> Thank you. Exactly what I wanted. Regards,Fred. On Friday, October 18, 2019, 12:56:23 AM PDT, Lajos Katona wrote: Hi, Similarly on the host where your agent is running you can use the command neutron-linuxbridge-agent --version to check the agent's version. The same works for l3-agent and other deployed neutron (openstack?) related things as well (i.e.: neutron-dhcp-agent --version) Regards Lajos fsbiz at yahoo.com ezt írta (időpont: 2019. okt. 17., Cs, 20:55): > > I know neutron-server --version  gives me the version of the neutron server currently running. > For the recent queens release it is 12.1.0 > > Is there a command or easy way to check the version of the L2 agent on the compute hosts? > > We are running the the neutron-linuxbridge-agent. > > thanks, > Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony at bakeyournoodle.com Sat Oct 19 01:28:24 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Sat, 19 Oct 2019 12:28:24 +1100 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <20191018160609.eadalm2qwwpjsigc@mthode.org> <20191018232110.GH8065@thor.bakeyournoodle.com> Message-ID: <20191019012824.GJ8065@thor.bakeyournoodle.com> On Fri, Oct 18, 2019 at 05:52:10PM -0700, Julia Kreger wrote: > Debian, RDO Project, and OpenSuse Train packaging pipelines show > 2.0.0. UCA doesn't seem to have train packaging at this point. UCA for train is here: https://launchpad.net/~ubuntu-cloud-archive/+archive/ubuntu/train-staging it doesn't seem to have a python-sushy package, which means it's getting the one from bionic[1] which is already violating the lower-constraint for ironic Even 'eon' is 1.8.1? I've added James to CC to see if he can clarify my thinking and or arrange to update sushy. Tony. [1] https://packages.ubuntu.com/search?keywords=python-sushy -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From colleen at gazlene.net Sat Oct 19 03:22:01 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 18 Oct 2019 20:22:01 -0700 Subject: [keystone] Keystone Team Update - Week of 14 October 2019 Message-ID: <92a536e9-f5a3-4d4c-9d6a-9bd36369b980@www.fastmail.com> # Keystone Team Update - Week of 14 October 2019 ## News ### Train Released Keystone has been successfully released for the Train cycle! Congratulations everyone on a smooth release. Check out the release notes[1] for details on what's new in this release. [1] https://docs.openstack.org/releasenotes/keystone/train.html ## Office Hours When there are topics to cover, the keystone team holds office hours on Tuesdays at 17:00 UTC. No office hours next week due to no topics. Add topics you would like to see covered during office hours to the etherpad: https://etherpad.openstack.org/p/keystone-office-hours-topics ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 21 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 39 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ### Priority Reviews * Roadmap Stories - Keystone to Keystone Testing https://tree.taiga.io/project/keystone-ussuri-roadmap/us/39 + https://review.opendev.org/580041 Keystone to Keystone tests + https://review.opendev.org/689208 Use up-to-date federation job names + https://review.opendev.org/689222 Add option to disable testing against external idp + https://review.opendev.org/689223 Add voting k2k tests - Federated Attributes for Users https://tree.taiga.io/project/keystone-ussuri-roadmap/us/32 + https://review.opendev.org/448755 Add federated support for creating a user + https://review.opendev.org/448730 https://review.opendev.org/448730 * Needs Discussion - https://review.opendev.org/687753 Drop project.id foreign keys - https://review.opendev.org/687756 Revert "Resource backend is SQL only now" * Oldest - https://review.opendev.org/373983 OpenID Connect improved support * Special Requests - https://review.opendev.org/687362 Set up for Ussuri - https://review.opendev.org/689558 Fix line-length PEP8 errors for c7fae97 - https://review.opendev.org/689559 Re-enable line-length linter * Closes Bugs - https://review.opendev.org/687990 Stop adding entry in local_user while updating ephemerals - https://review.opendev.org/688939 Remove group deletion for non-sql driver when removing domains. - https://review.opendev.org/685042 Fetch discovery documents with auth when needed ## Bugs This week we opened 5 new bugs and closed 2. Bugs opened (5) Bug #1848238 (keystone:Medium) opened by Sami Makki https://bugs.launchpad.net/keystone/+bug/1848238 Bug #1848342 (keystone:Medium) opened by Pedro Henrique Pereira Martins https://bugs.launchpad.net/keystone/+bug/1848342 Bug #1848400 (keystone:Undecided) opened by Eric Xie https://bugs.launchpad.net/keystone/+bug/1848400 Bug #1848470 (keystone:Undecided) opened by yongyi,ren https://bugs.launchpad.net/keystone/+bug/1848470 Bug #1848625 (keystone:Undecided) opened by David Coronel https://bugs.launchpad.net/keystone/+bug/1848625 Bugs closed (2) Bug #1848400 (keystone:Undecided) https://bugs.launchpad.net/keystone/+bug/1848400 Bug #1848625 (keystone:Undecided) https://bugs.launchpad.net/keystone/+bug/1848625 Bugs fixed (0) ## Milestone Outlook https://releases.openstack.org/ussuri/schedule.html Keystone-specific deadlines have been proposed[2]. [2] https://review.opendev.org/689549 ## Shout-outs Thanks to the whole team for making this a smooth release! ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From info at dantalion.nl Sat Oct 19 06:41:12 2019 From: info at dantalion.nl (info at dantalion.nl) Date: Sat, 19 Oct 2019 08:41:12 +0200 Subject: [oslo][futurist] Possible bug in futurist waiters.wait_for_any In-Reply-To: <92a536e9-f5a3-4d4c-9d6a-9bd36369b980@www.fastmail.com> References: <92a536e9-f5a3-4d4c-9d6a-9bd36369b980@www.fastmail.com> Message-ID: <9352fc18-cba7-c2ea-91c7-f3784ff310a3@dantalion.nl> Hello everyone, I think I have found a significant bug in the futurist concurrency library that is breaking for my application. If I launch a threadpool and submit n > 1 number of tasks. If now for any n of these tasks such a task calls Condition.wait(). Than when waiters.wait_for_any is called from the main thread it will block indefinitely instead of returning the n - 1 tasks that have completed. Furthermore setting the timeout parameter in wait_for_any is subsequently ignored. I have submitted this as a bug report on launchpad: https://bugs.launchpad.net/futurist/+bug/1848457 Any help on this is really appreciated as I think it is a significant issue. Kind regards, Corne Lukken (Dantali0n) PS: I also submitted this over on stackoverflow: https://stackoverflow.com/questions/58410610/calling-condition-wait-inside-thread-causes-retrieval-of-any-future-to-block-o From laurentfdumont at gmail.com Sat Oct 19 17:06:24 2019 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Sat, 19 Oct 2019 13:06:24 -0400 Subject: [kolla][ovs][networking] LLDP support within an internal network? Message-ID: Hey everyone, Somewhat of a random question, but is there any built-in support within OVS to carry LLDP traffic between instances running on the same Tenant network? I know that linux-br will eat the LLDP traffic by default. So far, enabling LLDP on the port "qbr" allows the hypervisor to see the instances hosted on that hypervisor. The instances can see the hypervisor as well. That said, instances cannot see each other. All are residing within the same L2 network and are able to have L2 between each other. I am seeing the LLDP packets from the hypervisor arriving on the instance internal interface but nothing from the other instances. I am seeing the outgoing LLDP packets on each interfaces as well. Let me know if anything else is required. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Sat Oct 19 17:15:22 2019 From: zigo at debian.org (Thomas Goirand) Date: Sat, 19 Oct 2019 19:15:22 +0200 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Message-ID: <73ae78bd-1a19-f3e7-6406-364ef8497bd6@debian.org> On 10/16/19 2:55 AM, Richard.Pioso at dell.com wrote: > Hi, > > The Ironic Train release can be broken due to an entry in its driver-requirements.txt. driver-requirements.txt defines a dependency on the sushy package [1] which can be satisfied by version 1.9.0. Unfortunately, that version contains a few bugs which prevent Ironic from being able to manage Dell EMC and perhaps other vendors' bare metal hardware with its Redfish hardware type (driver). The fixes to them [2][3][4] were merged into master before the creation of stable/train. Therefore, they are available on stable/train and in the last sushy release created during the Train cycle, 2.0.0, the only other version which can satisfy the dependency today. However, consumers -- packagers, operators, and users -- could, fighting time constraints or lacking solid visibility into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the dependency, but, in so doing, unknowingly render the package or installation severely broken. At least in Debian, sushy 2.0.0 is included in the Train release (sushy 2.0.0 was uploaded on the 26th of September). I don't know for other distros. > A change [5] has been proposed as part of a prospective solution to this issue. It creates a new release of sushy from the change which fixes the first bug [2]. Review comments [6] discuss basing the new release on a more recent stable/train change to pick up other bug fixes and, less importantly, backward compatible feature modifications and enhancements which merged before the change from which 2.0.0 was created. Backward compatible feature modifications and enhancements are interspersed in time among the bug fixes. Once a new release is available, the sushy entry in driver-requirements.txt on stable/train would be updated. However, apparently, the stable branch policy prevents releases from being done at a point earlier than the last release within a given cycle [6], which was 2.0.0. > > Another possible resolution which comes to mind is to change the definition of the sushy dependency in driver-requirements.txt [1] from "sushy>=1.9.0" to "sushy>=2.0.0". > > Does anyone have a suggestion on how to proceed? I'm not sure if I understand you correctly. Is sushy 2.0.0 enough? Or should I expect newer tags coming soon? Thomas Goirand (zigo) From geguileo at redhat.com Sat Oct 19 17:44:34 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Sat, 19 Oct 2019 19:44:34 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: <20191018191439.GA26163@sm-workstation> References: <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> <20191018144321.ovl4lqv2hxveblcd@localhost> <20191018191439.GA26163@sm-workstation> Message-ID: <20191019174434.va6vozmoaujwunwy@localhost> On 18/10, Sean McGinnis wrote: > > > > > > > Hi, > > > > I've given it some more thought, and I am now on the side of those that > > argue that "something imperfect" is better than what we currently have, > > so maybe we can reach some sort of compromise doing the following: > > > > - Cleanup locks directory on node start > > Didn't we determine this was not safe since multiple services can be configured > to use the same lock directory? In fact, that was the recommended configuration > I think when running Cinder and Nova services on the same node so os-brick > locks would actually work right (or something like that, it was awhile ago > now). > Hi, Reading it again I see that I wasn't very clear with the explanation, sorry about that. I didn't mean that Cinder would clean up the directory on start, because, like you say, this may be shared with other services. What I meant to say (but didn't) is that the installer (for example TripleO or openstack-ansible) should create a service unit that executes on start and removes all the contents of the file lock directory, and makes all the other OpenStack services dependent on this one. That way we can be sure that we are cleaning up the directory before any service has the opportunity to start using it. Cheers, Gorka. > > - Remove locks on delete volume/snapshot operation > > - Remove locks on missing source on create volume (volume/snapshot) > > +1 From radoslaw.piliszek at gmail.com Sat Oct 19 18:06:32 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Sat, 19 Oct 2019 20:06:32 +0200 Subject: [kolla][ovs][networking] LLDP support within an internal network? In-Reply-To: References: Message-ID: Hey Laurent, LLDP is terminated at the neighbor. In a physical setup hosts would see the switch and switch would see the hosts but hosts would not see each other. That's LLDP for you. For whatever purpose, you probably want to use a different discovery protocol. PS: This is very irrelevant to Kolla. Kind regards, Radek sob., 19 paź 2019 o 19:09 Laurent Dumont napisał(a): > Hey everyone, > > Somewhat of a random question, but is there any built-in support within > OVS to carry LLDP traffic between instances running on the same Tenant > network? I know that linux-br will eat the LLDP traffic by default. > > So far, enabling LLDP on the port "qbr" allows the hypervisor to see the > instances hosted on that hypervisor. > > The instances can see the hypervisor as well. That said, instances cannot > see each other. All are residing within the same L2 network and are able to > have L2 between each other. I am seeing the LLDP packets from the > hypervisor arriving on the instance internal interface but nothing from the > other instances. I am seeing the outgoing LLDP packets on each interfaces > as well. > > Let me know if anything else is required. > > Thanks! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Sat Oct 19 18:53:59 2019 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Sat, 19 Oct 2019 14:53:59 -0400 Subject: [kolla][ovs][networking] LLDP support within an internal network? In-Reply-To: References: Message-ID: I thought there might have been some weird magic within OVS to flood the LLDP frames. Thanks! On Sat, Oct 19, 2019 at 2:06 PM Radosław Piliszek < radoslaw.piliszek at gmail.com> wrote: > Hey Laurent, > > LLDP is terminated at the neighbor. > In a physical setup hosts would see the switch and switch would see the > hosts but hosts would not see each other. > That's LLDP for you. > For whatever purpose, you probably want to use a different discovery > protocol. > > PS: This is very irrelevant to Kolla. > > Kind regards, > Radek > > sob., 19 paź 2019 o 19:09 Laurent Dumont > napisał(a): > >> Hey everyone, >> >> Somewhat of a random question, but is there any built-in support within >> OVS to carry LLDP traffic between instances running on the same Tenant >> network? I know that linux-br will eat the LLDP traffic by default. >> >> So far, enabling LLDP on the port "qbr" allows the hypervisor to see the >> instances hosted on that hypervisor. >> >> The instances can see the hypervisor as well. That said, instances cannot >> see each other. All are residing within the same L2 network and are able to >> have L2 between each other. I am seeing the LLDP packets from the >> hypervisor arriving on the instance internal interface but nothing from the >> other instances. I am seeing the outgoing LLDP packets on each interfaces >> as well. >> >> Let me know if anything else is required. >> >> Thanks! >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongbin034 at gmail.com Mon Oct 21 01:57:30 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Sun, 20 Oct 2019 21:57:30 -0400 Subject: [neutron] Bug deputy Oct 14 - Oct 20 Message-ID: Hi all, I am on the bug deputy role last week. Below is the report: High * https://bugs.launchpad.net/neutron/+bug/1848545 Remove PostgreSQL support in Neutron Medium * https://bugs.launchpad.net/neutron/+bug/1848152 Update mtu of network has no validation * https://bugs.launchpad.net/neutron/+bug/1848187 DHCP agent timing out spawning dnsmasq * https://bugs.launchpad.net/neutron/+bug/1848738 Sometimes dnsmasq may not be restarted after adding new subnet * https://bugs.launchpad.net/neutron/+bug/1848213 Do not pass port-range to backend if all ports specified in security group rule Low * https://bugs.launchpad.net/neutron/+bug/1848147 not mocking openstacksdk exception may cause random non detected errors in unit tests * https://bugs.launchpad.net/neutron/+bug/1848201 [neutron-vpnaas] Neutron installed inside venv makes VPNaaS broken * https://bugs.launchpad.net/neutron/+bug/1848220 TestMinBwQoSOvs is not calling the correct methods * https://bugs.launchpad.net/neutron/+bug/1848851 move fwaas_v2_log constants to neutron-lib Wishlist * https://bugs.launchpad.net/neutron/+bug/1848500 Implement an OpenFlow monitor Undecided * https://bugs.launchpad.net/neutron/+bug/1848311 trunk + subports not working * https://bugs.launchpad.net/neutron/+bug/1848131 [FWaaS] Support blacklist filtering Best regards, Hongbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon Oct 21 08:44:23 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 21 Oct 2019 10:44:23 +0200 Subject: [oslo][futurist] Possible bug in futurist waiters.wait_for_any In-Reply-To: <9352fc18-cba7-c2ea-91c7-f3784ff310a3@dantalion.nl> References: <92a536e9-f5a3-4d4c-9d6a-9bd36369b980@www.fastmail.com> <9352fc18-cba7-c2ea-91c7-f3784ff310a3@dantalion.nl> Message-ID: Hello, Thanks for the heads up. You are right this is related to https://bugs.python.org/issue20319 futurist implement a similar code than the cpython concurrent.future code but not fixed yet. I just submitted a patch to fix that, feel free to review it: - https://review.opendev.org/689691 Thanks Le sam. 19 oct. 2019 à 08:44, info at dantalion.nl a écrit : > Hello everyone, > > I think I have found a significant bug in the futurist concurrency > library that is breaking for my application. > > If I launch a threadpool and submit n > 1 number of tasks. If now for > any n of these tasks such a task calls Condition.wait(). Than when > waiters.wait_for_any is called from the main thread it will block > indefinitely instead of returning the n - 1 tasks that have completed. > Furthermore setting the timeout parameter in wait_for_any is > subsequently ignored. > > I have submitted this as a bug report on launchpad: > https://bugs.launchpad.net/futurist/+bug/1848457 > > Any help on this is really appreciated as I think it is a significant > issue. > > Kind regards, > Corne Lukken (Dantali0n) > > PS: I also submitted this over on stackoverflow: > > https://stackoverflow.com/questions/58410610/calling-condition-wait-inside-thread-causes-retrieval-of-any-future-to-block-o > > > > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Mon Oct 21 09:36:58 2019 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 21 Oct 2019 10:36:58 +0100 Subject: Deployment tools capabilities now displayed on openstack.org/software In-Reply-To: References: Message-ID: On Fri, 18 Oct 2019 at 09:40, Thierry Carrez wrote: > > Hi everyone, > > During the Train cycle, we pushed to provide better information to users > of OpenStack on the differences between the various upstream ways to > deploy OpenStack. > > We first defined a number of deployment tools capabilities[1] and then > asked our various deployment tools to fill out which capabilities > applied to them[2]. > > [1] > https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools_capabilities.yaml > [2] > https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools.yaml > > Those capabilities are now displayed on the website at: > https://www.openstack.org/software/project-navigator/deployment-tools > > Thanks for everyone who helped defining and providing this information. > Please feel free to propose improvements through changes to the > osf/openstack-map repository. Thanks for pushing this Thierry, it should be useful. I don't see Kayobe on the list though. Mark > > Cheers, > > -- > Thierry Carrez (ttx) > From thierry at openstack.org Mon Oct 21 10:04:51 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 21 Oct 2019 12:04:51 +0200 Subject: [Release-job-failures] release-post job for openstack/releases for ref refs/heads/master failed In-Reply-To: References: Message-ID: <300d9b91-a19a-e449-5ffe-1af8689e88e7@openstack.org> zuul at openstack.org wrote: > Build failed. > > - tag-releases https://zuul.opendev.org/t/openstack/build/f5018416e3e443af88c0e1c7981bacfd : RETRY_LIMIT in 1m 41s > - publish-tox-docs-static https://zuul.opendev.org/t/openstack/build/None : SKIPPED This is the second similar issue in a row in tag-releases: https://zuul.opendev.org/t/openstack/builds?job_name=tag-releases Root cause seems to be: LOOP [Add origin remote to enable notes fetching] fatal: remote origin already exists. Coincides with start of usage of 'prepare-workspace-git' (instead of 'use-cached-repos' which apparently worked better): https://opendev.org/opendev/base-jobs/commit/59f84f00db1c6be614bb713c055b13cf9502f8bd use-cached-repos removed the origin, while prepare-workspace-git replaces it by a zuul fake origin. We should probably tweak: https://opendev.org/openstack/project-config/src/branch/master/playbooks/release/pre.yaml#L4 Once fixed we'll need to reenqueue this one to process the tags correctly. -- Thierry Carrez (ttx) From thierry at openstack.org Mon Oct 21 10:16:19 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 21 Oct 2019 12:16:19 +0200 Subject: [oslo] Adoption of microversion-parse Message-ID: <7d239dce-77c5-0e23-e1cb-57785e241b07@openstack.org> Hi Osloites, A recent review of the OpenStack namespace yielded a library that no project team owns, and yet is specific to openstack and present in global-requirements: https://opendev.org/openstack/microversion-parse Ideally it should be adopted by one of our project teams. After discussing it with cdent, and since it is openstack-specific and depended on by most openstack projects, it feels like Oslo would be a good home for it. It would follow an "independent" release model and I don't think it would trigger much updates or maintenance. I can post patches to make it happen if you confirm you're OK to host it. Just keep me posted :) Cheers, -- Thierry Carrez (ttx) From thierry at openstack.org Mon Oct 21 10:19:11 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 21 Oct 2019 12:19:11 +0200 Subject: Deployment tools capabilities now displayed on openstack.org/software In-Reply-To: References: Message-ID: <11f3c242-0a20-c531-1442-c5c7ce4eda8d@openstack.org> Mark Goddard wrote: >> [...] >> Those capabilities are now displayed on the website at: >> https://www.openstack.org/software/project-navigator/deployment-tools >> >> Thanks for everyone who helped defining and providing this information. >> Please feel free to propose improvements through changes to the >> osf/openstack-map repository. > > Thanks for pushing this Thierry, it should be useful. I don't see > Kayobe on the list though. Hmm, that's a bug. I'll raise it with the web team and keep you posted. -- Thierry From emilien at redhat.com Mon Oct 21 10:41:05 2019 From: emilien at redhat.com (Emilien Macchi) Date: Mon, 21 Oct 2019 12:41:05 +0200 Subject: [tripleo] tripleoclient and tripleo-common have stable/train branch In-Reply-To: References: Message-ID: We just branched all the other projects, from now please make sure your patches merged in master are backported into stable/train if needed. Thanks, On Thu, Oct 17, 2019 at 3:07 PM Emilien Macchi wrote: > We branched stable/train for python-tripleoclient and tripleo-common. > Please do the backports to that branch when necessary. > > We'll continue with branching hopefully today or tomorrow. > Let me know if any questions, > -- > Emilien Macchi > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Oct 21 12:06:12 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 21 Oct 2019 14:06:12 +0200 Subject: [Release-job-failures] release-post job for openstack/releases for ref refs/heads/master failed In-Reply-To: <300d9b91-a19a-e449-5ffe-1af8689e88e7@openstack.org> References: <300d9b91-a19a-e449-5ffe-1af8689e88e7@openstack.org> Message-ID: Thierry Carrez wrote: > [...] > We should probably tweak: > https://opendev.org/openstack/project-config/src/branch/master/playbooks/release/pre.yaml#L4 Proposed fix at https://review.opendev.org/#/c/689725/ -- Thierry Carrez (ttx) From thierry at openstack.org Mon Oct 21 12:14:46 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 21 Oct 2019 14:14:46 +0200 Subject: [helm][loci][rpm][ironic] Please fill deployment tool capabilities Message-ID: <3dea3216-bc9e-044b-4869-c4e96b09d3a4@openstack.org> OpenStack-Helm, LOCI, RPM-packaging & Bifrost (ironic) folks: We recently started to display deployment tools capabilities over at: https://www.openstack.org/software/project-navigator/deployment-tools However OpenStack-Helm, LOCI, RPM Packaging and Bifrost display only the (default) component:keystone capability, because they were not filled when we we last asked[1][2]. As such, they look a bit sad. Please propose a patch to the following file: https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools.yaml Capabilities are described in detail in: https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools_capabilities.yaml Let me know if you have any question, [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007585.html [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-July/008150.html -- Thierry Carrez (ttx) From hberaud at redhat.com Mon Oct 21 12:47:09 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 21 Oct 2019 14:47:09 +0200 Subject: [oslo] Adoption of microversion-parse In-Reply-To: <7d239dce-77c5-0e23-e1cb-57785e241b07@openstack.org> References: <7d239dce-77c5-0e23-e1cb-57785e241b07@openstack.org> Message-ID: Hey Thierry, It make sense to me, I'm personally OK to host it inside oslo. Let's see other oslo's cores feedbacks. Cheers Le lun. 21 oct. 2019 à 12:25, Thierry Carrez a écrit : > Hi Osloites, > > A recent review of the OpenStack namespace yielded a library that no > project team owns, and yet is specific to openstack and present in > global-requirements: > > https://opendev.org/openstack/microversion-parse > > Ideally it should be adopted by one of our project teams. After > discussing it with cdent, and since it is openstack-specific and > depended on by most openstack projects, it feels like Oslo would be a > good home for it. It would follow an "independent" release model and I > don't think it would trigger much updates or maintenance. > > I can post patches to make it happen if you confirm you're OK to host > it. Just keep me posted :) > > Cheers, > > -- > Thierry Carrez (ttx) > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From moguimar at redhat.com Mon Oct 21 13:12:56 2019 From: moguimar at redhat.com (Moises Guimaraes de Medeiros) Date: Mon, 21 Oct 2019 15:12:56 +0200 Subject: [oslo] Adoption of microversion-parse In-Reply-To: References: <7d239dce-77c5-0e23-e1cb-57785e241b07@openstack.org> Message-ID: +1 On Mon, Oct 21, 2019 at 2:48 PM Herve Beraud wrote: > Hey Thierry, > > It make sense to me, I'm personally OK to host it inside oslo. > Let's see other oslo's cores feedbacks. > > Cheers > > Le lun. 21 oct. 2019 à 12:25, Thierry Carrez a > écrit : > >> Hi Osloites, >> >> A recent review of the OpenStack namespace yielded a library that no >> project team owns, and yet is specific to openstack and present in >> global-requirements: >> >> https://opendev.org/openstack/microversion-parse >> >> Ideally it should be adopted by one of our project teams. After >> discussing it with cdent, and since it is openstack-specific and >> depended on by most openstack projects, it feels like Oslo would be a >> good home for it. It would follow an "independent" release model and I >> don't think it would trigger much updates or maintenance. >> >> I can post patches to make it happen if you confirm you're OK to host >> it. Just keep me posted :) >> >> Cheers, >> >> -- >> Thierry Carrez (ttx) >> >> > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > > -- Moisés Guimarães Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From neil at tigera.io Mon Oct 21 13:55:19 2019 From: neil at tigera.io (Neil Jerram) Date: Mon, 21 Oct 2019 14:55:19 +0100 Subject: [infra] Trying to reproduce an OpenStack CI run Message-ID: I'm trying to reproduce an OpenStack CI run on a fresh Ubuntu Bionic VM, like this: sudo apt-get update sudo apt-get install -y python-all-dev build-essential git libssl-dev ntp ntpdate libre2-dev python-pip sudo pip install virtualenv wget https://1a5e1b314637dd59968a-fddd31f569f44d98bd401bfd88253d97.ssl.cf5.rackcdn.com/682338/1/check/networking-calico-tempest-dsvm/6801298/logs/reproduce.sh chmod +x reproduce.sh sudo ./reproduce.sh The last line runs for a short while but then fails with: [...] Successfully built zuul PyYAML voluptuous PrettyTable ansible sqlalchemy alembic cachecontrol psutil fb-re2 paho-mqtt ws4py Mako repoze.lru pycparser Installing collected packages: pbr, six, python-dateutil, uritemplate, urllib3, certifi, chardet, idna, requests, enum34, ipaddress, pycparser, cffi, cryptography, jwcrypto, github3.py, PyYAML, pynacl, bcrypt, paramiko, smmap2, gitdb2, GitPython, docutils, lockfile, python-daemon, extras, statsd, voluptuous, gear, pytz, tzlocal, funcsigs, futures, apscheduler, PrettyTable, babel, MarkupSafe, jinja2, ansible, netaddr, kazoo, sqlalchemy, Mako, python-editor, alembic, msgpack, cachecontrol, pyjwt, iso8601, psutil, fb-re2, paho-mqtt, contextlib2, more-itertools, zc.lockfile, backports.functools-lru-cache, jaraco.functools, tempora, portend, cheroot, cherrypy, ws4py, repoze.lru, routes, zuul Successfully installed GitPython-2.1.14 Mako-1.1.0 MarkupSafe-1.1.1 PrettyTable-0.7.2 PyYAML-5.1.2 alembic-1.2.1 ansible-2.5.15 apscheduler-3.6.1 babel-2.7.0 backports.functools-lru-cache-1.5 bcrypt-3.1.7 cachecontrol-0.12.5 certifi-2019.9.11 cffi-1.13.0 chardet-3.0.4 cheroot-8.2.1 cherrypy-17.4.2 contextlib2-0.6.0.post1 cryptography-2.8 docutils-0.15.2 enum34-1.1.6 extras-1.0.0 fb-re2-1.0.7 funcsigs-1.0.2 futures-3.3.0 gear-0.14.0 gitdb2-2.0.6 github3.py-1.3.0 idna-2.8 ipaddress-1.0.23 iso8601-0.1.12 jaraco.functools-2.0 jinja2-2.10.3 jwcrypto-0.6.0 kazoo-2.6.1 lockfile-0.12.2 more-itertools-5.0.0 msgpack-0.6.2 netaddr-0.7.19 paho-mqtt-1.4.0 paramiko-2.6.0 pbr-5.4.3 portend-2.5 psutil-5.6.3 pycparser-2.19 pyjwt-1.7.1 pynacl-1.3.0 python-daemon-2.0.6 python-dateutil-2.8.0 python-editor-1.0.4 pytz-2019.3 repoze.lru-0.7 requests-2.22.0 routes-2.4.1 six-1.12.0 smmap2-2.0.5 sqlalchemy-1.3.10 statsd-3.3.0 tempora-1.14.1 tzlocal-2.0.0 uritemplate-3.0.0 urllib3-1.25.6 voluptuous-0.11.7 ws4py-0.5.1 zc.lockfile-2.0 zuul-3.2.0 + cat + /usr/zuul-env/bin/zuul-cloner -m clonemap.yaml --cache-dir /opt/git https://opendev.org openstack/devstack-gate ./reproduce.sh: line 112: /usr/zuul-env/bin/zuul-cloner: No such file or directory Can you see what I'm doing wrong? Many thanks, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Mon Oct 21 13:57:19 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 21 Oct 2019 08:57:19 -0500 Subject: [oslo] Adoption of microversion-parse In-Reply-To: <7d239dce-77c5-0e23-e1cb-57785e241b07@openstack.org> References: <7d239dce-77c5-0e23-e1cb-57785e241b07@openstack.org> Message-ID: <72c50e25-0a26-b3d8-5f3a-64c48570824d@nemebean.com> On 10/21/19 5:16 AM, Thierry Carrez wrote: > Hi Osloites, > > A recent review of the OpenStack namespace yielded a library that no > project team owns, and yet is specific to openstack and present in > global-requirements: > > https://opendev.org/openstack/microversion-parse > > Ideally it should be adopted by one of our project teams. After > discussing it with cdent, and since it is openstack-specific and > depended on by most openstack projects, it feels like Oslo would be a > good home for it. It would follow an "independent" release model and I > don't think it would trigger much updates or maintenance. > > I can post patches to make it happen if you confirm you're OK to host > it. Just keep me posted :) Makes sense. We probably want to have an independent core team for it in addition to oslo-core so we can add people like Chris to it. > > Cheers, > From openstack at fried.cc Mon Oct 21 14:08:41 2019 From: openstack at fried.cc (Eric Fried) Date: Mon, 21 Oct 2019 09:08:41 -0500 Subject: [oslo] Adoption of microversion-parse In-Reply-To: <72c50e25-0a26-b3d8-5f3a-64c48570824d@nemebean.com> References: <7d239dce-77c5-0e23-e1cb-57785e241b07@openstack.org> <72c50e25-0a26-b3d8-5f3a-64c48570824d@nemebean.com> Message-ID: > Makes sense. We probably want to have an independent core team for it in > addition to oslo-core so we can add people like Chris to it. I volunteer to help maintain it, if you'll have me. efried . From thierry at openstack.org Mon Oct 21 14:09:45 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 21 Oct 2019 16:09:45 +0200 Subject: [oslo] Adoption of microversion-parse In-Reply-To: <72c50e25-0a26-b3d8-5f3a-64c48570824d@nemebean.com> References: <7d239dce-77c5-0e23-e1cb-57785e241b07@openstack.org> <72c50e25-0a26-b3d8-5f3a-64c48570824d@nemebean.com> Message-ID: Ben Nemec wrote: > > > On 10/21/19 5:16 AM, Thierry Carrez wrote: >> Hi Osloites, >> >> A recent review of the OpenStack namespace yielded a library that no >> project team owns, and yet is specific to openstack and present in >> global-requirements: >> >> https://opendev.org/openstack/microversion-parse >> >> Ideally it should be adopted by one of our project teams. After >> discussing it with cdent, and since it is openstack-specific and >> depended on by most openstack projects, it feels like Oslo would be a >> good home for it. It would follow an "independent" release model and I >> don't think it would trigger much updates or maintenance. >> >> I can post patches to make it happen if you confirm you're OK to host >> it. Just keep me posted :) > > Makes sense. We probably want to have an independent core team for it in > addition to oslo-core so we can add people like Chris to it. It already has one: microversion-parse-core https://review.opendev.org/#/admin/groups/1345,members oslo-core should probably be added to that. I'll propose the project addition so you can all vote directly on it :) -- Thierry Carrez (ttx) From thierry at openstack.org Mon Oct 21 14:14:32 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 21 Oct 2019 16:14:32 +0200 Subject: [oslo] Adoption of microversion-parse In-Reply-To: References: <7d239dce-77c5-0e23-e1cb-57785e241b07@openstack.org> <72c50e25-0a26-b3d8-5f3a-64c48570824d@nemebean.com> Message-ID: <30c14806-6c43-a2be-9612-0a54de6c4323@openstack.org> Thierry Carrez wrote: > [...] > I'll propose the project addition so you can all vote directly on it :) https://review.opendev.org/#/c/689754/ -- Thierry Carrez (ttx) From jim at jimrollenhagen.com Mon Oct 21 14:42:17 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Mon, 21 Oct 2019 10:42:17 -0400 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <73ae78bd-1a19-f3e7-6406-364ef8497bd6@debian.org> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <73ae78bd-1a19-f3e7-6406-364ef8497bd6@debian.org> Message-ID: On Sat, Oct 19, 2019 at 1:16 PM Thomas Goirand wrote: > On 10/16/19 2:55 AM, Richard.Pioso at dell.com wrote: > > Hi, > > > > The Ironic Train release can be broken due to an entry in its > driver-requirements.txt. driver-requirements.txt defines a dependency on > the sushy package [1] which can be satisfied by version 1.9.0. > Unfortunately, that version contains a few bugs which prevent Ironic from > being able to manage Dell EMC and perhaps other vendors' bare metal > hardware with its Redfish hardware type (driver). The fixes to them > [2][3][4] were merged into master before the creation of stable/train. > Therefore, they are available on stable/train and in the last sushy release > created during the Train cycle, 2.0.0, the only other version which can > satisfy the dependency today. However, consumers -- packagers, operators, > and users -- could, fighting time constraints or lacking solid visibility > into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the > dependency, but, in so doing, unknowingly render the package or > installation severely broken. > > At least in Debian, sushy 2.0.0 is included in the Train release (sushy > 2.0.0 was uploaded on the 26th of September). I don't know for other > distros. > > > A change [5] has been proposed as part of a prospective solution to this > issue. It creates a new release of sushy from the change which fixes the > first bug [2]. Review comments [6] discuss basing the new release on a more > recent stable/train change to pick up other bug fixes and, less > importantly, backward compatible feature modifications and enhancements > which merged before the change from which 2.0.0 was created. Backward > compatible feature modifications and enhancements are interspersed in time > among the bug fixes. Once a new release is available, the sushy entry in > driver-requirements.txt on stable/train would be updated. However, > apparently, the stable branch policy prevents releases from being done at a > point earlier than the last release within a given cycle [6], which was > 2.0.0. > > > > Another possible resolution which comes to mind is to change the > definition of the sushy dependency in driver-requirements.txt [1] from > "sushy>=1.9.0" to "sushy>=2.0.0". > > > > Does anyone have a suggestion on how to proceed? > > I'm not sure if I understand you correctly. Is sushy 2.0.0 enough? Or > should I expect newer tags coming soon? > Yes, sushy 2.0.0 works as expected. Of course, there may be future bugfix releases from the train branch, but we don't have any planned right now. // jim > > Thomas Goirand (zigo) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at jimrollenhagen.com Mon Oct 21 14:43:39 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Mon, 21 Oct 2019 10:43:39 -0400 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <20191019012824.GJ8065@thor.bakeyournoodle.com> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <20191018160609.eadalm2qwwpjsigc@mthode.org> <20191018232110.GH8065@thor.bakeyournoodle.com> <20191019012824.GJ8065@thor.bakeyournoodle.com> Message-ID: On Fri, Oct 18, 2019 at 9:33 PM Tony Breeds wrote: > On Fri, Oct 18, 2019 at 05:52:10PM -0700, Julia Kreger wrote: > > Debian, RDO Project, and OpenSuse Train packaging pipelines show > > 2.0.0. UCA doesn't seem to have train packaging at this point. > > UCA for train is here: > > https://launchpad.net/~ubuntu-cloud-archive/+archive/ubuntu/train-staging > it doesn't seem to have a python-sushy package, which means it's > getting the one from bionic[1] which is already violating the > lower-constraint for ironic > > Even 'eon' is 1.8.1? > > I've added James to CC to see if he can clarify my thinking and or > arrange to update sushy. > If we need to update sushy in UCA anyway, sounds like we could probably go straight to 2.0.0 and then go ahead and do the requirements update dance mentioned upthread? // jim > > Tony. > [1] https://packages.ubuntu.com/search?keywords=python-sushy > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cboylan at sapwetik.org Mon Oct 21 14:54:46 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 21 Oct 2019 07:54:46 -0700 Subject: [infra] Trying to reproduce an OpenStack CI run In-Reply-To: References: Message-ID: <67da7f1e-0580-493b-aa27-7f8ff132e27c@www.fastmail.com> On Mon, Oct 21, 2019, at 6:55 AM, Neil Jerram wrote: > I'm trying to reproduce an OpenStack CI run on a fresh Ubuntu Bionic > VM, like this: > > sudo apt-get update > sudo apt-get install -y python-all-dev build-essential git libssl-dev > ntp ntpdate libre2-dev python-pip > sudo pip install virtualenv > wget > https://1a5e1b314637dd59968a-fddd31f569f44d98bd401bfd88253d97.ssl.cf5.rackcdn.com/682338/1/check/networking-calico-tempest-dsvm/6801298/logs/reproduce.sh > chmod +x reproduce.sh > sudo ./reproduce.sh > > The last line runs for a short while but then fails with: > > [...] > Successfully built zuul PyYAML voluptuous PrettyTable ansible > sqlalchemy alembic cachecontrol psutil fb-re2 paho-mqtt ws4py Mako > repoze.lru pycparser > Installing collected packages: pbr, six, python-dateutil, uritemplate, > urllib3, certifi, chardet, idna, requests, enum34, ipaddress, > pycparser, cffi, cryptography, jwcrypto, github3.py, PyYAML, pynacl, > bcrypt, paramiko, smmap2, gitdb2, GitPython, docutils, lockfile, > python-daemon, extras, statsd, voluptuous, gear, pytz, tzlocal, > funcsigs, futures, apscheduler, PrettyTable, babel, MarkupSafe, jinja2, > ansible, netaddr, kazoo, sqlalchemy, Mako, python-editor, alembic, > msgpack, cachecontrol, pyjwt, iso8601, psutil, fb-re2, paho-mqtt, > contextlib2, more-itertools, zc.lockfile, > backports.functools-lru-cache, jaraco.functools, tempora, portend, > cheroot, cherrypy, ws4py, repoze.lru, routes, zuul > Successfully installed GitPython-2.1.14 Mako-1.1.0 MarkupSafe-1.1.1 > PrettyTable-0.7.2 PyYAML-5.1.2 alembic-1.2.1 ansible-2.5.15 > apscheduler-3.6.1 babel-2.7.0 backports.functools-lru-cache-1.5 > bcrypt-3.1.7 cachecontrol-0.12.5 certifi-2019.9.11 cffi-1.13.0 > chardet-3.0.4 cheroot-8.2.1 cherrypy-17.4.2 contextlib2-0.6.0.post1 > cryptography-2.8 docutils-0.15.2 enum34-1.1.6 extras-1.0.0 fb-re2-1.0.7 > funcsigs-1.0.2 futures-3.3.0 gear-0.14.0 gitdb2-2.0.6 github3.py-1.3.0 > idna-2.8 ipaddress-1.0.23 iso8601-0.1.12 jaraco.functools-2.0 > jinja2-2.10.3 jwcrypto-0.6.0 kazoo-2.6.1 lockfile-0.12.2 > more-itertools-5.0.0 msgpack-0.6.2 netaddr-0.7.19 paho-mqtt-1.4.0 > paramiko-2.6.0 pbr-5.4.3 portend-2.5 psutil-5.6.3 pycparser-2.19 > pyjwt-1.7.1 pynacl-1.3.0 python-daemon-2.0.6 python-dateutil-2.8.0 > python-editor-1.0.4 pytz-2019.3 repoze.lru-0.7 requests-2.22.0 > routes-2.4.1 six-1.12.0 smmap2-2.0.5 sqlalchemy-1.3.10 statsd-3.3.0 > tempora-1.14.1 tzlocal-2.0.0 uritemplate-3.0.0 urllib3-1.25.6 > voluptuous-0.11.7 ws4py-0.5.1 zc.lockfile-2.0 zuul-3.2.0 > + cat > + /usr/zuul-env/bin/zuul-cloner -m clonemap.yaml --cache-dir /opt/git > https://opendev.org openstack/devstack-gate > ./reproduce.sh: line 112: /usr/zuul-env/bin/zuul-cloner: No such file > or directory > > Can you see what I'm doing wrong? The legacy devstack-gate jobs (which produce the reproduce.sh file) rely on the old zuulv2 (not 3) zuul-cloner utility to ensure git repos are in the correct location for the job. Our current images (which you can find at https://nb01.openstack.org/images and https://nb02.openstack.org/images) install a compatbility shim between zuul-cloner and zuulv3. Another option is to checkout zuul's latest version 2.x tag and pip install that to a virtualenv at /usr/zuul-env. Clark From neil at tigera.io Mon Oct 21 16:48:51 2019 From: neil at tigera.io (Neil Jerram) Date: Mon, 21 Oct 2019 17:48:51 +0100 Subject: [infra] Trying to reproduce an OpenStack CI run In-Reply-To: <67da7f1e-0580-493b-aa27-7f8ff132e27c@www.fastmail.com> References: <67da7f1e-0580-493b-aa27-7f8ff132e27c@www.fastmail.com> Message-ID: On Mon, Oct 21, 2019 at 3:55 PM Clark Boylan wrote: > On Mon, Oct 21, 2019, at 6:55 AM, Neil Jerram wrote: > > I'm trying to reproduce an OpenStack CI run on a fresh Ubuntu Bionic > > VM, like this: > > > > sudo apt-get update > > sudo apt-get install -y python-all-dev build-essential git libssl-dev > > ntp ntpdate libre2-dev python-pip > > sudo pip install virtualenv > > wget > > > https://1a5e1b314637dd59968a-fddd31f569f44d98bd401bfd88253d97.ssl.cf5.rackcdn.com/682338/1/check/networking-calico-tempest-dsvm/6801298/logs/reproduce.sh > > chmod +x reproduce.sh > > sudo ./reproduce.sh > > > > The last line runs for a short while but then fails with: > > > > [...] > > Successfully built zuul PyYAML voluptuous PrettyTable ansible > > sqlalchemy alembic cachecontrol psutil fb-re2 paho-mqtt ws4py Mako > > repoze.lru pycparser > > Installing collected packages: pbr, six, python-dateutil, uritemplate, > > urllib3, certifi, chardet, idna, requests, enum34, ipaddress, > > pycparser, cffi, cryptography, jwcrypto, github3.py, PyYAML, pynacl, > > bcrypt, paramiko, smmap2, gitdb2, GitPython, docutils, lockfile, > > python-daemon, extras, statsd, voluptuous, gear, pytz, tzlocal, > > funcsigs, futures, apscheduler, PrettyTable, babel, MarkupSafe, jinja2, > > ansible, netaddr, kazoo, sqlalchemy, Mako, python-editor, alembic, > > msgpack, cachecontrol, pyjwt, iso8601, psutil, fb-re2, paho-mqtt, > > contextlib2, more-itertools, zc.lockfile, > > backports.functools-lru-cache, jaraco.functools, tempora, portend, > > cheroot, cherrypy, ws4py, repoze.lru, routes, zuul > > Successfully installed GitPython-2.1.14 Mako-1.1.0 MarkupSafe-1.1.1 > > PrettyTable-0.7.2 PyYAML-5.1.2 alembic-1.2.1 ansible-2.5.15 > > apscheduler-3.6.1 babel-2.7.0 backports.functools-lru-cache-1.5 > > bcrypt-3.1.7 cachecontrol-0.12.5 certifi-2019.9.11 cffi-1.13.0 > > chardet-3.0.4 cheroot-8.2.1 cherrypy-17.4.2 contextlib2-0.6.0.post1 > > cryptography-2.8 docutils-0.15.2 enum34-1.1.6 extras-1.0.0 fb-re2-1.0.7 > > funcsigs-1.0.2 futures-3.3.0 gear-0.14.0 gitdb2-2.0.6 github3.py-1.3.0 > > idna-2.8 ipaddress-1.0.23 iso8601-0.1.12 jaraco.functools-2.0 > > jinja2-2.10.3 jwcrypto-0.6.0 kazoo-2.6.1 lockfile-0.12.2 > > more-itertools-5.0.0 msgpack-0.6.2 netaddr-0.7.19 paho-mqtt-1.4.0 > > paramiko-2.6.0 pbr-5.4.3 portend-2.5 psutil-5.6.3 pycparser-2.19 > > pyjwt-1.7.1 pynacl-1.3.0 python-daemon-2.0.6 python-dateutil-2.8.0 > > python-editor-1.0.4 pytz-2019.3 repoze.lru-0.7 requests-2.22.0 > > routes-2.4.1 six-1.12.0 smmap2-2.0.5 sqlalchemy-1.3.10 statsd-3.3.0 > > tempora-1.14.1 tzlocal-2.0.0 uritemplate-3.0.0 urllib3-1.25.6 > > voluptuous-0.11.7 ws4py-0.5.1 zc.lockfile-2.0 zuul-3.2.0 > > + cat > > + /usr/zuul-env/bin/zuul-cloner -m clonemap.yaml --cache-dir /opt/git > > https://opendev.org openstack/devstack-gate > > ./reproduce.sh: line 112: /usr/zuul-env/bin/zuul-cloner: No such file > > or directory > > > > Can you see what I'm doing wrong? > > The legacy devstack-gate jobs (which produce the reproduce.sh file) rely > on the old zuulv2 (not 3) zuul-cloner utility to ensure git repos are in > the correct location for the job. Our current images (which you can find at > https://nb01.openstack.org/images and https://nb02.openstack.org/images) > install a compatbility shim between zuul-cloner and zuulv3. Another option > is to checkout zuul's latest version 2.x tag and pip install that to a > virtualenv at /usr/zuul-env. > Thanks Clark. Do you happen to know if I can run one of those images on GCP? -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Mon Oct 21 16:54:22 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Mon, 21 Oct 2019 16:54:22 +0000 Subject: RMQ notifications.info has no consumers Message-ID: I installed the RMQ management plugin in my dev and qa clusters and noticed a queue called notifications.info that has 0 consumers: root at us01odc-qa-ctrl1:~# rabbitmqctl list_queues|grep notific notifications.info 15496 If I cold-start RMQ it starts without a notifications.info queue but after I create a VM it appears with 1 or 2 messages and then gradually grows over time. The messages in it appear to be Openstack operations: {"oslo.message": "{\"_context_domain\": null, \"_context_roles\": [\"admin\"], \"_context_global_request_id\": null, \"event_type\": \"dns.domain.exists\", \"_context_edit_managed_records\": false, \"_context_service_catalog\": null, \"timestamp\": \"2019-10-10 21:02:01.524052\", \"_context_all_tenants\": true, \"_unique_id\": \"326b8e30d10f40279d806384e7023954\", \"_context_resource_uuid\": null, \"_context_request_id\": \"req-a16c0cd5-94c3-4bf7-90ed-3ab677cc8e66\", \"_context_is_admin_project\": true, \"_context_tsigkey_id\": null, \"_context_client_addr\": null, \"payload\": {\"updated_at\": \"2019-09-24T15:08:55.000000\", \"minimum\": 3600, \"ttl\": 60, \"serial\": 1569337730, \"deleted_at\": null, \"id\": \"67840d5f-6b98-449c-80c4-4d3e3ee74d19\", \"parent_zone_id\": null, \"retry\": 600, \"transferred_at\": null, \"version\": 470, \"type\": \"PRIMARY\", \"email\": \"linux-admin-core at synopsys.com\", \"status\": \"ACTIVE\", \"description\": null, \"deleted\": \"0\", \"shard\": 1656, \"action\": \"NONE\", \"expire\": 86400, \"audit_period_beginning\": \"2019-10-10T20:02:01.486946\", \"masters\": [], \"name\": \"sg.us01-qa.synopsys.com.\", \"tenant_id\": \"80a60cda79be4ec48e12683478f0cf3b\", \"created_at\": \"2019-09-05T17:18:47.000000\", \"pool_id\": \"794ccc2c-d751-44fe-b57f-8894c9f5c842\", \"refresh\": 3545, \"delayed_notify\": false, \"audit_period_ending\": \"2019-10-10T21:02:01.486946\", \"attributes\": []}, \"_context_auth_token\": null, \"_context_system_scope\": null, \"_context_original_tenant\": null, \"_context_user_identity\": \"- - - - -\", \"_context_show_deleted\": false, \"_context_tenant\": null, \"_context_hide_counts\": false, \"priority\": \"INFO\", \"_context_read_only\": false, \"_context_is_admin\": true, \"_context_abandon\": null, \"_context_project_domain\": null, \"_context_user\": null, \"_context_user_domain\": null, \"publisher_id\": \"producer.us01odc-qa-ctrl2\", \"message_id\": \"0ceae8e3-e6df-4b08-8b1e-c6d8e75d456d\", \"_context_project\": null}", "oslo.version": "2.0"} I didn't install the management plugin in prod, but it appears that I have the same issue there: root at us01odc-p01-ctrl1:~# rabbitmqctl list_queues|grep notific notifications.info 226869 What should be consuming the notifications.info queue? -------------- next part -------------- An HTML attachment was scrubbed... URL: From cboylan at sapwetik.org Mon Oct 21 16:55:53 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 21 Oct 2019 09:55:53 -0700 Subject: [infra] Trying to reproduce an OpenStack CI run In-Reply-To: References: <67da7f1e-0580-493b-aa27-7f8ff132e27c@www.fastmail.com> Message-ID: <3613b838-9b10-42f1-8583-fce421a59b66@www.fastmail.com> On Mon, Oct 21, 2019, at 9:48 AM, Neil Jerram wrote: > > On Mon, Oct 21, 2019 at 3:55 PM Clark Boylan wrote: > > On Mon, Oct 21, 2019, at 6:55 AM, Neil Jerram wrote: > > > I'm trying to reproduce an OpenStack CI run on a fresh Ubuntu Bionic > > > VM, like this: > > > > > > sudo apt-get update > > > sudo apt-get install -y python-all-dev build-essential git libssl-dev > > > ntp ntpdate libre2-dev python-pip > > > sudo pip install virtualenv > > > wget > > > https://1a5e1b314637dd59968a-fddd31f569f44d98bd401bfd88253d97.ssl.cf5.rackcdn.com/682338/1/check/networking-calico-tempest-dsvm/6801298/logs/reproduce.sh > > > chmod +x reproduce.sh > > > sudo ./reproduce.sh > > > > > > The last line runs for a short while but then fails with: > > > > > > [...] > > > Successfully built zuul PyYAML voluptuous PrettyTable ansible > > > sqlalchemy alembic cachecontrol psutil fb-re2 paho-mqtt ws4py Mako > > > repoze.lru pycparser > > > Installing collected packages: pbr, six, python-dateutil, uritemplate, > > > urllib3, certifi, chardet, idna, requests, enum34, ipaddress, > > > pycparser, cffi, cryptography, jwcrypto, github3.py, PyYAML, pynacl, > > > bcrypt, paramiko, smmap2, gitdb2, GitPython, docutils, lockfile, > > > python-daemon, extras, statsd, voluptuous, gear, pytz, tzlocal, > > > funcsigs, futures, apscheduler, PrettyTable, babel, MarkupSafe, jinja2, > > > ansible, netaddr, kazoo, sqlalchemy, Mako, python-editor, alembic, > > > msgpack, cachecontrol, pyjwt, iso8601, psutil, fb-re2, paho-mqtt, > > > contextlib2, more-itertools, zc.lockfile, > > > backports.functools-lru-cache, jaraco.functools, tempora, portend, > > > cheroot, cherrypy, ws4py, repoze.lru, routes, zuul > > > Successfully installed GitPython-2.1.14 Mako-1.1.0 MarkupSafe-1.1.1 > > > PrettyTable-0.7.2 PyYAML-5.1.2 alembic-1.2.1 ansible-2.5.15 > > > apscheduler-3.6.1 babel-2.7.0 backports.functools-lru-cache-1.5 > > > bcrypt-3.1.7 cachecontrol-0.12.5 certifi-2019.9.11 cffi-1.13.0 > > > chardet-3.0.4 cheroot-8.2.1 cherrypy-17.4.2 contextlib2-0.6.0.post1 > > > cryptography-2.8 docutils-0.15.2 enum34-1.1.6 extras-1.0.0 fb-re2-1.0.7 > > > funcsigs-1.0.2 futures-3.3.0 gear-0.14.0 gitdb2-2.0.6 github3.py-1.3.0 > > > idna-2.8 ipaddress-1.0.23 iso8601-0.1.12 jaraco.functools-2.0 > > > jinja2-2.10.3 jwcrypto-0.6.0 kazoo-2.6.1 lockfile-0.12.2 > > > more-itertools-5.0.0 msgpack-0.6.2 netaddr-0.7.19 paho-mqtt-1.4.0 > > > paramiko-2.6.0 pbr-5.4.3 portend-2.5 psutil-5.6.3 pycparser-2.19 > > > pyjwt-1.7.1 pynacl-1.3.0 python-daemon-2.0.6 python-dateutil-2.8.0 > > > python-editor-1.0.4 pytz-2019.3 repoze.lru-0.7 requests-2.22.0 > > > routes-2.4.1 six-1.12.0 smmap2-2.0.5 sqlalchemy-1.3.10 statsd-3.3.0 > > > tempora-1.14.1 tzlocal-2.0.0 uritemplate-3.0.0 urllib3-1.25.6 > > > voluptuous-0.11.7 ws4py-0.5.1 zc.lockfile-2.0 zuul-3.2.0 > > > + cat > > > + /usr/zuul-env/bin/zuul-cloner -m clonemap.yaml --cache-dir /opt/git > > > https://opendev.org openstack/devstack-gate > > > ./reproduce.sh: line 112: /usr/zuul-env/bin/zuul-cloner: No such file > > > or directory > > > > > > Can you see what I'm doing wrong? > > > > The legacy devstack-gate jobs (which produce the reproduce.sh file) rely on the old zuulv2 (not 3) zuul-cloner utility to ensure git repos are in the correct location for the job. Our current images (which you can find at https://nb01.openstack.org/images and https://nb02.openstack.org/images) install a compatbility shim between zuul-cloner and zuulv3. Another option is to checkout zuul's latest version 2.x tag and pip install that to a virtualenv at /usr/zuul-env. > > Thanks Clark. Do you happen to know if I can run one of those images on GCP? > The images should boot fine on GCP. Where you may have problems is they rely on OpenStack's config-drive metadata to configure the root user's ssh key and networking if the networking doesn't rely on DHCP for ipv4 and stateless RAs for ipv6. If the user config is a problem you could update the image locally with your key then upload to GCP. I don't know anything about their networking so can't really suggest workarounds there if they don't DHCP. From fungi at yuggoth.org Mon Oct 21 16:56:16 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 21 Oct 2019 16:56:16 +0000 Subject: [infra] Trying to reproduce an OpenStack CI run In-Reply-To: References: <67da7f1e-0580-493b-aa27-7f8ff132e27c@www.fastmail.com> Message-ID: <20191021165616.7vudwlnyuazefujz@yuggoth.org> On 2019-10-21 17:48:51 +0100 (+0100), Neil Jerram wrote: > On Mon, Oct 21, 2019 at 3:55 PM Clark Boylan wrote: [...] > > Our current images (which you can find at > > https://nb01.openstack.org/images and > > https://nb02.openstack.org/images) install a compatbility shim > > between zuul-cloner and zuulv3. [...] > Thanks Clark. Do you happen to know if I can run one of those images on > GCP? It will probably depend on how well Glean can intuit GCP network configuration in the absence of a Nova configdrive. If it's just basic DHCP there then it might work? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gr at ham.ie Mon Oct 21 17:15:04 2019 From: gr at ham.ie (Graham Hayes) Date: Mon, 21 Oct 2019 18:15:04 +0100 Subject: RMQ notifications.info has no consumers In-Reply-To: References: Message-ID: On 21/10/2019 17:54, Albert Braden wrote: > I installed the RMQ management plugin in my dev and qa clusters and > noticed a queue called notifications.info that has 0 consumers: > >   > > root at us01odc-qa-ctrl1:~# rabbitmqctl list_queues|grep notific > > notifications.info      15496 > >   > > If I cold-start RMQ it starts without a notifications.info queue but > after I create a VM it appears with 1 or 2 messages and then gradually > grows over time. The messages in it appear to be Openstack operations: >   > > I didn’t install the management plugin in prod, but it appears that I > have the same issue there: > >   > > root at us01odc-p01-ctrl1:~# rabbitmqctl list_queues|grep notific > > notifications.info      226869 > >   > > What should be consuming the notifications.info queue? Usually something like ceilometer or other telemetry projects, or projects like Designate and searchlight that don't always have hooks into nova or neutron, or custom tooling. If nothing is consuming them,. you can either set the ttl on the RMQ queue [1] to something small (just in case you want to look at them) or disable them [2] > >   > >   > 1 - https://www.rabbitmq.com/ttl.html 2 - e.g. in Nova https://docs.openstack.org/nova/latest/reference/notifications.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ruslanas at lpic.lt Mon Oct 21 18:41:48 2019 From: ruslanas at lpic.lt (=?UTF-8?Q?Ruslanas_G=C5=BEibovskis?=) Date: Mon, 21 Oct 2019 21:41:48 +0300 Subject: RMQ notifications.info has no consumers In-Reply-To: References: Message-ID: Yes, fully agree with Graham. I do rabbitmqctl queue_purge notifications.* In cron every hour + add policy to remove them after 1 min... Also double check other queues: rabbitmqctl list_qeues | awk '{if($2>5)print $0}' or smth like that. On Mon, 21 Oct 2019, 20:17 Graham Hayes, wrote: > On 21/10/2019 17:54, Albert Braden wrote: > > I installed the RMQ management plugin in my dev and qa clusters and > > noticed a queue called notifications.info that has 0 consumers: > > > > > > > > root at us01odc-qa-ctrl1:~# rabbitmqctl list_queues|grep notific > > > > notifications.info 15496 > > > > > > > > If I cold-start RMQ it starts without a notifications.info queue but > > after I create a VM it appears with 1 or 2 messages and then gradually > > grows over time. The messages in it appear to be Openstack operations: > > > > > > > I didn’t install the management plugin in prod, but it appears that I > > have the same issue there: > > > > > > > > root at us01odc-p01-ctrl1:~# rabbitmqctl list_queues|grep notific > > > > notifications.info 226869 > > > > > > > > What should be consuming the notifications.info queue? > > Usually something like ceilometer or other telemetry projects, > or projects like Designate and searchlight that don't always have hooks > into nova or neutron, or custom tooling. > > If nothing is consuming them,. you can either set the ttl on the > RMQ queue [1] to something small (just in case you want to look at them) > or disable them [2] > > > > > > > > > > > > 1 - https://www.rabbitmq.com/ttl.html > 2 - e.g. in Nova > https://docs.openstack.org/nova/latest/reference/notifications.html > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eandersson at blizzard.com Mon Oct 21 19:03:04 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Mon, 21 Oct 2019 19:03:04 +0000 Subject: RMQ notifications.info has no consumers In-Reply-To: References: Message-ID: I just set max length on the queue to something like 1k. Sent from my iPhone On Oct 21, 2019, at 11:46 AM, Ruslanas Gžibovskis wrote:  Yes, fully agree with Graham. I do rabbitmqctl queue_purge notifications.* In cron every hour + add policy to remove them after 1 min... Also double check other queues: rabbitmqctl list_qeues | awk '{if($2>5)print $0}' or smth like that. On Mon, 21 Oct 2019, 20:17 Graham Hayes, > wrote: On 21/10/2019 17:54, Albert Braden wrote: > I installed the RMQ management plugin in my dev and qa clusters and > noticed a queue called notifications.info that has 0 consumers: > > > > root at us01odc-qa-ctrl1:~# rabbitmqctl list_queues|grep notific > > notifications.info 15496 > > > > If I cold-start RMQ it starts without a notifications.info queue but > after I create a VM it appears with 1 or 2 messages and then gradually > grows over time. The messages in it appear to be Openstack operations: > > > I didn’t install the management plugin in prod, but it appears that I > have the same issue there: > > > > root at us01odc-p01-ctrl1:~# rabbitmqctl list_queues|grep notific > > notifications.info 226869 > > > > What should be consuming the notifications.info queue? Usually something like ceilometer or other telemetry projects, or projects like Designate and searchlight that don't always have hooks into nova or neutron, or custom tooling. If nothing is consuming them,. you can either set the ttl on the RMQ queue [1] to something small (just in case you want to look at them) or disable them [2] > > > > > 1 - https://www.rabbitmq.com/ttl.html 2 - e.g. in Nova https://docs.openstack.org/nova/latest/reference/notifications.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From jasonanderson at uchicago.edu Mon Oct 21 21:35:33 2019 From: jasonanderson at uchicago.edu (Jason Anderson) Date: Mon, 21 Oct 2019 21:35:33 +0000 Subject: [keystone] Federated users who wish to use CLI Message-ID: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> Hi all, I'm in the process of prototyping a federated Keystone using OpenID Connect, which will place ephemeral users in a group that has roles in existing projects. I was testing how it felt from the user's perspective and am confused how I'm supposed to be able to use the openstacksdk with federation. For one thing, the RC files I can download from the "API Access" section of Horizon don't seem like they work; the domain is hard-coded to "Federated", and it also uses a username/password authentication method. I can see that there is a way to use KSA to use an existing OIDC token, which I think is probably the most "user-friendly" way, but the user still has to obtain this token themselves out-of-band, which is not trivial. Has anybody else set this up for users who liked to use the CLI? Is the solution to educate users about creating application credentials instead? Thank you in advance, -- Jason Anderson Chameleon DevOps Lead Consortium for Advanced Science and Engineering, The University of Chicago Mathematics & Computer Science Division, Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From pramchan at yahoo.com Tue Oct 22 01:08:12 2019 From: pramchan at yahoo.com (prakash RAMCHANDRAN) Date: Tue, 22 Oct 2019 01:08:12 +0000 (UTC) Subject: [openstack-discuss][Airship] Call for PTG Participation for Airship References: <667972095.121647.1571706492365.ref@mail.yahoo.com> Message-ID: <667972095.121647.1571706492365@mail.yahoo.com> Hi all, We seek active participation from community to enable Airship design and deployment for new release v2 in progress.Join us at PTG & mini-PTG to complete the Specs for MVP Beta (v2) before we get to November monthly sprint targets.Add your name topics and any topics you want to be bring forth. PTG at Shnnghai -  Airship PTG Ussuri Half day Thursday Nov 7th, full day Friday Nov 8th https://etherpad.openstack.org/p/airship-ptg-ussuri Airship KubeCon mini-PTG San DiegoMonday, 12:30-6:30 [planning on kicking off @ 1pm] https://etherpad.openstack.org/p/airship-kubecon-san-diego ThanksPrakash / For Airship PTG/mini-PTGs -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangyi01 at inspur.com Tue Oct 22 02:39:10 2019 From: yangyi01 at inspur.com (=?gb2312?B?WWkgWWFuZyAo0e6gRCkt1Ma3/s7xvK/NxQ==?=) Date: Tue, 22 Oct 2019 02:39:10 +0000 Subject: [kuryr][kuryr-kubernetes] does kuryr-kubernetes support dynamic subnet by pod namespace or annotation? Message-ID: Hi, Folks We need to create containers on baremetal for different tenants, so pod belongs to tenant VPC or tenant subnet, can we specify subnet by namespace or annotation in pod spec? I don’t mean multiple VIFs by additional subnets, I just need single OVS port for pod. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From skaplons at redhat.com Tue Oct 22 05:53:15 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 22 Oct 2019 07:53:15 +0200 Subject: [neutron] CI meeting cancelled Message-ID: Hi, I’m on training this week and I can’t lead CI meeting. As also some other people who usually attends our CI meeting are also on the same training, lets cancel the meeting this week. We will back with CI meetings next week, so 29.10. — Slawek Kaplonski Senior software engineer Red Hat From arnaud.morin at gmail.com Tue Oct 22 10:19:43 2019 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Tue, 22 Oct 2019 10:19:43 +0000 Subject: [ops] nova wsgi config Message-ID: <20191022101943.GG14827@sync> Hey all, I am trying to configure apache as a WSGI. Is there any other documentation than this: https://docs.openstack.org/nova/stein/user/wsgi.html Is there any recommendations? Thanks in advance! -- Arnaud Morin From ltomasbo at redhat.com Tue Oct 22 11:20:49 2019 From: ltomasbo at redhat.com (Luis Tomas Bolivar) Date: Tue, 22 Oct 2019 13:20:49 +0200 Subject: [kuryr][kuryr-kubernetes] does kuryr-kubernetes support dynamic subnet by pod namespace or annotation? In-Reply-To: References: Message-ID: Hi Yi Yang, On Tue, Oct 22, 2019 at 4:43 AM Yi Yang (杨燚)-云服务集团 wrote: > Hi, Folks > > > > We need to create containers on baremetal for different tenants, so pod > belongs to tenant VPC or tenant subnet, can we specify subnet by namespace > or annotation in pod spec? I don’t mean multiple VIFs by additional > subnets, I just need single OVS port for pod. > There is a namespace handler (and namespace_subnet driver) that creates a different subnet/network per K8s namespace, but those networks are created by that handle in the same tenant account (kuryr is single tenant). -- LUIS TOMÁS BOLÍVAR Senior Software Engineer Red Hat Madrid, Spain ltomasbo at redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From li.canwei2 at zte.com.cn Tue Oct 22 11:45:49 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Tue, 22 Oct 2019 19:45:49 +0800 (CST) Subject: =?UTF-8?B?W1dhdGNoZXJdIE5vIG1lZXRpbmcgb24gT2N0b2JlciAyMyBhbmQgTm92ZW1iZXIgNg==?= Message-ID: <201910221945498908128@zte.com.cn> Hi all, I'm no time tomorrow and Nov 6 is during the Openstack Shanghai Summit, so canceling the IRC meeting. thanks! licanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdulko at redhat.com Tue Oct 22 12:43:47 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Tue, 22 Oct 2019 14:43:47 +0200 Subject: [kuryr][kuryr-kubernetes] does kuryr-kubernetes support dynamic subnet by pod namespace or annotation? In-Reply-To: References: Message-ID: <5246a23457ba43c19d475faf4d4e3d649347d253.camel@redhat.com> On Tue, 2019-10-22 at 13:20 +0200, Luis Tomas Bolivar wrote: > Hi Yi Yang, > > On Tue, Oct 22, 2019 at 4:43 AM Yi Yang (杨燚)-云服务集团 < > yangyi01 at inspur.com> wrote: > > Hi, Folks > > > > > > > > We need to create containers on baremetal for different tenants, so > > pod belongs to tenant VPC or tenant subnet, can we specify subnet > > by namespace or annotation in pod spec? I don’t mean multiple VIFs > > by additional subnets, I just need single OVS port for pod. > > > > > > > There is a namespace handler (and namespace_subnet driver) that > creates a different subnet/network per K8s namespace, but those > networks are created by that handle in the same tenant account (kuryr > is single tenant). > > Adding to Luis reply: The short answer is no, but we're totally open to such a contribution in this area. We thought about it but it was never a priority, so there have never been enough resources to get it done properly. The long answer is that it should be pretty easy to implement by having your own PodSubnetsDriver, very similar to the default [1], that would do that logic. While we would totally welcome such implementation upstream, you can also easily keep it in another Python package and use entrypoints to configure Kuryr-Kubernetes with it. [1] https://github.com/openstack/kuryr-kubernetes/blob/5fa529efa46695ae2f29a9ad9c35386d952e6a32/kuryr_kubernetes/controller/drivers/default_subnet.py#L23-L37 From paye600 at gmail.com Tue Oct 22 14:00:43 2019 From: paye600 at gmail.com (Roman Gorshunov) Date: Tue, 22 Oct 2019 16:00:43 +0200 Subject: [helm][loci][rpm][ironic] Please fill deployment tool capabilities In-Reply-To: <3dea3216-bc9e-044b-4869-c4e96b09d3a4@openstack.org> References: <3dea3216-bc9e-044b-4869-c4e96b09d3a4@openstack.org> Message-ID: Hello Thierry, Thank you for the reminder and your great work. I have submitted patches to update OpenStack-Helm [0] and LOCI [1] information and added reviewers from respective teams to hear their feedback. [0] https://review.opendev.org/690068 [1] https://review.opendev.org/690078 Best regards, -- Roman Gorshunov AT&T On Mon, Oct 21, 2019 at 2:21 PM Thierry Carrez wrote: > > OpenStack-Helm, LOCI, RPM-packaging & Bifrost (ironic) folks: > > We recently started to display deployment tools capabilities over at: > https://www.openstack.org/software/project-navigator/deployment-tools > > However OpenStack-Helm, LOCI, RPM Packaging and Bifrost display only the > (default) component:keystone capability, because they were not filled > when we we last asked[1][2]. As such, they look a bit sad. > > Please propose a patch to the following file: > https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools.yaml > > Capabilities are described in detail in: > https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools_capabilities.yaml > > Let me know if you have any question, > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007585.html > [2] > http://lists.openstack.org/pipermail/openstack-discuss/2019-July/008150.html > > -- > Thierry Carrez (ttx) > From yangyi01 at inspur.com Tue Oct 22 14:11:18 2019 From: yangyi01 at inspur.com (=?utf-8?B?WWkgWWFuZyAo5p2o54eaKS3kupHmnI3liqHpm4blm6I=?=) Date: Tue, 22 Oct 2019 14:11:18 +0000 Subject: =?utf-8?B?562U5aSNOiBba3VyeXJdW2t1cnlyLWt1YmVybmV0ZXNdIGRvZXMga3VyeXIt?= =?utf-8?B?a3ViZXJuZXRlcyBzdXBwb3J0IGR5bmFtaWMgc3VibmV0IGJ5IHBvZCBuYW1l?= =?utf-8?Q?space_or_annotation=3F?= In-Reply-To: References: Message-ID: <5bb1eaa841ad422584fd90e2300e95e8@inspur.com> Thanks Luis, what if I have created network and subnet with network name and subnet name the namespace driver will create? I just want to check if it can use an existing tenant network and subnet which can be specified by namespace or annotation. 发件人: Luis Tomas Bolivar [mailto:ltomasbo at redhat.com] 发送时间: 2019年10月22日 19:21 收件人: Yi Yang (杨燚)-云服务集团 抄送: openstack-discuss at lists.openstack.org 主题: Re: [kuryr][kuryr-kubernetes] does kuryr-kubernetes support dynamic subnet by pod namespace or annotation? Hi Yi Yang, On Tue, Oct 22, 2019 at 4:43 AM Yi Yang (杨燚)-云服务集团 > wrote: Hi, Folks We need to create containers on baremetal for different tenants, so pod belongs to tenant VPC or tenant subnet, can we specify subnet by namespace or annotation in pod spec? I don’t mean multiple VIFs by additional subnets, I just need single OVS port for pod. There is a namespace handler (and namespace_subnet driver) that creates a different subnet/network per K8s namespace, but those networks are created by that handle in the same tenant account (kuryr is single tenant). -- LUIS TOMÁS BOLÍVAR Senior Software Engineer Red Hat Madrid, Spain ltomasbo at redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From yangyi01 at inspur.com Tue Oct 22 14:13:35 2019 From: yangyi01 at inspur.com (=?utf-8?B?WWkgWWFuZyAo5p2o54eaKS3kupHmnI3liqHpm4blm6I=?=) Date: Tue, 22 Oct 2019 14:13:35 +0000 Subject: =?utf-8?B?562U5aSNOiBba3VyeXJdW2t1cnlyLWt1YmVybmV0ZXNdIGRvZXMga3VyeXIt?= =?utf-8?B?a3ViZXJuZXRlcyBzdXBwb3J0IGR5bmFtaWMgc3VibmV0IGJ5IHBvZCBuYW1l?= =?utf-8?Q?space_or_annotation=3F?= In-Reply-To: <5246a23457ba43c19d475faf4d4e3d649347d253.camel@redhat.com> References: <5246a23457ba43c19d475faf4d4e3d649347d253.camel@redhat.com> Message-ID: Thanks Michal, got it, I'll try it, I'd like to upstream it if possible. -----邮件原件----- 发件人: Michał Dulko [mailto:mdulko at redhat.com] 发送时间: 2019年10月22日 20:44 收件人: Luis Tomas Bolivar ; Yi Yang (杨燚)-云服务集团 抄送: openstack-discuss at lists.openstack.org 主题: Re: [kuryr][kuryr-kubernetes] does kuryr-kubernetes support dynamic subnet by pod namespace or annotation? On Tue, 2019-10-22 at 13:20 +0200, Luis Tomas Bolivar wrote: > Hi Yi Yang, > > On Tue, Oct 22, 2019 at 4:43 AM Yi Yang (杨燚)-云服务集团 < > yangyi01 at inspur.com> wrote: > > Hi, Folks > > > > > > > > We need to create containers on baremetal for different tenants, so > > pod belongs to tenant VPC or tenant subnet, can we specify subnet by > > namespace or annotation in pod spec? I don’t mean multiple VIFs by > > additional subnets, I just need single OVS port for pod. > > > > > > > There is a namespace handler (and namespace_subnet driver) that > creates a different subnet/network per K8s namespace, but those > networks are created by that handle in the same tenant account (kuryr > is single tenant). > > Adding to Luis reply: The short answer is no, but we're totally open to such a contribution in this area. We thought about it but it was never a priority, so there have never been enough resources to get it done properly. The long answer is that it should be pretty easy to implement by having your own PodSubnetsDriver, very similar to the default [1], that would do that logic. While we would totally welcome such implementation upstream, you can also easily keep it in another Python package and use entrypoints to configure Kuryr-Kubernetes with it. [1] https://github.com/openstack/kuryr-kubernetes/blob/5fa529efa46695ae2f29a9ad9c35386d952e6a32/kuryr_kubernetes/controller/drivers/default_subnet.py#L23-L37 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From gouthampravi at gmail.com Tue Oct 22 15:13:17 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Tue, 22 Oct 2019 08:13:17 -0700 Subject: [ptl][release] Re: [stable][EM] Extended Maintenance - Queens In-Reply-To: <20191017203152.GA828@sm-workstation> References: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> <20191017203152.GA828@sm-workstation> Message-ID: On Thu, Oct 17, 2019 at 1:37 PM Sean McGinnis wrote: > > On Wed, Oct 16, 2019 at 05:44:31PM +0000, Elõd Illés wrote: > > Hi, > > > > As it was agreed during PTG, the planned date of Extended Maintenance > > transition of Queens is around two weeks after Train release (a less > > busy period) [1]. Now that Train is released, it is a good opportunity > > for teams to go through the list of open and unreleased changes in > > Queens [2] and schedule a final release for Queens if needed. Feel free > > to use / edit / modify the lists (I've generated the lists for > > repositories which have 'follows-policy' tag). I hope this helps. > > > > [1] https://releases.openstack.org/ > > [2] https://etherpad.openstack.org/p/queens-final-release-before-em > > > > Thanks, > > > > Előd > > > > Trying to amplify this. > > The date for Queens to transition to Extended Maintenance is next week. Late in > the week we will be proposing a patch to tag all deliverables with a > "queens-em" tag. After this point, no additional releases will be allowed. > > I took a quick look through our stable/queens deliverables, and there are > several that look to have a sizable amount of patches landed that have not been > released. Elod was super nice by including all of that for easy checking in [2] > above. > > As part of Extended Maintenance, bugfixes can (and should) be cherry-picked to > stable/queens. But once we enter Extended Maintenance, there won't be any > official releases and it will be up to downstream consumers to pick up these > fixes locally as they need them. > > So consider this a last call for stable/queens releases. > > Thanks! > Sean > Thank you Elod and Sean. In Manila, we went through the exercise of seeing if we missed out any bug fixes for stable/queens and addressed most of them. However, there's one additional fix [1] due that would be nice to get in. We hope to get that in in the next couple of days and propose a final release with that commit hash. Hope that's okay.. Thanks, Goutham [1] https://launchpad.net/bugs/1845135 From gmann at ghanshyammann.com Tue Oct 22 15:13:38 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 22 Oct 2019 10:13:38 -0500 Subject: [tc][all] Ussuri community goal candidate 1: 'Project Specific New Contributor & PTL Docs' Message-ID: <16df407dbf8.11d25468592036.8156563932102242889@ghanshyammann.com> Hello Everyone, We are starting the next step for the Ussuri Cycle Community Goals. We have four candidates till now as proposed in etherpad[1]. The first candidate is "Project Specific New Contributor & PTL Docs". Kendall (diablo_rojo) volunteered to lead this goal as Champion. Thanks to her for stepping up for this job. This idea was brought up during Train cycle goal discussions also[2]. The idea here is to have a consistent and mandatory contributors guide in each project which will help new contributors to get onboard in upstream activities. Also, create PTL duties guide on the project's side. Few projects might have the PTL duties documented and making it consistent and for all projects is something easy for transferring the knowledge. Kendall can put up more details and highlights based on queries. We would like to open this idea to get wider feedback from the community and projects team before we start defining the goal in Gerrit. What do you think of this as a community goal? Any query or Improvement Feedback? [1] https://etherpad.openstack.org/p/PVG-u-series-goals [2] https://etherpad.openstack.org/p/BER-t-series-goals -gmann & diablo_rojo From mdulko at redhat.com Tue Oct 22 15:28:34 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Tue, 22 Oct 2019 17:28:34 +0200 Subject: =?UTF-8?Q?=E7=AD=94=E5=A4=8D=3A?= [kuryr][kuryr-kubernetes] does kuryr-kubernetes support dynamic subnet by pod namespace or annotation? In-Reply-To: <5bb1eaa841ad422584fd90e2300e95e8@inspur.com> References: <5bb1eaa841ad422584fd90e2300e95e8@inspur.com> Message-ID: <2bb02e40e6c7d712eea123f0496ca9c7affb2fb8.camel@redhat.com> Oh, I actually should have thought about it. So if you'll precreate the network, subnet and a KuryrNet Custom Resource [1] it should actually work. The definition of KuryrNet can be find here [2], fields are pretty self-explanatory. Please note that you also need to link KuryrNet to the namespace by adding an annotation to the namespace: "openstack.org/kuryr-net-crd": "ns-" Also, just for safety, make sure the KuryrNet itself is named "ns- " - I'm not sure if some code isn't looking it up by name. Please note that this was never tested, so maybe there's something I don't see that might prevent it from working. [1] https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ [2] https://github.com/openstack/kuryr-kubernetes/blob/a85a7bc8b1761eb748ccf16430fe77587bc764c2/kubernetes_crds/kuryrnet.yaml On Tue, 2019-10-22 at 14:11 +0000, Yi Yang (杨燚)-云服务集团 wrote: > Thanks Luis, what if I have created network and subnet with network > name and subnet name the namespace driver will create? I just want to > check if it can use an existing tenant network and subnet which can > be specified by namespace or annotation. > > 发件人: Luis Tomas Bolivar [mailto:ltomasbo at redhat.com] > 发送时间: 2019年10月22日 19:21 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: openstack-discuss at lists.openstack.org > 主题: Re: [kuryr][kuryr-kubernetes] does kuryr-kubernetes support > dynamic subnet by pod namespace or annotation? > > Hi Yi Yang, > > On Tue, Oct 22, 2019 at 4:43 AM Yi Yang (杨燚)-云服务集团 < > yangyi01 at inspur.com> wrote: > > Hi, Folks > > > > We need to create containers on baremetal for different tenants, so > > pod belongs to tenant VPC or tenant subnet, can we specify subnet > > by namespace or annotation in pod spec? I don’t mean multiple VIFs > > by additional subnets, I just need single OVS port for pod. > > > > There is a namespace handler (and namespace_subnet driver) that > creates a different subnet/network per K8s namespace, but those > networks are created by that handle in the same tenant account (kuryr > is single tenant). > > > -- > LUIS TOMÁS BOLÍVAR > Senior Software Engineer > Red Hat > Madrid, Spain > ltomasbo at redhat.com > From mdulko at redhat.com Tue Oct 22 15:34:45 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Tue, 22 Oct 2019 17:34:45 +0200 Subject: [ptl][release] Re: [stable][EM] Extended Maintenance - Queens In-Reply-To: <20191017203152.GA828@sm-workstation> References: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> <20191017203152.GA828@sm-workstation> Message-ID: <2b7014655fa57e7f6bfbeb5bd304a9d1544019e4.camel@redhat.com> On Thu, 2019-10-17 at 15:31 -0500, Sean McGinnis wrote: > On Wed, Oct 16, 2019 at 05:44:31PM +0000, Elõd Illés wrote: > > Hi, > > > > As it was agreed during PTG, the planned date of Extended Maintenance > > transition of Queens is around two weeks after Train release (a less > > busy period) [1]. Now that Train is released, it is a good opportunity > > for teams to go through the list of open and unreleased changes in > > Queens [2] and schedule a final release for Queens if needed. Feel free > > to use / edit / modify the lists (I've generated the lists for > > repositories which have 'follows-policy' tag). I hope this helps. > > > > [1] https://releases.openstack.org/ > > [2] https://etherpad.openstack.org/p/queens-final-release-before-em > > > > Thanks, > > > > Előd > > > > Trying to amplify this. > > The date for Queens to transition to Extended Maintenance is next week. Late in > the week we will be proposing a patch to tag all deliverables with a > "queens-em" tag. After this point, no additional releases will be allowed. > > I took a quick look through our stable/queens deliverables, and there are > several that look to have a sizable amount of patches landed that have not been > released. Elod was super nice by including all of that for easy checking in [2] > above. > > As part of Extended Maintenance, bugfixes can (and should) be cherry-picked to > stable/queens. But once we enter Extended Maintenance, there won't be any > official releases and it will be up to downstream consumers to pick up these > fixes locally as they need them. > > So consider this a last call for stable/queens releases. Here's the patch [1] to do last kuryr-kubernetes release for Queens. It's not a big deal, most people build their own containers, but I guess it doesn't hurt. [1] https://review.opendev.org/#/c/690109/ > Thanks! > Sean > From gmann at ghanshyammann.com Tue Oct 22 16:00:17 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 22 Oct 2019 11:00:17 -0500 Subject: [tc][goal]: Ussuri community goal candidate 2: 'Switch remaining legacy jobs to Zuul v3 and drop legacy support' Message-ID: <16df432923a.1169e5bdd93925.5868281999311653240@ghanshyammann.com> Hello Everyone, As you might have seen on the ML about Ussuri community goal feedback for each proposed candidate [1], 2nd candidate is 'Switch remaining legacy jobs to Zuul v3 and drop legacy support' [2]. Andreas has briefed this goal in etherpad[2]. "Convert the remaining legacy jobs to Zuul v3 features. Grenade might be the largest open issue. This means especially that devstack-gate will not support Ussuri but only older releases. Having all jobs migrated to Zuul v3 native will allow easier maintenance and sharing between them. " Zuul has good documentation about migrating the legacy job to zuulv3 native. devstack and tempest have already migrated all their legacy jobs and so does many projects also ( if not all but few jobs for sure). Keeping legacy jobs are very costly not just in term of maintenance but also for doing the community-wide work. I have experienced the same during the migration of upstream CI from xenial - > Bionic in stein and IPV6 goal in Train. Currently, we do not have any volunteers to lead this goal. I would like to ask for the volunteer who can drive this goal do the pre-work about defining the goal and then be champion for this goal. Feel Free to reply to this email or ping me on #openstack-tc IRC channel. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010287.html [2] https://etherpad.openstack.org/p/PVG-u-series-goals -gmann From gmann at ghanshyammann.com Tue Oct 22 16:50:33 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 22 Oct 2019 11:50:33 -0500 Subject: [tc][all][goal]: Ussuri community goal candidate 3: 'Consistent and secure default policies' Message-ID: <16df4609432.e8a5d43f95570.8265671559696188212@ghanshyammann.com> Hello Everyone, This is 3rd proposal candidate for the Ussuri cycle community-wide goal. The other two are [1]. Colleen proposed this idea for the Ussuri cycle community goal. Projects implemented/plan to implement this: *Keystone already implemented this with all necessary support in oslo.policy with nice documentation. * We discussed this in nova train PTG to implement it in nova [2]. Nova spec was merged in Train but could not implement. I have re-proposed the spec for the Ussuri cycle [3]. This is nice idea as a goal from the user perspective. Colleen has less bandwidth to drive this goal alone. We are looking for a champion or co-champions (1-2 people will be much better) this goal along with Colleen. Also, let us know what do you think of this as a community goal? Any query or Improvement Feedback? You can refer the detailed on etherpad or I am copying the same here too to get feedback/queries in each item. Existing policy defaults suffer from three major faults: #1: the admin-ness problem: use of policy rules like 'is_admin' or hard-coded is-admin checks results in the admin-anywhere-admin-everywhere problem and drastically inhibits true multitenancy since by default customers cannot have admin rights on their own projects or domains #2: insecure custom roles: many policy rules simply use "" as the rule, which means there is no rule: anyone can perform that action. This means creation of a custom role (say, "nova-autoscaler" requires editing every policy file across every service to block users with such a rule from performing actions unrelated to their role #3: related to #2, no support for read-only roles: keystone now has a "reader" role that comes out of the box when keystone is bootstrapped, but it currently has very little value because of the use of empty rules in service policies: users with the "reader" role can still perform write actions on services if the policy rule for such an action is empty. The keystone project has migrated all of its default policies to 1) use oslo.policy's scope_types attribute, which allows the policy engine to understand "system scope" and distinguish between an admin role assignment on a project versus an admin role assignment on the entire system, 2) ensure all rules use one of the default roles (admin, member, and reader) which both ensures support for a read-only role and prevents custom roles from accidental over-permissiveness. Although the problems being solved are slightly different, the keystone team found it was easiest to migrate everything at once. The rest of the OpenStack services can use this migration as a template for securing their own policies. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010287.html http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010290.html [2] https://etherpad.openstack.org/p/ptg-train-xproj-nova-keystone [3] https://review.opendev.org/#/q/topic:bp/policy-defaults-refresh+(status:open+OR+status:merged) -gmann From ltoscano at redhat.com Tue Oct 22 19:28:31 2019 From: ltoscano at redhat.com (Luigi Toscano) Date: Tue, 22 Oct 2019 21:28:31 +0200 Subject: [tc][goal]: Ussuri community goal candidate 2: 'Switch remaining legacy jobs to Zuul v3 and drop legacy support' In-Reply-To: <16df432923a.1169e5bdd93925.5868281999311653240@ghanshyammann.com> References: <16df432923a.1169e5bdd93925.5868281999311653240@ghanshyammann.com> Message-ID: <1767778.LsA9bvU4DT@whitebase.usersys.redhat.com> On Tuesday, 22 October 2019 18:00:17 CEST Ghanshyam Mann wrote: > Hello Everyone, > > As you might have seen on the ML about Ussuri community goal feedback for > each proposed candidate [1], 2nd candidate is 'Switch remaining legacy > jobs to Zuul v3 and drop legacy support' [2]. > > [...] > > Currently, we do not have any volunteers to lead this goal. I would like to > ask for the volunteer who can drive this goal do the pre-work about > defining the goal and then be champion for this goal. Feel Free to reply to > this email or ping me on #openstack-tc IRC channel. I think I can help with this goal. I've had my good share of migrations so far, and the grenade jobs are almost there as well. Ciao -- Luigi From fsbiz at yahoo.com Tue Oct 22 19:45:50 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Tue, 22 Oct 2019 19:45:50 +0000 (UTC) Subject: [ironic]: Timeout reached while waiting for callback for node References: <413811384.587474.1571773550206.ref@mail.yahoo.com> Message-ID: <413811384.587474.1571773550206@mail.yahoo.com> Hi folks, We have a 300 strong ironic cluster which has been relatively stable for the most part.Occasionally we get the IPA timeout errors. | | 2019-10-20 23:33:37.994 | 2019-10-20 23:33:37.421 265272 ERROR nova.virt.ironic.driver [req-71933c39-2ac4-497b-8cdc-9578889f42fe cda0725e2a68445db857f56c49f77faf 53a4a4738d9f49e99fe580c1bd51f950 - default default] Error deploying instance c51368f6-b9d8-4c3c-bd63-d1f84ee26bfb on baremetal node 83dc67d4-36be-4004-8847-afd9b969de0e.: InstanceDeployFailure: Failed to provision instance c51368f6-b9d8-4c3c-bd63-d1f84ee26bfb: Timeout reached while waiting for callback for node 83dc67d4-36be-4004-8847-afd9b969de0e | Here's my troubleshooting so far. DHCP: All good. Oct 22 09:49:58 sc-control03 dnsmasq-dhcp[965699]: 3076961402 DHCPDISCOVER(ns-601d0738-39) 6c:b3:11:4f:87:acOct 22 09:49:58 sc-control03 dnsmasq-dhcp[965699]: 3076961402 DHCPOFFER(ns-601d0738-39) 10.33.23.94 6c:b3:11:4f:87:acOct 22 09:50:02 sc-control03 dnsmasq-dhcp[965699]: 3076961402 DHCPREQUEST(ns-601d0738-39) 10.33.23.94 6c:b3:11:4f:87:acOct 22 09:50:02 sc-control03 dnsmasq-dhcp[965699]: 3076961402 DHCPACK(ns-601d0738-39) 10.33.23.94 6c:b3:11:4f:87:ac host-10-33-23-94Oct 22 10:19:47 sc-control03 dnsmasq-dhcp[965699]: 0 DHCPRELEASE(ns-601d0738-39) 10.33.23.94 6c:b3:11:4f:87:acOct 22 10:21:23 sc-control03 dnsmasq-dhcp[965699]: 2188330678 DHCPDISCOVER(ns-601d0738-39) 6c:b3:11:4f:87:acOct 22 10:21:23 sc-control03 dnsmasq-dhcp[965699]: 2188330678 DHCPOFFER(ns-601d0738-39) 10.33.23.187 6c:b3:11:4f:87:acOct 22 10:21:26 sc-control03 dnsmasq-dhcp[965699]: 2188330678 DHCPREQUEST(ns-601d0738-39) 10.33.23.187 6c:b3:11:4f:87:acOct 22 10:21:26 sc-control03 dnsmasq-dhcp[965699]: 2188330678 DHCPACK(ns-601d0738-39) 10.33.23.187 6c:b3:11:4f:87:ac host-10-33-23-Oct 22 10:22:46 sc-control03 dnsmasq-dhcp[965699]: 3549042952 DHCPDISCOVER(ns-601d0738-39) 6c:b3:11:4f:87:ac Ironic conductor logs: (shows ironic state proceeding to Power ON, ironic state: deploying.  Then 30 minutes later timeout since IPA failed to check-in 2019-10-22 09:49:03.746 267066 DEBUG ironic.drivers.modules.deploy_utils [req-fabea5a4-da6c-48be-834a-49763c23b0e9 1ded19959aa94e53b6487d7c4f8d3371 acf8cd411e5e4751a61d1ed54e8e874d - default default] Deploy boot mode is uefi for d465fa17-a2ea-43ff-a3a7-033cbb497774. get_boot_mode_for_deploy /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:798 2019-10-22 09:49:03.747 267066 DEBUG ironic.common.pxe_utils [req-fabea5a4-da6c-48be-834a-49763c23b0e9 1ded19959aa94e53b6487d7c4f8d3371 acf8cd411e5e4751a61d1ed54e8e874d - default default] Building PXE config for node d465fa17-a2ea-43ff-a3a7-033cbb497774 create_pxe_config /usr/lib/python2.7/site-packages/ironic/common/pxe_utils.py:222 2019-10-22 09:49:03.748 267066 DEBUG ironic.drivers.modules.deploy_utils [req-fabea5a4-da6c-48be-834a-49763c23b0e9 1ded19959aa94e53b6487d7c4f8d3371 acf8cd411e5e4751a61d1ed54e8e874d - default default] Deploy boot mode is uefi for d465fa17-a2ea-43ff-a3a7-033cbb497774. get_boot_mode_for_deploy /usr/lib/python2.7/site-packages/ironic/drivers/modules/deploy_utils.py:798 2019-10-22 09:49:04.516 267066 DEBUG ironic.drivers.modules.pxe [req-fabea5a4-da6c-48be-834a-49763c23b0e9  1ded19959aa94e53b6487d7c4f8d3371 acf8cd411e5e4751a61d1ed54e8e874d - default default] Fetching necessary kernel and ramdisk for node d465fa17-a2ea-43ff-a3a7-033cbb497774 _cache_ramdisk_kernel /usr/lib/python2.7/site-packages/ironic/drivers/modules/pxe.py:382 2019-10-22 09:49:17.761 267066 INFO ironic.conductor.utils [req-fabea5a4-da6c-48be-834a-49763c23b0e9 1ded19959aa94e53b6487d7c4f8d3371 acf8cd411e5e4751a61d1ed54e8e874d - default default] Successfully set node d465fa17-a2ea-43ff-a3a7-033cbb497774 power state to power on by rebooting. ===== 30 minutes later timeout.  IPA failed to check-in 2019-10-22 10:19:37.448 267066 INFO ironic.conductor.task_manager [req-762d8acf-0a55-4d8d-99d8-93020a14547c - - - - -] Node d465fa17-a2ea-43ff-a3a7-033cbb497774 moved to provision state "deploy failed" from state "wait call-back"; target provision state is "active" 2019-10-22 10:19:37.471 267066 DEBUG ironic.common.pxe_utils [req-762d8acf-0a55-4d8d-99d8-93020a14547c - - - - -] Cleaning up PXE config for node d465fa17-a2ea-43ff-a3a7-033cbb497774 clean_up_pxe_config /usr/lib/python2.7/site-packages/ironic/common/pxe_utils.py:288 TFTP logs: shows TFTP client timed out (weird).  Any pointers here?tftpd shows ramdisk_deployed completed.  Then, it reports that the client timed out. Oct 22 09:50:15 sc-ironic04 in.tftpd[318194]: RRQ from 10.33.23.94 filename d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_ramdisk remapped to /tftpboot/d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_ramdisk Oct 22 09:50:44 sc-ironic04 in.tftpd[318194]: Client 10.33.23.94 finished d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_ramdiskOct 22 09:50:44 sc-ironic04 in.tftpd[318194]: Client 10.33.23.94 timed out Oct 22 09:50:03 sc-ironic04 in.tftpd[318148]: remap: input: bootx64.efiOct 22 09:50:03 sc-ironic04 in.tftpd[318148]: remap: rule 3: rewrite: /tftpboot/bootx64.efiOct 22 09:50:03 sc-ironic04 in.tftpd[318148]: remap: rule 3: exitOct 22 09:50:03 sc-ironic04 in.tftpd[318148]: RRQ from 10.33.23.94 filename bootx64.efi remapped to /tftpboot/bootx64.efiOct 22 09:50:04 sc-ironic04 in.tftpd[318148]: Error code 8: User aborted the transferOct 22 09:50:04 sc-ironic04 in.tftpd[318150]: remap: input: bootx64.efiOct 22 09:50:04 sc-ironic04 in.tftpd[318150]: remap: rule 3: rewrite: /tftpboot/bootx64.efiOct 22 09:50:04 sc-ironic04 in.tftpd[318150]: remap: rule 3: exitOct 22 09:50:04 sc-ironic04 in.tftpd[318150]: RRQ from 10.33.23.94 filename bootx64.efi remapped to /tftpboot/bootx64.efiOct 22 09:50:04 sc-ironic04 in.tftpd[318150]: Client 10.33.23.94 finished bootx64.efiOct 22 09:50:04 sc-ironic04 in.tftpd[318151]: remap: input: grubx64.efiOct 22 09:50:04 sc-ironic04 in.tftpd[318151]: remap: rule 3: rewrite: /tftpboot/grubx64.efiOct 22 09:50:04 sc-ironic04 in.tftpd[318151]: remap: rule 3: exitOct 22 09:50:04 sc-ironic04 in.tftpd[318151]: RRQ from 10.33.23.94 filename grubx64.efi remapped to /tftpboot/grubx64.efiOct 22 09:50:05 sc-ironic04 in.tftpd[318151]: Client 10.33.23.94 finished grubx64.efiOct 22 09:50:05 sc-ironic04 in.tftpd[318152]: remap: input: /grub.cfg-01-6c-b3-11-4f-87-acOct 22 09:50:05 sc-ironic04 in.tftpd[318152]: remap: rule 2: rewrite: /tftpboot//grub.cfg-01-6c-b3-11-4f-87-acOct 22 09:50:05 sc-ironic04 in.tftpd[318152]: remap: rule 2: exitOct 22 09:50:05 sc-ironic04 in.tftpd[318152]: RRQ from 10.33.23.94 filename /grub.cfg-01-6c-b3-11-4f-87-ac remapped to /tftpboot//grub.cfg-01-6c-b3-11-4f-87-acOct 22 09:50:05 sc-ironic04 in.tftpd[318152]: Client 10.33.23.94 File not found /tftpboot//grub.cfg-01-6c-b3-11-4f-87-acOct 22 09:50:05 sc-ironic04 in.tftpd[318152]: sending NAK (1, File not found) to 10.33.23.94Oct 22 09:50:05 sc-ironic04 in.tftpd[318153]: remap: input: /grub.cfg-0A21175EOct 22 09:50:05 sc-ironic04 in.tftpd[318153]: remap: rule 2: rewrite: /tftpboot//grub.cfg-0A21175EOct 22 09:50:05 sc-ironic04 in.tftpd[318153]: remap: rule 2: exitOct 22 09:50:05 sc-ironic04 in.tftpd[318153]: RRQ from 10.33.23.94 filename /grub.cfg-0A21175E remapped to /tftpboot//grub.cfg-0A21175EOct 22 09:50:05 sc-ironic04 in.tftpd[318153]: Client 10.33.23.94 File not found /tftpboot//grub.cfg-0A21175EOct 22 09:50:05 sc-ironic04 in.tftpd[318153]: sending NAK (1, File not found) to 10.33.23.94Oct 22 09:50:05 sc-ironic04 in.tftpd[318154]: remap: input: /grub.cfg-0A21175Oct 22 09:50:05 sc-ironic04 in.tftpd[318154]: remap: rule 2: rewrite: /tftpboot//grub.cfg-0A21175Oct 22 09:50:05 sc-ironic04 in.tftpd[318154]: remap: rule 2: exitOct 22 09:50:05 sc-ironic04 in.tftpd[318154]: RRQ from 10.33.23.94 filename /grub.cfg-0A21175 remapped to /tftpboot//grub.cfg-0A21175Oct 22 09:50:05 sc-ironic04 in.tftpd[318154]: Client 10.33.23.94 File not found /tftpboot//grub.cfg-0A21175Oct 22 09:50:05 sc-ironic04 in.tftpd[318154]: sending NAK (1, File not found) to 10.33.23.94Oct 22 09:50:05 sc-ironic04 in.tftpd[318155]: remap: input: /grub.cfg-0A2117Oct 22 09:50:05 sc-ironic04 in.tftpd[318155]: remap: rule 2: rewrite: /tftpboot//grub.cfg-0A2117Oct 22 09:50:05 sc-ironic04 in.tftpd[318155]: remap: rule 2: exitOct 22 09:50:05 sc-ironic04 in.tftpd[318155]: RRQ from 10.33.23.94 filename /grub.cfg-0A2117 remapped to /tftpboot//grub.cfg-0A2117Oct 22 09:50:05 sc-ironic04 in.tftpd[318155]: Client 10.33.23.94 File not found /tftpboot//grub.cfg-0A2117Oct 22 09:50:05 sc-ironic04 in.tftpd[318155]: sending NAK (1, File not found) to 10.33.23.94Oct 22 09:50:05 sc-ironic04 in.tftpd[318156]: remap: input: /grub.cfg-0A211Oct 22 09:50:05 sc-ironic04 in.tftpd[318156]: remap: rule 2: rewrite: /tftpboot//grub.cfg-0A211Oct 22 09:50:05 sc-ironic04 in.tftpd[318156]: remap: rule 2: exitOct 22 09:50:05 sc-ironic04 in.tftpd[318156]: RRQ from 10.33.23.94 filename /grub.cfg-0A211 remapped to /tftpboot//grub.cfg-0A211Oct 22 09:50:05 sc-ironic04 in.tftpd[318156]: Client 10.33.23.94 File not found /tftpboot//grub.cfg-0A211Oct 22 09:50:05 sc-ironic04 in.tftpd[318156]: sending NAK (1, File not found) to 10.33.23.94Oct 22 09:50:05 sc-ironic04 in.tftpd[318157]: remap: input: /grub.cfg-0A21Oct 22 09:50:05 sc-ironic04 in.tftpd[318157]: remap: rule 2: rewrite: /tftpboot//grub.cfg-0A21Oct 22 09:50:05 sc-ironic04 in.tftpd[318157]: remap: rule 2: exitOct 22 09:50:05 sc-ironic04 in.tftpd[318157]: RRQ from 10.33.23.94 filename /grub.cfg-0A21 remapped to /tftpboot//grub.cfg-0A21Oct 22 09:50:05 sc-ironic04 in.tftpd[318157]: Client 10.33.23.94 File not found /tftpboot//grub.cfg-0A21Oct 22 09:50:05 sc-ironic04 in.tftpd[318157]: sending NAK (1, File not found) to 10.33.23.94Oct 22 09:50:05 sc-ironic04 in.tftpd[318158]: remap: input: /grub.cfg-0A2Oct 22 09:50:05 sc-ironic04 in.tftpd[318158]: remap: rule 2: rewrite: /tftpboot//grub.cfg-0A2Oct 22 09:50:05 sc-ironic04 in.tftpd[318158]: remap: rule 2: exitOct 22 09:50:05 sc-ironic04 in.tftpd[318178]: RRQ from 10.33.23.94 filename /EFI/centos/grub.cfg remapped to /tftpboot//EFI/centos/grub.cfgOct 22 09:50:05 sc-ironic04 in.tftpd[318178]: Client 10.33.23.94 finished /EFI/centos/grub.cfgOct 22 09:50:05 sc-ironic04 in.tftpd[318179]: remap: input: /EFI/centos/grub.cfgOct 22 09:50:05 sc-ironic04 in.tftpd[318179]: remap: rule 2: rewrite: /tftpboot//EFI/centos/grub.cfgOct 22 09:50:05 sc-ironic04 in.tftpd[318179]: remap: rule 2: exitOct 22 09:50:05 sc-ironic04 in.tftpd[318179]: RRQ from 10.33.23.94 filename /EFI/centos/grub.cfg remapped to /tftpboot//EFI/centos/grub.cfgOct 22 09:50:05 sc-ironic04 in.tftpd[318179]: Client 10.33.23.94 finished /EFI/centos/grub.cfgOct 22 09:50:07 sc-ironic04 ironic-conductor: 2019-10-22 09:50:07.298 267066 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._sync_local_state' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639Oct 22 09:50:07 sc-ironic04 ironic-conductor: 2019-10-22 09:50:07.298 267066 INFO ironic.conductor.manager [-] Executing sync_local_stateOct 22 09:50:07 sc-ironic04 ironic-conductor: 2019-10-22 09:50:07.305 267066 DEBUG ironic.common.hash_ring [-] Rebuilding cached hash rings ring /usr/lib/python2.7/site-packages/ironic/common/hash_ring.py:51Oct 22 09:50:07 sc-ironic04 ironic-conductor: 2019-10-22 09:50:07.466 267066 DEBUG ironic.common.hash_ring [-] Finished rebuilding hash rings, available drivers are ipmi, fake, snmp, fake_pxe, pxe_snmp, agent_ipmitool, pxe_ipmitool ring /usr/lib/python2.7/site-packages/ironic/common/hash_ring.py:56Oct 22 09:50:10 sc-ironic04 in.tftpd[318184]: remap: input: /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318184]: remap: rule 0: rewrite: /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318184]: remap: rule 0: exitOct 22 09:50:10 sc-ironic04 in.tftpd[318184]: RRQ from 10.33.23.94 filename /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318184]: Client 10.33.23.94 finished /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318185]: remap: input: /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318185]: remap: rule 0: rewrite: /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318185]: remap: rule 0: exitOct 22 09:50:10 sc-ironic04 in.tftpd[318185]: RRQ from 10.33.23.94 filename /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318185]: Client 10.33.23.94 finished /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318186]: remap: input: /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318186]: remap: rule 0: rewrite: /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318186]: remap: rule 0: exitOct 22 09:50:10 sc-ironic04 in.tftpd[318186]: RRQ from 10.33.23.94 filename /tftpboot/10.33.23.94.confOct 22 09:50:10 sc-ironic04 in.tftpd[318186]: Client 10.33.23.94 finished /tftpboot/10.33.23.94.confOct 22 09:50:15 sc-ironic04 in.tftpd[318192]: remap: input: d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_kernelOct 22 09:50:15 sc-ironic04 in.tftpd[318192]: remap: rule 3: rewrite: /tftpboot/d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_kernelOct 22 09:50:15 sc-ironic04 in.tftpd[318192]: remap: rule 3: exitOct 22 09:50:15 sc-ironic04 in.tftpd[318192]: RRQ from 10.33.23.94 filename d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_kernel remapped to /tftpboot/d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_kernelOct 22 09:50:15 sc-ironic04 ironic-conductor: 2019-10-22 09:50:15.898 267066 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_rescuewait_timeouts' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639Oct 22 09:50:15 sc-ironic04 in.tftpd[318192]: Client 10.33.23.94 finished d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_kernelOct 22 09:50:15 sc-ironic04 in.tftpd[318194]: remap: input: d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_ramdiskOct 22 09:50:15 sc-ironic04 in.tftpd[318194]: remap: rule 3: rewrite: /tftpboot/d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_ramdiskOct 22 09:50:15 sc-ironic04 in.tftpd[318194]: remap: rule 3: exitOct 22 09:50:15 sc-ironic04 in.tftpd[318194]: RRQ from 10.33.23.94 filename d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_ramdisk remapped to /tftpboot/d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_ramdisk Oct 22 09:50:44 sc-ironic04 in.tftpd[318194]: Client 10.33.23.94 finished d465fa17-a2ea-43ff-a3a7-033cbb497774/deploy_ramdiskOct 22 09:50:44 sc-ironic04 in.tftpd[318194]: Client 10.33.23.94 timed out This has me stumped here.  This exact failure seems to be happening 3 to 4 times a week on different nodes.Any pointers appreciated. thanks,Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jp.methot at planethoster.info Tue Oct 22 20:57:56 2019 From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=) Date: Tue, 22 Oct 2019 16:57:56 -0400 Subject: [nova] Is config drive data stored in ephemeral storage? Message-ID: Hi, We currently use Ceph for storage on our Openstack cluster. We set up a pool to use for ephemeral nova storage, but we never actually use it, as we prefer to use cinder block devices. However, we notice that objects are being created inside that pool. Could it be that when you use config drives, the actual config drive is created inside nova’s ephemeral storage instead of on the compute node’s disk? Best regards, Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikal at stillhq.com Tue Oct 22 21:16:36 2019 From: mikal at stillhq.com (Michael Still) Date: Wed, 23 Oct 2019 08:16:36 +1100 Subject: [nova] Is config drive data stored in ephemeral storage? In-Reply-To: References: Message-ID: Yes, this is how it works IIRC. Michael On Wed., 23 Oct. 2019, 7:58 am Jean-Philippe Méthot, < jp.methot at planethoster.info> wrote: > Hi, > > We currently use Ceph for storage on our Openstack cluster. We set up a > pool to use for ephemeral nova storage, but we never actually use it, as we > prefer to use cinder block devices. However, we notice that objects are > being created inside that pool. Could it be that when you use config > drives, the actual config drive is created inside nova’s ephemeral storage > instead of on the compute node’s disk? > > Best regards, > > Jean-Philippe Méthot > Openstack system administrator > Administrateur système Openstack > PlanetHoster inc. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Tue Oct 22 22:05:41 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 22 Oct 2019 17:05:41 -0500 Subject: [ptl][release] Re: [stable][EM] Extended Maintenance - Queens In-Reply-To: <20191017203152.GA828@sm-workstation> References: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> <20191017203152.GA828@sm-workstation> Message-ID: On 10/17/2019 3:31 PM, Sean McGinnis wrote: > The date for Queens to transition to Extended Maintenance is next week. Late in > the week we will be proposing a patch to tag all deliverables with a > "queens-em" tag. After this point, no additional releases will be allowed. > > I took a quick look through our stable/queens deliverables, and there are > several that look to have a sizable amount of patches landed that have not been > released. Elod was super nice by including all of that for easy checking in [2] > above. > > As part of Extended Maintenance, bugfixes can (and should) be cherry-picked to > stable/queens. But once we enter Extended Maintenance, there won't be any > official releases and it will be up to downstream consumers to pick up these > fixes locally as they need them. > > So consider this a last call for stable/queens releases. Nova is still working through a non-trivial backlog of backports, the main slowness being some of the stuff yet to land in queens still has to make it through stein and rocky first. I've been bugging stable cores in nova the last week or so but it's slow going. So just FYI that I'm not sure nova will be ready for the queens-em tag this week but I'm trying to get us close. -- Thanks, Matt From gmann at ghanshyammann.com Wed Oct 23 00:26:10 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 22 Oct 2019 19:26:10 -0500 Subject: [tc][goal]: Ussuri community goal candidate 2: 'Switch remaining legacy jobs to Zuul v3 and drop legacy support' In-Reply-To: <1767778.LsA9bvU4DT@whitebase.usersys.redhat.com> References: <16df432923a.1169e5bdd93925.5868281999311653240@ghanshyammann.com> <1767778.LsA9bvU4DT@whitebase.usersys.redhat.com> Message-ID: <16df601b6e0.c86721b199301.1357270482485818652@ghanshyammann.com> ---- On Tue, 22 Oct 2019 14:28:31 -0500 Luigi Toscano wrote ---- > On Tuesday, 22 October 2019 18:00:17 CEST Ghanshyam Mann wrote: > > Hello Everyone, > > > > As you might have seen on the ML about Ussuri community goal feedback for > > each proposed candidate [1], 2nd candidate is 'Switch remaining legacy > > jobs to Zuul v3 and drop legacy support' [2]. > > > > [...] > > > > Currently, we do not have any volunteers to lead this goal. I would like to > > ask for the volunteer who can drive this goal do the pre-work about > > defining the goal and then be champion for this goal. Feel Free to reply to > > this email or ping me on #openstack-tc IRC channel. > > I think I can help with this goal. I've had my good share of migrations so > far, and the grenade jobs are almost there as well. Perfect and thanks for volunteering. As next step, while we are getting the broader feedback on ML, can you start defining this goal in governance repo under goals/proposed dir - https://opendev.org/openstack/governance/src/branch/master/goals/proposed You can find the template here- https://opendev.org/openstack/governance/src/branch/master/goals/template.rst -gmann > > Ciao > -- > Luigi > > > > From dharmendra.kushwaha at gmail.com Wed Oct 23 06:33:28 2019 From: dharmendra.kushwaha at gmail.com (Dharmendra Kushwaha) Date: Wed, 23 Oct 2019 12:03:28 +0530 Subject: [tacker][ptg] Ussuri PTG Planning for Tacker Message-ID: Hello Everyone, I have created an etherpad [1] to collect the topics for Shanghai PTG discussion for Tacker. Please start drafting/adding your topics which you want to discuss in PTG. Even if anyone not be physically available for PTG, but want to discuss their topics, please add in etherpad, we will discuss. [1]: https://etherpad.openstack.org/p/Tacker-PTG-Ussuri Thanks & Regards Dharmendra Kushwaha -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.chengde at 99cloud.net Wed Oct 23 07:03:44 2019 From: yu.chengde at 99cloud.net (yu.chengde at 99cloud.net) Date: Wed, 23 Oct 2019 15:03:44 +0800 Subject: [nova] Which nova container service that nova/conf/compute.py map to In-Reply-To: References: Message-ID: <6C1B4E0B-91B3-4F97-A15D-54C7A7E8A846@99cloud.net> Hi Radosiaw: Thanks for answer. But, I got another issue while deploy OpenStack:stein with adding “nova_dev_mode” in globals.yml Please refer to log listed below for detail. The placement where I face the problem is "nova : Running Nova bootstrap container” ... "AttributeError: 'module' object has no attribute 'COMPUTE_IMAGE_TYPE_AKI'" I can’t find any related code with it. Need your help to handle this, many thanks. Kolla-ansible vision : stable/stein Ansible version : stable/stein Environment: intel Grantley i7-6950 Method to reproduce : $ kolla-ansible -i all-in-one deploy TASK [nova : Running Nova bootstrap container] ********************************* fatal: [chantyu -> chantyu]: FAILED! => {"changed": true, "msg": "Container exited with non-zero return code 1", "rc": 1, "stderr": "+ sudo -E kolla_set_configs\nINFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json\nINFO:__main__:Validating config file\nINFO:__main__:Kolla config strategy set to: COPY_ALWAYS\nINFO:__main__:Copying service configuration files\nINFO:__main__:Copying /var/lib/kolla/config_files/nova.conf to /etc/nova/nova.conf\nINFO:__main__:Setting permission for /etc/nova/nova.conf\nINFO:__main__:Writing out command to execute\nINFO:__main__:Setting permission for /var/log/kolla/nova\nINFO:__main__:Setting permission for /var/log/kolla/nova/nova-manage.log\nINFO:__main__:Setting permission for /var/log/kolla/nova/nova-conductor.log\nINFO:__main__:Setting permission for /var/log/kolla/nova/nova-scheduler.log\nINFO:__main__:Setting permission for /var/log/kolla/nova/nova-consoleauth.log\nINFO:__main__:Setting permission for /var/log/kolla/nova/nova-novncproxy.log\nINFO:__main__:Setting permission for /var/log/kolla/nova/nova-serialproxy.log\nINFO:__main__:Setting permission for /var/log/kolla/nova/nova-api.log\nINFO:__main__:Setting permission for /var/log/kolla/nova/nova-compute.log\nINFO:__main__:Setting permission for /var/log/kolla/nova/privsep-helper.log\n++ cat /run_command\n+ CMD=nova-api\n+ ARGS=\n+ [[ ! -n '' ]]\n+ . kolla_extend_start\n++ [[ ! -d /var/log/kolla/nova ]]\n+++ stat -c %a /var/log/kolla/nova\n++ [[ 2755 != \\7\\5\\5 ]]\n++ chmod 755 /var/log/kolla/nova\n++ . /usr/local/bin/kolla_nova_extend_start\n+++ [[ -n '' ]]\n+++ [[ -n 0 ]]\n+++ nova-manage api_db sync\n/var/lib/kolla/venv/lib/python2.7/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use \"pip install psycopg2-binary\" instead. For details see: .\n \"\"\")\nTraceback (most recent call last):\n File \"/var/lib/kolla/venv/bin/nova-manage\", line 6, in \n from nova.cmd.manage import main\n File \"/var/lib/kolla/venv/lib/python2.7/site-packages/nova/cmd/manage.py\", line 51, in \n from nova.compute import api as compute_api\n File \"/var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/api.py\", line 40, in \n from nova import block_device\n File \"/var/lib/kolla/venv/lib/python2.7/site-packages/nova/block_device.py\", line 26, in \n from nova.virt import driver\n File \"/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/driver.py\", line 115, in \n \"supports_image_type_aki\": os_traits.COMPUTE_IMAGE_TYPE_AKI,\nAttributeError: 'module' object has no attribute 'COMPUTE_IMAGE_TYPE_AKI'\n", "stderr_lines": ["+ sudo -E kolla_set_configs", "INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json", "INFO:__main__:Validating config file", "INFO:__main__:Kolla config strategy set to: COPY_ALWAYS", "INFO:__main__:Copying service configuration files", "INFO:__main__:Copying /var/lib/kolla/config_files/nova.conf to /etc/nova/nova.conf", "INFO:__main__:Setting permission for /etc/nova/nova.conf", "INFO:__main__:Writing out command to execute", "INFO:__main__:Setting permission for /var/log/kolla/nova", "INFO:__main__:Setting permission for /var/log/kolla/nova/nova-manage.log", "INFO:__main__:Setting permission for /var/log/kolla/nova/nova-conductor.log", "INFO:__main__:Setting permission for /var/log/kolla/nova/nova-scheduler.log", "INFO:__main__:Setting permission for /var/log/kolla/nova/nova-consoleauth.log", "INFO:__main__:Setting permission for /var/log/kolla/nova/nova-novncproxy.log", "INFO:__main__:Setting permission for /var/log/kolla/nova/nova-serialproxy.log", "INFO:__main__:Setting permission for /var/log/kolla/nova/nova-api.log", "INFO:__main__:Setting permission for /var/log/kolla/nova/nova-compute.log", "INFO:__main__:Setting permission for /var/log/kolla/nova/privsep-helper.log", "++ cat /run_command", "+ CMD=nova-api", "+ ARGS=", "+ [[ ! -n '' ]]", "+ . kolla_extend_start", "++ [[ ! -d /var/log/kolla/nova ]]", "+++ stat -c %a /var/log/kolla/nova", "++ [[ 2755 != \\7\\5\\5 ]]", "++ chmod 755 /var/log/kolla/nova", "++ . /usr/local/bin/kolla_nova_extend_start", "+++ [[ -n '' ]]", "+++ [[ -n 0 ]]", "+++ nova-manage api_db sync", "/var/lib/kolla/venv/lib/python2.7/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use \"pip install psycopg2-binary\" instead. For details see: .", " \"\"\")", "Traceback (most recent call last):", " File \"/var/lib/kolla/venv/bin/nova-manage\", line 6, in ", " from nova.cmd.manage import main", " File \"/var/lib/kolla/venv/lib/python2.7/site-packages/nova/cmd/manage.py\", line 51, in ", " from nova.compute import api as compute_api", " File \"/var/lib/kolla/venv/lib/python2.7/site-packages/nova/compute/api.py\", line 40, in ", " from nova import block_device", " File \"/var/lib/kolla/venv/lib/python2.7/site-packages/nova/block_device.py\", line 26, in ", " from nova.virt import driver", " File \"/var/lib/kolla/venv/lib/python2.7/site-packages/nova/virt/driver.py\", line 115, in ", " \"supports_image_type_aki\": os_traits.COMPUTE_IMAGE_TYPE_AKI,", "AttributeError: 'module' object has no attribute 'COMPUTE_IMAGE_TYPE_AKI'"], "stdout": "", "stdout_lines": []} NO MORE HOSTS LEFT ************************************************************* PLAY RECAP ********************************************************************* chantyu : ok=245 changed=8 unreachable=0 failed=1 skipped=125 rescued=0 ignored=0 Command failed ansible-playbook -i ../../all-in-one -e @/etc/kolla/globals.yml -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla -e kolla_action=deploy /home/openstack/src/kolla-ansible/ansible/site.yml > Radosław Piliszek 於 2019年10月16日 下午7:50 寫道: > > Hi Yu, > > you want to read: https://docs.openstack.org/kolla-ansible/latest/contributor/kolla-for-openstack-development.html > > In your case you should set: > nova_dev_mode: yes > in globals.yml > > Kind regards, > Radek > > śr., 16 paź 2019 o 13:10 yu.chengde at 99cloud.net > napisał(a): > Hi, > I have deployed a stein version openstack on server thought Kolla-ansible method. > Then, I git clone the nova code, and ready to do coding in " nova/nova/conf/compute.py" > However, many of nova containers include this file. > So, I want to know that I should modify them all, or just pick a specific one. > Thanks > > > [root at chantyu kolla-ansible]# docker ps | grep nova > 05f72e539974 kolla/centos-source-nova-compute:stein "dumb-init --single-…" 28 hours ago Up 2 hours nova_compute > 7393a7d566ee kolla/centos-source-nova-libvirt:stein "dumb-init --single-…" 28 hours ago Up 5 hours nova_libvirt > 9d8357cfa334 kolla/centos-source-nova-scheduler:stein "dumb-init --single-…" 32 hours ago Up 3 hours nova_scheduler > 085b9da918df kolla/centos-source-nova-api:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_api > b80e9503e93e kolla/centos-source-nova-serialproxy:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_serialproxy > c15d41823a22 kolla/centos-source-nova-novncproxy:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_novncproxy > c30e47cd56c6 kolla/centos-source-nova-consoleauth:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_consoleauth > b7d5e9ba1f11 kolla/centos-source-nova-ssh:stein "dumb-init --single-…" 7 days ago Up 5 hours nova_ssh > 3f81cd0a97ce kolla/centos-source-nova-conductor:stein "dumb-init --single-…" 7 days ago Up 3 hours nova_conductor > [root at chantyu kolla-ansible]# -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Wed Oct 23 09:57:20 2019 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 23 Oct 2019 11:57:20 +0200 Subject: "Large Scale SIG" discussion in Shanghai Message-ID: <3292569b-6107-3a74-dfeb-7db44b8b1977@openstack.org> Hi everyone, As OpenStack clusters grow larger, they hit scaling limitations in a number of components. To work around this problem, operators create multiple clusters. In recent visits to OpenStack users, I noticed a common theme around people interested in working to scale single clusters past their current limits. In the past we had various groups trying to identify bottlenecks or work on alternate technologies (RabbitMQ replacements): the Performance WG, or the "Large deployment" WG. However those groups had a bit of a wide focus, and/or are not active anymore. In Shanghai we'll discuss creating a "Large Scale SIG" that would tackle specifically those single-cluster-scaling issues. The idea is that by sharing performance analysis, it can help identify key bottlenecks. By pooling development resources, it can push fixes or new features to address those issues. I directly reached out to several people and organizations interested in working on that. Please join the session in Shanghai if you'll be present there and are interested in discussing this topic: https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24405/facilitating-running-openstack-at-scale-join-the-large-scale-sig -- Thierry Carrez (ttx) From lyarwood at redhat.com Wed Oct 23 10:14:41 2019 From: lyarwood at redhat.com (Lee Yarwood) Date: Wed, 23 Oct 2019 11:14:41 +0100 Subject: [ptl][release] Re: [stable][EM] Extended Maintenance - Queens In-Reply-To: References: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> <20191017203152.GA828@sm-workstation> Message-ID: <20191023101441.oisscim5w5pkzrek@lyarwood.usersys.redhat.com> On 22-10-19 17:05:41, Matt Riedemann wrote: > On 10/17/2019 3:31 PM, Sean McGinnis wrote: > > The date for Queens to transition to Extended Maintenance is next week. Late in > > the week we will be proposing a patch to tag all deliverables with a > > "queens-em" tag. After this point, no additional releases will be allowed. > > > > I took a quick look through our stable/queens deliverables, and there are > > several that look to have a sizable amount of patches landed that have not been > > released. Elod was super nice by including all of that for easy checking in [2] > > above. > > > > As part of Extended Maintenance, bugfixes can (and should) be cherry-picked to > > stable/queens. But once we enter Extended Maintenance, there won't be any > > official releases and it will be up to downstream consumers to pick up these > > fixes locally as they need them. > > > > So consider this a last call for stable/queens releases. > > Nova is still working through a non-trivial backlog of backports, the main > slowness being some of the stuff yet to land in queens still has to make it > through stein and rocky first. I've been bugging stable cores in nova the > last week or so but it's slow going. So just FYI that I'm not sure nova will > be ready for the queens-em tag this week but I'm trying to get us close. I've finally found time to try and push things through this morning. FWIW I've also updated some dashboards I've been using for active and em stable branch review below, hopefully others find these useful: Add EM dashboard for Nova https://review.opendev.org/#/c/690536/ Nova Stable Maintenance Review Inbox http://shorturl.at/cj156 Nova Extended Maintenance Review Inbox http://shorturl.at/moDFH -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From sean.mcginnis at gmx.com Wed Oct 23 11:04:55 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 23 Oct 2019 06:04:55 -0500 Subject: [RelMgmt] Release team meeting cancelled Oct 24 Message-ID: <20191023110455.GA887@sm-workstation> Hey all, We discussed this in last week's meeting, but just a reminder that we will be skipping the release team meeting this Thursday, October 24. We are reaching the final deadline for the stable/queens branch to transition to Extended Maintenance. Please watch for anything related to that and help review final release patches if you can. Anything else that comes up, we can just address ad hoc in the #openstack-release channel. Thanks! Sean From e0ne at e0ne.info Wed Oct 23 12:41:21 2019 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Wed, 23 Oct 2019 15:41:21 +0300 Subject: [tc][horizon][all] Horizon plugins maintenance Message-ID: Hi team, As you may know, we've got a pretty big list of Horizon Plugins [1]. Unfortunately, not all of them are in active development due to the lack of resources in projects teams. As a Horizon team, we understand all the reasons, and we're doing our best to help other teams to maintain plugins. That's why we're proposing our help to maintain horizon plugins. We raised this topic during the last Horizon weekly meeting [2] and we'll have some discussion during the PTG [3] too. There are a lot of Horizon changes which affect plugins and horizon team is ready to help: - new Django versions - dependencies updates - Horizon API changes - etc. To get faster fixes in, it would be good to have +2 permissions for the horizon-core team for each plugin. We helped Heat team during the last cycle adding horizon-core to the heat-dashboard-core team. Also, we've got +2 on other plugins via global project config [4] and via Gerrit configuration for (neutron-*aas-dashboard, tuskar-ui). Vitrage PTL agreed to do the same for vitrage-dashboard during the last meeting [5]. Of course, it's up to each project to maintain horizon plugins and it's responsibilities but I would like to raise this topic to the TC too. I really sure, that it will speed up some critical fixes for Horizon plugins and makes users and operators experience better. [1] https://docs.openstack.org/horizon/latest/install/plugin-registry.html [2] http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-10-16-15.02.log.html#l-128 [3] https://etherpad.openstack.org/p/horizon-u-ptg [4] http://codesearch.openstack.org/?q=horizon-core&i=nope&files=&repos=openstack/project-config [5] http://eavesdrop.openstack.org/meetings/vitrage/2019/vitrage.2019-10-23-08.03.log.html#l-21 Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From corey.bryant at canonical.com Wed Oct 23 13:00:25 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Wed, 23 Oct 2019 09:00:25 -0400 Subject: [Openstack] OpenStack Train for Ubuntu 18.04 LTS Message-ID: The Ubuntu OpenStack team at Canonical is pleased to announce the general availability of OpenStack Train on Ubuntu 18.04 LTS via the Ubuntu Cloud Archive. Details of the Train release can be found at: https://www.openstack.org/software/train To get access to the Ubuntu Train packages: Ubuntu 18.04 LTS ----------------------- You can enable the Ubuntu Cloud Archive pocket for OpenStack Train on Ubuntu 18.04 installations by running the following commands: sudo add-apt-repository cloud-archive:train sudo apt update The Ubuntu Cloud Archive for Train includes updates for: aodh, barbican, ceilometer, ceph (14.2.2), cinder, designate, designate-dashboard, dpdk (18.11.2), glance, gnocchi, heat, heat-dashboard, horizon, ironic, keystone, libvirt (5.4.0), magnum, manila, manila-ui, mistral, murano, murano-dashboard, networking-arista, networking-bagpipe, networking-bgpvpn, networking-hyperv, networking-l2gw, networking-mlnx, networking-odl, networking-ovn, networking-sfc, neutron, neutron-dynamic-routing, neutron-fwaas, neutron-lbaas, neutron-lbaas-dashboard, neutron-vpnaas, nova, octavia, openstack-trove, openvswitch (2.12.0), panko, placement, qemu (4.0), sahara, sahara-dashboard, senlin, swift, trove-dashboard, vmware-nsx, watcher, and zaqar. For a full list of packages and versions, please refer to: http://reqorts.qa.ubuntu.com/reports/ubuntu-server/cloud-archive/train_versions.html Python support ------------------- The Train release of Ubuntu OpenStack is Python 3 only; all Python 2 packages have been dropped in Train. Branch package builds ----------------------------- If you would like to try out the latest updates to branches, we deliver continuously integrated packages on each upstream commit via the following PPA’s: sudo add-apt-repository ppa:openstack-ubuntu-testing/mitaka sudo add-apt-repository ppa:openstack-ubuntu-testing/ocata sudo add-apt-repository ppa:openstack-ubuntu-testing/queens sudo add-apt-repository ppa:openstack-ubuntu-testing/rocky sudo add-apt-repository ppa:openstack-ubuntu-testing/train Reporting bugs ------------------- If you have any issues please report bugs using the 'ubuntu-bug' tool to ensure that bugs get logged in the right place in Launchpad: sudo ubuntu-bug nova-conductor Thanks to everyone who has contributed to OpenStack Train, both upstream and downstream. Special thanks to the Puppet OpenStack modules team and the OpenStack Charms team for their continued early testing of the Ubuntu Cloud Archive, as well as the Ubuntu and Debian OpenStack teams for all of their contributions. Enjoy and see you in Ussuri! Corey (on behalf of the Ubuntu OpenStack team) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongbin034 at gmail.com Wed Oct 23 13:56:46 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Wed, 23 Oct 2019 09:56:46 -0400 Subject: [tc][horizon][all] Horizon plugins maintenance In-Reply-To: References: Message-ID: I added horizon-core to zun-ui core. Feel free to exercice the +2 privilegae whenever it is necessary. On Wed., Oct. 23, 2019, 8:51 a.m. Ivan Kolodyazhny wrote: > Hi team, > > As you may know, we've got a pretty big list of Horizon Plugins [1]. > Unfortunately, not all of them are in active development due to the lack of > resources in projects teams. > > As a Horizon team, we understand all the reasons, and we're doing our best > to help other teams to maintain plugins. > > That's why we're proposing our help to maintain horizon plugins. We raised > this topic during the last Horizon weekly meeting [2] and we'll have some > discussion during the PTG [3] too. > > There are a lot of Horizon changes which affect plugins and horizon team > is ready to help: > - new Django versions > - dependencies updates > - Horizon API changes > - etc. > > To get faster fixes in, it would be good to have +2 permissions for the > horizon-core team for each plugin. > > We helped Heat team during the last cycle adding horizon-core to the > heat-dashboard-core team. Also, we've got +2 on other plugins via global > project config [4] and via Gerrit configuration for > (neutron-*aas-dashboard, tuskar-ui). > > Vitrage PTL agreed to do the same for vitrage-dashboard during the last > meeting [5]. > > > Of course, it's up to each project to maintain horizon plugins and it's > responsibilities but I would like to raise this topic to the TC too. I > really sure, that it will speed up some critical fixes for Horizon plugins > and makes users and operators experience better. > > > [1] https://docs.openstack.org/horizon/latest/install/plugin-registry.html > [2] > http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-10-16-15.02.log.html#l-128 > [3] https://etherpad.openstack.org/p/horizon-u-ptg > [4] > http://codesearch.openstack.org/?q=horizon-core&i=nope&files=&repos=openstack/project-config > [5] > http://eavesdrop.openstack.org/meetings/vitrage/2019/vitrage.2019-10-23-08.03.log.html#l-21 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed Oct 23 14:40:32 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 23 Oct 2019 09:40:32 -0500 Subject: [nova] Which nova container service that nova/conf/compute.py map to In-Reply-To: <6C1B4E0B-91B3-4F97-A15D-54C7A7E8A846@99cloud.net> References: <6C1B4E0B-91B3-4F97-A15D-54C7A7E8A846@99cloud.net> Message-ID: <7a2e7b39-ed05-c690-41a0-61b0686e4719@gmail.com> On 10/23/2019 2:03 AM, yu.chengde at 99cloud.net wrote: >    "AttributeError: 'module' object has no attribute > 'COMPUTE_IMAGE_TYPE_AKI'" You need os-traits >= 0.12.0: https://review.opendev.org/#/c/648147/ Nova's lower-constraints.txt should also specify a version that includes that if you're honoring the lower-constraints file packaged in the repo. -- Thanks, Matt From cdent+os at anticdent.org Wed Oct 23 15:26:23 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Wed, 23 Oct 2019 16:26:23 +0100 (BST) Subject: [ops] nova wsgi config In-Reply-To: <20191022101943.GG14827@sync> References: <20191022101943.GG14827@sync> Message-ID: On Tue, 22 Oct 2019, Arnaud Morin wrote: > I am trying to configure apache as a WSGI. > Is there any other documentation than this: > https://docs.openstack.org/nova/stein/user/wsgi.html > > Is there any recommendations? There are a lot of options, and which you use can mostly come down to personal preference and other aspects of your environment. For example, if you're running in a kubernetes environment, using apache at all can be overkill: the nova-api container(s) can expose an http service which are reached through the ingress. Adding apache in that environment would mean you had proxy -> proxy/apache -> service. If you're trying save some space, that's overkill. However, if what you want is some kind of combination where apache is in front of the nova-api you have three basic options: * Use apache plus mod_proxy to talk to the eventlet driven `nova-api` process. * Use apache plus mod_wsgi to talk to the `nova-api-wsgi` application, probably using WSGIDaemonProcess. * Use apache plus mod_proxy_uwsgi to talk to the `nova-api-wsgi` application, itself being run by uwsgi, where the uwsgi process is started and managed by something like systemd or uwsgi emperor mode. If you use either of the latter two you need to be aware of a potential issue with eventlet as described in the release notes for stein: https://docs.openstack.org/releasenotes/nova/stein.html#known-issues There's some boilerplate documentation for using mod wsgi and uwsgi for various projects. Here's the one for zun: https://docs.openstack.org/zun/train/contributor/mod-wsgi.html There's some documentation in placement which has links to the changes that added placement in devstack, first using mod_wsgi and then using uwsgi: https://docs.openstack.org/placement/latest/install/ That can be a useful guide, just remember to replace placement names with the corresponding nova names. Where placement uses `placement-api`, nova wants `nova-api-wsgi`. There are many options for how to do this, so there's no straightforward cookiecutter answer. The important thing to remember is that `nova-api-wsgi` is a standard WSGI application and there are all kinds of resources on the web for how to host a WSGI application on a variety of web servers in various ways. Things you learn about handling a WSGI application of one sort can be transferred to others (with the important caveat about nova and eventlet described above). My current way for doing this kind of thing is to run uwsgi in a container and then have a proxy talk to that container. See https://github.com/cdent/placedock for how I've done this with Placement. If there's no container involved, I simply run uwsgi standalone. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From clclchen at cn.ibm.com Wed Oct 23 09:05:46 2019 From: clclchen at cn.ibm.com (Ling LL Chen) Date: Wed, 23 Oct 2019 17:05:46 +0800 Subject: [Infra][Cinder] IBM Storage OpenStack CI Gerrit account issue Message-ID: Hi Dear, This is Linda from IBM Storage cinder team. Currently we need to set up the OpenStack CI environment. But now we failed to configure the Gerrit server with ssh key pair. We only know the Gerrit account is "ibm_storage_ci", but we don't know which mail address used for this account. Would you please help to check this Gerrit account "ibm_storage_ci" and provide the detail information with us? Thanks. Regards, Chen Ling(Linda,陈玲) ------------------------------------------- Cloud Storage Solutions Development (CSSD) IBM System SRA troubleshooting wiki: https://apps.na.collabserv.com/wikis/home?lang=en-us#!/wiki/W9c4e08533b49_46bf_b9ee_e46fb712a085/page/SRA%20Troubleshoting Tel: (8621)6092-8292 Ext:28292 Address: 3F, Building 10, 399 Keyuan Road, Zhangjiang Hi-Tech Park, Pudong New District, Shanghai 201203, China ------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas at goirand.fr Wed Oct 23 11:45:54 2019 From: thomas at goirand.fr (Thomas Goirand) Date: Wed, 23 Oct 2019 13:45:54 +0200 Subject: [ops] nova wsgi config In-Reply-To: <20191022101943.GG14827@sync> References: <20191022101943.GG14827@sync> Message-ID: <659657f1-89ba-63b6-f2dc-6d8c42430d08@goirand.fr> On 10/22/19 12:19 PM, Arnaud Morin wrote: > Hey all, > > I am trying to configure apache as a WSGI. > Is there any other documentation than this: > https://docs.openstack.org/nova/stein/user/wsgi.html > > Is there any recommendations? > > Thanks in advance! Hi Arnaud, If you wish, you can have a look at the nova-api package in Debian, which by default, is set to use uwsgi for both the compute and the metadata API. This consist of a sysv-rc / systemd startup script, plus a configuration file. Note that this system also has support over SSL if you just drop the certificate + key in the right folder. Nearly all API package in Debian are configured this way. If it's not the case, it means that either we didn't have time to do the switch yet (which is kind of rare, I hope), or there's no such support upstream for running under a wsgi server. Cheers, Thomas Goirand (zigo) From tobias.rydberg at citynetwork.eu Wed Oct 23 16:51:05 2019 From: tobias.rydberg at citynetwork.eu (Tobias Rydberg) Date: Wed, 23 Oct 2019 20:51:05 +0400 Subject: [sigs][publiccloud][publiccloud-wg][publiccloud-sig] Bi-weekly meeting for the Public Cloud SIG tomorrow CANCELLED Message-ID: <535dba08-e831-3b71-3831-32d0709a3d5a@citynetwork.eu> Hi all, Need to cancel tomorrows meeting since I will be on a flight at that time. Hope to see most of you in Shanghai in less than 2 weeks. Cheers, Tobias -- Tobias Rydberg Senior Developer Twitter & IRC: tobberydberg www.citynetwork.eu | www.citycloud.com INNOVATION THROUGH OPEN IT INFRASTRUCTURE ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4017 bytes Desc: S/MIME Cryptographic Signature URL: From cboylan at sapwetik.org Wed Oct 23 16:52:46 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 23 Oct 2019 09:52:46 -0700 Subject: [Infra][Cinder] IBM Storage OpenStack CI Gerrit account issue In-Reply-To: References: Message-ID: On Wed, Oct 23, 2019, at 2:05 AM, Ling LL Chen wrote: > Hi Dear, > > This is Linda from IBM Storage cinder team. Currently we need to set up > the OpenStack CI environment. But now we failed to configure the Gerrit > server with ssh key pair. > > We only know the Gerrit account is "ibm_storage_ci", but we don't know > which mail address used for this account. > > Would you please help to check this Gerrit account "ibm_storage_ci" and > provide the detail information with us? Thanks. If you got to https://review.opendev.org and in the search bar enter owner:ibm_storage_ci Gerrit helpfully does a lookahead search and shows us "IBM Storage CI ". If we got to the wiki we can find this info about the CI system too: https://wiki.openstack.org/wiki/ThirdPartySystems/IBM_Storage_CI. Possible one of those individuals has the info necessary to update this account. > > Regards, > > Chen Ling(Linda,陈玲) From gmann at ghanshyammann.com Wed Oct 23 16:53:44 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 23 Oct 2019 11:53:44 -0500 Subject: [tc][all] Community-wide goal Ussuri and V cycle forum collaboration idea In-Reply-To: <05fa700e-dba6-36ce-cf42-c7023f2515c9@gmail.com> References: <16dcc79196d.b7dfa21684317.2121277505699030183@ghanshyammann.com> <05fa700e-dba6-36ce-cf42-c7023f2515c9@gmail.com> Message-ID: <16df989dd4b.114969d1e133046.5952283484072407692@ghanshyammann.com> ---- On Tue, 15 Oct 2019 10:24:58 -0500 Matt Riedemann wrote ---- > On 10/14/2019 5:52 PM, Ghanshyam Mann wrote: > > Question is for V cycle goal planning, whether we should discuss the V cycle goal in Ussuri goal fourm sessoin[3] or > > it is too early to kick off V cycle goal at least until we finalize U cycle goal first. I would like to list the below two > > options to proceed further (at least to decide if we need to change the existing U cycle goal forum sessions title). > > > > 1. Merge the Forum session for both cycle goal discussion (divide both in two half). This need forum session title and description change. > > 2. Keep forum session for U cycle goal only and start the V cycle over ML asynchronously. This will help to avoid any confusion or mixing the both cycle goal discussions. > > So you have 40 minutes to discuss something that is notoriously hard to > sort out for one release let alone the future, and to date there are > only 3 goals proposed for Ussuri. Why even consider goals for V at this > point when settling on goals for Train was kind of a (train)wreck (get > it?!) and goal champions for Ussuri aren't necessarily champing at the bit? > > I won't be there so I don't have a horse in this race (yay more idioms), > just commenting from the peanut gallery. Agree with you and let's keep the Forum session for Ussuri cycle goal discussions and we can start V cycle one on ML later. -gmann > > -- > > Thanks, > > Matt > > From colleen at gazlene.net Wed Oct 23 16:55:38 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Wed, 23 Oct 2019 09:55:38 -0700 Subject: [keystone] Federated users who wish to use CLI In-Reply-To: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> References: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> Message-ID: Hi Jason, On Mon, Oct 21, 2019, at 14:35, Jason Anderson wrote: > Hi all, > > I'm in the process of prototyping a federated Keystone using OpenID > Connect, which will place ephemeral users in a group that has roles in > existing projects. I was testing how it felt from the user's > perspective and am confused how I'm supposed to be able to use the > openstacksdk with federation. For one thing, the RC files I can > download from the "API Access" section of Horizon don't seem like they > work; the domain is hard-coded to "Federated", This should be fixed in the latest version of keystone... > and it also uses a > username/password authentication method. ...but this is not, horizon only knows about the 'password' authentication method and can't provide RC files for other types of auth methods (unless you create an application credential). > > I can see that there is a way to use KSA to use an existing OIDC > token, which I think is probably the most "user-friendly" way, but the > user still has to obtain this token themselves out-of-band, which is > not trivial. Has anybody else set this up for users who liked to use > the CLI? All of KSA's auth types are supported by the openstack CLI. Which one you use depends on your OpenID Connect provider. If your provider supports it, you can use the "v3oidcpassword" auth method with the openstack CLI, following this example: https://support.massopen.cloud/kb/faq.php?id=16 On the other hand if you are using something like Google which only supports the authorization_code grant type, then you would have to get the authorization code out of band and then use the "v3oidcauthcode" auth type, and personally I've never gotten that to work with Google. > Is the solution to educate users about creating application > credentials instead? This is the best option. It's much easier to manage and horizon provides openrc and clouds.yaml files for app creds. Hope this helps, Colleen > > Thank you in advance, > > -- > Jason Anderson > > Chameleon DevOps Lead > *Consortium for Advanced Science and Engineering, The University of Chicago* > *Mathematics & Computer Science Division, Argonne National Laboratory* From openstack at fried.cc Wed Oct 23 16:58:49 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 23 Oct 2019 11:58:49 -0500 Subject: [nova] Which nova container service that nova/conf/compute.py map to In-Reply-To: <7a2e7b39-ed05-c690-41a0-61b0686e4719@gmail.com> References: <6C1B4E0B-91B3-4F97-A15D-54C7A7E8A846@99cloud.net> <7a2e7b39-ed05-c690-41a0-61b0686e4719@gmail.com> Message-ID: <7a10cd5e-1d52-e875-b29e-d0d03c5dc1fb@fried.cc> You're clearly running post-stein nova-compute code, because the error is coming from a line that was introduced in train via [1] ([2]). As Matt says, nova's lower constraint for os-traits in train (0.16.0 [3]) includes that trait. Any of the normal packages/distros should be accounting for that properly; if not, we need to know about it. Thanks, efried [1] https://review.opendev.org/#/c/652710/ [2] https://review.opendev.org/#/c/652710/3/nova/virt/driver.py at 115 [3] https://opendev.org/openstack/nova/src/branch/stable/train/lower-constraints.txt#L71 On 10/23/19 9:40 AM, Matt Riedemann wrote: > On 10/23/2019 2:03 AM, yu.chengde at 99cloud.net wrote: >>     "AttributeError: 'module' object has no attribute >> 'COMPUTE_IMAGE_TYPE_AKI'" > > You need os-traits >= 0.12.0: > > https://review.opendev.org/#/c/648147/ > > Nova's lower-constraints.txt should also specify a version that includes > that if you're honoring the lower-constraints file packaged in the repo. > From rafaelweingartner at gmail.com Wed Oct 23 17:59:08 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Wed, 23 Oct 2019 14:59:08 -0300 Subject: [keystone] Federated users who wish to use CLI In-Reply-To: References: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> Message-ID: Hello Colleen, Have you tested the OpenStack CLI with v3oidcpassword or v3oidcauthcode and multiple IdPs configured in Keystone? We are currently debugging and discussing on how to enable this support in the CLI. So far, we were not able to make it work with the current code. This also happens with Horizon. If one has multiple IdPs in Keystone, the "discovery" process would happen twice, one in Horizon and another in Keystone, which is executed by the OIDC plugin in the HTTPD. We already fixed the Horizon issue, but the CLI we are still investigating, and we suspect that is probably the same problem. On Wed, Oct 23, 2019 at 1:56 PM Colleen Murphy wrote: > Hi Jason, > > On Mon, Oct 21, 2019, at 14:35, Jason Anderson wrote: > > Hi all, > > > > I'm in the process of prototyping a federated Keystone using OpenID > > Connect, which will place ephemeral users in a group that has roles in > > existing projects. I was testing how it felt from the user's > > perspective and am confused how I'm supposed to be able to use the > > openstacksdk with federation. For one thing, the RC files I can > > download from the "API Access" section of Horizon don't seem like they > > work; the domain is hard-coded to "Federated", > > This should be fixed in the latest version of keystone... > > > and it also uses a > > username/password authentication method. > > ...but this is not, horizon only knows about the 'password' authentication > method and can't provide RC files for other types of auth methods (unless > you create an application credential). > > > > > I can see that there is a way to use KSA to use an existing OIDC > > token, which I think is probably the most "user-friendly" way, but the > > user still has to obtain this token themselves out-of-band, which is > > not trivial. Has anybody else set this up for users who liked to use > > the CLI? > > All of KSA's auth types are supported by the openstack CLI. Which one you > use depends on your OpenID Connect provider. If your provider supports it, > you can use the "v3oidcpassword" auth method with the openstack CLI, > following this example: > > https://support.massopen.cloud/kb/faq.php?id=16 > > On the other hand if you are using something like Google which only > supports the authorization_code grant type, then you would have to get the > authorization code out of band and then use the "v3oidcauthcode" auth type, > and personally I've never gotten that to work with Google. > > > Is the solution to educate users about creating application > > credentials instead? > > This is the best option. It's much easier to manage and horizon provides > openrc and clouds.yaml files for app creds. > > Hope this helps, > > Colleen > > > > > Thank you in advance, > > > > -- > > Jason Anderson > > > > Chameleon DevOps Lead > > *Consortium for Advanced Science and Engineering, The University of > Chicago* > > *Mathematics & Computer Science Division, Argonne National Laboratory* > > -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Wed Oct 23 18:14:43 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 23 Oct 2019 11:14:43 -0700 Subject: [tc][horizon][all] Horizon plugins maintenance In-Reply-To: References: Message-ID: I believe this is totally reasonable and will raise it with the ironic team during our next meeting. Thanks for bringing this up! -Julia On Wed, Oct 23, 2019 at 5:43 AM Ivan Kolodyazhny wrote: > > Hi team, > > As you may know, we've got a pretty big list of Horizon Plugins [1]. Unfortunately, not all of them are in active development due to the lack of resources in projects teams. > > As a Horizon team, we understand all the reasons, and we're doing our best to help other teams to maintain plugins. > > That's why we're proposing our help to maintain horizon plugins. We raised this topic during the last Horizon weekly meeting [2] and we'll have some discussion during the PTG [3] too. > > There are a lot of Horizon changes which affect plugins and horizon team is ready to help: > - new Django versions > - dependencies updates > - Horizon API changes > - etc. > > To get faster fixes in, it would be good to have +2 permissions for the horizon-core team for each plugin. > > We helped Heat team during the last cycle adding horizon-core to the heat-dashboard-core team. Also, we've got +2 on other plugins via global project config [4] and via Gerrit configuration for (neutron-*aas-dashboard, tuskar-ui). > > Vitrage PTL agreed to do the same for vitrage-dashboard during the last meeting [5]. > > > Of course, it's up to each project to maintain horizon plugins and it's responsibilities but I would like to raise this topic to the TC too. I really sure, that it will speed up some critical fixes for Horizon plugins and makes users and operators experience better. > > > [1] https://docs.openstack.org/horizon/latest/install/plugin-registry.html > [2] http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-10-16-15.02.log.html#l-128 > [3] https://etherpad.openstack.org/p/horizon-u-ptg > [4] http://codesearch.openstack.org/?q=horizon-core&i=nope&files=&repos=openstack/project-config > [5] http://eavesdrop.openstack.org/meetings/vitrage/2019/vitrage.2019-10-23-08.03.log.html#l-21 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ From gmann at ghanshyammann.com Wed Oct 23 19:08:38 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 23 Oct 2019 14:08:38 -0500 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin Message-ID: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> Hello Everyone, We are in Ussuri development cycle which is planned to drop the py2.7 support[1]. I was holding and planning to support the py2.7 in Tempest due to its branchless model[2]. My main point to keep py2 support is if any users running the Tempest on py27 env cloud and not in virtual env then they can keep running in the same way. But this might be just me overthinking on this usage. What happens if we drop py2.7 from Tempest: * Users with the above case have the way to install the latest Tempest on virtual env of py3. or use the Tempest tag if they do not need latest Tempest. * No change in users using Tempest on py3 or in py3 virtual env. * Upstream testing of master or stable branch is no issue as we install the Tempest in virtual env. Tempest in py3 venv can always test the py2.7 jobs. * other than that no change in Tempest usage or at least it would not break anything. Why we cannot keep the support: There is no big cost of supporting the py2.7 in Tempest itself but that require Temepst dependency (OpenStack lib like Oslo and non-openstack maintained lib [3]) to keep supporting the py2.7 which is not feasible. Other solution: One way is to cut the Tempest stable branch and keep the py2.7 support there with eligible backport from Tempest master which is py3 only. But I would say, QA team has no bandwidth to do so. if anyone wants to maintain that then we can discuss this option in more detail. I have given a second thought on this and now ok to drop the py2.7 from Tempest by considering all the above points. Please reply if any disagreement on this or add if anything I missed to consider. NOTE: Tempest includes its plugins also. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010142.html [2] https://review.opendev.org/#/c/681203/ [3] https://opendev.org/openstack/tempest/src/commit/7fdd39c6dbde37bccd419c4037e1e352a5189c5a/requirements.txt -gmann From mriedemos at gmail.com Wed Oct 23 19:22:54 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 23 Oct 2019 14:22:54 -0500 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> Message-ID: <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> On 10/23/2019 2:08 PM, Ghanshyam Mann wrote: > What happens if we drop py2.7 from Tempest: > * Users with the above case have the way to install the latest Tempest on virtual env of py3. or use > the Tempest tag if they do not need latest Tempest. This seems sufficient to me and testing from a tag is what we're doing upstream in stable/ocata and stable/pike branches anyway - not because of python version stuff but because of extended maintenance and backward incompatible changes since those branches which break testing in ocata and pike with tempest from master. > Other solution: > One way is to cut the Tempest stable branch and keep the py2.7 support there with eligible backport from Tempest > master which is py3 only. But I would say, QA team has no bandwidth to do so. if anyone wants to maintain that then > we can discuss this option in more detail. I would avoid creating a stable branch for tempest if at all possible since we have valid options to workaround it (above) and I just don't think we want toy with that idea and the precedent it could set for relaxing other rules around how tempest is developed. -- Thanks, Matt From mtreinish at kortar.org Wed Oct 23 19:34:30 2019 From: mtreinish at kortar.org (Matthew Treinish) Date: Wed, 23 Oct 2019 15:34:30 -0400 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> Message-ID: <20191023193430.GA7095@sinanju.localdomain> On Wed, Oct 23, 2019 at 02:08:38PM -0500, Ghanshyam Mann wrote: > Hello Everyone, > > We are in Ussuri development cycle which is planned to drop the py2.7 support[1]. > > I was holding and planning to support the py2.7 in Tempest due to its branchless model[2]. My main point > to keep py2 support is if any users running the Tempest on py27 env cloud and not in virtual env then they > can keep running in the same way. But this might be just me overthinking on this usage. > > What happens if we drop py2.7 from Tempest: > * Users with the above case have the way to install the latest Tempest on virtual env of py3. or use > the Tempest tag if they do not need latest Tempest. > * No change in users using Tempest on py3 or in py3 virtual env. > * Upstream testing of master or stable branch is no issue as we install the Tempest in virtual env. > Tempest in py3 venv can always test the py2.7 jobs. > * other than that no change in Tempest usage or at least it would not break anything. I have made this exact argument before about venvs and tempest being not actually part of a cloud installation (just in other contexts). In my experience most people don't actually agree with it for whatever reason (I assume because the installer/deployment projects treat tempest as the same thing as other openstack projects). But I still feel that way and even if some people don't agree there are still releases that support python 2.7. In general I'm in favor of just doing this though, as long as we didn't already say that we'd continue supporting 2.7 somewhere in tempest until the Train EM date. If we did advertise that anywhere then we'll have to provide a deprecation period for those people who could have latched onto that. > > Why we cannot keep the support: > There is no big cost of supporting the py2.7 in Tempest itself but that require Temepst dependency > (OpenStack lib like Oslo and non-openstack maintained lib [3]) to keep supporting the py2.7 which is not > feasible. In my experience the list of requirements for tempest is not crazy long and most of them don't have big api divergance (or at least how tempest uses it). I'd almost say just set a version cap with python_version==2.7 to a requirement when/if a requirement drops support for 2.7. Of course that probably has g-r implications, not sure how that would work. > > Other solution: > One way is to cut the Tempest stable branch and keep the py2.7 support there with eligible backport from Tempest > master which is py3 only. But I would say, QA team has no bandwidth to do so. if anyone wants to maintain that then > we can discuss this option in more detail. Branching doesn't actually fix any of the harms you have outlined above. It just increases the complexity of maintainence. > > I have given a second thought on this and now ok to drop the py2.7 from Tempest by considering all the above points. > Please reply if any disagreement on this or add if anything I missed to consider. > > NOTE: Tempest includes its plugins also. I don't actually buy this point, a plugin is independently maintained. If the plugin maintainers do not want to support python 3 they don't have to. Just like any other project that has an upstream dep that supports python 2.7 even if they don't support it. -Matt Treinish > > [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010142.html > [2] https://review.opendev.org/#/c/681203/ > [3] https://opendev.org/openstack/tempest/src/commit/7fdd39c6dbde37bccd419c4037e1e352a5189c5a/requirements.txt -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From aschultz at redhat.com Wed Oct 23 19:36:48 2019 From: aschultz at redhat.com (Alex Schultz) Date: Wed, 23 Oct 2019 13:36:48 -0600 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> Message-ID: On Wed, Oct 23, 2019 at 1:27 PM Matt Riedemann wrote: > On 10/23/2019 2:08 PM, Ghanshyam Mann wrote: > > What happens if we drop py2.7 from Tempest: > > * Users with the above case have the way to install the latest Tempest > on virtual env of py3. or use > > the Tempest tag if they do not need latest Tempest. > > This seems sufficient to me and testing from a tag is what we're doing > upstream in stable/ocata and stable/pike branches anyway - not because > of python version stuff but because of extended maintenance and backward > incompatible changes since those branches which break testing in ocata > and pike with tempest from master. > > > Other solution: > > One way is to cut the Tempest stable branch and keep the py2.7 support > there with eligible backport from Tempest > > master which is py3 only. But I would say, QA team has no bandwidth to > do so. if anyone wants to maintain that then > > we can discuss this option in more detail. > > I would avoid creating a stable branch for tempest if at all possible > since we have valid options to workaround it (above) and I just don't > think we want toy with that idea and the precedent it could set for > relaxing other rules around how tempest is developed. > > My concern is that in tripleo/puppet we currently rely on centos7/python2 as centos8 is still not yet available. So this pretty much means we likely won't be able to run the latest tempest anymore and there goes our validations. We could pin to a version (we've had to do that in the past) but I'd be concerned about things that go untested until we can finally get python3 available. I think it might be beneficial to have a py2-em branch similar to what we do when we create -em branches where folks who still have to have python2 wouldn't be completely blocked. Thanks, -Alex > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Wed Oct 23 19:40:18 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 23 Oct 2019 21:40:18 +0200 Subject: [ptg][neutron][all] Project onboarding session Message-ID: <76E6F289-2352-48B9-BDD4-E12D7F6CBFF2@redhat.com> Hi, As a Neutron team we are planning to do onboarding session on Wednesday morning during the Shanghai PTG. I’m wondering if there are any ways to reach to as much people as it is possible with information when this onboarding will take place and I’m sure that putting it just in PTG etherpad will not be enough for potential newcomers. So I have a question if there are any other ways to announce when exactly such onboarding session will be? And maybe how other teams are announcing that. Thx in advance for any tips on clues :) — Slawek Kaplonski Senior software engineer Red Hat From mriedemos at gmail.com Wed Oct 23 19:40:38 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 23 Oct 2019 14:40:38 -0500 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> Message-ID: <92b8927d-0401-5b76-a67e-6579bc29d3c7@gmail.com> On 10/23/2019 2:36 PM, Alex Schultz wrote: > My concern is that in tripleo/puppet we currently rely on > centos7/python2 as centos8 is still not yet available.  So this pretty > much means we likely won't be able to run the latest tempest anymore and > there goes our validations. How much stuff do you think is going to land in tempest that is going to be useful validation between the time you could use a tagged version of tempest before py27 support is dropped and when you can roll to centos8? I can't imagine it's much and you're still going to get all of the interop style smoke tests we already have. Also, you can't run tempest from a py3 virtual environment or container? -- Thanks, Matt From aschultz at redhat.com Wed Oct 23 20:06:04 2019 From: aschultz at redhat.com (Alex Schultz) Date: Wed, 23 Oct 2019 14:06:04 -0600 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: <92b8927d-0401-5b76-a67e-6579bc29d3c7@gmail.com> References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> <92b8927d-0401-5b76-a67e-6579bc29d3c7@gmail.com> Message-ID: On Wed, Oct 23, 2019 at 1:40 PM Matt Riedemann wrote: > On 10/23/2019 2:36 PM, Alex Schultz wrote: > > My concern is that in tripleo/puppet we currently rely on > > centos7/python2 as centos8 is still not yet available. So this pretty > > much means we likely won't be able to run the latest tempest anymore and > > there goes our validations. > > How much stuff do you think is going to land in tempest that is going to > be useful validation between the time you could use a tagged version of > tempest before py27 support is dropped and when you can roll to centos8? > I can't imagine it's much and you're still going to get all of the > interop style smoke tests we already have. > > It's probably not going to cause problems, but it's not like we haven't had issues in previous cycles. The concern is really we don't currently have an ETA when RDO will be fully up on CentOS8. So it could be weeks or months. We're hoping for sooner rather than later but once py2 support is officially dropped, who knows what folks might want to try and start working on. > Also, you can't run tempest from a py3 virtual environment or container? > puppet no, tripleo maybe. It's already a container in tripleo and we do have a rhel8 container but the issue is reallly packaging for both. > > -- > > Thanks, > > Matt > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cboylan at sapwetik.org Wed Oct 23 20:07:21 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 23 Oct 2019 13:07:21 -0700 Subject: =?UTF-8?Q?Re:_[qa]_[all]_Opinion_on_dropping_the_py2.7_support_from_Temp?= =?UTF-8?Q?est_&_Tempest_plugin?= In-Reply-To: References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> Message-ID: On Wed, Oct 23, 2019, at 12:36 PM, Alex Schultz wrote: > > > On Wed, Oct 23, 2019 at 1:27 PM Matt Riedemann wrote: > > On 10/23/2019 2:08 PM, Ghanshyam Mann wrote: > > > What happens if we drop py2.7 from Tempest: > > > * Users with the above case have the way to install the latest Tempest on virtual env of py3. or use > > > the Tempest tag if they do not need latest Tempest. > > > > This seems sufficient to me and testing from a tag is what we're doing > > upstream in stable/ocata and stable/pike branches anyway - not because > > of python version stuff but because of extended maintenance and backward > > incompatible changes since those branches which break testing in ocata > > and pike with tempest from master. > > > > > Other solution: > > > One way is to cut the Tempest stable branch and keep the py2.7 support there with eligible backport from Tempest > > > master which is py3 only. But I would say, QA team has no bandwidth to do so. if anyone wants to maintain that then > > > we can discuss this option in more detail. > > > > I would avoid creating a stable branch for tempest if at all possible > > since we have valid options to workaround it (above) and I just don't > > think we want toy with that idea and the precedent it could set for > > relaxing other rules around how tempest is developed. > > > > My concern is that in tripleo/puppet we currently rely on > centos7/python2 as centos8 is still not yet available. So this pretty > much means we likely won't be able to run the latest tempest anymore > and there goes our validations. We could pin to a version (we've had to > do that in the past) but I'd be concerned about things that go untested > until we can finally get python3 available. I think it might be > beneficial to have a py2-em branch similar to what we do when we create > -em branches where folks who still have to have python2 wouldn't be > completely blocked. The infra team has centos-8 images available now. Another option is to run tempest in a container to host python3. That should work on CentOS 7. > > Thanks, > -Alex From smooney at redhat.com Wed Oct 23 20:56:38 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 23 Oct 2019 21:56:38 +0100 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> Message-ID: <17370b775b98d7bd9837ea160e23147f85d2b919.camel@redhat.com> On Wed, 2019-10-23 at 13:07 -0700, Clark Boylan wrote: > On Wed, Oct 23, 2019, at 12:36 PM, Alex Schultz wrote: > > > > > > On Wed, Oct 23, 2019 at 1:27 PM Matt Riedemann wrote: > > > On 10/23/2019 2:08 PM, Ghanshyam Mann wrote: > > > > What happens if we drop py2.7 from Tempest: > > > > * Users with the above case have the way to install the latest Tempest on virtual env of py3. or use > > > > the Tempest tag if they do not need latest Tempest. > > > > > > This seems sufficient to me and testing from a tag is what we're doing > > > upstream in stable/ocata and stable/pike branches anyway - not because > > > of python version stuff but because of extended maintenance and backward > > > incompatible changes since those branches which break testing in ocata > > > and pike with tempest from master. > > > > > > > Other solution: > > > > One way is to cut the Tempest stable branch and keep the py2.7 support there with eligible backport from > > > Tempest > > > > master which is py3 only. But I would say, QA team has no bandwidth to do so. if anyone wants to maintain that > > > then > > > > we can discuss this option in more detail. > > > > > > I would avoid creating a stable branch for tempest if at all possible > > > since we have valid options to workaround it (above) and I just don't > > > think we want toy with that idea and the precedent it could set for > > > relaxing other rules around how tempest is developed. > > > > > > > My concern is that in tripleo/puppet we currently rely on > > centos7/python2 as centos8 is still not yet available. So this pretty > > much means we likely won't be able to run the latest tempest anymore > > and there goes our validations. We could pin to a version (we've had to > > do that in the past) but I'd be concerned about things that go untested > > until we can finally get python3 available. I think it might be > > beneficial to have a py2-em branch similar to what we do when we create > > -em branches where folks who still have to have python2 wouldn't be > > completely blocked. > > The infra team has centos-8 images available now. Another option is to run tempest in a container to host python3. > That should work on CentOS 7. you also can install py36 on centos 7 form https://ius.io/ the "inline with upstream stable" repos https://github.com/iusrepo/python36 and its also available in eple so you can do python 3 testing fine on centos 7 if you need too. by the way when i downloaded the centos-8 image form infra glean did not automaticaly pick up my ssh keys form the openstack metadata serivce or config drive i assuem that is fixed/works in the gate? > > > > > Thanks, > > -Alex > > From fungi at yuggoth.org Wed Oct 23 21:05:05 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 23 Oct 2019 21:05:05 +0000 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> Message-ID: <20191023210505.hvwut4telzlhi2pa@yuggoth.org> On 2019-10-23 13:07:21 -0700 (-0700), Clark Boylan wrote: > On Wed, Oct 23, 2019, at 12:36 PM, Alex Schultz wrote: [...] > > My concern is that in tripleo/puppet we currently rely on > > centos7/python2 as centos8 is still not yet available. [...] > The infra team has centos-8 images available now. [...] That was also the first thing which jumped to mind for me, but in another E-mail in the thread Alex clarifies that what's not yet available isn't CentOS 8, but rather a working build of RDO for CentOS 8 (and TripleO is dependent on RDO's packages). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From cboylan at sapwetik.org Wed Oct 23 21:06:06 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 23 Oct 2019 14:06:06 -0700 Subject: =?UTF-8?Q?Re:_[qa]_[all]_Opinion_on_dropping_the_py2.7_support_from_Temp?= =?UTF-8?Q?est_&_Tempest_plugin?= In-Reply-To: <17370b775b98d7bd9837ea160e23147f85d2b919.camel@redhat.com> References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> <17370b775b98d7bd9837ea160e23147f85d2b919.camel@redhat.com> Message-ID: <2a93dc97-9150-4336-abcf-9295429cf565@www.fastmail.com> On Wed, Oct 23, 2019, at 1:56 PM, Sean Mooney wrote: > by the way when i downloaded the centos-8 image form infra glean > did not automaticaly pick up my ssh keys form the openstack metadata serivce > or config drive i assuem that is fixed/works in the gate? Glean does not support metadata service it only works with the config drive. It must be working in the gate because the images boot and I am able to ssh in. Note that glean configures the ssh key on the 'root' user and not 'centos' or 'ubuntu' or 'fedora'. Clark From rosmaita.fossdev at gmail.com Wed Oct 23 21:30:56 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 23 Oct 2019 17:30:56 -0400 Subject: [cinder] changing the weekly meeting time Message-ID: (Just to be completely clear -- we're only gathering information at this point. The Cinder weekly meeting is still Wednesdays at 16:00 UTC.) As we discussed at today's meeting [0], a request has been made to hold the weekly meeting earlier so that it would be friendlier for people in Asia time zones. Based on the people in attendance today, it seems that a move to 14:00 UTC is not out of the question. Thus, the point of this email is to solicit comments on whether we should change the meeting time to 14:00 UTC. As you consider the impact on yourself, if you are in a TZ that observes Daylight Savings Time, keep in mind that most TZs go back to standard time over the next few weeks. (I was going to insert an opinion here, but I will wait and respond in this thread like everyone else.) cheers, brian [0] http://eavesdrop.openstack.org/meetings/cinder/2019/cinder.2019-10-23-16.00.log.html#l-166 From juliaashleykreger at gmail.com Wed Oct 23 22:15:23 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 23 Oct 2019 15:15:23 -0700 Subject: [ironic]: Timeout reached while waiting for callback for node In-Reply-To: <413811384.587474.1571773550206@mail.yahoo.com> References: <413811384.587474.1571773550206.ref@mail.yahoo.com> <413811384.587474.1571773550206@mail.yahoo.com> Message-ID: Greetings Fred! Reply in-line. On Tue, Oct 22, 2019 at 12:47 PM fsbiz at yahoo.com wrote: [trim] > > > TFTP logs: shows TFTP client timed out (weird). Any pointers here? > Sadly this is one of those things that comes with using TFTP. Issues like this is why the community tends to recommend using ipxe.efi to chainload as you can perform transport over TCP as opposed to UDP where in something might happen mid-transport. > tftpd shows ramdisk_deployed completed. Then, it reports that the client > timed out. > Grub does tend to be very abrupt and not wrap up very final actions. I suspect it may just never be sending the ack back and the transfer may be completing. I'm afraid this is one of those things you really need to see on the console what is going on. My guess would be that your deploy_ramdisk lost a packet in transfer or that it was corrupted in transport. It would be interesting to know if the network card stack is performing checksum validation, but for IPv4 it is optional. [trim] > > This has me stumped here. This exact failure seems to be happening 3 to 4 > times a week on different nodes. > Any pointers appreciated. > > thanks, > Fred. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Thu Oct 24 01:47:29 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 23 Oct 2019 20:47:29 -0500 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> <92b8927d-0401-5b76-a67e-6579bc29d3c7@gmail.com> Message-ID: <16dfb7284b1.117caa3f6139590.3561420008929970262@ghanshyammann.com> ---- On Wed, 23 Oct 2019 15:06:04 -0500 Alex Schultz wrote ---- > > > On Wed, Oct 23, 2019 at 1:40 PM Matt Riedemann wrote: > On 10/23/2019 2:36 PM, Alex Schultz wrote: > > My concern is that in tripleo/puppet we currently rely on > > centos7/python2 as centos8 is still not yet available. So this pretty > > much means we likely won't be able to run the latest tempest anymore and > > there goes our validations. > > How much stuff do you think is going to land in tempest that is going to > be useful validation between the time you could use a tagged version of > tempest before py27 support is dropped and when you can roll to centos8? > I can't imagine it's much and you're still going to get all of the > interop style smoke tests we already have. > > > It's probably not going to cause problems, but it's not like we haven't had issues in previous cycles. The concern is really we don't currently have an ETA when RDO will be fully up on CentOS8. So it could be weeks or months. We're hoping for sooner rather than later but once py2 support is officially dropped, who knows what folks might want to try and start working on. Also, you can't run tempest from a py3 virtual environment or container? A month or so is all fine. Dropping py2 from Tempest is going to be the last in phase-2 (Ussuri milestone-2 ) (final schedule to be discussed tomorrow ). - https://etherpad.openstack.org/p/drop-python2-support -gmann > > puppet no, tripleo maybe. It's already a container in tripleo and we do have a rhel8 container but the issue is reallly packaging for both. > -- > > Thanks, > > Matt > From gmann at ghanshyammann.com Thu Oct 24 01:58:30 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 23 Oct 2019 20:58:30 -0500 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: <20191023193430.GA7095@sinanju.localdomain> References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <20191023193430.GA7095@sinanju.localdomain> Message-ID: <16dfb7c9ab2.d2d5efd3139625.5923213009987724274@ghanshyammann.com> ---- On Wed, 23 Oct 2019 14:34:30 -0500 Matthew Treinish wrote ---- > On Wed, Oct 23, 2019 at 02:08:38PM -0500, Ghanshyam Mann wrote: > > Hello Everyone, > > > > We are in Ussuri development cycle which is planned to drop the py2.7 support[1]. > > > > I was holding and planning to support the py2.7 in Tempest due to its branchless model[2]. My main point > > to keep py2 support is if any users running the Tempest on py27 env cloud and not in virtual env then they > > can keep running in the same way. But this might be just me overthinking on this usage. > > > > What happens if we drop py2.7 from Tempest: > > * Users with the above case have the way to install the latest Tempest on virtual env of py3. or use > > the Tempest tag if they do not need latest Tempest. > > * No change in users using Tempest on py3 or in py3 virtual env. > > * Upstream testing of master or stable branch is no issue as we install the Tempest in virtual env. > > Tempest in py3 venv can always test the py2.7 jobs. > > * other than that no change in Tempest usage or at least it would not break anything. > > I have made this exact argument before about venvs and tempest being not > actually part of a cloud installation (just in other contexts). In my > experience most people don't actually agree with it for whatever reason > (I assume because the installer/deployment projects treat tempest as the > same thing as other openstack projects). But I still feel that way and > even if some people don't agree there are still releases that support > python 2.7. > > In general I'm in favor of just doing this though, as long as we didn't > already say that we'd continue supporting 2.7 somewhere in tempest > until the Train EM date. If we did advertise that anywhere then we'll > have to provide a deprecation period for those people who could have > latched onto that. I do think we have said or documented anywhere about 2.7 support till Train EM. But sending the warning of py2.7 drop now would be a good idea to notify the people. Because we will droping the support during milestone-2 which is during mid of Feb. > > > > > Why we cannot keep the support: > > There is no big cost of supporting the py2.7 in Tempest itself but that require Temepst dependency > > (OpenStack lib like Oslo and non-openstack maintained lib [3]) to keep supporting the py2.7 which is not > > feasible. > > In my experience the list of requirements for tempest is not crazy long and > most of them don't have big api divergance (or at least how tempest uses > it). I'd almost say just set a version cap with python_version==2.7 to a > requirement when/if a requirement drops support for 2.7. Of course that > probably has g-r implications, not sure how that would work. Yeah, we could do that if someone strongly says or convience us not to drop the py2.7 from Tempest. > > > > > Other solution: > > One way is to cut the Tempest stable branch and keep the py2.7 support there with eligible backport from Tempest > > master which is py3 only. But I would say, QA team has no bandwidth to do so. if anyone wants to maintain that then > > we can discuss this option in more detail. > > Branching doesn't actually fix any of the harms you have outlined above. It just > increases the complexity of maintainence. > > > > > I have given a second thought on this and now ok to drop the py2.7 from Tempest by considering all the above points. > > Please reply if any disagreement on this or add if anything I missed to consider. > > > > NOTE: Tempest includes its plugins also. > > I don't actually buy this point, a plugin is independently maintained. If the > plugin maintainers do not want to support python 3 they don't have to. Just like > any other project that has an upstream dep that supports python 2.7 even if they > don't support it. Agree. I added plugin also in this scope because most of the plugins wait for guidelines from the Tempest side and they use lot of interfaces from Tempest. They can drop py2.7 independent of Tempest but keeping py2.7 support is not possible if Tempest drop. -gmann > > -Matt Treinish > > > > > [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010142.html > > [2] https://review.opendev.org/#/c/681203/ > > [3] https://opendev.org/openstack/tempest/src/commit/7fdd39c6dbde37bccd419c4037e1e352a5189c5a/requirements.txt > From gouthampravi at gmail.com Thu Oct 24 02:08:21 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Wed, 23 Oct 2019 19:08:21 -0700 Subject: [manila] Remote-only PTG poll Message-ID: Hey Zorillas* and interested Stackers, (* manila contributors in the wild) A part of our community will be attending the Summit+Forum+PTG in Shanghai in a couple of weeks. However, we haven't figured out a way to enable remote participation during our PTG (Wednesday, 6th Nov 2019 between 9:00 AM and 4:30 PM) there. So, as discussed during our weekly meeting, I'd like to take a poll of when you would like to gather for a remote PTG that would supplement the gathering in Shanghai. Please take this poll and indicate your availability: https://xoyondo.com/dp/qQbsoHtNhi4DCki The prospective date for this meeting is between 2nd Dec and 13th Dec (Ussuri-1 Milestone is on 12th Dec). The meeting time will be 15:00 UTC to 22:00 UTC. This time may not work for our contributors in the Far East (APAC), but I'm hoping we'll address them primarily at the Shanghai get together, and relay their thoughts here. We will also be recording the entire meeting and will share notes on these lists. Thanks, Goutham -------------- next part -------------- An HTML attachment was scrubbed... URL: From feilong at catalyst.net.nz Thu Oct 24 05:56:36 2019 From: feilong at catalyst.net.nz (feilong) Date: Thu, 24 Oct 2019 18:56:36 +1300 Subject: [magnum][kubernetes] Train release updates Message-ID: <65abb275-fc1b-e10a-1deb-fe4815b73f1f@catalyst.net.nz> Hi all, Glad to see the Train just released. I'd like to take this opportunity to highlight some cool features Magnum team have done in this cycle. Please feel free to pop up at #openstack-containers for any feedback. Thank you. 1. Fedora CoreOS driver Finally we get this done in this cycle which is quite important and urgent. Because the Fedora Atomic will be end of life soon (at the end of this Nov?). Now with stable/train, user can create k8s cluster based on Fedora CoreOS image. We're welcome for any feedback about this. 2. Node group This is a quite large feature, thanks for the great work from CERN team and StackHPC. With this feature, user can create different node groups with different specs. 3. Rolling upgrade Now user can do rolling upgrade for both k8s version and the node operating system without downtime, it's a quite important feature. 4. Auto healing With auto healing feature, a small service will be deployed on k8s cluster to monitor the health status of the cluster and replace broken node when failure detected. 5. Volume based k8s nodes Now user can create a Kubernetes cluster with nodes based on volume and the volume type is configurable. User can even set the volume type of etcd volume which is very useful for cloud providers who want to use SSD for etcd and k8s nodes. 6. Private cluster As a security best practice, isolating Kubernetes clusters from Internet access is one of the most desired features for enterprise users. Now user has the flexibility to create a private cluster by default, only allow the API exposed on Internet or fully opened. -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ From Tim.Bell at cern.ch Thu Oct 24 06:11:46 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Thu, 24 Oct 2019 06:11:46 +0000 Subject: [magnum][kubernetes] Train release updates In-Reply-To: <65abb275-fc1b-e10a-1deb-fe4815b73f1f@catalyst.net.nz> References: <65abb275-fc1b-e10a-1deb-fe4815b73f1f@catalyst.net.nz> Message-ID: Feilong, Great to see all of these improvements getting into the release. Would it be possible to add these to the release highlights documentation at https://releases.openstack.org/train/highlights.html ? Tim On 24 Oct 2019, at 07:56, feilong > wrote: Hi all, Glad to see the Train just released. I'd like to take this opportunity to highlight some cool features Magnum team have done in this cycle. Please feel free to pop up at #openstack-containers for any feedback. Thank you. 1. Fedora CoreOS driver Finally we get this done in this cycle which is quite important and urgent. Because the Fedora Atomic will be end of life soon (at the end of this Nov?). Now with stable/train, user can create k8s cluster based on Fedora CoreOS image. We're welcome for any feedback about this. 2. Node group This is a quite large feature, thanks for the great work from CERN team and StackHPC. With this feature, user can create different node groups with different specs. 3. Rolling upgrade Now user can do rolling upgrade for both k8s version and the node operating system without downtime, it's a quite important feature. 4. Auto healing With auto healing feature, a small service will be deployed on k8s cluster to monitor the health status of the cluster and replace broken node when failure detected. 5. Volume based k8s nodes Now user can create a Kubernetes cluster with nodes based on volume and the volume type is configurable. User can even set the volume type of etcd volume which is very useful for cloud providers who want to use SSD for etcd and k8s nodes. 6. Private cluster As a security best practice, isolating Kubernetes clusters from Internet access is one of the most desired features for enterprise users. Now user has the flexibility to create a private cluster by default, only allow the API exposed on Internet or fully opened. -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeremyfreudberg at gmail.com Thu Oct 24 06:49:57 2019 From: jeremyfreudberg at gmail.com (Jeremy Freudberg) Date: Thu, 24 Oct 2019 08:49:57 +0200 Subject: [tc][horizon][all] Horizon plugins maintenance In-Reply-To: References: Message-ID: Hi, I have added horizon-core to sahara-dashboard-core. Thanks for your help. Best, Jeremy On Wed, Oct 23, 2019 at 2:44 PM Ivan Kolodyazhny wrote: > > Hi team, > > As you may know, we've got a pretty big list of Horizon Plugins [1]. Unfortunately, not all of them are in active development due to the lack of resources in projects teams. > > As a Horizon team, we understand all the reasons, and we're doing our best to help other teams to maintain plugins. > > That's why we're proposing our help to maintain horizon plugins. We raised this topic during the last Horizon weekly meeting [2] and we'll have some discussion during the PTG [3] too. > > There are a lot of Horizon changes which affect plugins and horizon team is ready to help: > - new Django versions > - dependencies updates > - Horizon API changes > - etc. > > To get faster fixes in, it would be good to have +2 permissions for the horizon-core team for each plugin. > > We helped Heat team during the last cycle adding horizon-core to the heat-dashboard-core team. Also, we've got +2 on other plugins via global project config [4] and via Gerrit configuration for (neutron-*aas-dashboard, tuskar-ui). > > Vitrage PTL agreed to do the same for vitrage-dashboard during the last meeting [5]. > > > Of course, it's up to each project to maintain horizon plugins and it's responsibilities but I would like to raise this topic to the TC too. I really sure, that it will speed up some critical fixes for Horizon plugins and makes users and operators experience better. > > > [1] https://docs.openstack.org/horizon/latest/install/plugin-registry.html > [2] http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-10-16-15.02.log.html#l-128 > [3] https://etherpad.openstack.org/p/horizon-u-ptg > [4] http://codesearch.openstack.org/?q=horizon-core&i=nope&files=&repos=openstack/project-config > [5] http://eavesdrop.openstack.org/meetings/vitrage/2019/vitrage.2019-10-23-08.03.log.html#l-21 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ From feilong at catalyst.net.nz Thu Oct 24 08:27:23 2019 From: feilong at catalyst.net.nz (feilong) Date: Thu, 24 Oct 2019 21:27:23 +1300 Subject: [magnum][kubernetes] Train release updates In-Reply-To: References: <65abb275-fc1b-e10a-1deb-fe4815b73f1f@catalyst.net.nz> Message-ID: <17cf6441-008f-710e-7c0c-3cebbc099096@catalyst.net.nz> Thanks Tim. That's good point. Will do when we release the Magnum 9.1.0 version. Cheers. On 24/10/19 7:11 PM, Tim Bell wrote: > Feilong, > > Great to see all of these improvements getting into the release. > > Would it be possible to add these to the release highlights > documentation at https://releases.openstack.org/train/highlights.html ? > > Tim > >> On 24 Oct 2019, at 07:56, feilong > > wrote: >> >> Hi all, >> >> Glad to see the Train just released. I'd like to take this opportunity >> to highlight some cool features Magnum team have done in this cycle. >> Please feel free to pop up at #openstack-containers for any feedback. >> Thank you. >> >> 1. Fedora CoreOS driver >> >> Finally we get this done in this cycle which is quite important and >> urgent. Because the Fedora Atomic will be end of life soon (at the end >> of this Nov?). Now with stable/train, user can create k8s cluster based >> on Fedora CoreOS image. We're welcome for any feedback about this. >> >> 2. Node group >> >> This is a quite large feature, thanks for the great work from CERN team >> and StackHPC. With this feature, user can create different node groups >> with different specs. >> >> 3. Rolling upgrade >> >> Now user can do rolling upgrade for both k8s version and the node >> operating system without downtime, it's a quite important feature. >> >> 4. Auto healing >> >> With auto healing feature, a small service will be deployed on k8s >> cluster to monitor the health status of the cluster and replace broken >> node when failure detected. >> >> 5. Volume based k8s nodes >> >> Now user can create a Kubernetes cluster with nodes based on volume and >> the volume type is configurable. User can even set the volume type of >> etcd volume which is very useful for cloud providers who want to use SSD >> for etcd and k8s nodes. >> >> 6. Private cluster >> >> As a security best practice, isolating Kubernetes clusters from Internet >> access is one of the most desired features for enterprise users. Now >> user has the flexibility to create a private cluster by default, only >> allow the API exposed on Internet or fully opened. >> >> -- >> Cheers & Best regards, >> Feilong Wang (王飞龙) >> ------------------------------------------------------ >> Senior Cloud Software Engineer >> Tel: +64-48032246 >> Email: flwang at catalyst.net.nz >> Catalyst IT Limited >> Level 6, Catalyst House, 150 Willis Street, Wellington >> ------------------------------------------------------ >> >> > -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.morin at gmail.com Thu Oct 24 09:06:45 2019 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Thu, 24 Oct 2019 09:06:45 +0000 Subject: [ops] nova wsgi config In-Reply-To: <659657f1-89ba-63b6-f2dc-6d8c42430d08@goirand.fr> References: <20191022101943.GG14827@sync> <659657f1-89ba-63b6-f2dc-6d8c42430d08@goirand.fr> Message-ID: <20191024090645.GH14827@sync> Hey Thomas, Thank you for your example. If I understand well, you are using 4 processes in the uwsgi config. I dont see any number of thread, does it mean the uwsgi is not spawning threads but only processes? ( so there is only 1 thread per process?) Thanks, -- Arnaud Morin On 23.10.19 - 13:45, Thomas Goirand wrote: > On 10/22/19 12:19 PM, Arnaud Morin wrote: > > Hey all, > > > > I am trying to configure apache as a WSGI. > > Is there any other documentation than this: > > https://docs.openstack.org/nova/stein/user/wsgi.html > > > > Is there any recommendations? > > > > Thanks in advance! > > Hi Arnaud, > > If you wish, you can have a look at the nova-api package in Debian, > which by default, is set to use uwsgi for both the compute and the > metadata API. This consist of a sysv-rc / systemd startup script, plus a > configuration file. Note that this system also has support over SSL if > you just drop the certificate + key in the right folder. > > Nearly all API package in Debian are configured this way. If it's not > the case, it means that either we didn't have time to do the switch yet > (which is kind of rare, I hope), or there's no such support upstream for > running under a wsgi server. > > Cheers, > > Thomas Goirand (zigo) From arnaud.morin at gmail.com Thu Oct 24 09:11:43 2019 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Thu, 24 Oct 2019 09:11:43 +0000 Subject: [ops] nova wsgi config In-Reply-To: References: <20191022101943.GG14827@sync> Message-ID: <20191024091143.GI14827@sync> Hi Chris, That's helping a lot, thanks for all of this! Regards, -- Arnaud Morin On 23.10.19 - 16:26, Chris Dent wrote: > On Tue, 22 Oct 2019, Arnaud Morin wrote: > > > I am trying to configure apache as a WSGI. > > Is there any other documentation than this: > > https://docs.openstack.org/nova/stein/user/wsgi.html > > > > Is there any recommendations? > > There are a lot of options, and which you use can mostly come down > to personal preference and other aspects of your environment. For > example, if you're running in a kubernetes environment, using apache > at all can be overkill: the nova-api container(s) can expose an http > service which are reached through the ingress. Adding apache in that > environment would mean you had proxy -> proxy/apache -> service. If > you're trying save some space, that's overkill. > > However, if what you want is some kind of combination where apache > is in front of the nova-api you have three basic options: > > * Use apache plus mod_proxy to talk to the eventlet driven > `nova-api` process. > > * Use apache plus mod_wsgi to talk to the `nova-api-wsgi` application, > probably using WSGIDaemonProcess. > > * Use apache plus mod_proxy_uwsgi to talk to the `nova-api-wsgi` > application, itself being run by uwsgi, where the uwsgi process > is started and managed by something like systemd or uwsgi emperor > mode. > > If you use either of the latter two you need to be aware of a > potential issue with eventlet as described in the release notes for > stein: https://docs.openstack.org/releasenotes/nova/stein.html#known-issues > > There's some boilerplate documentation for using mod wsgi and uwsgi for > various projects. Here's the one for zun: > > https://docs.openstack.org/zun/train/contributor/mod-wsgi.html > > There's some documentation in placement which has links to the > changes that added placement in devstack, first using mod_wsgi and > then using uwsgi: > > https://docs.openstack.org/placement/latest/install/ > > That can be a useful guide, just remember to replace placement names > with the corresponding nova names. Where placement uses > `placement-api`, nova wants `nova-api-wsgi`. > > There are many options for how to do this, so there's no > straightforward cookiecutter answer. The important thing to remember > is that `nova-api-wsgi` is a standard WSGI application and there are > all kinds of resources on the web for how to host a WSGI application > on a variety of web servers in various ways. Things you learn about > handling a WSGI application of one sort can be transferred to others > (with the important caveat about nova and eventlet described above). > > My current way for doing this kind of thing is to run uwsgi in a > container and then have a proxy talk to that container. See > https://github.com/cdent/placedock for how I've done this with > Placement. If there's no container involved, I simply run uwsgi > standalone. > > -- > Chris Dent ٩◔̯◔۶ https://anticdent.org/ > freenode: cdent From tony at bakeyournoodle.com Thu Oct 24 09:26:07 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Thu, 24 Oct 2019 11:26:07 +0200 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> <20191018160609.eadalm2qwwpjsigc@mthode.org> <20191018232110.GH8065@thor.bakeyournoodle.com> <20191019012824.GJ8065@thor.bakeyournoodle.com> Message-ID: <20191024092607.GA27123@thor.bakeyournoodle.com> On Mon, Oct 21, 2019 at 10:43:39AM -0400, Jim Rollenhagen wrote: > If we need to update sushy in UCA anyway, sounds like we could probably go > straight to 2.0.0 and then go ahead and do the requirements update dance > mentioned upthread? Yup totally. I haven't seen anything from James/Ubuntu that indicates they're cool with this. Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From smooney at redhat.com Thu Oct 24 09:44:31 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 24 Oct 2019 10:44:31 +0100 Subject: [qa] [all] Opinion on dropping the py2.7 support from Tempest & Tempest plugin In-Reply-To: <2a93dc97-9150-4336-abcf-9295429cf565@www.fastmail.com> References: <16dfa055dc6.f0715085135906.5868151027614853213@ghanshyammann.com> <47b991f0-d32f-19b2-4ea0-5ce46c2f1227@gmail.com> <17370b775b98d7bd9837ea160e23147f85d2b919.camel@redhat.com> <2a93dc97-9150-4336-abcf-9295429cf565@www.fastmail.com> Message-ID: On Wed, 2019-10-23 at 14:06 -0700, Clark Boylan wrote: > On Wed, Oct 23, 2019, at 1:56 PM, Sean Mooney wrote: > > by the way when i downloaded the centos-8 image form infra glean > > did not automaticaly pick up my ssh keys form the openstack metadata serivce > > or config drive i assuem that is fixed/works in the gate? > > Glean does not support metadata service it only works with the config drive. It must be working in the gate because > the images boot and I am able to ssh in. Note that glean configures the ssh key on the 'root' user and not 'centos' or > 'ubuntu' or 'fedora'. yep that was the issue. if i enable config drive and log in as root it works. the only other issue i have found is the centos-8 vms are not getting ipv6 addresss but my ubunut ones are but that is something to look into after. thanks for the tip on root login. > > Clark > From witold.bedyk at suse.com Thu Oct 24 10:45:25 2019 From: witold.bedyk at suse.com (Witek Bedyk) Date: Thu, 24 Oct 2019 12:45:25 +0200 Subject: [tc][horizon][all] Horizon plugins maintenance In-Reply-To: References: Message-ID: <7c7f4086-096b-5fbd-29eb-61812763af1a@suse.com> Hi Ivan and Horizon team, thanks a lot for your help. It's greatly appreciated. I've submitted a change[1] granting +2 permissions for Monasca UI. [1] https://review.opendev.org/690918 Best greetings Witek On 10/23/19 2:41 PM, Ivan Kolodyazhny wrote: > Hi team, > > As you may know, we've got a pretty big list of Horizon Plugins [1]. > Unfortunately, not all of them are in active development due to the lack > of resources in projects teams. > > As a Horizon team, we understand all the reasons, and we're doing our > best to help other teams to maintain plugins. > > That's why we're proposing our help to maintain horizon plugins. We > raised this topic during the last Horizon weekly meeting [2] and we'll > have some discussion during the PTG [3] too. > > There are a lot of Horizon changes which affect plugins and horizon team > is ready to help: > - new Django versions > - dependencies updates > - Horizon API changes > - etc. > > To get faster fixes in, it would be good to have +2 permissions for the > horizon-core team for each plugin. > > We helped Heat team during the last cycle adding horizon-core to the > heat-dashboard-core team. Also, we've got +2 on other plugins via global > project config [4] and via Gerrit configuration for > (neutron-*aas-dashboard, tuskar-ui). > > Vitrage PTL agreed to do the same for vitrage-dashboard during the last > meeting [5]. > > > Of course, it's up to each project to maintain horizon plugins and it's > responsibilities but I would like to raise this topic to the TC too. I > really sure, that it will speed up some critical fixes for Horizon > plugins and makes users and operators experience better. > > > [1] https://docs.openstack.org/horizon/latest/install/plugin-registry.html > [2] > http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-10-16-15.02.log.html#l-128 > [3] https://etherpad.openstack.org/p/horizon-u-ptg > [4] > http://codesearch.openstack.org/?q=horizon-core&i=nope&files=&repos=openstack/project-config > [5] > http://eavesdrop.openstack.org/meetings/vitrage/2019/vitrage.2019-10-23-08.03.log.html#l-21 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ From mbooth at redhat.com Thu Oct 24 13:11:07 2019 From: mbooth at redhat.com (Matthew Booth) Date: Thu, 24 Oct 2019 14:11:07 +0100 Subject: [nova] Is config drive data stored in ephemeral storage? In-Reply-To: References: Message-ID: On Tue, 22 Oct 2019 at 21:59, Jean-Philippe Méthot wrote: > > Hi, > > We currently use Ceph for storage on our Openstack cluster. We set up a pool to use for ephemeral nova storage, but we never actually use it, as we prefer to use cinder block devices. However, we notice that objects are being created inside that pool. Could it be that when you use config drives, the actual config drive is created inside nova’s ephemeral storage instead of on the compute node’s disk? Yes. Matt -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK) From mnaser at vexxhost.com Thu Oct 24 13:45:35 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 24 Oct 2019 09:45:35 -0400 Subject: [openstack-ansible] core updates Message-ID: Hi everyone, I'd like to propose the addition of the following 2 new core members: - Georgina Shippey (BBC R&D, committed continuous contributor and operator) - James Denton (long time contributor, extremely knowledgeable in OSA) If no one opposes to this, I will be adding them to our core list shortly. Thanks, Mohammed -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. https://vexxhost.com From amy at demarco.com Thu Oct 24 13:52:39 2019 From: amy at demarco.com (Amy) Date: Thu, 24 Oct 2019 08:52:39 -0500 Subject: [openstack-ansible] core updates In-Reply-To: References: Message-ID: <83143D4D-3685-4F2B-8E67-E51834473C19@demarco.com> +2 welcome aboard Amy (spotz) > On Oct 24, 2019, at 8:47 AM, Mohammed Naser wrote: > > Hi everyone, > > I'd like to propose the addition of the following 2 new core members: > > - Georgina Shippey (BBC R&D, committed continuous contributor and operator) > - James Denton (long time contributor, extremely knowledgeable in OSA) > > If no one opposes to this, I will be adding them to our core list shortly. > > Thanks, > Mohammed > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. https://vexxhost.com > From gsteinmuller at vexxhost.com Thu Oct 24 14:02:24 2019 From: gsteinmuller at vexxhost.com (=?UTF-8?Q?Guilherme_Steinm=C3=BCller?=) Date: Thu, 24 Oct 2019 11:02:24 -0300 Subject: [openstack-ansible] core updates In-Reply-To: <83143D4D-3685-4F2B-8E67-E51834473C19@demarco.com> References: <83143D4D-3685-4F2B-8E67-E51834473C19@demarco.com> Message-ID: ++ On Thu, Oct 24, 2019 at 10:55 AM Amy wrote: > +2 welcome aboard > > Amy (spotz) > > > On Oct 24, 2019, at 8:47 AM, Mohammed Naser wrote: > > > > Hi everyone, > > > > I'd like to propose the addition of the following 2 new core members: > > > > - Georgina Shippey (BBC R&D, committed continuous contributor and > operator) > > - James Denton (long time contributor, extremely knowledgeable in OSA) > > > > If no one opposes to this, I will be adding them to our core list > shortly. > > > > Thanks, > > Mohammed > > > > -- > > Mohammed Naser — vexxhost > > ----------------------------------------------------- > > D. 514-316-8872 > > D. 800-910-1726 ext. 200 > > E. mnaser at vexxhost.com > > W. https://vexxhost.com > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Thu Oct 24 14:16:30 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 24 Oct 2019 16:16:30 +0200 Subject: [resource-management-sig] Status of the "Resource Management" SIG Message-ID: <5a84404f-0e10-9010-61ed-29aff08b5ec6@openstack.org> Hi, I was wondering about the status of the Resource Management SIG... It's been "forming" according to https://wiki.openstack.org/wiki/Res_Mgmt_SIG since January 2018... And I could'nt find a reference or log to any meeting after that. Does anyone have updated status on this one? Should it be removed from the list of active SIGs at https://governance.openstack.org/sigs/ ? -- Thierry Carrez (ttx) From gmann at ghanshyammann.com Thu Oct 24 14:55:58 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 24 Oct 2019 09:55:58 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> Message-ID: <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> Just a reminder, discussion for this is going to start in #openstack-tc channel in another 5 min. -gmann ---- On Tue, 15 Oct 2019 13:18:03 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > Python 2.7 is going to retire in Jan 2020 [1] and we planned to drop the python 2 support from OpenStack > during the start of the Ussuri cycle[2]. > > Time has come now to start the planning on dropping the Python2. It needs to be coordinated among various > Projects, libraries, vendors driver, third party CI and testing frameworks. > > * Preparation for the Plan & Schedule: > > Etherpad: https://etherpad.openstack.org/p/drop-python2-support > > We discussed it in TC to come up with the plan, execute it smoothly and avoid breaking any dependent projects. > I have prepared an etherpad[3](mentioned above also) to capture all the points related to this topic and most importantly > the draft schedule about who can drop the support and when. The schedule is in the draft state and not final yet. > The most important points are if you are dropping the support then all your consumers (OpenStack Projects, Vendors drivers etc) > are ready for that. For example, oslo, os-bricks, client lib, testing framework projects will keep the python2 support until we make > sure all the consumers of those projects do not require py2 support. If anyone require then how long they can support py2. > These libraries, testing frameworks will be the last one to drop py2. > > We have planned to have a dedicated discussion in TC office hours on the 24th Thursday #openstack-tc channel. We will > discuss what all need to be done and the schedules. > > You do not have to drop it immediately and keep eyes on this ML thread till we get the consensus on the > community-level plan and schedule. > > Meanwhile, you can always start pre-planning for your projects, for example, stephenfin has started for Nova[4] to > migrate the third party CI etc. Cinder has coordinated with all vendor drivers & their CI to migrate from py2 to py3. > > * Projects want to keep the py2 support? > There is no mandate that projects have to drop the py2 support right now. If you want to keep the support then key things > to discuss are what all you need and does all your dependent projects/libs provide the support of py2. This is something needs to be > discussed case by case. If any project wants to keep the support, add that in the etherpad with a brief reason which will > be helpful to discuss the need and feasibility. > > Feel free to provide feedback or add the missing point on the etherpad. Do not forget to attend the 24th Oct 2019, TC > office hour on Thursday at 1500 UTC in #openstack-tc. > > > [1] https://pythonclock.org/ > [2] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html > [3] https://etherpad.openstack.org/p/drop-python2-support > [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010109.html > > -gmann > > > > From elmiko at redhat.com Thu Oct 24 14:21:21 2019 From: elmiko at redhat.com (Michael McCune) Date: Thu, 24 Oct 2019 10:21:21 -0400 Subject: [api sig] office hour cancelled for today Message-ID: due to schedule conflicts we are short on staff for the office hour today and it will be cancelled. peace o/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Thu Oct 24 15:53:44 2019 From: mthode at mthode.org (Matthew Thode) Date: Thu, 24 Oct 2019 10:53:44 -0500 Subject: [openstack-ansible] core updates In-Reply-To: References: Message-ID: <20191024155344.7xu2onz77lrqgf6u@mthode.org> On 19-10-24 09:45:35, Mohammed Naser wrote: > Hi everyone, > > I'd like to propose the addition of the following 2 new core members: > > - Georgina Shippey (BBC R&D, committed continuous contributor and operator) > - James Denton (long time contributor, extremely knowledgeable in OSA) > > If no one opposes to this, I will be adding them to our core list shortly. > Awesome, welcome. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From gouthampravi at gmail.com Thu Oct 24 16:17:56 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Thu, 24 Oct 2019 09:17:56 -0700 Subject: [tc][horizon][all] Horizon plugins maintenance In-Reply-To: References: Message-ID: On Wed, Oct 23, 2019 at 5:46 AM Ivan Kolodyazhny wrote: > > Hi team, > > As you may know, we've got a pretty big list of Horizon Plugins [1]. Unfortunately, not all of them are in active development due to the lack of resources in projects teams. > > As a Horizon team, we understand all the reasons, and we're doing our best to help other teams to maintain plugins. > > That's why we're proposing our help to maintain horizon plugins. We raised this topic during the last Horizon weekly meeting [2] and we'll have some discussion during the PTG [3] too. > > There are a lot of Horizon changes which affect plugins and horizon team is ready to help: > - new Django versions > - dependencies updates > - Horizon API changes > - etc. > > To get faster fixes in, it would be good to have +2 permissions for the horizon-core team for each plugin. +1, the horizon team has been extremely helpful in reviewing, finding, fixing issues in manila-ui. We certainly appreciate the proactive approach you folks take, Ivan! It's invaluable to teams that have lesser number of core reviewers/contributors who are familiar with the framework or the dependencies. If anyone needs a testimony, we've had the horizon core team have access to merge changes in manila-ui for several releases now, and they've been nothing but responsible and responsive! > > We helped Heat team during the last cycle adding horizon-core to the heat-dashboard-core team. Also, we've got +2 on other plugins via global project config [4] and via Gerrit configuration for (neutron-*aas-dashboard, tuskar-ui). > > Vitrage PTL agreed to do the same for vitrage-dashboard during the last meeting [5]. > > > Of course, it's up to each project to maintain horizon plugins and it's responsibilities but I would like to raise this topic to the TC too. I really sure, that it will speed up some critical fixes for Horizon plugins and makes users and operators experience better. > > > [1] https://docs.openstack.org/horizon/latest/install/plugin-registry.html > [2] http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-10-16-15.02.log.html#l-128 > [3] https://etherpad.openstack.org/p/horizon-u-ptg > [4] http://codesearch.openstack.org/?q=horizon-core&i=nope&files=&repos=openstack/project-config > [5] http://eavesdrop.openstack.org/meetings/vitrage/2019/vitrage.2019-10-23-08.03.log.html#l-21 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ From kristi at nikolla.me Thu Oct 24 16:28:52 2019 From: kristi at nikolla.me (Kristi Nikolla) Date: Thu, 24 Oct 2019 12:28:52 -0400 Subject: [keystone] Federated users who wish to use CLI In-Reply-To: References: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> Message-ID: Hi Rafael, I have no experience with using multiple identity providers directly in Keystone. Does specifying the access_token_endpoint or discovery_endpoint for the specific provider you are trying to authenticate to work? Kristi On Wed, Oct 23, 2019 at 2:06 PM Rafael Weingärtner < rafaelweingartner at gmail.com> wrote: > Hello Colleen, > Have you tested the OpenStack CLI with v3oidcpassword or v3oidcauthcode > and multiple IdPs configured in Keystone? > > We are currently debugging and discussing on how to enable this support in > the CLI. So far, we were not able to make it work with the current code. > This also happens with Horizon. If one has multiple IdPs in Keystone, the > "discovery" process would happen twice, one in Horizon and another in > Keystone, which is executed by the OIDC plugin in the HTTPD. We already > fixed the Horizon issue, but the CLI we are still investigating, and we > suspect that is probably the same problem. > > On Wed, Oct 23, 2019 at 1:56 PM Colleen Murphy > wrote: > >> Hi Jason, >> >> On Mon, Oct 21, 2019, at 14:35, Jason Anderson wrote: >> > Hi all, >> > >> > I'm in the process of prototyping a federated Keystone using OpenID >> > Connect, which will place ephemeral users in a group that has roles in >> > existing projects. I was testing how it felt from the user's >> > perspective and am confused how I'm supposed to be able to use the >> > openstacksdk with federation. For one thing, the RC files I can >> > download from the "API Access" section of Horizon don't seem like they >> > work; the domain is hard-coded to "Federated", >> >> This should be fixed in the latest version of keystone... >> >> > and it also uses a >> > username/password authentication method. >> >> ...but this is not, horizon only knows about the 'password' >> authentication method and can't provide RC files for other types of auth >> methods (unless you create an application credential). >> >> > >> > I can see that there is a way to use KSA to use an existing OIDC >> > token, which I think is probably the most "user-friendly" way, but the >> > user still has to obtain this token themselves out-of-band, which is >> > not trivial. Has anybody else set this up for users who liked to use >> > the CLI? >> >> All of KSA's auth types are supported by the openstack CLI. Which one you >> use depends on your OpenID Connect provider. If your provider supports it, >> you can use the "v3oidcpassword" auth method with the openstack CLI, >> following this example: >> >> https://support.massopen.cloud/kb/faq.php?id=16 >> >> On the other hand if you are using something like Google which only >> supports the authorization_code grant type, then you would have to get the >> authorization code out of band and then use the "v3oidcauthcode" auth type, >> and personally I've never gotten that to work with Google. >> >> > Is the solution to educate users about creating application >> > credentials instead? >> >> This is the best option. It's much easier to manage and horizon provides >> openrc and clouds.yaml files for app creds. >> >> Hope this helps, >> >> Colleen >> >> > >> > Thank you in advance, >> > >> > -- >> > Jason Anderson >> > >> > Chameleon DevOps Lead >> > *Consortium for Advanced Science and Engineering, The University of >> Chicago* >> > *Mathematics & Computer Science Division, Argonne National Laboratory* >> >> > > -- > Rafael Weingärtner > -- Kristi -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Thu Oct 24 16:40:08 2019 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Thu, 24 Oct 2019 19:40:08 +0300 Subject: [openstack-ansible] core updates In-Reply-To: References: Message-ID: <19363181571935208@iva6-161d47f95e63.qloud-c.yandex.net> Great news! Welcome folks! 24.10.2019, 16:50, "Mohammed Naser" : > Hi everyone, > > I'd like to propose the addition of the following 2 new core members: > > - Georgina Shippey (BBC R&D, committed continuous contributor and operator) > - James Denton (long time contributor, extremely knowledgeable in OSA) > > If no one opposes to this, I will be adding them to our core list shortly. > > Thanks, > Mohammed > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. https://vexxhost.com --  Kind Regards, Dmitriy Rabotyagov From rafaelweingartner at gmail.com Thu Oct 24 17:03:37 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Thu, 24 Oct 2019 14:03:37 -0300 Subject: [keystone] Federated users who wish to use CLI In-Reply-To: References: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> Message-ID: We are using the "access_token_endpoint". The token is retrieved nicely from the IdP. However, the issue starts on Keystone side and the Apache HTTPD mod_auth_openidc. The CLI was not ready to deal with it. It is like Horizon, when we have multiple IdPs. The discovery process happens twice, once in Horizon and another one in Keystone. We already fixed the Horizon issue, and now we are working to fix the CLI. We should have something in the next few days. On Thu, Oct 24, 2019 at 1:29 PM Kristi Nikolla wrote: > Hi Rafael, > > I have no experience with using multiple identity providers directly in > Keystone. Does specifying the access_token_endpoint or discovery_endpoint > for the specific provider you are trying to authenticate to work? > > Kristi > > On Wed, Oct 23, 2019 at 2:06 PM Rafael Weingärtner < > rafaelweingartner at gmail.com> wrote: > >> Hello Colleen, >> Have you tested the OpenStack CLI with v3oidcpassword or v3oidcauthcode >> and multiple IdPs configured in Keystone? >> >> We are currently debugging and discussing on how to enable this support >> in the CLI. So far, we were not able to make it work with the current code. >> This also happens with Horizon. If one has multiple IdPs in Keystone, the >> "discovery" process would happen twice, one in Horizon and another in >> Keystone, which is executed by the OIDC plugin in the HTTPD. We already >> fixed the Horizon issue, but the CLI we are still investigating, and we >> suspect that is probably the same problem. >> >> On Wed, Oct 23, 2019 at 1:56 PM Colleen Murphy >> wrote: >> >>> Hi Jason, >>> >>> On Mon, Oct 21, 2019, at 14:35, Jason Anderson wrote: >>> > Hi all, >>> > >>> > I'm in the process of prototyping a federated Keystone using OpenID >>> > Connect, which will place ephemeral users in a group that has roles in >>> > existing projects. I was testing how it felt from the user's >>> > perspective and am confused how I'm supposed to be able to use the >>> > openstacksdk with federation. For one thing, the RC files I can >>> > download from the "API Access" section of Horizon don't seem like they >>> > work; the domain is hard-coded to "Federated", >>> >>> This should be fixed in the latest version of keystone... >>> >>> > and it also uses a >>> > username/password authentication method. >>> >>> ...but this is not, horizon only knows about the 'password' >>> authentication method and can't provide RC files for other types of auth >>> methods (unless you create an application credential). >>> >>> > >>> > I can see that there is a way to use KSA to use an existing OIDC >>> > token, which I think is probably the most "user-friendly" way, but the >>> > user still has to obtain this token themselves out-of-band, which is >>> > not trivial. Has anybody else set this up for users who liked to use >>> > the CLI? >>> >>> All of KSA's auth types are supported by the openstack CLI. Which one >>> you use depends on your OpenID Connect provider. If your provider supports >>> it, you can use the "v3oidcpassword" auth method with the openstack CLI, >>> following this example: >>> >>> https://support.massopen.cloud/kb/faq.php?id=16 >>> >>> On the other hand if you are using something like Google which only >>> supports the authorization_code grant type, then you would have to get the >>> authorization code out of band and then use the "v3oidcauthcode" auth type, >>> and personally I've never gotten that to work with Google. >>> >>> > Is the solution to educate users about creating application >>> > credentials instead? >>> >>> This is the best option. It's much easier to manage and horizon provides >>> openrc and clouds.yaml files for app creds. >>> >>> Hope this helps, >>> >>> Colleen >>> >>> > >>> > Thank you in advance, >>> > >>> > -- >>> > Jason Anderson >>> > >>> > Chameleon DevOps Lead >>> > *Consortium for Advanced Science and Engineering, The University of >>> Chicago* >>> > *Mathematics & Computer Science Division, Argonne National Laboratory* >>> >>> >> >> -- >> Rafael Weingärtner >> > > > -- > Kristi > -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Thu Oct 24 17:12:38 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 24 Oct 2019 12:12:38 -0500 Subject: [ptl][release] Re: [stable][EM] Extended Maintenance - Queens In-Reply-To: <20191017203152.GA828@sm-workstation> References: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> <20191017203152.GA828@sm-workstation> Message-ID: <20191024171238.GA25079@sm-workstation> One final last call for stable/queens. Tomorrow I will be proposing patches to mark all deliverables as Extended Maintenance by adding a queens-em tag. After this point, there will be no more official releases for any queens deliverables. I took another quick look at unreleased commits today, and there are a few projects that still have some things that might be good to release. As mentioned though, downstream consumers can always pick these up as needed once we are past the point of doing official community releases. Some things do not need to be released. Things like CI job changes, internal documentation, and testing fixes are all good, but not something that needs to be delivered in a "release" of that project. Bug fixes, translations, requirements updates, etc., though are useful things. Thanks, Sean > The date for Queens to transition to Extended Maintenance is next week. Late in > the week we will be proposing a patch to tag all deliverables with a > "queens-em" tag. After this point, no additional releases will be allowed. > > I took a quick look through our stable/queens deliverables, and there are > several that look to have a sizable amount of patches landed that have not been > released. Elod was super nice by including all of that for easy checking in [2] > above. > > As part of Extended Maintenance, bugfixes can (and should) be cherry-picked to > stable/queens. But once we enter Extended Maintenance, there won't be any > official releases and it will be up to downstream consumers to pick up these > fixes locally as they need them. > > So consider this a last call for stable/queens releases. > > Thanks! > Sean > From kristi at nikolla.me Thu Oct 24 17:21:26 2019 From: kristi at nikolla.me (Kristi Nikolla) Date: Thu, 24 Oct 2019 13:21:26 -0400 Subject: [keystone] Federated users who wish to use CLI In-Reply-To: References: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> Message-ID: Keep us posted! It would be great to have this documented for future reference. On Thu, Oct 24, 2019 at 1:04 PM Rafael Weingärtner < rafaelweingartner at gmail.com> wrote: > We are using the "access_token_endpoint". The token is retrieved nicely > from the IdP. However, the issue starts on Keystone side and the Apache > HTTPD mod_auth_openidc. The CLI was not ready to deal with it. It is like > Horizon, when we have multiple IdPs. The discovery process happens twice, > once in Horizon and another one in Keystone. We already fixed the Horizon > issue, and now we are working to fix the CLI. We should have something in > the next few days. > > On Thu, Oct 24, 2019 at 1:29 PM Kristi Nikolla wrote: > >> Hi Rafael, >> >> I have no experience with using multiple identity providers directly in >> Keystone. Does specifying the access_token_endpoint or discovery_endpoint >> for the specific provider you are trying to authenticate to work? >> >> Kristi >> >> On Wed, Oct 23, 2019 at 2:06 PM Rafael Weingärtner < >> rafaelweingartner at gmail.com> wrote: >> >>> Hello Colleen, >>> Have you tested the OpenStack CLI with v3oidcpassword or v3oidcauthcode >>> and multiple IdPs configured in Keystone? >>> >>> We are currently debugging and discussing on how to enable this support >>> in the CLI. So far, we were not able to make it work with the current code. >>> This also happens with Horizon. If one has multiple IdPs in Keystone, the >>> "discovery" process would happen twice, one in Horizon and another in >>> Keystone, which is executed by the OIDC plugin in the HTTPD. We already >>> fixed the Horizon issue, but the CLI we are still investigating, and we >>> suspect that is probably the same problem. >>> >>> On Wed, Oct 23, 2019 at 1:56 PM Colleen Murphy >>> wrote: >>> >>>> Hi Jason, >>>> >>>> On Mon, Oct 21, 2019, at 14:35, Jason Anderson wrote: >>>> > Hi all, >>>> > >>>> > I'm in the process of prototyping a federated Keystone using OpenID >>>> > Connect, which will place ephemeral users in a group that has roles >>>> in >>>> > existing projects. I was testing how it felt from the user's >>>> > perspective and am confused how I'm supposed to be able to use the >>>> > openstacksdk with federation. For one thing, the RC files I can >>>> > download from the "API Access" section of Horizon don't seem like >>>> they >>>> > work; the domain is hard-coded to "Federated", >>>> >>>> This should be fixed in the latest version of keystone... >>>> >>>> > and it also uses a >>>> > username/password authentication method. >>>> >>>> ...but this is not, horizon only knows about the 'password' >>>> authentication method and can't provide RC files for other types of auth >>>> methods (unless you create an application credential). >>>> >>>> > >>>> > I can see that there is a way to use KSA to use an existing OIDC >>>> > token, which I think is probably the most "user-friendly" way, but >>>> the >>>> > user still has to obtain this token themselves out-of-band, which is >>>> > not trivial. Has anybody else set this up for users who liked to use >>>> > the CLI? >>>> >>>> All of KSA's auth types are supported by the openstack CLI. Which one >>>> you use depends on your OpenID Connect provider. If your provider supports >>>> it, you can use the "v3oidcpassword" auth method with the openstack CLI, >>>> following this example: >>>> >>>> https://support.massopen.cloud/kb/faq.php?id=16 >>>> >>>> On the other hand if you are using something like Google which only >>>> supports the authorization_code grant type, then you would have to get the >>>> authorization code out of band and then use the "v3oidcauthcode" auth type, >>>> and personally I've never gotten that to work with Google. >>>> >>>> > Is the solution to educate users about creating application >>>> > credentials instead? >>>> >>>> This is the best option. It's much easier to manage and horizon >>>> provides openrc and clouds.yaml files for app creds. >>>> >>>> Hope this helps, >>>> >>>> Colleen >>>> >>>> > >>>> > Thank you in advance, >>>> > >>>> > -- >>>> > Jason Anderson >>>> > >>>> > Chameleon DevOps Lead >>>> > *Consortium for Advanced Science and Engineering, The University of >>>> Chicago* >>>> > *Mathematics & Computer Science Division, Argonne National Laboratory* >>>> >>>> >>> >>> -- >>> Rafael Weingärtner >>> >> >> >> -- >> Kristi >> > > > -- > Rafael Weingärtner > -- Kristi -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Thu Oct 24 19:32:03 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 24 Oct 2019 14:32:03 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> Message-ID: <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> Hello Everyone, We had good amount of discussion on the final plan and schedule in today's TC office hour[1]. I captured the agreement on each point in etherpad (you can see the AGREE:). Also summarizing the discussions here. Imp point is if your projects are planning to keep the py2.7 support then do not delay to tell us. Reply on this ML thread or add your project in etherpad. - Projects can start dropping the py2.7 support. Common lib and testing tools need to wait until milestone-2. ** pepe8 job to be included in openstack-python3-ussuri-jobs-* templates - https://review.opendev.org/#/c/688997/ ** You can drop openstack-python-jobs template and start using ussuri template once 688997 patch is merged. ** Cross projects dependency (if any ) can be sync up among dependent projects. - I will add this plan and schedule as a community goal. The goal is more about what all things to do and when. ** If any project keeping the support then it has to be notified explicitly for its consumer. - Schedule: The schedule is aligned with the Ussuri cycle milestone[2]. I will add the plan in the release schedule also. Phase-1: Dec 09 - Dec 13 R-22 Ussuri-1 milestone ** Project to start dropping the py2 support along with all the py2 CI jobs. Phase-2: Feb 10 - Feb 14 R-13 Ussuri-2 milestone ** This includes Oslo, QA tools (or any other testing tools), common lib (os-brick), Client library. ** This will give enough time to projects to drop the py2 support. Phase-3: Apr 06 - Apr 10 R-5 Ussuri-3 milestone ** Final audit on Phase-1 and Phase-2 plan and make sure everything is done without breaking anything. This is enough time to measure such break or anything extra to do before ussuri final release. Other discussions points and agreement: - Projects want to keep python 2 support and need oslo, QA or any other dependent projects/lib support: ** swift. AI: gmann to reach out to swift team about the plan and exact required things from its dependency (the common lib/testing tool). ** List your project if you want to keep the py2 support. ** Action item: TC liaisons to reach out to their projects and make sure they are aware of this plan[3]. - How to test the upgrade from python2 -> python3 ** AGREE: let's drop the integrated testing for py2->py3 and oslo can check if they can do functional testing of oslo.messaging - https://bugs.launchpad.net/oslo.messaging/+bug/1792977 - What are our guidelines to backport fixes to stable branches? ** AGREE: This will be rare case and if that happen then tweaking for py27 in the backport is fine. The stable branch backport policy does not need any changing for this - What is the tactics of dropping openstack-tox-py27 in our gate? ** AGREE: on merging the pep8 job in ussuri job template https://review.opendev.org/#/c/688997/ [1] http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%23openstack-tc.2019-10-24.log.html#t2019-10-24T14:49:12 [2] https://releases.openstack.org/ussuri/schedule.html [3] https://governance.openstack.org/tc/reference/tc-liaisons.html -gmann ---- On Thu, 24 Oct 2019 09:55:58 -0500 Ghanshyam Mann wrote ---- > Just a reminder, discussion for this is going to start in #openstack-tc channel in another 5 min. > > -gmann > > > ---- On Tue, 15 Oct 2019 13:18:03 -0500 Ghanshyam Mann wrote ---- > > Hello Everyone, > > > > Python 2.7 is going to retire in Jan 2020 [1] and we planned to drop the python 2 support from OpenStack > > during the start of the Ussuri cycle[2]. > > > > Time has come now to start the planning on dropping the Python2. It needs to be coordinated among various > > Projects, libraries, vendors driver, third party CI and testing frameworks. > > > > * Preparation for the Plan & Schedule: > > > > Etherpad: https://etherpad.openstack.org/p/drop-python2-support > > > > We discussed it in TC to come up with the plan, execute it smoothly and avoid breaking any dependent projects. > > I have prepared an etherpad[3](mentioned above also) to capture all the points related to this topic and most importantly > > the draft schedule about who can drop the support and when. The schedule is in the draft state and not final yet. > > The most important points are if you are dropping the support then all your consumers (OpenStack Projects, Vendors drivers etc) > > are ready for that. For example, oslo, os-bricks, client lib, testing framework projects will keep the python2 support until we make > > sure all the consumers of those projects do not require py2 support. If anyone require then how long they can support py2. > > These libraries, testing frameworks will be the last one to drop py2. > > > > We have planned to have a dedicated discussion in TC office hours on the 24th Thursday #openstack-tc channel. We will > > discuss what all need to be done and the schedules. > > > > You do not have to drop it immediately and keep eyes on this ML thread till we get the consensus on the > > community-level plan and schedule. > > > > Meanwhile, you can always start pre-planning for your projects, for example, stephenfin has started for Nova[4] to > > migrate the third party CI etc. Cinder has coordinated with all vendor drivers & their CI to migrate from py2 to py3. > > > > * Projects want to keep the py2 support? > > There is no mandate that projects have to drop the py2 support right now. If you want to keep the support then key things > > to discuss are what all you need and does all your dependent projects/libs provide the support of py2. This is something needs to be > > discussed case by case. If any project wants to keep the support, add that in the etherpad with a brief reason which will > > be helpful to discuss the need and feasibility. > > > > Feel free to provide feedback or add the missing point on the etherpad. Do not forget to attend the 24th Oct 2019, TC > > office hour on Thursday at 1500 UTC in #openstack-tc. > > > > > > [1] https://pythonclock.org/ > > [2] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html > > [3] https://etherpad.openstack.org/p/drop-python2-support > > [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010109.html > > > > -gmann > > > > > > > > > > > From jasonanderson at uchicago.edu Thu Oct 24 19:45:46 2019 From: jasonanderson at uchicago.edu (Jason Anderson) Date: Thu, 24 Oct 2019 19:45:46 +0000 Subject: [keystone] Federated users who wish to use CLI In-Reply-To: References: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> Message-ID: Hey all, thanks for the helpful replies! I did discover that some of my issues were fixed in Horizon Stein (I'm on Rocky still), which added support for RC file templates. Good to know about some of the client quirks that are being sorted out. One thing to point out, v3oidcpassword requires Resource Owner Password Credential grant support (grant_type=password), which not all IdPs support (for example, the one I am integrating against!) Application credentials are an interesting feature and I'll see how it might make sense to leverage them. Cheers! On 10/24/19 12:21 PM, Kristi Nikolla wrote: Keep us posted! It would be great to have this documented for future reference. On Thu, Oct 24, 2019 at 1:04 PM Rafael Weingärtner > wrote: We are using the "access_token_endpoint". The token is retrieved nicely from the IdP. However, the issue starts on Keystone side and the Apache HTTPD mod_auth_openidc. The CLI was not ready to deal with it. It is like Horizon, when we have multiple IdPs. The discovery process happens twice, once in Horizon and another one in Keystone. We already fixed the Horizon issue, and now we are working to fix the CLI. We should have something in the next few days. On Thu, Oct 24, 2019 at 1:29 PM Kristi Nikolla > wrote: Hi Rafael, I have no experience with using multiple identity providers directly in Keystone. Does specifying the access_token_endpoint or discovery_endpoint for the specific provider you are trying to authenticate to work? Kristi On Wed, Oct 23, 2019 at 2:06 PM Rafael Weingärtner > wrote: Hello Colleen, Have you tested the OpenStack CLI with v3oidcpassword or v3oidcauthcode and multiple IdPs configured in Keystone? We are currently debugging and discussing on how to enable this support in the CLI. So far, we were not able to make it work with the current code. This also happens with Horizon. If one has multiple IdPs in Keystone, the "discovery" process would happen twice, one in Horizon and another in Keystone, which is executed by the OIDC plugin in the HTTPD. We already fixed the Horizon issue, but the CLI we are still investigating, and we suspect that is probably the same problem. On Wed, Oct 23, 2019 at 1:56 PM Colleen Murphy > wrote: Hi Jason, On Mon, Oct 21, 2019, at 14:35, Jason Anderson wrote: > Hi all, > > I'm in the process of prototyping a federated Keystone using OpenID > Connect, which will place ephemeral users in a group that has roles in > existing projects. I was testing how it felt from the user's > perspective and am confused how I'm supposed to be able to use the > openstacksdk with federation. For one thing, the RC files I can > download from the "API Access" section of Horizon don't seem like they > work; the domain is hard-coded to "Federated", This should be fixed in the latest version of keystone... > and it also uses a > username/password authentication method. ...but this is not, horizon only knows about the 'password' authentication method and can't provide RC files for other types of auth methods (unless you create an application credential). > > I can see that there is a way to use KSA to use an existing OIDC > token, which I think is probably the most "user-friendly" way, but the > user still has to obtain this token themselves out-of-band, which is > not trivial. Has anybody else set this up for users who liked to use > the CLI? All of KSA's auth types are supported by the openstack CLI. Which one you use depends on your OpenID Connect provider. If your provider supports it, you can use the "v3oidcpassword" auth method with the openstack CLI, following this example: https://support.massopen.cloud/kb/faq.php?id=16 On the other hand if you are using something like Google which only supports the authorization_code grant type, then you would have to get the authorization code out of band and then use the "v3oidcauthcode" auth type, and personally I've never gotten that to work with Google. > Is the solution to educate users about creating application > credentials instead? This is the best option. It's much easier to manage and horizon provides openrc and clouds.yaml files for app creds. Hope this helps, Colleen > > Thank you in advance, > > -- > Jason Anderson > > Chameleon DevOps Lead > *Consortium for Advanced Science and Engineering, The University of Chicago* > *Mathematics & Computer Science Division, Argonne National Laboratory* -- Rafael Weingärtner -- Kristi -- Rafael Weingärtner -- Kristi -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Thu Oct 24 20:53:38 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Thu, 24 Oct 2019 17:53:38 -0300 Subject: [keystone] Federated users who wish to use CLI In-Reply-To: References: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> Message-ID: Jason, just watch out for another issue, which is the group assignment permissions and app credentials. As soon, as we have some updates, I will ping you guys. On Thu, Oct 24, 2019 at 4:49 PM Jason Anderson wrote: > Hey all, thanks for the helpful replies! > > I did discover that some of my issues were fixed in Horizon Stein (I'm on > Rocky still), which added support for RC file templates. Good to know about > some of the client quirks that are being sorted out. One thing to point > out, v3oidcpassword requires Resource Owner Password Credential grant > support (grant_type=password), which not all IdPs support (for example, the > one I am integrating against!) > > Application credentials are an interesting feature and I'll see how it > might make sense to leverage them. > > Cheers! > > On 10/24/19 12:21 PM, Kristi Nikolla wrote: > > Keep us posted! It would be great to have this documented for > future reference. > > On Thu, Oct 24, 2019 at 1:04 PM Rafael Weingärtner < > rafaelweingartner at gmail.com> wrote: > >> We are using the "access_token_endpoint". The token is retrieved nicely >> from the IdP. However, the issue starts on Keystone side and the Apache >> HTTPD mod_auth_openidc. The CLI was not ready to deal with it. It is like >> Horizon, when we have multiple IdPs. The discovery process happens twice, >> once in Horizon and another one in Keystone. We already fixed the Horizon >> issue, and now we are working to fix the CLI. We should have something in >> the next few days. >> >> On Thu, Oct 24, 2019 at 1:29 PM Kristi Nikolla wrote: >> >>> Hi Rafael, >>> >>> I have no experience with using multiple identity providers directly in >>> Keystone. Does specifying the access_token_endpoint or discovery_endpoint >>> for the specific provider you are trying to authenticate to work? >>> >>> Kristi >>> >>> On Wed, Oct 23, 2019 at 2:06 PM Rafael Weingärtner < >>> rafaelweingartner at gmail.com> wrote: >>> >>>> Hello Colleen, >>>> Have you tested the OpenStack CLI with v3oidcpassword or v3oidcauthcode >>>> and multiple IdPs configured in Keystone? >>>> >>>> We are currently debugging and discussing on how to enable this support >>>> in the CLI. So far, we were not able to make it work with the current code. >>>> This also happens with Horizon. If one has multiple IdPs in Keystone, the >>>> "discovery" process would happen twice, one in Horizon and another in >>>> Keystone, which is executed by the OIDC plugin in the HTTPD. We already >>>> fixed the Horizon issue, but the CLI we are still investigating, and we >>>> suspect that is probably the same problem. >>>> >>>> On Wed, Oct 23, 2019 at 1:56 PM Colleen Murphy >>>> wrote: >>>> >>>>> Hi Jason, >>>>> >>>>> On Mon, Oct 21, 2019, at 14:35, Jason Anderson wrote: >>>>> > Hi all, >>>>> > >>>>> > I'm in the process of prototyping a federated Keystone using OpenID >>>>> > Connect, which will place ephemeral users in a group that has roles >>>>> in >>>>> > existing projects. I was testing how it felt from the user's >>>>> > perspective and am confused how I'm supposed to be able to use the >>>>> > openstacksdk with federation. For one thing, the RC files I can >>>>> > download from the "API Access" section of Horizon don't seem like >>>>> they >>>>> > work; the domain is hard-coded to "Federated", >>>>> >>>>> This should be fixed in the latest version of keystone... >>>>> >>>>> > and it also uses a >>>>> > username/password authentication method. >>>>> >>>>> ...but this is not, horizon only knows about the 'password' >>>>> authentication method and can't provide RC files for other types of auth >>>>> methods (unless you create an application credential). >>>>> >>>>> > >>>>> > I can see that there is a way to use KSA to use an existing OIDC >>>>> > token, which I think is probably the most "user-friendly" way, but >>>>> the >>>>> > user still has to obtain this token themselves out-of-band, which is >>>>> > not trivial. Has anybody else set this up for users who liked to use >>>>> > the CLI? >>>>> >>>>> All of KSA's auth types are supported by the openstack CLI. Which one >>>>> you use depends on your OpenID Connect provider. If your provider supports >>>>> it, you can use the "v3oidcpassword" auth method with the openstack CLI, >>>>> following this example: >>>>> >>>>> https://support.massopen.cloud/kb/faq.php?id=16 >>>>> >>>>> On the other hand if you are using something like Google which only >>>>> supports the authorization_code grant type, then you would have to get the >>>>> authorization code out of band and then use the "v3oidcauthcode" auth type, >>>>> and personally I've never gotten that to work with Google. >>>>> >>>>> > Is the solution to educate users about creating application >>>>> > credentials instead? >>>>> >>>>> This is the best option. It's much easier to manage and horizon >>>>> provides openrc and clouds.yaml files for app creds. >>>>> >>>>> Hope this helps, >>>>> >>>>> Colleen >>>>> >>>>> > >>>>> > Thank you in advance, >>>>> > >>>>> > -- >>>>> > Jason Anderson >>>>> > >>>>> > Chameleon DevOps Lead >>>>> > *Consortium for Advanced Science and Engineering, The University of >>>>> Chicago* >>>>> > *Mathematics & Computer Science Division, Argonne National >>>>> Laboratory* >>>>> >>>>> >>>> >>>> -- >>>> Rafael Weingärtner >>>> >>> >>> >>> -- >>> Kristi >>> >> >> >> -- >> Rafael Weingärtner >> > > > -- > Kristi > > > -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From sundar.nadathur at intel.com Thu Oct 24 20:54:55 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Thu, 24 Oct 2019 20:54:55 +0000 Subject: [cyborg] Nominating Chenke as core reviewer Message-ID: Hello, Chenke has contributed substantially to Cyborg [1], esp. in the Train release, and has been an active participant in the community. I would like to acknowledge his significant past and ongoing contributions [2] and enthusiasm, and nominate him as a core reviewer for Cyborg. Please provide any feedback on his nomination by Oct 31, 2019. If there are no objections, his nomination will be made effective on Nov 1, 2019. [1] https://www.stackalytics.com/?module=cyborg-group&user_id=chenker [2] https://review.opendev.org/#/q/project:openstack/cyborg+owner:chenker Regards, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.matulis at canonical.com Thu Oct 24 21:07:58 2019 From: peter.matulis at canonical.com (Peter Matulis) Date: Thu, 24 Oct 2019 17:07:58 -0400 Subject: [charms] OpenStack Charms 19.10 release is now available Message-ID: The OpenStack Charms team is thrilled to announce the 19.10 charms release, introducing support for OpenStack Train and Ceph Nautilus on Ubuntu 18.04 LTS (via UCA) and Ubuntu 19.10. This release also brings several new and valuable features to the existing OpenStack Charms deployments for Queens, Rocky, Stein, and many other stable combinations of Ubuntu + OpenStack. Please see the Release Notes for full details: https://docs.openstack.org/charm-guide/latest/1910.html == Highlights == * OpenStack Train OpenStack Train is now supported on Ubuntu 18.04 LTS (via UCA) and Ubuntu 19.10. * Policy Overrides A new Policy Overrides feature provides operators with a mechanism to override policy defaults on a per-service basis. * Ceph Nautilus The Nautilus release of Ceph is now supported, in conjunction with OpenStack Train. * Ceph placement group autotuning In Ceph Nautilus, the autotuning of placement groups is now supported. * Neutron port forwarding The Neutron port forwarding extension can now be optionally enabled for OpenStack Rocky and later. * Migration to FQDN for agent registration When deploying OpenStack Stein or newer, the Nova Compute agent and Neutron agents will now use a fully qualified domain name (FQDN) when registering with the API services. * Ceph RADOS Gateway tenant namespacing The ceph-radosgw charm now supports deployment with tenant namespaces. * New charm: Placement There is a new charm for the placement API: the 'placement' charm. The new charm must be deployed and related to the nova-cloud-controller charm for OpenStack Train deployments. This will affect Stein to Train upgrades. * New charm: Cinder integration with Pure Storage array There is a new subordinate charm that can be used to integrate Cinder with a Pure Storage array: the 'cinder-purestorage' charm. * nova-cloud-controller: instance migration - DNS caching is now the default The caching of DNS lookups of the nova-compute units is now the default behaviour. == OpenStack Charms team == The OpenStack Charms team can be contacted on the #openstack-charms IRC channel on Freenode. The team will be at the Open Infrastructure Summit and PTG events in Shanghai (November 4-8, 2019). == Thank you == Massive appreciation of the below 59 charm contributors who squashed 41 bugs, enabled support for a new release of OpenStack, improved documentation, and added compelling new functionality! Frode Nordahl Chris MacNaughton Corey Bryant Ryan Beisner Camille Rodriguez David Ames Alex Kavanagh Liam Young James Page Dmitrii Shcherbakov Rodrigo Barbieri Peter Matulis Sahid Orentino Ferdjaoui Edward Hope-Morley Tytus Kurek Ghanshyam Mann Jorge Niedbalski Natalia Litvinova Tiago Pasqualini Nicolas Pochet Narinder Gupta exsdev Andrea Ieri Andreas Jaeger Eduardo Sousa Zachary Zehring Trent Lloyd Mike Wilson David Coronel Stamatis Katsaounis Dan Ackerson Hua Zhang Dongdong Tao Ian Wienand Felipe Reyes Ramon Grullon Joe Guo Erlon R. Cruz George Kraft Alvaro Uria Peter Sabaini Michael Skalka Nikolay Vinogradov Jose Delarosa melissaml Seyeong Kim Mark S. Maglana Nobuto Murata Andrew McLeod Frank Kloeker Tim Burke Cory Johns Marian Gasparovic sunnyve Pete Vander Giessen Ryan Farrell Levente Tamas Alexander Litvinov Marcelo Subtil Marcal -- OpenStack Charms Team From jonathan.rosser at rd.bbc.co.uk Thu Oct 24 21:48:43 2019 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Thu, 24 Oct 2019 22:48:43 +0100 Subject: [openstack-ansible] core updates In-Reply-To: <19363181571935208@iva6-161d47f95e63.qloud-c.yandex.net> References: <19363181571935208@iva6-161d47f95e63.qloud-c.yandex.net> Message-ID: <790963f1-8ecf-ec1b-4f69-1983a79c22b4@rd.bbc.co.uk> + to both from me, good stuff. On 24/10/2019 17:40, Dmitriy Rabotyagov wrote: > Great news! Welcome folks! > > 24.10.2019, 16:50, "Mohammed Naser" : >> Hi everyone, >> >> I'd like to propose the addition of the following 2 new core members: >> >> - Georgina Shippey (BBC R&D, committed continuous contributor and operator) >> - James Denton (long time contributor, extremely knowledgeable in OSA) >> >> If no one opposes to this, I will be adding them to our core list shortly. >> >> Thanks, >> Mohammed >> >> -- >> Mohammed Naser — vexxhost >> ----------------------------------------------------- >> D. 514-316-8872 >> D. 800-910-1726 ext. 200 >> E. mnaser at vexxhost.com >> W. https://vexxhost.com > > -- > Kind Regards, > Dmitriy Rabotyagov > > > From fungi at yuggoth.org Thu Oct 24 21:57:55 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 24 Oct 2019 21:57:55 +0000 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> Message-ID: <20191024215755.ckk42r4qqs4ismlc@yuggoth.org> On 2019-10-24 14:32:03 -0500 (-0500), Ghanshyam Mann wrote: [...] > - Projects can start dropping the py2.7 support. Common lib and > testing tools need to wait until milestone-2. [...] This doesn't match the intent behind what I originally suggested nor my subsequent interpretation of what we discussed, and unfortunately the plan on the etherpad is slightly vague here too. I thought what we were agreeing to was that leaf projects (services and the like) had *until* milestone 1 (~2019-12-12) to remove Python 2.7 testing if they depend on shared libraries which are planning to remove support for it in Ussuri. From then until milestone 2 (~2020-02-13) shared libraries could work on dropping support for Python 2.7. If libs are allowed to drop support for it *after* milestone 2 then that doesn't leave much time before they're released at milestone 3 to stabilize or reverse course. > Phase-1: Dec 09 - Dec 13 R-22 Ussuri-1 milestone > ** Project to start dropping the py2 support along with all the > py2 CI jobs. This is a milestone later than I expected, unless you mean they should be done by this point. It's just about removing jobs, so projects should be on the ball and do this quickly. > Phase-2: Feb 10 - Feb 14 R-13 Ussuri-2 milestone > ** This includes Oslo, QA tools (or any other testing tools), > common lib (os-brick), Client library. > ** This will give enough time to projects to drop the py2 > support. This leaves less than 2 months where libraries are allowed to complete the necessary work before they get released for Ussuri (remember the final release for libraries is at R-6, the week before milestone 3). > Phase-3: Apr 06 - Apr 10 R-5 Ussuri-3 milestone > ** Final audit on Phase-1 and Phase-2 plan and make sure > everything is done without breaking anything. This is enough > time to measure such break or anything extra to do before ussuri > final release. [...] Libraries are released the week before this, so no, that doesn't really provide any auditing opportunity. I apologize, in retrospect I realize that the distinction between "by" and "at" in English could be too subtle for a lot of folks to pick up on, and I should have been more explicit in my original proposal. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at fried.cc Thu Oct 24 22:28:50 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 24 Oct 2019 17:28:50 -0500 Subject: [nova][ptg] Virtual PTG Message-ID: <4254ccd8-88ca-b21d-29b6-ab4e427f3ee4@fried.cc> Hello nova contributors and other stakeholders. As you are aware, nova maintainers will be sparser than usual at the ussuri PTG. For that reason, and also because it promotes better inclusion anyway, I'd like us to do the majority of decision making via the mailing list. The PTG is still a useful place to talk through design ideas, but this will give those not attending a voice in the final direction. To that end, I call your attention to the etherpad [1]. As usual, list your topics there. And if your topic is something for which you only need (or wish to start with) in-person discussions (e.g. "I'd like to do $thing but could use some help figuring out $how"), you're done. But if what you're shooting for is discussion leading to some kind of decision, like... - My spec has been stalled because we can't decide among N different approaches; we need to reach a consensus. - My feature is really important; can we please prioritize it for ussuri? ...then in addition to putting your topic on the etherpad, please initiate a (separate) thread on this mailing list, including [nova][ptg] in your subject line. Some of these topics may be resolved before the PTG itself. Others may be discussed in Shanghai. However, even if a consensus is reached in person, expect that decision to be tentative pending closure of the ML thread. Thanks, efried [1] https://etherpad.openstack.org/p/nova-shanghai-ptg From openstack at fried.cc Thu Oct 24 22:41:58 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 24 Oct 2019 17:41:58 -0500 Subject: [cyborg] Nominating Chenke as core reviewer In-Reply-To: References: Message-ID: <1696ef41-242e-bedc-b1c7-99855253f9bf@fried.cc> (Non cyborg core weighing in) I've worked with chenker through some reviews and contributions and he's been receptive and responsive to feedback, as well as being increasingly active over the past cycle. I think he would make a good addition to the core team. +1. efried On 10/24/19 3:54 PM, Nadathur, Sundar wrote: > Hello, > >     Chenke has contributed substantially to Cyborg [1], esp. in the > Train release, and has been an active participant in the community. I > would like to acknowledge his significant past and ongoing contributions > [2] and enthusiasm, and nominate him as a core reviewer for Cyborg. > >   > > Please provide any feedback on his nomination by Oct 31, 2019. If there > are no objections, his nomination will be made effective on Nov 1, 2019. > >   > > [1] https://www.stackalytics.com/?module=cyborg-group&user_id=chenker > > [2] https://review.opendev.org/#/q/project:openstack/cyborg+owner:chenker > >   > > Regards, > > Sundar > >   > From gmann at ghanshyammann.com Thu Oct 24 22:55:41 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 24 Oct 2019 17:55:41 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <20191024215755.ckk42r4qqs4ismlc@yuggoth.org> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> <20191024215755.ckk42r4qqs4ismlc@yuggoth.org> Message-ID: <16dfffb9772.c7fc65eb179764.2147827441212779322@ghanshyammann.com> ---- On Thu, 24 Oct 2019 16:57:55 -0500 Jeremy Stanley wrote ---- > On 2019-10-24 14:32:03 -0500 (-0500), Ghanshyam Mann wrote: > [...] > > - Projects can start dropping the py2.7 support. Common lib and > > testing tools need to wait until milestone-2. > [...] > > This doesn't match the intent behind what I originally suggested nor > my subsequent interpretation of what we discussed, and unfortunately > the plan on the etherpad is slightly vague here too. I thought what > we were agreeing to was that leaf projects (services and the like) > had *until* milestone 1 (~2019-12-12) to remove Python 2.7 testing > if they depend on shared libraries which are planning to remove > support for it in Ussuri. From then until milestone 2 (~2020-02-13) > shared libraries could work on dropping support for Python 2.7. If > libs are allowed to drop support for it *after* milestone 2 then > that doesn't leave much time before they're released at milestone 3 > to stabilize or reverse course. > > > Phase-1: Dec 09 - Dec 13 R-22 Ussuri-1 milestone > > ** Project to start dropping the py2 support along with all the > > py2 CI jobs. > > This is a milestone later than I expected, unless you mean they > should be done by this point. It's just about removing jobs, so > projects should be on the ball and do this quickly. > > > Phase-2: Feb 10 - Feb 14 R-13 Ussuri-2 milestone > > ** This includes Oslo, QA tools (or any other testing tools), > > common lib (os-brick), Client library. > > ** This will give enough time to projects to drop the py2 > > support. > > This leaves less than 2 months where libraries are allowed to > complete the necessary work before they get released for Ussuri > (remember the final release for libraries is at R-6, the week before > milestone 3). > > > Phase-3: Apr 06 - Apr 10 R-5 Ussuri-3 milestone > > ** Final audit on Phase-1 and Phase-2 plan and make sure > > everything is done without breaking anything. This is enough > > time to measure such break or anything extra to do before ussuri > > final release. > [...] > > Libraries are released the week before this, so no, that doesn't > really provide any auditing opportunity. Sorry for the confusion in the schedule. Below one is what I meant. Phase-1: Now -> Ussuri-1 milestone (deadline R-22 ) ** Project to dropping the py2 support along with all the py2 CI jobs. Phase-2: milestone-1 -> milestone-2 ( deadline R-13 ) ** This includes Oslo, QA tools (or any other testing tools), common lib (os-brick), Client library. Phase-3: at milestone-2 ** Final audit on Phase-1 and Phase-2 plan and make sure everything is done without breaking anything. This is enough time to measure such a break or anything extra to do before Ussuri final release. -gmann > > I apologize, in retrospect I realize that the distinction between > "by" and "at" in English could be too subtle for a lot of folks to > pick up on, and I should have been more explicit in my original > proposal. > -- > Jeremy Stanley > From zigo at debian.org Thu Oct 24 23:15:48 2019 From: zigo at debian.org (Thomas Goirand) Date: Fri, 25 Oct 2019 01:15:48 +0200 Subject: [ops] nova wsgi config In-Reply-To: <20191024090645.GH14827@sync> References: <20191022101943.GG14827@sync> <659657f1-89ba-63b6-f2dc-6d8c42430d08@goirand.fr> <20191024090645.GH14827@sync> Message-ID: <9cbaefe7-fcd0-8c5f-3c6a-b2cda278e01a@debian.org> On 10/24/19 11:06 AM, Arnaud Morin wrote: > Hey Thomas, > > Thank you for your example. > If I understand well, you are using 4 processes in the uwsgi config. > I dont see any number of thread, does it mean the uwsgi is not spawning > threads but only processes? ( so there is only 1 thread per process?) > > Thanks, Hi Arnaud, If you carefully read the notes for nova, they are saying that we should leave the number of thread to 1, otherwise there may be some eventlet reconnection to rabbit issues. It's however fine to increase the number of processes. Cheers, Thomas Goirand (zigo) From aaronzhu1121 at gmail.com Thu Oct 24 23:20:45 2019 From: aaronzhu1121 at gmail.com (Rong Zhu) Date: Fri, 25 Oct 2019 07:20:45 +0800 Subject: [cyborg] Nominating Chenke as core reviewer In-Reply-To: <1696ef41-242e-bedc-b1c7-99855253f9bf@fried.cc> References: <1696ef41-242e-bedc-b1c7-99855253f9bf@fried.cc> Message-ID: (Also non cyborg core weighing in) big +1 for this Eric Fried 于2019年10月25日 周五06:44写道: > (Non cyborg core weighing in) > > I've worked with chenker through some reviews and contributions and he's > been receptive and responsive to feedback, as well as being increasingly > active over the past cycle. I think he would make a good addition to the > core team. +1. > > efried > > On 10/24/19 3:54 PM, Nadathur, Sundar wrote: > > Hello, > > > > Chenke has contributed substantially to Cyborg [1], esp. in the > > Train release, and has been an active participant in the community. I > > would like to acknowledge his significant past and ongoing contributions > > [2] and enthusiasm, and nominate him as a core reviewer for Cyborg. > > > > > > > > Please provide any feedback on his nomination by Oct 31, 2019. If there > > are no objections, his nomination will be made effective on Nov 1, 2019. > > > > > > > > [1] https://www.stackalytics.com/?module=cyborg-group&user_id=chenker > > > > [2] > https://review.opendev.org/#/q/project:openstack/cyborg+owner:chenker > > > > > > > > Regards, > > > > Sundar > > > > > > > > -- Thanks, Rong Zhu -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhipengh512 at gmail.com Thu Oct 24 23:28:03 2019 From: zhipengh512 at gmail.com (Zhipeng Huang) Date: Fri, 25 Oct 2019 07:28:03 +0800 Subject: [cyborg] Nominating Chenke as core reviewer In-Reply-To: References: <1696ef41-242e-bedc-b1c7-99855253f9bf@fried.cc> Message-ID: big +1 ! On Fri, Oct 25, 2019 at 7:23 AM Rong Zhu wrote: > (Also non cyborg core weighing in) > > big +1 for this > > > Eric Fried 于2019年10月25日 周五06:44写道: > >> (Non cyborg core weighing in) >> >> I've worked with chenker through some reviews and contributions and he's >> been receptive and responsive to feedback, as well as being increasingly >> active over the past cycle. I think he would make a good addition to the >> core team. +1. >> >> efried >> >> On 10/24/19 3:54 PM, Nadathur, Sundar wrote: >> > Hello, >> > >> > Chenke has contributed substantially to Cyborg [1], esp. in the >> > Train release, and has been an active participant in the community. I >> > would like to acknowledge his significant past and ongoing contributions >> > [2] and enthusiasm, and nominate him as a core reviewer for Cyborg. >> > >> > >> > >> > Please provide any feedback on his nomination by Oct 31, 2019. If there >> > are no objections, his nomination will be made effective on Nov 1, 2019. >> > >> > >> > >> > [1] https://www.stackalytics.com/?module=cyborg-group&user_id=chenker >> > >> > [2] >> https://review.opendev.org/#/q/project:openstack/cyborg+owner:chenker >> > >> > >> > >> > Regards, >> > >> > Sundar >> > >> > >> > >> >> -- > Thanks, > Rong Zhu > -- Zhipeng (Howard) Huang Principle Engineer OpenStack, Kubernetes, CNCF, LF Edge, ONNX, Kubeflow, OpenSDS, Open Service Broker API, OCP, Hyperledger, ETSI, SNIA, DMTF, W3C -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Fri Oct 25 02:10:20 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 24 Oct 2019 21:10:20 -0500 Subject: [release] Release countdown for week R-28 Message-ID: <20191025021020.GA12799@sm-workstation> Welcome back to the release countdown emails! These will be sent at major points in the Ussuri development cycle, which should conclude with a final release on May 13, 2020. Development Focus ----------------- At this stage in the release cycle, focus should be on planning the Ussuri development cycle, assessing Ussuri community goals and approving Ussuri specs. General Information ------------------- Please note that the Ussuri cycle will be slightly longer than past cycles (30 weeks). In case you haven't seen it yet, please take a look over the schedule for this release: https://releases.openstack.org/ussuri/schedule.html By default, the team PTL is responsible for handling the release cycle and approving release requests. This task can (and probably should) be delegated to release liaisons. Now is a good time to review release liaison information for your team and make sure it is up to date: https://opendev.org/openstack/releases/src/branch/master/data/release_liaisons.yaml By default, all your team deliverables from the Train release are continued in Ussuri with a similar release model. If you intend to drop a deliverable, or modify its release model, please do so before the ussuri-1 milestone by proposing a change to the deliverable file at: https://opendev.org/openstack/releases/src/branch/master/deliverables/ussuri Upcoming Deadlines & Dates -------------------------- Forum+PTG at Shanghai summit: November 4 Ussuri-1 milestone: December 12 (R-22 week) From logan at protiumit.com Fri Oct 25 03:21:07 2019 From: logan at protiumit.com (Logan V.) Date: Thu, 24 Oct 2019 22:21:07 -0500 Subject: [openstack-ansible] core updates In-Reply-To: <790963f1-8ecf-ec1b-4f69-1983a79c22b4@rd.bbc.co.uk> References: <19363181571935208@iva6-161d47f95e63.qloud-c.yandex.net> <790963f1-8ecf-ec1b-4f69-1983a79c22b4@rd.bbc.co.uk> Message-ID: ++ Welcome! On Thu, Oct 24, 2019 at 4:49 PM Jonathan Rosser < jonathan.rosser at rd.bbc.co.uk> wrote: > + to both from me, good stuff. > > On 24/10/2019 17:40, Dmitriy Rabotyagov wrote: > > Great news! Welcome folks! > > > > 24.10.2019, 16:50, "Mohammed Naser" : > >> Hi everyone, > >> > >> I'd like to propose the addition of the following 2 new core members: > >> > >> - Georgina Shippey (BBC R&D, committed continuous contributor and > operator) > >> - James Denton (long time contributor, extremely knowledgeable in OSA) > >> > >> If no one opposes to this, I will be adding them to our core list > shortly. > >> > >> Thanks, > >> Mohammed > >> > >> -- > >> Mohammed Naser — vexxhost > >> ----------------------------------------------------- > >> D. 514-316-8872 > >> D. 800-910-1726 ext. 200 > >> E. mnaser at vexxhost.com > >> W. https://vexxhost.com > > > > -- > > Kind Regards, > > Dmitriy Rabotyagov > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gouthampravi at gmail.com Fri Oct 25 06:38:32 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Thu, 24 Oct 2019 23:38:32 -0700 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <16dfffb9772.c7fc65eb179764.2147827441212779322@ghanshyammann.com> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> <20191024215755.ckk42r4qqs4ismlc@yuggoth.org> <16dfffb9772.c7fc65eb179764.2147827441212779322@ghanshyammann.com> Message-ID: On Thu, Oct 24, 2019 at 3:58 PM Ghanshyam Mann wrote: > ---- On Thu, 24 Oct 2019 16:57:55 -0500 Jeremy Stanley > wrote ---- > > On 2019-10-24 14:32:03 -0500 (-0500), Ghanshyam Mann wrote: > > [...] > > > - Projects can start dropping the py2.7 support. Common lib and > > > testing tools need to wait until milestone-2. > > [...] > > > > This doesn't match the intent behind what I originally suggested nor > > my subsequent interpretation of what we discussed, and unfortunately > > the plan on the etherpad is slightly vague here too. I thought what > > we were agreeing to was that leaf projects (services and the like) > > had *until* milestone 1 (~2019-12-12) to remove Python 2.7 testing > > if they depend on shared libraries which are planning to remove > > support for it in Ussuri. From then until milestone 2 (~2020-02-13) > > shared libraries could work on dropping support for Python 2.7. If > > libs are allowed to drop support for it *after* milestone 2 then > > that doesn't leave much time before they're released at milestone 3 > > to stabilize or reverse course. > > > > > Phase-1: Dec 09 - Dec 13 R-22 Ussuri-1 milestone > > > ** Project to start dropping the py2 support along with all the > > > py2 CI jobs. > > > > This is a milestone later than I expected, unless you mean they > > should be done by this point. It's just about removing jobs, so > > projects should be on the ball and do this quickly. > > > > > Phase-2: Feb 10 - Feb 14 R-13 Ussuri-2 milestone > > > ** This includes Oslo, QA tools (or any other testing tools), > > > common lib (os-brick), Client library. > > > ** This will give enough time to projects to drop the py2 > > > support. > > > > This leaves less than 2 months where libraries are allowed to > > complete the necessary work before they get released for Ussuri > > (remember the final release for libraries is at R-6, the week before > > milestone 3). > > > > > Phase-3: Apr 06 - Apr 10 R-5 Ussuri-3 milestone > > > ** Final audit on Phase-1 and Phase-2 plan and make sure > > > everything is done without breaking anything. This is enough > > > time to measure such break or anything extra to do before ussuri > > > final release. > > [...] > > > > Libraries are released the week before this, so no, that doesn't > > really provide any auditing opportunity. > > Sorry for the confusion in the schedule. Below one is what I meant. > > Phase-1: Now -> Ussuri-1 milestone (deadline R-22 ) > ** Project to dropping the py2 support along with all the py2 CI jobs. > > Phase-2: milestone-1 -> milestone-2 ( deadline R-13 ) > ** This includes Oslo, QA tools (or any other testing tools), common lib > (os-brick), Client library. > > Phase-3: at milestone-2 > ** Final audit on Phase-1 and Phase-2 plan and make sure everything is > done without breaking anything. > This is enough time to measure such a break or anything extra to do > before Ussuri final release. > > > Awesome, thank you for stating this very clearly. For OpenStack manila, I've lined up the patches like you've indicated: Phase-1: Now -> Ussuri-1 milestone (deadline R-22 ) ** openstack/manila will drop support for python2.7: https://review.opendev.org/#/c/691134/ Phase-2: milestone-1 -> milestone-2 ( deadline R-13 ) ** Client projects and tempest plugin will drop support for python2.7: https://review.opendev.org/#/c/691183/ (openstack/python-manilaclient) https://review.opendev.org/#/c/691186/ (openstack/manila-tempest-plugin) https://review.opendev.org/#/c/691184/ (openstack/manila-ui) Phase-3: at milestone-2 ** Final audit on Phase-1 and Phase-2 plan and make sure everything is done without breaking anything. I'm using the Gerrit topic "drop-py2" to track these. > -gmann > > > > > I apologize, in retrospect I realize that the distinction between > > "by" and "at" in English could be too subtle for a lot of folks to > > pick up on, and I should have been more explicit in my original > > proposal. > > -- > > Jeremy Stanley > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From missile0407 at gmail.com Fri Oct 25 07:35:18 2019 From: missile0407 at gmail.com (Eddie Yen) Date: Fri, 25 Oct 2019 15:35:18 +0800 Subject: [ceilometer] The correct way to change metric granularity time? Message-ID: Hi We're using Ceilometer with Gnocchi in Rocky release. And we're using cpu_util variable to doing auto scaling on Heat. We known that ceilometer will create its own archive policy in gnocchi, and default value of granularity is 5 minutes. We want change ceilometer's granularity to a minute. Here's our workaround. 1. Edit pipeline.yaml. Since gnocchi has default policy that granularity = 1 second, we change "gnocchi://" to "gnocchi://?archive_policy=high" 2. Edit ceilometer.conf & polling.yaml, change both evaluation_interval (ceilometer.conf) & interval (polling.yaml) to 60. 3. Restart all ceilometer services We tested this workaround but found that it would not work sometimes. And found that the first detection triggered every "o'clock" when created auto scaling at first time. If we want to shrink the granularity time, is our workaround correct? And how is the correct way to do this if not? Many thanks, Eddie. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbooth at redhat.com Fri Oct 25 08:30:01 2019 From: mbooth at redhat.com (Matthew Booth) Date: Fri, 25 Oct 2019 09:30:01 +0100 Subject: [nova][ptg] Virtual PTG In-Reply-To: <4254ccd8-88ca-b21d-29b6-ab4e427f3ee4@fried.cc> References: <4254ccd8-88ca-b21d-29b6-ab4e427f3ee4@fried.cc> Message-ID: On Thu, 24 Oct 2019 at 23:32, Eric Fried wrote: > > Hello nova contributors and other stakeholders. > > As you are aware, nova maintainers will be sparser than usual at the > ussuri PTG. For that reason, and also because it promotes better > inclusion anyway, I'd like us to do the majority of decision making via > the mailing list. The PTG is still a useful place to talk through design > ideas, but this will give those not attending a voice in the final > direction. > > To that end, I call your attention to the etherpad [1]. As usual, list > your topics there. And if your topic is something for which you only > need (or wish to start with) in-person discussions (e.g. "I'd like to do > $thing but could use some help figuring out $how"), you're done. > > But if what you're shooting for is discussion leading to some kind of > decision, like... > > - My spec has been stalled because we can't decide among N different > approaches; we need to reach a consensus. > - My feature is really important; can we please prioritize it for ussuri? > > ...then in addition to putting your topic on the etherpad, please > initiate a (separate) thread on this mailing list, including [nova][ptg] > in your subject line. Thanks for doing this, Eric! Could I also ask that we link ML threads in the etherpad as well just to keep everything tied together? Thanks, Matt > [1] https://etherpad.openstack.org/p/nova-shanghai-ptg -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK) From luka.peschke at objectif-libre.com Fri Oct 25 08:54:50 2019 From: luka.peschke at objectif-libre.com (Luka Peschke) Date: Fri, 25 Oct 2019 10:54:50 +0200 Subject: [tc][horizon][all] Horizon plugins maintenance In-Reply-To: References: Message-ID: <84ea879acf9104061363c35a9e1dba0b@objectif-libre.com> Hi, We really appreciate the help! I've submitted a change adding +2/+1 permissions for horizon-core on cloudkitty-dashboard [1]. Cheers, [1] https://review.opendev.org/#/c/691263/ -- Luka Peschke (peschk_l) Le 2019-10-23 14:41, Ivan Kolodyazhny a écrit : > Hi team, > > As you may know, we've got a pretty big list of Horizon Plugins [1]. > Unfortunately, not all of them are in active development due to the > lack of resources in projects teams. > > As a Horizon team, we understand all the reasons, and we're doing our > best to help other teams to maintain plugins. > > That's why we're proposing our help to maintain horizon plugins. We: > raised this topic during the last Horizon weekly meeting [2] and we'll > have some discussion during the PTG [3] too. > > There are a lot of Horizon changes which affect plugins and horizon > team is ready to help: > - new Django versions > - dependencies updates > - Horizon API changes > - etc. > > To get faster fixes in, it would be good to have +2 permissions for > the horizon-core team for each plugin. > > We helped Heat team during the last cycle adding horizon-core to the > heat-dashboard-core team. Also, we've got +2 on other plugins via > global project config [4] and via Gerrit configuration for > (neutron-*aas-dashboard, tuskar-ui). > > Vitrage PTL agreed to do the same for vitrage-dashboard during the > last meeting [5]. > > Of course, it's up to each project to maintain horizon plugins and > it's responsibilities but I would like to raise this topic to the TC > too. I really sure, that it will speed up some critical fixes for > Horizon plugins and makes users and operators experience better. > > [1] > https://docs.openstack.org/horizon/latest/install/plugin-registry.html > [2] > http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-10-16-15.02.log.html#l-128 > > [3] https://etherpad.openstack.org/p/horizon-u-ptg > [4] > http://codesearch.openstack.org/?q=horizon-core&i=nope&files=&repos=openstack/project-config > [5] > http://eavesdrop.openstack.org/meetings/vitrage/2019/vitrage.2019-10-23-08.03.log.html#l-21 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ From rico.lin.guanyu at gmail.com Fri Oct 25 09:32:41 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Fri, 25 Oct 2019 17:32:41 +0800 Subject: [tc] Weekly update Message-ID: Hello TC members, Here's some update for this week ** For summit update* - No launch time presentation due to launch place limitation - Joint Leadership Meeting will take place on Sunday [1] - TC Dinner update: Dinner is scheduled at 8:00 pm on Wednesday. More detail can be found in [2]. - Remember we now have two meet project leaders events settled during Summit - Monday: https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24417/meet-the-project-leaders - Wednesday: https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24426/meet-the-project-leaders - Right now, we already have some PTG session proposed in [2], please help to prepare it before PTG. And of course we still allow new suggestions, but be careful and make sure all members notice it. - PTG team photo will take place on Friday 11:50-12:00 ** We now have goal champions [5]:* - tosky for zuulv3 migration - diablo_rojo for `Project Specific New Contributor & PTL Docs` ** Reference to Airship TC confirmation feedbacks* - Airship is official now! here's what feedbacks OpenStack TCs provides [3] ** drop python 2.7 * - Discussion for drop python 2.7 in etherpad[7] and ML[4]. There's a lot of discussion happened in [6]. - Please review the schedule mentioned in [7]. - Also, action required for TC members as TC liaison for projects. ** Release naming status:* As mentioned in [8], we need TC members to vote on roll call. And by not give any vote means you're fine to let others decide if any of them are good. Thank you, everyone! And for who will come to Summit! See you in Shanghai:) [1] https://wiki.openstack.org/wiki/Governance/Foundation/3November2019BoardMeeting [2] https://etherpad.openstack.org/p/PVG-TC-PTG [3] https://etherpad.openstack.org/p/openstack-tc-airship-confirmation-feedback [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010142.html [5] https://etherpad.openstack.org/p/PVG-u-series-goals [6] http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%23openstack-tc.2019-10-24.log.html [7] https://etherpad.openstack.org/p/drop-python2-support [8] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010106.html Regards, JP & Rico -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Fri Oct 25 10:17:13 2019 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Fri, 25 Oct 2019 10:17:13 +0000 Subject: [nova][ptg] Continue QoS port support Message-ID: <1571998629.31348.8@est.tech> Hi Novas! Here is my summarized plans for Ussuri about continuing the work on supporting qos neutron ports (those that has bandwidth resource request) in nova. What is missing: * Tempest coverage is missing for migrate and resize support that is merged in Train. This work is already underway and bugs has been caught [1][2] * Support evacuate, live migrate, unshelve. The work is described in [3][4] and the first set of patches for the evacuation support is up for review [5] * Support for cross cell resize with qos port needs some work. Matt prepared the cross cell resize code already in a way that no new RPC change will be needed [6] and I have a plan what to do [7]. * InstancePCIRequest persists parent_ifname during migration but the such change is not rolled back if the migration fails. This is ugly but I think it does not cause any issues [8]. I will look into this to remove the ugliness. The bandwidth support for the nova-manage heal_allocation tool was merged in Train. Originally I planned to backport that to Stein but that patch grown so big and incorporated may refactors along the way that I'm not sure any more that it is reasonable to backport it. I'm now thinking about keeping it as-is and suggesting operators to install Train nova in a virtualenv to run heal allocations for bandwidth aware servers if needed in Stein. I do have to run some manual tests to see if it actually works. Any feedback is welcome! cheers, gibi [1] https://bugs.launchpad.net/nova/+bug/1849695 [2] https://bugs.launchpad.net/nova/+bug/1849657 [3] https://blueprints.launchpad.net/nova/+spec/support-move-ops-with-qos-ports-ussuri [4] https://specs.openstack.org/openstack/nova-specs/specs/ussuri/approved/support-move-ops-with-qos-ports-ussuri.html [5] https://review.opendev.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/support-move-ops-with-qos-ports-ussuri [6] https://review.opendev.org/#/c/635080/43/nova/compute/manager.py at 5375 [7] https://review.opendev.org/#/c/633293/49/nova/compute/manager.py at 4742 [8] https://review.opendev.org/#/c/688387/6/nova/compute/manager.py at 3404 From arnaud.morin at gmail.com Fri Oct 25 10:58:40 2019 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Fri, 25 Oct 2019 10:58:40 +0000 Subject: [ops] nova wsgi config In-Reply-To: <9cbaefe7-fcd0-8c5f-3c6a-b2cda278e01a@debian.org> References: <20191022101943.GG14827@sync> <659657f1-89ba-63b6-f2dc-6d8c42430d08@goirand.fr> <20191024090645.GH14827@sync> <9cbaefe7-fcd0-8c5f-3c6a-b2cda278e01a@debian.org> Message-ID: <20191025105840.GK14827@sync> That what I figured out after writing my previous mail. Anyway, thanks for your help! See you in Shanghai. -- Arnaud Morin On 25.10.19 - 01:15, Thomas Goirand wrote: > On 10/24/19 11:06 AM, Arnaud Morin wrote: > > Hey Thomas, > > > > Thank you for your example. > > If I understand well, you are using 4 processes in the uwsgi config. > > I dont see any number of thread, does it mean the uwsgi is not spawning > > threads but only processes? ( so there is only 1 thread per process?) > > > > Thanks, > > Hi Arnaud, > > If you carefully read the notes for nova, they are saying that we should > leave the number of thread to 1, otherwise there may be some eventlet > reconnection to rabbit issues. > > It's however fine to increase the number of processes. > > Cheers, > > Thomas Goirand (zigo) > From mriedemos at gmail.com Fri Oct 25 13:35:46 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 25 Oct 2019 08:35:46 -0500 Subject: [nova][ptg] Continue QoS port support In-Reply-To: <1571998629.31348.8@est.tech> References: <1571998629.31348.8@est.tech> Message-ID: On 10/25/2019 5:17 AM, Balázs Gibizer wrote: > The bandwidth support for the nova-manage heal_allocation tool was > merged in Train. Originally I planned to backport that to Stein but > that patch grown so big and incorporated may refactors along the way > that I'm not sure any more that it is reasonable to backport it. I'm > now thinking about keeping it as-is and suggesting operators to install > Train nova in a virtualenv to run heal allocations for bandwidth aware > servers if needed in Stein. I think that's reasonable. Trying to backport that to stein would be a challenge, both in you doing it and stable cores reviewing it. -- Thanks, Matt From sean.mcginnis at gmx.com Fri Oct 25 16:48:29 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Fri, 25 Oct 2019 11:48:29 -0500 Subject: [ptl][release] Re: [stable][EM] Extended Maintenance - Queens In-Reply-To: <20191024171238.GA25079@sm-workstation> References: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> <20191017203152.GA828@sm-workstation> <20191024171238.GA25079@sm-workstation> Message-ID: <20191025164829.GB29562@sm-workstation> On Thu, Oct 24, 2019 at 12:12:38PM -0500, Sean McGinnis wrote: > One final last call for stable/queens. Tomorrow I will be proposing patches to > mark all deliverables as Extended Maintenance by adding a queens-em tag. After > this point, there will be no more official releases for any queens > deliverables. > Patches have now been proposed: https://review.opendev.org/#/q/status:open+project:openstack/releases+branch:master+topic:queens-em If PTLs and/or release liaisons can leave +1, that can help us know which ones we can approve right away to start to clear the queue. Otherwise, for ones with no ack, we will wait until next week and take the silence as implicit approval. We do have a few teams that needed a little more time to wrap up some final things to leave the last release in a good state. We can hold off on a few a little bit longer if they are important. Otherwise, just another reminder, downstream consumers will be able to pick up committed changes during the Extended Maintenance period. They just won't be included in an official community release. Big thanks to Elod for helping get the ball rolling on this and getting reviewers added to the patches! Thanks! Sean From gmann at ghanshyammann.com Fri Oct 25 17:15:14 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 25 Oct 2019 12:15:14 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> <20191024215755.ckk42r4qqs4ismlc@yuggoth.org> <16dfffb9772.c7fc65eb179764.2147827441212779322@ghanshyammann.com> Message-ID: <16e03ea4452.d698dfc8211156.39695754352130827@ghanshyammann.com> ---- On Fri, 25 Oct 2019 01:38:32 -0500 Goutham Pacha Ravi wrote ---- > > > On Thu, Oct 24, 2019 at 3:58 PM Ghanshyam Mann wrote: > ---- On Thu, 24 Oct 2019 16:57:55 -0500 Jeremy Stanley wrote ---- > > On 2019-10-24 14:32:03 -0500 (-0500), Ghanshyam Mann wrote: > > [...] > > > - Projects can start dropping the py2.7 support. Common lib and > > > testing tools need to wait until milestone-2. > > [...] > > > > This doesn't match the intent behind what I originally suggested nor > > my subsequent interpretation of what we discussed, and unfortunately > > the plan on the etherpad is slightly vague here too. I thought what > > we were agreeing to was that leaf projects (services and the like) > > had *until* milestone 1 (~2019-12-12) to remove Python 2.7 testing > > if they depend on shared libraries which are planning to remove > > support for it in Ussuri. From then until milestone 2 (~2020-02-13) > > shared libraries could work on dropping support for Python 2.7. If > > libs are allowed to drop support for it *after* milestone 2 then > > that doesn't leave much time before they're released at milestone 3 > > to stabilize or reverse course. > > > > > Phase-1: Dec 09 - Dec 13 R-22 Ussuri-1 milestone > > > ** Project to start dropping the py2 support along with all the > > > py2 CI jobs. > > > > This is a milestone later than I expected, unless you mean they > > should be done by this point. It's just about removing jobs, so > > projects should be on the ball and do this quickly. > > > > > Phase-2: Feb 10 - Feb 14 R-13 Ussuri-2 milestone > > > ** This includes Oslo, QA tools (or any other testing tools), > > > common lib (os-brick), Client library. > > > ** This will give enough time to projects to drop the py2 > > > support. > > > > This leaves less than 2 months where libraries are allowed to > > complete the necessary work before they get released for Ussuri > > (remember the final release for libraries is at R-6, the week before > > milestone 3). > > > > > Phase-3: Apr 06 - Apr 10 R-5 Ussuri-3 milestone > > > ** Final audit on Phase-1 and Phase-2 plan and make sure > > > everything is done without breaking anything. This is enough > > > time to measure such break or anything extra to do before ussuri > > > final release. > > [...] > > > > Libraries are released the week before this, so no, that doesn't > > really provide any auditing opportunity. > > Sorry for the confusion in the schedule. Below one is what I meant. > > Phase-1: Now -> Ussuri-1 milestone (deadline R-22 ) > ** Project to dropping the py2 support along with all the py2 CI jobs. > > Phase-2: milestone-1 -> milestone-2 ( deadline R-13 ) > ** This includes Oslo, QA tools (or any other testing tools), common lib (os-brick), Client library. > > Phase-3: at milestone-2 > ** Final audit on Phase-1 and Phase-2 plan and make sure everything is done without breaking anything. > This is enough time to measure such a break or anything extra to do before Ussuri final release. > > > > Awesome, thank you for stating this very clearly. For OpenStack manila, I've lined up the patches like you've indicated: > Phase-1: Now -> Ussuri-1 milestone (deadline R-22 )** openstack/manila will drop support for python2.7: https://review.opendev.org/#/c/691134/ > Phase-2: milestone-1 -> milestone-2 ( deadline R-13 )** Client projects and tempest plugin will drop support for python2.7: https://review.opendev.org/#/c/691183/ (openstack/python-manilaclient) https://review.opendev.org/#/c/691186/ (openstack/manila-tempest-plugin) https://review.opendev.org/#/c/691184/ (openstack/manila-ui) > Phase-3: at milestone-2** Final audit on Phase-1 and Phase-2 plan and make sure everything is done without breaking anything. > I'm using the Gerrit topic "drop-py2" to track these. Perfect things Gautham. I have drafted this as a goal also - https://review.opendev.org/#/c/691178/ -gmann > -gmann > > > > > I apologize, in retrospect I realize that the distinction between > > "by" and "at" in English could be too subtle for a lot of folks to > > pick up on, and I should have been more explicit in my original > > proposal. > > -- > > Jeremy Stanley > > > > > From ignaziocassano at gmail.com Fri Oct 25 19:08:27 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 25 Oct 2019 21:08:27 +0200 Subject: [heat] software deployment doen not work on queens (timed out) Message-ID: Hello All, I created a simple heat stack on queens with SoftwareDeployment but it does not terminate because timed out. The following is my simple stack template: heat_template_version: queens parameters: key_name: type: string default: opstkcsi flavor: type: string default: m1.small image: type: string default: centos7 resources: config: type: OS::Heat::SoftwareConfig properties: inputs: - name: previous default: 'NONE' group: script config: | #!/bin/bash echo "Previous: $previous" echo "${deploy_resource_name} is running on $(hostname) at $(date)" deployment_a: type: OS::Heat::SoftwareDeployment properties: config: get_resource: config server: get_resource: server_a deployment_b: type: OS::Heat::SoftwareDeployment properties: input_values: previous: get_attr: [deployment_a, deploy_stdout] config: get_resource: config server: get_resource: server_b deployment_c: type: OS::Heat::SoftwareDeployment depends_on: deployment_b properties: input_values: previous: 'deployment_b' config: get_resource: config server: get_resource: server_a server_a: type: OS::Nova::Server properties: flavor: get_param: flavor networks: - network: "565" key_name: get_param: key_name block_device_mapping: [{ device_name: "vda", volume_id : { get_resource : volume1 }, delete_on_termination : "false" }] user_data_format: SOFTWARE_CONFIG server_b: type: OS::Nova::Server properties: flavor: get_param: flavor networks: - network: "565" key_name: get_param: key_name block_device_mapping: [{ device_name: "vda", volume_id : { get_resource : volume2 }, delete_on_termination : "false" }] user_data_format: SOFTWARE_CONFIG volume1: type: OS::Cinder::Volume properties: name: "Server-RootDisk" image: { get_param: image } size: 20 volume2: type: OS::Cinder::Volume properties: name: "Server-RootDisk" image: { get_param: image } size: 20 outputs: deployment_a_stdout: value: get_attr: [deployment_a, deploy_stdout] deployment_b_stdout: value: get_attr: [deployment_b, deploy_stdout] deployment_c_stdout: value: get_attr: [deployment_c, deploy_stdout] It works on ocata but timed out on queens. I connected on the first virtual machine for debugging and I launched: ystemctl stop os-collect-config.service and then: sudo os-collect-config --force --one-time --debug It reports the following: HTTPConnectionPool(host='10.102.184.83', port=8000): Read timed out. (read timeout=10.0) Source [cfn] Unavailable. /var/lib/os-collect-config/local-data not found. Skipping [2019-10-25 21:02:31,948] (os-refresh-config) [INFO] Starting phase pre-configure dib-run-parts ven 25 ott 2019, 21.02.31, CEST ----------------------- PROFILING ----------------------- dib-run-parts ven 25 ott 2019, 21.02.31, CEST dib-run-parts ven 25 ott 2019, 21.02.31, CEST Target: pre-configure.d dib-run-parts ven 25 ott 2019, 21.02.31, CEST dib-run-parts ven 25 ott 2019, 21.02.31, CEST Script Seconds dib-run-parts ven 25 ott 2019, 21.02.31, CEST --------------------------------------- ---------- dib-run-parts ven 25 ott 2019, 21.02.31, CEST dib-run-parts ven 25 ott 2019, 21.02.31, CEST dib-run-parts ven 25 ott 2019, 21.02.31, CEST --------------------- END PROFILING --------------------- [2019-10-25 21:02:31,987] (os-refresh-config) [INFO] Completed phase pre-configure [2019-10-25 21:02:31,987] (os-refresh-config) [INFO] Starting phase configure dib-run-parts ven 25 ott 2019, 21.02.32, CEST Running /usr/libexec/os-refresh-config/configure.d/20-os-apply-config [2019/10/25 09:02:32 PM] [INFO] writing /var/run/heat-config/heat-config [2019/10/25 09:02:32 PM] [INFO] writing /etc/os-collect-config.conf [2019/10/25 09:02:32 PM] [INFO] success dib-run-parts ven 25 ott 2019, 21.02.32, CEST 20-os-apply-config completed dib-run-parts ven 25 ott 2019, 21.02.32, CEST Running /usr/libexec/os-refresh-config/configure.d/55-heat-config dib-run-parts ven 25 ott 2019, 21.02.32, CEST 55-heat-config completed dib-run-parts ven 25 ott 2019, 21.02.32, CEST ----------------------- PROFILING ----------------------- dib-run-parts ven 25 ott 2019, 21.02.32, CEST dib-run-parts ven 25 ott 2019, 21.02.32, CEST Target: configure.d dib-run-parts ven 25 ott 2019, 21.02.32, CEST dib-run-parts ven 25 ott 2019, 21.02.32, CEST Script Seconds dib-run-parts ven 25 ott 2019, 21.02.32, CEST --------------------------------------- ---------- dib-run-parts ven 25 ott 2019, 21.02.32, CEST dib-run-parts Fri Oct 25 21:02:32 CEST 2019 20-os-apply-config 0.316 dib-run-parts Fri Oct 25 21:02:32 CEST 2019 55-heat-config 0.165 dib-run-parts ven 25 ott 2019, 21.02.32, CEST dib-run-parts ven 25 ott 2019, 21.02.32, CEST --------------------- END PROFILING --------------------- [2019-10-25 21:02:32,530] (os-refresh-config) [INFO] Completed phase configure [2019-10-25 21:02:32,530] (os-refresh-config) [INFO] Starting phase post-configure dib-run-parts ven 25 ott 2019, 21.02.32, CEST Running /usr/libexec/os-refresh-config/post-configure.d/99-refresh-completed ++ os-apply-config --key completion-handle --type raw --key-default '' + HANDLE= ++ os-apply-config --key completion-signal --type raw --key-default '' + SIGNAL= ++ os-apply-config --key instance-id --type raw --key-default '' + ID=i-00000049 + '[' -n i-00000049 ']' + '[' -n '' ']' + '[' -n '' ']' ++ os-apply-config --key deployments --type raw --key-default '' ++ jq -r 'map(select(.group == "os-apply-config") | select(.inputs[].name == "deploy_signal_id") | .id + (.inputs | map(select(.name == "deploy_signal_id")) | .[].value)) | .[]' + DEPLOYMENTS= + DEPLOYED_DIR=/var/lib/os-apply-config-deployments/deployed + '[' '!' -d /var/lib/os-apply-config-deployments/deployed ']' dib-run-parts ven 25 ott 2019, 21.02.33, CEST 99-refresh-completed completed dib-run-parts ven 25 ott 2019, 21.02.33, CEST ----------------------- PROFILING ----------------------- dib-run-parts ven 25 ott 2019, 21.02.33, CEST dib-run-parts ven 25 ott 2019, 21.02.33, CEST Target: post-configure.d dib-run-parts ven 25 ott 2019, 21.02.33, CEST dib-run-parts ven 25 ott 2019, 21.02.33, CEST Script Seconds dib-run-parts ven 25 ott 2019, 21.02.33, CEST --------------------------------------- ---------- dib-run-parts ven 25 ott 2019, 21.02.33, CEST dib-run-parts Fri Oct 25 21:02:33 CEST 2019 99-refresh-completed 1.231 dib-run-parts ven 25 ott 2019, 21.02.33, CEST dib-run-parts ven 25 ott 2019, 21.02.33, CEST --------------------- END PROFILING --------------------- [2019-10-25 21:02:33,811] (os-refresh-config) [INFO] Completed phase post-configure [2019-10-25 21:02:33,811] (os-refresh-config) [INFO] Starting phase migration dib-run-parts ven 25 ott 2019, 21.02.33, CEST ----------------------- PROFILING ----------------------- dib-run-parts ven 25 ott 2019, 21.02.33, CEST dib-run-parts ven 25 ott 2019, 21.02.33, CEST Target: migration.d dib-run-parts ven 25 ott 2019, 21.02.33, CEST dib-run-parts ven 25 ott 2019, 21.02.33, CEST Script Seconds dib-run-parts ven 25 ott 2019, 21.02.33, CEST --------------------------------------- ---------- dib-run-parts ven 25 ott 2019, 21.02.33, CEST dib-run-parts ven 25 ott 2019, 21.02.33, CEST dib-run-parts ven 25 ott 2019, 21.02.33, CEST --------------------- END PROFILING --------------------- [2019-10-25 21:02:33,848] (os-refresh-config) [INFO] Completed phase migration The first line: HTTPConnectionPool(host='10.102.184.83', port=8000): Read timed out. (read timeout=10.0) Source [cfn] Unavailable. It is strange: 10.102.184.83 is the vip public endpoint reported in my heat.conf: heat_metadata_server_url = http://10.102.184.83:8000 heat_waitcondition_server_url = http://10.102.184.83:8000/v1/waitcondition The virtual machine can contact the above addres on port 8000. Please, anyone can help me ? Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Fri Oct 25 19:26:48 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 25 Oct 2019 12:26:48 -0700 Subject: [tc][all] Ussuri community goal candidate 1: 'Project Specific New Contributor & PTL Docs' In-Reply-To: <16df407dbf8.11d25468592036.8156563932102242889@ghanshyammann.com> References: <16df407dbf8.11d25468592036.8156563932102242889@ghanshyammann.com> Message-ID: <9e27d91a-6974-32f6-dd81-ddb7c8eee0eb@gmail.com> On 10/22/19 08:13, Ghanshyam Mann wrote: > Hello Everyone, > > We are starting the next step for the Ussuri Cycle Community Goals. We have four candidates till now as proposed in > etherpad[1]. > > The first candidate is "Project Specific New Contributor & PTL Docs". Kendall (diablo_rojo) volunteered to lead this goal > as Champion. Thanks to her for stepping up for this job. > > This idea was brought up during Train cycle goal discussions also[2]. The idea here is to have a consistent and mandatory > contributors guide in each project which will help new contributors to get onboard in upstream activities. > Also, create PTL duties guide on the project's side. Few projects might have the PTL duties documented and making it > consistent and for all projects is something easy for transferring the knowledge. > Kendall can put up more details and highlights based on queries. > > We would like to open this idea to get wider feedback from the community and projects team before we start defining > the goal in Gerrit. What do you think of this as a community goal? Any query or Improvement Feedback? I wrote what I call a "chronological PTL guide" for nova [3], which was originally a local google doc I had used while I was PTL. I converted and published it as an in-tree nova doc in case it would be helpful to others. I think having a PTL guide doc is nice and could potentially make it easier for new or prospective PTLs to learn what is involved in the duties. -melanie [3] https://docs.openstack.org/nova/latest/contributor/ptl-guide.html > [1] https://etherpad.openstack.org/p/PVG-u-series-goals > [2] https://etherpad.openstack.org/p/BER-t-series-goals > > -gmann & diablo_rojo > > From mnaser at vexxhost.com Sun Oct 27 23:48:48 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Sun, 27 Oct 2019 19:48:48 -0400 Subject: [openstack-ansible] core updates In-Reply-To: References: <19363181571935208@iva6-161d47f95e63.qloud-c.yandex.net> <790963f1-8ecf-ec1b-4f69-1983a79c22b4@rd.bbc.co.uk> Message-ID: Due to no objections, I've added both to the core team. Welcome Georgina & James! :) On Thu, Oct 24, 2019 at 11:24 PM Logan V. wrote: > > ++ Welcome! > > On Thu, Oct 24, 2019 at 4:49 PM Jonathan Rosser wrote: >> >> + to both from me, good stuff. >> >> On 24/10/2019 17:40, Dmitriy Rabotyagov wrote: >> > Great news! Welcome folks! >> > >> > 24.10.2019, 16:50, "Mohammed Naser" : >> >> Hi everyone, >> >> >> >> I'd like to propose the addition of the following 2 new core members: >> >> >> >> - Georgina Shippey (BBC R&D, committed continuous contributor and operator) >> >> - James Denton (long time contributor, extremely knowledgeable in OSA) >> >> >> >> If no one opposes to this, I will be adding them to our core list shortly. >> >> >> >> Thanks, >> >> Mohammed >> >> >> >> -- >> >> Mohammed Naser — vexxhost >> >> ----------------------------------------------------- >> >> D. 514-316-8872 >> >> D. 800-910-1726 ext. 200 >> >> E. mnaser at vexxhost.com >> >> W. https://vexxhost.com >> > >> > -- >> > Kind Regards, >> > Dmitriy Rabotyagov >> > >> > >> > >> -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. https://vexxhost.com From dougal at redhat.com Mon Oct 28 09:12:53 2019 From: dougal at redhat.com (Dougal Matthews) Date: Mon, 28 Oct 2019 09:12:53 +0000 Subject: [tripleo] Stable policy and tripleoclient stdout Message-ID: Hey all, As some of you will know there is work going on to replace Mistral with Ansible. There is an initial spec[1] and a few in-flight patches here to create the initial playbooks. In doing this work, I hit an issue I wanted to get some input in. first the tl;dr - what is our policy on changing the output from tripleoclient? It is proving difficult to keep it the same and in some cases changes will be required (we won't have Mistral execution IDs to print for example). As a quick reminder, in the current code we print from tripleoclient and Mistral sends messages via Zaqar to tripleoclient (which it then prints). This allows for "real time" updates from workflows. For example, introspection will print updates as introspection of nodes is completed. With Ansible it is tricky to have this same result. I can think of three options; 1. We run Ansible in the background, essentially hiding it and then polling OpenStack services to look for the expected state changes. We can then display this to the user. 2. We go for a "ansible native" approach and stream the ansible output to the user. This will be familiar to anyone familiar with Ansible but it will mean the output completely changes. This is also the easiest option (from an implementation point of view) 3. I have not tested this idea, but I think we could have a custom Ansible module that writes messages to a tempfile. tripleoclient could then consume them and display them to the user. This would be similar to idea 1, but rather than polling services we constantly read a file and display those "messages" from ansible. This would be closest to the current Mistral and Zaqar solution. Personally I am a bit torn. One of the reasons we want to use Ansible is because developers/users are more familiar with debugging it. However, if we hide ansible that might not help very much. So I think in the long run option 2 might be best. However, that completely changes the output, it limits us to what Ansible can output and frankly, in my experience, Ansible output is ugly and often hard to read. I am curious to know what y'all think and hopefully there are some other options too. Thanks, Dougal [1] https://review.opendev.org/#/c/679272/ [2] https://review.opendev.org/#/q/status:open+topic:mistral_to_ansible -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Mon Oct 28 13:22:32 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Mon, 28 Oct 2019 09:22:32 -0400 Subject: [cinder][ops][extended-maintenance-sig][public-cloud-sig][enterprise-wg] request for info about stable branch usage Message-ID: Cinder currently has nine (9) branches open, which is a lot. (See [0] for a graphical representation.) We'd like to close some. In release chronological order, the eight stable branches are: 1. driverfixes/mitaka 2. driverfixes/newton The above pre-date the OpenStack-wide extended maintenance phase [1] for stable branches; they were created after mitaka and newton went EOL so that there'd be a place where operators could get backported fixes for Cinder drivers only. 3. stable/ocata 4. stable/pike 5. stable/queens The above are currently in the extended maintenance phase. Vulnerability Management is done on a reasonable effort basis only, and there's no official statement of level of testing (could be extremely limited). This phase also means that there are no longer any official releases of these branches. So like driverfixes, any bug fixes merged must be picked up and applied by someone downstream. 6. stable/rocky 7. stable/stein 8. stable/train The above are in the 'maintained' phase [2] and not really the subject of this email (they're just listed for completeness). Ok, so here's what the Cinder team would like to know. 1. We're assuming that no one is using driverfixes/mitaka anymore and would like to kill it. Are we correct? 2. Is anyone still relying upon driverfixes/newton? It contains 98 commits after the 9.1.4 tag (the final Cinder Newton release), so it's been getting a lot of love from the team. This is a key issue, because if we keep this branch open, the OpenStack stable branch "appropriate fixes" policy (point #4) implies that we must keep stable ocata, pike, and queens open as well. 3. As far as the Extended Maintenance branches go, it seems that ocata and pike could go EOL. At the very least, we may need to reduce the gate coverage on stable/ocata as there are rumors that it doesn't play nice with Zuul v3. Is anyone relying on stable/ocata or stable/pike ? Thank you for your attention to this matter. I should mention that the Cinder team doesn't mind keeping branches open (as you can see by the number of commits into driverfixes/newton), we just want to make sure they're being used so we can allocate our resources efficiently. It would be really helpful if you could respond before November 1 (that is, before the end of this week) so we can make a decision at the PTG. [0] https://launchpad.net/cinder/+series [1] https://docs.openstack.org/project-team-guide/stable-branches.html#extended-maintenance [2] https://docs.openstack.org/project-team-guide/stable-branches.html#maintained [3] https://docs.openstack.org/project-team-guide/stable-branches.html#appropriate-fixes From whayutin at redhat.com Mon Oct 28 14:42:10 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Mon, 28 Oct 2019 08:42:10 -0600 Subject: [tripleo] tripleo ptg schedule Message-ID: Greetings, First pass at the TripleO ptg schedule can be found here [1]. Thank you!! [1] https://etherpad.openstack.org/p/tripleo-ptg-ussuri -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Mon Oct 28 14:49:24 2019 From: aschultz at redhat.com (Alex Schultz) Date: Mon, 28 Oct 2019 08:49:24 -0600 Subject: [tripleo] Stable policy and tripleoclient stdout In-Reply-To: References: Message-ID: On Mon, Oct 28, 2019 at 3:18 AM Dougal Matthews wrote: > Hey all, > > As some of you will know there is work going on to replace Mistral with > Ansible. There is an initial spec[1] and a few in-flight patches here to > create the initial playbooks. In doing this work, I hit an issue I wanted > to get some input in. > > first the tl;dr - what is our policy on changing the output from > tripleoclient? It is proving difficult to keep it the same and in some > cases changes will be required (we won't have Mistral execution IDs to > print for example). > > I don't think for the deployment commands that we have any expectation of a specific format. There was a large change during the Rocky timeframe when we switched over to the ansible driven deployment. That being said there have been requests to improve the output logging for the deployment as we currently use print rather than the proper logging module. This means that --log-file doesn't actually capture any of the deployment output (not ideal). > As a quick reminder, in the current code we print from tripleoclient and > Mistral sends messages via Zaqar to tripleoclient (which it then prints). > This allows for "real time" updates from workflows. For example, > introspection will print updates as introspection of nodes is completed. > With Ansible it is tricky to have this same result. > > I can think of three options; > > 1. We run Ansible in the background, essentially hiding it and then > polling OpenStack services to look for the expected state changes. We can > then display this to the user. > Let's not do this since the output is already presented to the user for the undercloud/overcloud today. > > 2. We go for a "ansible native" approach and stream the ansible output to > the user. This will be familiar to anyone familiar with Ansible but it will > mean the output completely changes. This is also the easiest option (from > an implementation point of view) > > We already do this for the `openstack tripleo deploy` command so I would use that as it's likely the simplest and more closely resembles what we have today. > 3. I have not tested this idea, but I think we could have a custom Ansible > module that writes messages to a tempfile. tripleoclient could then consume > them and display them to the user. This would be similar to idea 1, but > rather than polling services we constantly read a file and display those > "messages" from ansible. This would be closest to the current Mistral and > Zaqar solution. > > Sounds overly complex and prone to failures. > > Personally I am a bit torn. One of the reasons we want to use Ansible is > because developers/users are more familiar with debugging it. However, if > we hide ansible that might not help very much. So I think in the long run > option 2 might be best. However, that completely changes the output, it > limits us to what Ansible can output and frankly, in my experience, Ansible > output is ugly and often hard to read. > > I am curious to know what y'all think and hopefully there are some other > options too. > > Thanks, > Dougal > > > [1] https://review.opendev.org/#/c/679272/ > [2] https://review.opendev.org/#/q/status:open+topic:mistral_to_ansible > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Oct 28 15:08:58 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 28 Oct 2019 16:08:58 +0100 Subject: [ptg] ptgbot HOWTO Message-ID: <8c97aa34-17bc-4dc3-942c-2fbc73595717@openstack.org> Hi everyone, In a few days, some contributor teams will meet in Shanghai for our 6th Project Teams Gathering. The event is organized around separate 'tracks' (generally tied to a specific team/group). Topics of discussion are loosely scheduled in those tracks, based on the needs of the attendance. This allows to maximize attendee productivity, but the downside is that it can make the event a bit confusing to navigate. To mitigate that issue, we are using an IRC bot to expose what's happening currently at the event at the following page: http://ptg.openstack.org/ptg.html It is therefore useful to have a volunteer in each room who makes use of the PTG bot to communicate what's happening. This is done by joining the #openstack-ptg IRC channel on Freenode and voicing commands to the bot. Usage of the bot is of course optional, but in past editions it was really useful to help attendees successfully navigate this dynamic event. How to keep attendees informed of what's being discussed in your room --------------------------------------------------------------------- To indicate what's currently being discussed, you will use the track name hashtag (found in the "Scheduled tracks" section on the above page), with the 'now' command: #TRACK now Example: #swift now brainstorming improvements to the ring You can also mention other track names to make sure to get people attention when the topic is transverse: #ops-meetup now discussing #cinder pain points There can only be one 'now' entry for a given track at a time. To indicate what will be discussed next, you can enter one or more 'next' commands: #TRACK next Example: #api-sig next at 2pm we'll be discussing pagination woes Note that in order to keep content current, entering a new 'now' command for a track will automatically erase any 'next' entry for that track. Finally, if you want to clear all 'now' and 'next' entries for your track, you can issue the 'clean' command: #TRACK clean Example: #ironic clean How to make your track etherpad easily found -------------------------------------------- We traditionally use an etherpad for each track, to plan and document the topics being discussed. In the past we used a wiki to list those etherpads, but now we publish a list of them at: http://ptg.openstack.org/etherpads.html PTGbot generates a default URL for your etherpad. If you already have one, you can update the URL by issuing the following command: #TRACK etherpad How to book reservable rooms ---------------------------- In Shanghai we will have some additional reservable space for extra un-scheduled discussions. The PTG bot page shows which track is allocated to which room, as well as available reservable space, with a slot code (room name - time slot) that you can use to issue a 'book' command to the PTG bot: #TRACK book Example: #release-management book Ocata-WedA1 Any track can book additional space and time using this system. If your topic of discussion does not fall into an existing track, it is easy to add a track on the fly. Just ask PTG bot admins (ttx, diablo_rojo...) on the channel to create a track for you (which they can do by getting op rights and issuing a ~add command). For more information on the bot commands, please see: https://opendev.org/openstack/ptgbot/src/branch/master/README.rst Let me know if you have any additional questions. -- Thierry Carrez (ttx) From openstack at nemebean.com Mon Oct 28 16:00:35 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 28 Oct 2019 11:00:35 -0500 Subject: [oslo] No meeting next week Message-ID: I and a number of other Oslo contributors will be in Shanghai next week, so we'll skip the meeting. It will resume on Nov. 11 (if I remember the time change...). Thanks. -Ben From openstack at nemebean.com Mon Oct 28 16:05:32 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 28 Oct 2019 11:05:32 -0500 Subject: [oslo] Please do not merge py27 removal patches yet Message-ID: <5862f8cb-ca07-a37b-6058-81e6c60905da@nemebean.com> This is mostly directed at the Oslo cores, some of whom have already been -1'ing these patches (thanks!). There have been some discussions in the tc channel, but I'm not sure we've actually talked about it within the team yet. The main thing is that Oslo needs to wait until all of our consumers have stopped testing py27 before we do. Otherwise we may accidentally merge a py27-incompatible change and break other projects' gate jobs. Some patches to drop py27 jobs have already been proposed, and in lieu of someone procedural -2'ing all of them I'm hoping we can just have an understanding that these can't merge yet. I'll send a followup when we're ready to proceed with the py27 removal work. Thanks. -Ben From johfulto at redhat.com Mon Oct 28 16:52:11 2019 From: johfulto at redhat.com (John Fulton) Date: Mon, 28 Oct 2019 12:52:11 -0400 Subject: [tripleo] Stable policy and tripleoclient stdout In-Reply-To: References: Message-ID: On Mon, Oct 28, 2019 at 10:52 AM Alex Schultz wrote: > On Mon, Oct 28, 2019 at 3:18 AM Dougal Matthews wrote: >> >> Hey all, >> >> As some of you will know there is work going on to replace Mistral with Ansible. There is an initial spec[1] and a few in-flight patches here to create the initial playbooks. In doing this work, I hit an issue I wanted to get some input in. >> >> first the tl;dr - what is our policy on changing the output from tripleoclient? It is proving difficult to keep it the same and in some cases changes will be required (we won't have Mistral execution IDs to print for example). >> > > I don't think for the deployment commands that we have any expectation of a specific format. There was a large change during the Rocky timeframe when we switched over to the ansible driven deployment. That being said there have been requests to improve the output logging for the deployment as we currently use print rather than the proper logging module. This means that --log-file doesn't actually capture any of the deployment output (not ideal). > >> >> As a quick reminder, in the current code we print from tripleoclient and Mistral sends messages via Zaqar to tripleoclient (which it then prints). This allows for "real time" updates from workflows. For example, introspection will print updates as introspection of nodes is completed. With Ansible it is tricky to have this same result. >> >> I can think of three options; >> >> 1. We run Ansible in the background, essentially hiding it and then polling OpenStack services to look for the expected state changes. We can then display this to the user. > > > Let's not do this since the output is already presented to the user for the undercloud/overcloud today. > >> >> >> 2. We go for a "ansible native" approach and stream the ansible output to the user. This will be familiar to anyone familiar with Ansible but it will mean the output completely changes. This is also the easiest option (from an implementation point of view) >> > > We already do this for the `openstack tripleo deploy` command so I would use that as it's likely the simplest and more closely resembles what we have today. +1 John > >> >> 3. I have not tested this idea, but I think we could have a custom Ansible module that writes messages to a tempfile. tripleoclient could then consume them and display them to the user. This would be similar to idea 1, but rather than polling services we constantly read a file and display those "messages" from ansible. This would be closest to the current Mistral and Zaqar solution. >> > > Sounds overly complex and prone to failures. > >> >> >> Personally I am a bit torn. One of the reasons we want to use Ansible is because developers/users are more familiar with debugging it. However, if we hide ansible that might not help very much. So I think in the long run option 2 might be best. However, that completely changes the output, it limits us to what Ansible can output and frankly, in my experience, Ansible output is ugly and often hard to read. >> >> I am curious to know what y'all think and hopefully there are some other options too. >> >> Thanks, >> Dougal >> >> >> [1] https://review.opendev.org/#/c/679272/ >> [2] https://review.opendev.org/#/q/status:open+topic:mistral_to_ansible From mdulko at redhat.com Mon Oct 28 17:04:29 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Mon, 28 Oct 2019 18:04:29 +0100 Subject: [Kuryr] Meeting move to office hours on #openstack-kuryr and 1500 UTC Message-ID: Hi, The attendance at the Kuryr meetings was very low recently, making most of them just my monologues (e.g. [1], [2], [3], [4]). Given that I made a decision to switch to office hours model on our #openstack-kuryr channel. If you have anything to discuss, just drop by and ping me (dulek) or ltomasbo on the channel. Moreover most of the team is based in Europe and have a conflict after last weekend DST change, so I'm moving the meeting to 1500 UTC. Also please note that next week's meeting falls during OpenInfra Summit that I'm attending, so we'll be skipping that meeting. The commit doing the changes is at [5]. Thanks, Michał [1] http://eavesdrop.openstack.org/meetings/kuryr/2019/kuryr.2019-10-28-14.33.log.html [2] http://eavesdrop.openstack.org/meetings/kuryr/2019/kuryr.2019-10-21-14.00.log.html [3] http://eavesdrop.openstack.org/meetings/kuryr/2019/kuryr.2019-09-23-14.04.log.html [4] http://eavesdrop.openstack.org/meetings/kuryr/2019/kuryr.2019-09-16-14.03.log.html [5] https://review.opendev.org/#/c/691703/ From haleyb.dev at gmail.com Mon Oct 28 17:33:04 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Mon, 28 Oct 2019 13:33:04 -0400 Subject: [neutron] Bug deputy report for week of October 25th Message-ID: Hi, I was Neutron bug deputy last week. Below is a short summary about reported bugs. -Brian Critical bugs ------------- None High bugs --------- * https://bugs.launchpad.net/neutron/+bug/1849463 - linuxbridge packet forwarding issue with vlan backed networks - Related to https://bugs.launchpad.net/os-vif/+bug/1837252 - Asked for Rodolfo to add more information from IRC discussion - Marked High simply due to reverence to CVE * https://bugs.launchpad.net/neutron/+bug/1849976 - "networkdhcpagentbindings.binding_index" autoincrement parameter is not working - Caused by https://review.opendev.org/#/c/288271/ - Somehow merged even though DB migration script was in wrong dir - Should it be reverted? - https://review.opendev.org/#/c/691641/ is alternative fix-up Medium bugs ----------- * https://bugs.launchpad.net/neutron/+bug/1849392 - Neutron - Stein - Having two external networks will prevent the router from being created - Could not reproduce, more information requested * https://bugs.launchpad.net/neutron/+bug/1849479 - neutron l2 to dhcp lost when migrating in stable/stein 14.0.2 - Might be related to 1849392 - Assigned to Slawek since triaging other bug might lead to answer * https://bugs.launchpad.net/neutron/+bug/1849449 - get_link_id() traceback in case of non-existing interface is misleading - https://review.opendev.org/#/c/690514/ * https://bugs.launchpad.net/neutron/+bug/1849676 - DHCP agents time out during startup at 60s when there is enough agents - https://review.opendev.org/#/c/682241/ is a possible fix? Low bugs -------- Wishlist bugs ------------- Invalid bugs ------------ Further triage required ----------------------- * https://bugs.launchpad.net/neutron/+bug/1849690 - designate not creating recordsets - Asked mlavalle to triage * https://bugs.launchpad.net/neutron/+bug/1849726 - Routed Networks DHCP agents getting ports in unconnected segments - Does look like a bug, triaging From colleen at gazlene.net Mon Oct 28 17:35:38 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Mon, 28 Oct 2019 10:35:38 -0700 Subject: [keystone] Upcoming team meetings and PTGs Message-ID: Hi team, Tomorrow, October 29, we are having our virtual pre-PTG meeting from 14:00-17:30 UTC in lieu of our regular 16:00 IRC meeting. Tuesday, November 5 some of us will be in Shanghai for the Forum and I will not be available to chair the meeting, so I propose we skip it. Tuesday, November 12, we will have our virtual post-PTG meeting from 15:00-17:30 UTC in lieu of our regular 16:00 IRC meeting. Details about the PTG sessions can be found on the planning etherpad: https://etherpad.openstack.org/p/keystone-shanghai-ptg Colleen From juliaashleykreger at gmail.com Mon Oct 28 19:58:53 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 28 Oct 2019 12:58:53 -0700 Subject: [tc][horizon][all] Horizon plugins maintenance In-Reply-To: References: Message-ID: While I totally forgot to bring this up during the ironic meeting today, I was able to bring it up with the cores present shortly after the meeting. Everyone was good with granting this access. As such I've added horizon-core to ironic-ui-core. The only ask from the ironic team is that we try to have one core review from the ironic team as well. If we run into any visibility issues, please don't hesitate to reach out to the team in IRC. I hope this is just the beginning of a new age of cross-project collaboration in OpenStack! Thanks again! -Julia On Wed, Oct 23, 2019 at 11:14 AM Julia Kreger wrote: > > I believe this is totally reasonable and will raise it with the ironic > team during our next meeting. > > Thanks for bringing this up! > > -Julia > > On Wed, Oct 23, 2019 at 5:43 AM Ivan Kolodyazhny wrote: > > > > Hi team, > > > > As you may know, we've got a pretty big list of Horizon Plugins [1]. Unfortunately, not all of them are in active development due to the lack of resources in projects teams. > > > > As a Horizon team, we understand all the reasons, and we're doing our best to help other teams to maintain plugins. > > > > That's why we're proposing our help to maintain horizon plugins. We raised this topic during the last Horizon weekly meeting [2] and we'll have some discussion during the PTG [3] too. > > > > There are a lot of Horizon changes which affect plugins and horizon team is ready to help: > > - new Django versions > > - dependencies updates > > - Horizon API changes > > - etc. > > > > To get faster fixes in, it would be good to have +2 permissions for the horizon-core team for each plugin. > > > > We helped Heat team during the last cycle adding horizon-core to the heat-dashboard-core team. Also, we've got +2 on other plugins via global project config [4] and via Gerrit configuration for (neutron-*aas-dashboard, tuskar-ui). > > > > Vitrage PTL agreed to do the same for vitrage-dashboard during the last meeting [5]. > > > > > > Of course, it's up to each project to maintain horizon plugins and it's responsibilities but I would like to raise this topic to the TC too. I really sure, that it will speed up some critical fixes for Horizon plugins and makes users and operators experience better. > > > > > > [1] https://docs.openstack.org/horizon/latest/install/plugin-registry.html > > [2] http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-10-16-15.02.log.html#l-128 > > [3] https://etherpad.openstack.org/p/horizon-u-ptg > > [4] http://codesearch.openstack.org/?q=horizon-core&i=nope&files=&repos=openstack/project-config > > [5] http://eavesdrop.openstack.org/meetings/vitrage/2019/vitrage.2019-10-23-08.03.log.html#l-21 > > > > Regards, > > Ivan Kolodyazhny, > > http://blog.e0ne.info/ From mthode at mthode.org Mon Oct 28 20:08:50 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 28 Oct 2019 15:08:50 -0500 Subject: [requirements][taskflow] networkx bump is blocked on what looks to be a taskflow issue. Message-ID: <20191028200850.qgwvblgbr23pxgo3@mthode.org> I've made a test review so people can see and test against, but it looks like the bump from 2.3 to 2.4 is causing some issues. hits nova,neutron,octavia,glance,cinder -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From aj at suse.com Mon Oct 28 20:29:29 2019 From: aj at suse.com (Andreas Jaeger) Date: Mon, 28 Oct 2019 21:29:29 +0100 Subject: [docs][oslo] docs-tools repos now fully part of oslo Message-ID: <13963ee5-877e-9608-38fd-79b25cb4ac8a@suse.com> In Juni, the following repos became part of Oslo project [1]: openstack/openstack-doc-tools openstack/os-api-ref openstack/openstackdocstheme openstack/whereto The ACL changes for these have been merged now [2] and IRC notifications happen now in #openstack-oslo [3]. The openstack-doc-tools launchpad project has also been assigned to the Oslo team [4]. Thanks Oslo team for adopting these tools. If there're any open topics or questions, feel free to ask Stephen Finucane or myself, Andreas References: [1] https://review.opendev.org/#/c/657141/ [2] https://review.opendev.org/670269 [3] https://review.opendev.org/670483 [4] https://launchpad.net/openstack-doc-tools -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From mriedemos at gmail.com Mon Oct 28 21:17:03 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Mon, 28 Oct 2019 16:17:03 -0500 Subject: [requirements][taskflow] networkx bump is blocked on what looks to be a taskflow issue. In-Reply-To: <20191028200850.qgwvblgbr23pxgo3@mthode.org> References: <20191028200850.qgwvblgbr23pxgo3@mthode.org> Message-ID: <6a2fba34-14e9-d71e-ed97-8e724d0b6b7a@gmail.com> On 10/28/2019 3:08 PM, Matthew Thode wrote: > I've made a test review so people can see and test against, but it looks > like the bump from 2.3 to 2.4 is causing some issues. > > hits nova,neutron,octavia,glance,cinder We already fixed [1] in nova but that was due to not using upper-constraints on transitive deps (nova pulls it in because of the powervm driver code that uses taskflow). I don't think we (nova) are in any rush to adopt the latest networkx/taskflow code anytime soon so the blocker is low priority for us. Having said all that, it looks like there is a patch to make taskflow work with networkx 2.4 [2]. [1] https://bugs.launchpad.net/nova/+bug/1848499 [2] https://review.opendev.org/#/c/689611/ -- Thanks, Matt From johnsomor at gmail.com Mon Oct 28 21:38:34 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Mon, 28 Oct 2019 14:38:34 -0700 Subject: [requirements][taskflow] networkx bump is blocked on what looks to be a taskflow issue. In-Reply-To: <6a2fba34-14e9-d71e-ed97-8e724d0b6b7a@gmail.com> References: <20191028200850.qgwvblgbr23pxgo3@mthode.org> <6a2fba34-14e9-d71e-ed97-8e724d0b6b7a@gmail.com> Message-ID: Yep, my Taskflow patch will need to merge for networkx 2.4 to be ok. Shameless review request: https://review.opendev.org/#/c/689611/ Michael On Mon, Oct 28, 2019 at 2:19 PM Matt Riedemann wrote: > > On 10/28/2019 3:08 PM, Matthew Thode wrote: > > I've made a test review so people can see and test against, but it looks > > like the bump from 2.3 to 2.4 is causing some issues. > > > > hits nova,neutron,octavia,glance,cinder > > We already fixed [1] in nova but that was due to not using > upper-constraints on transitive deps (nova pulls it in because of the > powervm driver code that uses taskflow). > > I don't think we (nova) are in any rush to adopt the latest > networkx/taskflow code anytime soon so the blocker is low priority for us. > > Having said all that, it looks like there is a patch to make taskflow > work with networkx 2.4 [2]. > > [1] https://bugs.launchpad.net/nova/+bug/1848499 > [2] https://review.opendev.org/#/c/689611/ > > -- > > Thanks, > > Matt > From fsbiz at yahoo.com Mon Oct 28 22:26:41 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Mon, 28 Oct 2019 22:26:41 +0000 (UTC) Subject: [ironic]: Timeout reached while waiting for callback for node References: <1530284401.3551200.1572301601958.ref@mail.yahoo.com> Message-ID: <1530284401.3551200.1572301601958@mail.yahoo.com> Thanks Julia. In addition to what you mentioned this particular issue seems to have cropped up when we added 100 more baremetal nodes. I've also narrowed down the issue (TFTP timeouts) when 3-4 baremetal nodes are in "deploy" state and downloading the OS via iSCSI.  Each iSCSI transfer takes about 6 Gbps and thus with four transfers we are over our 20Gbps capacity of the leaf-spine links.    We are slowly migrating to iPXE so it should help. That being said is there a document on large scale ironic design architectures?We are looking into a DC design (primarily for baremetals) for upto 2500 nodes. thanks,Fred, On Wednesday, October 23, 2019, 03:19:41 PM PDT, Julia Kreger wrote: Greetings Fred! Reply in-line. On Tue, Oct 22, 2019 at 12:47 PM fsbiz at yahoo.com wrote: [trim]  TFTP logs: shows TFTP client timed out (weird).  Any pointers here? Sadly this is one of those things that comes with using TFTP. Issues like this is why the community tends to recommend using ipxe.efi to chainload as you can perform transport over TCP as opposed to UDP where in something might happen mid-transport.  tftpd shows ramdisk_deployed completed.  Then, it reports that the client timed out. Grub does tend to be very abrupt and not wrap up very final actions. I suspect it may just never be sending the ack back and the transfer may be completing. I'm afraid this is one of those things you really need to see on the console what is going on. My guess would be that your deploy_ramdisk lost a packet in transfer or that it was corrupted in transport. It would be interesting to know if the network card stack is performing checksum validation, but for IPv4 it is optional.  [trim]  This has me stumped here.  This exact failure seems to be happening 3 to 4 times a week on different nodes.Any pointers appreciated. thanks,Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From adriant at catalyst.net.nz Mon Oct 28 22:48:52 2019 From: adriant at catalyst.net.nz (Adrian Turjak) Date: Tue, 29 Oct 2019 11:48:52 +1300 Subject: [ospurge] looking for project owners / considering adoption In-Reply-To: References: Message-ID: <342983ed-1d22-8f3a-3335-f153512ec2b2@catalyst.net.nz> My apologies I missed this email. Sadly I won't be at the summit this time around. There may be some public cloud focused discussions, and some of those often have this topic come up. Also if Monty from the SDK team is around, I'd suggest finding him and having a chat. I'll help if I can but we are swamped with internal work and I can't dedicate much time to do upstream work that isn't urgent. :( On 17/10/19 8:48 am, Adam Harwell wrote: > That's interesting -- we have already started working to add features > and improve ospurge, and it seems like a plenty useful tool for our > needs, but I think I agree that it would be nice to have that > functionality built into the sdk. I might be able to help with both, > since one is immediately useful and we (like everyone) have deadlines > to meet, and the other makes sense to me as a possible future > direction that could be more widely supported. > > Will you or someone else be hosting and discussion about this at the > Shanghai summit? I'll be there and would be happy to join and discuss. > >     --Adam > > On Tue, Oct 15, 2019, 22:04 Adrian Turjak > wrote: > > I tried to get a community goal to do project deletion per > project, but > we ended up deciding that a community goal wasn't ideal unless we did > build a bulk delete API in each service: > https://review.opendev.org/#/c/639010/ > https://etherpad.openstack.org/p/community-goal-project-deletion > https://etherpad.openstack.org/p/DEN-Deletion-of-resources > https://etherpad.openstack.org/p/DEN-Train-PublicCloudWG-brainstorming > > What we decided on, but didn't get a chance to work on, was building > into the OpenstackSDK OS-purge like functionality, as well as > reporting > functionality (of all project resources to be deleted). That way we > could have per project per resource deletion logic, and all of that > defined in the SDK. > > I was up for doing some of the work, but ended up swamped with > internal > work and just didn't drive or push for the deletion work upstream. > > If you want to do something useful, don't pursue OS-Purge, help us add > that official functionality to the SDK, and then we can push for bulk > deletion APIs in each project to make resource deletion more pleasant. > > I'd be happy to help with the work, and Monty on the SDK team will > most > likely be happy to as well. :) > > Cheers, > Adrian > > On 1/10/19 11:48 am, Adam Harwell wrote: > > I haven't seen much activity on this project in a while, and > it's been > > moved to opendev/x since the opendev migration... Who is the current > > owner of this project? Is there anyone who actually is > maintaining it, > > or would mind if others wanted to adopt the project to move it > forward? > > > > Thanks, > >    --Adam Harwell > -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Tue Oct 29 00:11:29 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Mon, 28 Oct 2019 17:11:29 -0700 Subject: Important Shanghai PTG Information In-Reply-To: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> References: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> Message-ID: On Wed, Oct 9, 2019, at 10:06, Kendall Waters wrote: [snipped] > > Flipcharts > > While we won’t have projection available, we will have some flipcharts > around. Each dedicated room will have one flipchart and the big main > room will have a few to share. Please feel free to grab one when you > need it, but put it back when you are finished so that others can use > it if they need. Again, sharing is caring! :) > > Onboarding > > A lot of the usual PTG attendees won’t be able to attend this event, > but we will also have a lot of new faces. With this in mind, we have > decided to add project onboarding to the PTG so that the new > contributors can get up to speed with the projects meeting that week. > The teams gathering that will be doing onboarding will have that > denoted on the print and digital schedule on site. They have also been > encouraged to promote when they will be doing their onboarding via the > PTGBot and on the mailing lists. > It occurred to me that with the onboarding sessions happening during the PTG and the PTG not having projectors that means the onboarding sessions won't have projectors either...which kinda sucks. Is there anyway around this? Colleen From kennelson11 at gmail.com Tue Oct 29 00:28:22 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 28 Oct 2019 17:28:22 -0700 Subject: Important Shanghai PTG Information In-Reply-To: References: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> Message-ID: Yeah. It does suck and unfortunately there's not much we can do to get projection. That said, there will be flipcharts if you want to draw anything out. My other suggestion (which you have probably already considered) would be to have an etherpad with links to launchpad or slides that people can easily get at so that you can look at the same thing together. Just a thought, but maybe (time permitting) you could prepare a small lab sort of thing, like how to check out the code, run unit tests for Keystone and how to assign themselves some low hanging fruit. Focus more on a hands on type of thing than a presentation? -Kendall (diablo_rojo) On Mon, Oct 28, 2019 at 5:13 PM Colleen Murphy wrote: > On Wed, Oct 9, 2019, at 10:06, Kendall Waters wrote: > > [snipped] > > > > > Flipcharts > > > > While we won’t have projection available, we will have some flipcharts > > around. Each dedicated room will have one flipchart and the big main > > room will have a few to share. Please feel free to grab one when you > > need it, but put it back when you are finished so that others can use > > it if they need. Again, sharing is caring! :) > > > > Onboarding > > > > A lot of the usual PTG attendees won’t be able to attend this event, > > but we will also have a lot of new faces. With this in mind, we have > > decided to add project onboarding to the PTG so that the new > > contributors can get up to speed with the projects meeting that week. > > The teams gathering that will be doing onboarding will have that > > denoted on the print and digital schedule on site. They have also been > > encouraged to promote when they will be doing their onboarding via the > > PTGBot and on the mailing lists. > > > > It occurred to me that with the onboarding sessions happening during the > PTG and the PTG not having projectors that means the onboarding sessions > won't have projectors either...which kinda sucks. Is there anyway around > this? > > Colleen > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Tue Oct 29 00:40:52 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 28 Oct 2019 17:40:52 -0700 Subject: [PTG] Unoffical Game Night Message-ID: Hello! While we won't have space in any official capacity, I am planning on bringing games and thinking Thursday would be a good night to play them if anyone wants to bring games and join me. Myself + other OSF staff are planning on hanging out at the Shanghai Marriott Hotel City Centre (Address: 555 Xizang Middle Rd, Huangpu, Shanghai, China, 200003) lobby bar starting at 8:00. For those interested I'm gathering an etherpad of people + games[1]. Hope to see you there! -Kendall (diablo_rojo) [1] https://etherpad.openstack.org/p/pvg-game-night -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangyi01 at inspur.com Tue Oct 29 04:08:00 2019 From: yangyi01 at inspur.com (=?utf-8?B?WWkgWWFuZyAo5p2o54eaKS3kupHmnI3liqHpm4blm6I=?=) Date: Tue, 29 Oct 2019 04:08:00 +0000 Subject: =?utf-8?B?562U5aSNOiDnrZTlpI06IFtrdXJ5cl1ba3VyeXIta3ViZXJuZXRlc10gZG9l?= =?utf-8?B?cyBrdXJ5ci1rdWJlcm5ldGVzIHN1cHBvcnQgZHluYW1pYyBzdWJuZXQgYnkg?= =?utf-8?Q?pod_namespace_or_annotation=3F?= In-Reply-To: <2bb02e40e6c7d712eea123f0496ca9c7affb2fb8.camel@redhat.com> References: <5bb1eaa841ad422584fd90e2300e95e8@inspur.com> <2bb02e40e6c7d712eea123f0496ca9c7affb2fb8.camel@redhat.com> Message-ID: <699cbbbe7a0e4385b9aa88a5c7d8a8f5@inspur.com> Hi, Michal I tried it, but it can't work, it is also so even for the network kuryr created by namespace driver, here is some information: I created namespace by "kubectl create namespace kuryrns1" yangyi at cmp001:~$ kubectl get ns NAME STATUS AGE default Active 48d kube-node-lease Active 48d kube-public Active 48d kube-system Active 48d kuryrns1 Active 52m My kuryr conf is below: yangyi at cmp001:~$ grep "^[^#]" /etc/kuryr/kuryr.conf [DEFAULT] bindir = /home/yangyi/kuryr-k8s-controller/env/libexec/kuryr deployment_type = baremetal log_file = /var/log/kuryr.log [binding] [cache_defaults] [cni_daemon] [cni_health_server] [health_server] [ingress] [kubernetes] api_root = https://10.110.21.64:6443 ssl_client_crt_file = /etc/kubernetes/pki/kuryr.crt ssl_client_key_file = /etc/kubernetes/pki/kuryr.key ssl_ca_crt_file = /etc/kubernetes/pki/ca.crt pod_subnets_driver = namespace enabled_handlers = vif,namespace,kuryrnet [kuryr-kubernetes] [namespace_handler_caching] [namespace_sg] [namespace_subnet] pod_router = 46fc6730-a7f9-45f7-b98b-f682c436e85c pod_subnet_pool = 581daf0e-e661-4fb8-b8d6-b7b11d0b43ab [neutron] auth_url = http://10.110.28.20:35357/v3 auth_type = password password = HAOQNs07Ci9c0DvB project_domain_id = default project_name = admin region_name = SDNRegion tenant_name = admin user_domain_id = default username = admin [neutron_defaults] project = 852d281e70b34b5398c1c5534124952e pod_subnet = b1fa2198-2ecd-41ce-bd06-93ddb2742586 pod_security_groups = d89787f5-b892-487f-b682-88742007f49f ovs_bridge = br-int service_subnet = 58b322fd-19e4-47db-b2fe-5cffd528af05 network_device_mtu = 1450 [node_driver_caching] [np_handler_caching] [octavia_defaults] [pod_ip_caching] [pod_vif_nested] [pool_manager] [sriov] [subnet_caching] [vif_handler_caching] [vif_pool] yangyi at cmp001:~$ KuryrNet has been created automatically by namespace creation: yangyi at cmp001:~$ kubectl get KuryrNet/ns-kuryrns1 -o yaml apiVersion: openstack.org/v1 kind: KuryrNet metadata: annotations: namespaceName: kuryrns1 creationTimestamp: "2019-10-29T02:58:01Z" generation: 2 name: ns-kuryrns1 resourceVersion: "5926221" selfLink: /apis/openstack.org/v1/kuryrnets/ns-kuryrns1 uid: df5850a5-dc57-4243-b01e-be1c24d788fc spec: netId: 2dcc6969-7923-460e-8ede-17985cdf2b80 populated: true routerId: 46fc6730-a7f9-45f7-b98b-f682c436e85c subnetCIDR: 10.254.0.0/24 subnetId: a46861d3-eccf-4573-8c22-5412cc9d64f0 yangyi at cmp001:~$ But when I created deployment under kuryrns1 namespace, it never succeeded. I found CNI daemon is broken. It is so before kubectl apply -f deploy.yaml. yangyi at cmp004:~$ sudo ps aux | grep kuryr root 15339 0.0 0.0 51420 3852 pts/9 S 03:35 0:00 sudo -E kuryr-daemon --config-file /etc/kuryr/kuryr.conf -d root 15340 1.2 0.0 271028 101408 ? Ssl 03:35 0:01 kuryr-daemon: master process [/home/yangyi/kuryr-k8s-cni/env/bin/kuryr-daemon --config-file /etc/kuryr kuryr.conf -d] root 15352 0.0 0.0 268948 92016 ? S 03:35 0:00 kuryr-daemon: master process [/home/yangyi/kuryr-k8s-cni/env/bin/kuryr-daemon --config-file /etc/kuryr kuryr.conf -d] root 15357 0.0 0.0 426944 94624 ? Sl 03:35 0:00 kuryr-daemon: watcher worker(0) root 15362 0.0 0.0 353220 93084 ? Sl 03:35 0:00 kuryr-daemon: server worker(0) root 15366 0.0 0.0 353212 92260 ? Sl 03:35 0:00 kuryr-daemon: health worker(0) It is so after kubectl apply -f deploy.yaml. yangyi at cmp004:~$ sudo ps aux | grep kuryr root 15339 0.0 0.0 51420 3852 pts/9 S 03:35 0:00 sudo -E kuryr-daemon --config-file /etc/kuryr/kuryr.conf -d root 15340 0.2 0.0 271028 101408 ? Ssl 03:35 0:01 kuryr-daemon: master process [/home/yangyi/kuryr-k8s-cni/env/bin/kuryr-daemon --config-file /etc/kuryr kuryr.conf -d] root 15352 0.0 0.0 342680 92016 ? Sl 03:35 0:00 kuryr-daemon: master process [/home/yangyi/kuryr-k8s-cni/env/bin/kuryr-daemon --config-file /etc/kuryr kuryr.conf -d] root 15357 0.0 0.0 427200 95028 ? Sl 03:35 0:00 kuryr-daemon: watcher worker(0) root 15362 0.0 0.0 353220 93108 ? Sl 03:35 0:00 kuryr-daemon: server worker(0) root 15366 0.0 0.0 353212 92260 ? Sl 03:35 0:00 kuryr-daemon: health worker(0) root 16426 0.1 0.0 0 0 ? Z 03:39 0:00 [kuryr-daemon: s] root 16729 0.0 0.0 428232 94988 ? S 03:40 0:00 kuryr-daemon: server worker(0) root 16813 0.0 0.0 429768 97480 ? S 03:40 0:00 kuryr-daemon: server worker(0) yangyi 17700 0.0 0.0 12944 1012 pts/0 R+ 03:42 0:00 grep --color=auto kuryr I can see port is indeed created. yangyi at cmp001:~$ openstack port list --network ns/kuryrns1-net +--------------------------------------+------+-------------------+---------------------------------------------------------------------------+--------+ | ID | Name | MAC Address | Fixed IP Addresses | Status | +--------------------------------------+------+-------------------+---------------------------------------------------------------------------+--------+ | 2dd5f11f-5fc6-45ee-8f9b-8037019572cd | | fa:16:3e:af:4e:f1 | ip_address='10.254.0.3', subnet_id='a46861d3-eccf-4573-8c22-5412cc9d64f0' | ACTIVE | | c7ff9e5c-1110-4dfa-983d-2f04bf7d2794 | | fa:16:3e:10:64:c7 | ip_address='10.254.0.1', subnet_id='a46861d3-eccf-4573-8c22-5412cc9d64f0' | ACTIVE | +--------------------------------------+------+-------------------+---------------------------------------------------------------------------+--------+ yangyi at cmp001:~$ kuryr log indicated cni is defunct and is restarted. yangyi at cmp004:~$ grep is_alive /var/log/kuryr.log 2019-10-29 03:40:00.492 16426 DEBUG kuryr_kubernetes.cni.binding.bridge [-] Reporting Driver not healthy. is_alive /home/yangyi/kuryr-k8s-cni/kuryr-kubernetes/kuryr_kubernetes/cni/binding/bridge.py:119 yangyi at cmp004:~$ Can you give me some advice or hints about how I can troubleshoot such an issue? -----邮件原件----- 发件人: Michał Dulko [mailto:mdulko at redhat.com] 发送时间: 2019年10月22日 23:29 收件人: Yi Yang (杨燚)-云服务集团 ; ltomasbo at redhat.com 抄送: openstack-discuss at lists.openstack.org 主题: Re: 答复: [kuryr][kuryr-kubernetes] does kuryr-kubernetes support dynamic subnet by pod namespace or annotation? Oh, I actually should have thought about it. So if you'll precreate the network, subnet and a KuryrNet Custom Resource [1] it should actually work. The definition of KuryrNet can be find here [2], fields are pretty self-explanatory. Please note that you also need to link KuryrNet to the namespace by adding an annotation to the namespace: "openstack.org/kuryr-net-crd": "ns-" Also, just for safety, make sure the KuryrNet itself is named "ns- " - I'm not sure if some code isn't looking it up by name. Please note that this was never tested, so maybe there's something I don't see that might prevent it from working. [1] https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ [2] https://github.com/openstack/kuryr-kubernetes/blob/a85a7bc8b1761eb748ccf16430fe77587bc764c2/kubernetes_crds/kuryrnet.yaml -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From mdulko at redhat.com Tue Oct 29 08:11:29 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Tue, 29 Oct 2019 09:11:29 +0100 Subject: =?UTF-8?Q?=E7=AD=94=E5=A4=8D=3A?= =?UTF-8?Q?_=E7=AD=94=E5=A4=8D=3A?= [kuryr][kuryr-kubernetes] does kuryr-kubernetes support dynamic subnet by pod namespace or annotation? In-Reply-To: <699cbbbe7a0e4385b9aa88a5c7d8a8f5@inspur.com> References: <5bb1eaa841ad422584fd90e2300e95e8@inspur.com> <2bb02e40e6c7d712eea123f0496ca9c7affb2fb8.camel@redhat.com> <699cbbbe7a0e4385b9aa88a5c7d8a8f5@inspur.com> Message-ID: See answers inline. On Tue, 2019-10-29 at 04:08 +0000, Yi Yang (杨燚)-云服务集团 wrote: > Hi, Michal > > I tried it, but it can't work, it is also so even for the network > kuryr created by namespace driver, here is some information: > > I created namespace by "kubectl create namespace kuryrns1" The correct order to have "predefined" subnets would be to start with KuryrNet creation. But okay. > yangyi at cmp001:~$ kubectl get ns > NAME STATUS AGE > default Active 48d > kube-node-lease Active 48d > kube-public Active 48d > kube-system Active 48d > kuryrns1 Active 52m > > My kuryr conf is below: > > yangyi at cmp001:~$ grep "^[^#]" /etc/kuryr/kuryr.conf > [DEFAULT] > bindir = /home/yangyi/kuryr-k8s-controller/env/libexec/kuryr > deployment_type = baremetal > log_file = /var/log/kuryr.log > [binding] > [cache_defaults] > [cni_daemon] > [cni_health_server] > [health_server] > [ingress] > [kubernetes] > api_root = https://10.110.21.64:6443 > ssl_client_crt_file = /etc/kubernetes/pki/kuryr.crt > ssl_client_key_file = /etc/kubernetes/pki/kuryr.key > ssl_ca_crt_file = /etc/kubernetes/pki/ca.crt > pod_subnets_driver = namespace > enabled_handlers = vif,namespace,kuryrnet > [kuryr-kubernetes] > [namespace_handler_caching] > [namespace_sg] > [namespace_subnet] > pod_router = 46fc6730-a7f9-45f7-b98b-f682c436e85c > pod_subnet_pool = 581daf0e-e661-4fb8-b8d6-b7b11d0b43ab > [neutron] > auth_url = http://10.110.28.20:35357/v3 > auth_type = password > password = HAOQNs07Ci9c0DvB > project_domain_id = default > project_name = admin > region_name = SDNRegion > tenant_name = admin > user_domain_id = default > username = admin > [neutron_defaults] > project = 852d281e70b34b5398c1c5534124952e > pod_subnet = b1fa2198-2ecd-41ce-bd06-93ddb2742586 > pod_security_groups = d89787f5-b892-487f-b682-88742007f49f > ovs_bridge = br-int > service_subnet = 58b322fd-19e4-47db-b2fe-5cffd528af05 > network_device_mtu = 1450 > [node_driver_caching] > [np_handler_caching] > [octavia_defaults] > [pod_ip_caching] > [pod_vif_nested] > [pool_manager] > [sriov] > [subnet_caching] > [vif_handler_caching] > [vif_pool] > yangyi at cmp001:~$ > > KuryrNet has been created automatically by namespace creation: > > yangyi at cmp001:~$ kubectl get KuryrNet/ns-kuryrns1 -o yaml > apiVersion: openstack.org/v1 > kind: KuryrNet > metadata: > annotations: > namespaceName: kuryrns1 > creationTimestamp: "2019-10-29T02:58:01Z" > generation: 2 > name: ns-kuryrns1 > resourceVersion: "5926221" > selfLink: /apis/openstack.org/v1/kuryrnets/ns-kuryrns1 > uid: df5850a5-dc57-4243-b01e-be1c24d788fc > spec: > netId: 2dcc6969-7923-460e-8ede-17985cdf2b80 > populated: true > routerId: 46fc6730-a7f9-45f7-b98b-f682c436e85c > subnetCIDR: 10.254.0.0/24 > subnetId: a46861d3-eccf-4573-8c22-5412cc9d64f0 > yangyi at cmp001:~$ > > But when I created deployment under kuryrns1 namespace, it never > succeeded. I found CNI daemon is broken. > > It is so before kubectl apply -f deploy.yaml. > > yangyi at cmp004:~$ sudo ps aux | grep kuryr > root 15339 0.0 0.0 51420 3852 pts/9 S 03:35 0:00 sudo > -E kuryr-daemon --config-file /etc/kuryr/kuryr.conf -d > root 15340 1.2 0.0 271028 101408 ? Ssl 03:35 0:01 > kuryr-daemon: master process [/home/yangyi/kuryr-k8s- > cni/env/bin/kuryr-daemon --config-file /etc/kuryr kuryr.conf -d] > root 15352 0.0 0.0 268948 92016 ? S 03:35 0:00 > kuryr-daemon: master process [/home/yangyi/kuryr-k8s- > cni/env/bin/kuryr-daemon --config-file /etc/kuryr kuryr.conf -d] > root 15357 0.0 0.0 426944 94624 ? Sl 03:35 0:00 > kuryr-daemon: watcher worker(0) > root 15362 0.0 0.0 353220 93084 ? Sl 03:35 0:00 > kuryr-daemon: server worker(0) > root 15366 0.0 0.0 353212 92260 ? Sl 03:35 0:00 > kuryr-daemon: health worker(0) > > It is so after kubectl apply -f deploy.yaml. > > yangyi at cmp004:~$ sudo ps aux | grep kuryr > root 15339 0.0 0.0 51420 3852 pts/9 S 03:35 0:00 sudo > -E kuryr-daemon --config-file /etc/kuryr/kuryr.conf -d > root 15340 0.2 0.0 271028 101408 ? Ssl 03:35 0:01 > kuryr-daemon: master process [/home/yangyi/kuryr-k8s- > cni/env/bin/kuryr-daemon --config-file /etc/kuryr kuryr.conf -d] > root 15352 0.0 0.0 342680 92016 ? Sl 03:35 0:00 > kuryr-daemon: master process [/home/yangyi/kuryr-k8s- > cni/env/bin/kuryr-daemon --config-file /etc/kuryr kuryr.conf -d] > root 15357 0.0 0.0 427200 95028 ? Sl 03:35 0:00 > kuryr-daemon: watcher worker(0) > root 15362 0.0 0.0 353220 93108 ? Sl 03:35 0:00 > kuryr-daemon: server worker(0) > root 15366 0.0 0.0 353212 92260 ? Sl 03:35 0:00 > kuryr-daemon: health worker(0) > root 16426 0.1 0.0 0 0 ? Z 03:39 0:00 > [kuryr-daemon: s] > root 16729 0.0 0.0 428232 94988 ? S 03:40 0:00 > kuryr-daemon: server worker(0) > root 16813 0.0 0.0 429768 97480 ? S 03:40 0:00 > kuryr-daemon: server worker(0) > yangyi 17700 0.0 0.0 12944 1012 pts/0 R+ 03:42 0:00 grep > --color=auto kuryr Defunct processes are just zombie processes. I see that as well in our setups, it's probably a bug in pyroute2. It does not seem to affect Kuryr though. > I can see port is indeed created. > > yangyi at cmp001:~$ openstack port list --network ns/kuryrns1-net > +--------------------------------------+------+-------------------+ > ------------------------------------------------------------------- > --------+--------+ > > ID | Name | MAC Address | > > Fixed IP Addresses > > | Status | > +--------------------------------------+------+-------------------+ > ------------------------------------------------------------------- > --------+--------+ > > 2dd5f11f-5fc6-45ee-8f9b-8037019572cd | | fa:16:3e:af:4e:f1 | > > ip_address='10.254.0.3', subnet_id='a46861d3-eccf-4573-8c22- > > 5412cc9d64f0' | ACTIVE | > > c7ff9e5c-1110-4dfa-983d-2f04bf7d2794 | | fa:16:3e:10:64:c7 | > > ip_address='10.254.0.1', subnet_id='a46861d3-eccf-4573-8c22- > > 5412cc9d64f0' | ACTIVE | > +--------------------------------------+------+-------------------+ > ------------------------------------------------------------------- > --------+--------+ > yangyi at cmp001:~$ > > kuryr log indicated cni is defunct and is restarted. > > yangyi at cmp004:~$ grep is_alive /var/log/kuryr.log > 2019-10-29 03:40:00.492 16426 DEBUG > kuryr_kubernetes.cni.binding.bridge [-] Reporting Driver not healthy. > is_alive /home/yangyi/kuryr-k8s-cni/kuryr- > kubernetes/kuryr_kubernetes/cni/binding/bridge.py:119 > yangyi at cmp004:~$ > > Can you give me some advice or hints about how I can troubleshoot > such an issue? You'd need longer log, above that message you should see some logs explaining the culprit. > -----邮件原件----- > 发件人: Michał Dulko [mailto:mdulko at redhat.com] > 发送时间: 2019年10月22日 23:29 > 收件人: Yi Yang (杨燚)-云服务集团 ; ltomasbo at redhat.com > 抄送: openstack-discuss at lists.openstack.org > 主题: Re: 答复: [kuryr][kuryr-kubernetes] does kuryr-kubernetes support > dynamic subnet by pod namespace or annotation? > > Oh, I actually should have thought about it. So if you'll precreate > the network, subnet and a KuryrNet Custom Resource [1] it should > actually work. The definition of KuryrNet can be find here [2], > fields are pretty self-explanatory. Please note that you also need to > link KuryrNet to the namespace by adding an annotation to the > namespace: > > "openstack.org/kuryr-net-crd": "ns-" > > Also, just for safety, make sure the KuryrNet itself is named "ns- > " - I'm not sure if some code isn't looking it up by > name. > > Please note that this was never tested, so maybe there's something I > don't see that might prevent it from working. > > [1] > https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ > [2] > https://github.com/openstack/kuryr-kubernetes/blob/a85a7bc8b1761eb748ccf16430fe77587bc764c2/kubernetes_crds/kuryrnet.yaml From dougal at redhat.com Tue Oct 29 08:42:40 2019 From: dougal at redhat.com (Dougal Matthews) Date: Tue, 29 Oct 2019 08:42:40 +0000 Subject: [tripleo] Stable policy and tripleoclient stdout In-Reply-To: References: Message-ID: On Mon, 28 Oct 2019 at 14:50, Alex Schultz wrote: > > > On Mon, Oct 28, 2019 at 3:18 AM Dougal Matthews wrote: > >> Hey all, >> >> As some of you will know there is work going on to replace Mistral with >> Ansible. There is an initial spec[1] and a few in-flight patches here to >> create the initial playbooks. In doing this work, I hit an issue I wanted >> to get some input in. >> >> first the tl;dr - what is our policy on changing the output from >> tripleoclient? It is proving difficult to keep it the same and in some >> cases changes will be required (we won't have Mistral execution IDs to >> print for example). >> >> > I don't think for the deployment commands that we have any expectation of > a specific format. There was a large change during the Rocky timeframe when > we switched over to the ansible driven deployment. That being said there > have been requests to improve the output logging for the deployment as we > currently use print rather than the proper logging module. This means that > --log-file doesn't actually capture any of the deployment output (not > ideal). > > >> As a quick reminder, in the current code we print from tripleoclient and >> Mistral sends messages via Zaqar to tripleoclient (which it then prints). >> This allows for "real time" updates from workflows. For example, >> introspection will print updates as introspection of nodes is completed. >> With Ansible it is tricky to have this same result. >> >> I can think of three options; >> >> 1. We run Ansible in the background, essentially hiding it and then >> polling OpenStack services to look for the expected state changes. We can >> then display this to the user. >> > > Let's not do this since the output is already presented to the user for > the undercloud/overcloud today. > > >> >> 2. We go for a "ansible native" approach and stream the ansible output to >> the user. This will be familiar to anyone familiar with Ansible but it will >> mean the output completely changes. This is also the easiest option (from >> an implementation point of view) >> >> > We already do this for the `openstack tripleo deploy` command so I would > use that as it's likely the simplest and more closely resembles what we > have today. > Great. That sounds good to me. I was worried there would be more pushback about completely changing the output but I think this is the best option otherwise. > > >> 3. I have not tested this idea, but I think we could have a custom >> Ansible module that writes messages to a tempfile. tripleoclient could then >> consume them and display them to the user. This would be similar to idea 1, >> but rather than polling services we constantly read a file and display >> those "messages" from ansible. This would be closest to the current Mistral >> and Zaqar solution. >> >> > Sounds overly complex and prone to failures. > > >> >> Personally I am a bit torn. One of the reasons we want to use Ansible is >> because developers/users are more familiar with debugging it. However, if >> we hide ansible that might not help very much. So I think in the long run >> option 2 might be best. However, that completely changes the output, it >> limits us to what Ansible can output and frankly, in my experience, Ansible >> output is ugly and often hard to read. >> >> I am curious to know what y'all think and hopefully there are some other >> options too. >> >> Thanks, >> Dougal >> >> >> [1] https://review.opendev.org/#/c/679272/ >> [2] https://review.opendev.org/#/q/status:open+topic:mistral_to_ansible >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdulko at redhat.com Tue Oct 29 08:52:15 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Tue, 29 Oct 2019 09:52:15 +0100 Subject: [kuryr] [tc] kuryr project mission Message-ID: <94950c5e942e22a4ea1599a4c814eb554d4f2a9b.camel@redhat.com> Hi, It's been more than a year after fuxi [1] and fuxi-kubernetes [2] were declared as retired. First one was supposed to integrate Docker with Cinder and Manila. The second was aiming to do the same with Kubernetes. I'm not really sure how to serve the first use case, but it seems like there are some alternatives [3], [4]. fuxi-kubernetes use case is now served by Cloud Provider OpenStack in Cinder [5] and Manila [6] CSI plugins, which seem like a much better fit. Currently the only maintained deliverables in kuryr project are kuryr, kuryr-libnetwork and kuryr-kubernetes. All of them are related to networking. Given that I'd like to propose rephrasing Kuryr mission statement from: > Bridge between container framework networking and storage models > to OpenStack networking and storage abstractions. to > Bridge between container framework networking models > to OpenStack networking abstractions. effectively getting storage out of project scope. Are there any thoughts or objections? Maybe someone sees a better phrasing? Thanks, Michał [1] https://opendev.org/openstack/fuxi [2] https://opendev.org/openstack/fuxi-kubernetes [3] https://rexray.readthedocs.io/en/stable/user-guide/schedulers/docker/plug-ins/openstack/ [4] https://github.com/j-griffith/cinder-docker-driver [5] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-cinder-csi-plugin.md [6] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-manila-csi-plugin.md From ltomasbo at redhat.com Tue Oct 29 11:08:08 2019 From: ltomasbo at redhat.com (Luis Tomas Bolivar) Date: Tue, 29 Oct 2019 12:08:08 +0100 Subject: [kuryr] [tc] kuryr project mission In-Reply-To: <94950c5e942e22a4ea1599a4c814eb554d4f2a9b.camel@redhat.com> References: <94950c5e942e22a4ea1599a4c814eb554d4f2a9b.camel@redhat.com> Message-ID: +1 On Tue, Oct 29, 2019 at 9:55 AM Michał Dulko wrote: > Hi, > > It's been more than a year after fuxi [1] and fuxi-kubernetes [2] were > declared as retired. First one was supposed to integrate Docker with > Cinder and Manila. The second was aiming to do the same with > Kubernetes. > > I'm not really sure how to serve the first use case, but it seems like > there are some alternatives [3], [4]. fuxi-kubernetes use case is now > served by Cloud Provider OpenStack in Cinder [5] and Manila [6] CSI > plugins, which seem like a much better fit. > > Currently the only maintained deliverables in kuryr project are kuryr, > kuryr-libnetwork and kuryr-kubernetes. All of them are related to > networking. Given that I'd like to propose rephrasing Kuryr mission > statement from: > > > Bridge between container framework networking and storage models > > to OpenStack networking and storage abstractions. > > to > > > Bridge between container framework networking models > > to OpenStack networking abstractions. > > effectively getting storage out of project scope. > > Are there any thoughts or objections? Maybe someone sees a better > phrasing? > > Thanks, > Michał > > [1] https://opendev.org/openstack/fuxi > [2] https://opendev.org/openstack/fuxi-kubernetes > [3] > https://rexray.readthedocs.io/en/stable/user-guide/schedulers/docker/plug-ins/openstack/ > [4] https://github.com/j-griffith/cinder-docker-driver > [5] > https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-cinder-csi-plugin.md > [6] > https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-manila-csi-plugin.md > > > -- LUIS TOMÁS BOLÍVAR Senior Software Engineer Red Hat Madrid, Spain ltomasbo at redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmellado at redhat.com Tue Oct 29 11:49:53 2019 From: dmellado at redhat.com (Daniel Mellado) Date: Tue, 29 Oct 2019 12:49:53 +0100 Subject: [kuryr] [tc] kuryr project mission In-Reply-To: <94950c5e942e22a4ea1599a4c814eb554d4f2a9b.camel@redhat.com> References: <94950c5e942e22a4ea1599a4c814eb554d4f2a9b.camel@redhat.com> Message-ID: <8b6a9456-2676-f309-c99a-001ef5e04607@redhat.com> Hi, On 10/29/19 9:52 AM, Michał Dulko wrote: > Hi, > > It's been more than a year after fuxi [1] and fuxi-kubernetes [2] were > declared as retired. First one was supposed to integrate Docker with > Cinder and Manila. The second was aiming to do the same with > Kubernetes. > > I'm not really sure how to serve the first use case, but it seems like > there are some alternatives [3], [4]. fuxi-kubernetes use case is now > served by Cloud Provider OpenStack in Cinder [5] and Manila [6] CSI > plugins, which seem like a much better fit. > > Currently the only maintained deliverables in kuryr project are kuryr, > kuryr-libnetwork and kuryr-kubernetes. All of them are related to > networking. Given that I'd like to propose rephrasing Kuryr mission > statement from: > >> Bridge between container framework networking and storage models >> to OpenStack networking and storage abstractions. > > to > >> Bridge between container framework networking models >> to OpenStack networking abstractions. > > effectively getting storage out of project scope. Do totally agree, both fuxi and fuxi-kubernetes are now completely deprecated and overridden by another projects, so tbh it doesn't make sense to keep the 'storage' label around any longer. > > Are there any thoughts or objections? Maybe someone sees a better > phrasing? Let's just drop the storage part for now > > Thanks, > Michał Best! Daniel > > [1] https://opendev.org/openstack/fuxi > [2] https://opendev.org/openstack/fuxi-kubernetes > [3] https://rexray.readthedocs.io/en/stable/user-guide/schedulers/docker/plug-ins/openstack/ > [4] https://github.com/j-griffith/cinder-docker-driver > [5] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-cinder-csi-plugin.md > [6] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-manila-csi-plugin.md > > From mdemaced at redhat.com Tue Oct 29 12:00:44 2019 From: mdemaced at redhat.com (Maysa De Macedo Souza) Date: Tue, 29 Oct 2019 13:00:44 +0100 Subject: [kuryr] [tc] kuryr project mission In-Reply-To: <8b6a9456-2676-f309-c99a-001ef5e04607@redhat.com> References: <94950c5e942e22a4ea1599a4c814eb554d4f2a9b.camel@redhat.com> <8b6a9456-2676-f309-c99a-001ef5e04607@redhat.com> Message-ID: Hi, Makes total sense. Best, Maysa. On Tue, Oct 29, 2019 at 12:53 PM Daniel Mellado wrote: > Hi, > > On 10/29/19 9:52 AM, Michał Dulko wrote: > > Hi, > > > > It's been more than a year after fuxi [1] and fuxi-kubernetes [2] were > > declared as retired. First one was supposed to integrate Docker with > > Cinder and Manila. The second was aiming to do the same with > > Kubernetes. > > > > I'm not really sure how to serve the first use case, but it seems like > > there are some alternatives [3], [4]. fuxi-kubernetes use case is now > > served by Cloud Provider OpenStack in Cinder [5] and Manila [6] CSI > > plugins, which seem like a much better fit. > > > > Currently the only maintained deliverables in kuryr project are kuryr, > > kuryr-libnetwork and kuryr-kubernetes. All of them are related to > > networking. Given that I'd like to propose rephrasing Kuryr mission > > statement from: > > > >> Bridge between container framework networking and storage models > >> to OpenStack networking and storage abstractions. > > > > to > > > >> Bridge between container framework networking models > >> to OpenStack networking abstractions. > > > > effectively getting storage out of project scope. > Do totally agree, both fuxi and fuxi-kubernetes are now completely > deprecated and overridden by another projects, so tbh it doesn't make > sense to keep the 'storage' label around any longer. > > > > Are there any thoughts or objections? Maybe someone sees a better > > phrasing? > Let's just drop the storage part for now > > > > Thanks, > > Michał > Best! > > Daniel > > > > [1] https://opendev.org/openstack/fuxi > > [2] https://opendev.org/openstack/fuxi-kubernetes > > [3] > https://rexray.readthedocs.io/en/stable/user-guide/schedulers/docker/plug-ins/openstack/ > > [4] https://github.com/j-griffith/cinder-docker-driver > > [5] > https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-cinder-csi-plugin.md > > [6] > https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-manila-csi-plugin.md > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Tue Oct 29 12:23:13 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Tue, 29 Oct 2019 12:23:13 +0000 (GMT) Subject: [resource-management-sig] Status of the "Resource Management" SIG In-Reply-To: <5a84404f-0e10-9010-61ed-29aff08b5ec6@openstack.org> References: <5a84404f-0e10-9010-61ed-29aff08b5ec6@openstack.org> Message-ID: On Thu, 24 Oct 2019, Thierry Carrez wrote: > I was wondering about the status of the Resource Management SIG... It's been > "forming" according to https://wiki.openstack.org/wiki/Res_Mgmt_SIG since > January 2018... And I could'nt find a reference or log to any meeting after > that. I think it is fair to say that this has failed to launch. Jay's moved on to other things, I have maybe a pinkie toe of attention left to community things, and I'm unable to speak for Howard (which says something in itself). > Does anyone have updated status on this one? Should it be removed from the > list of active SIGs at https://governance.openstack.org/sigs/ ? Unless we hear from Howard pretty soon I'd say, for the sake of tidying things up and accurately reflecting reality, it would be a good idea to remove it. Which is unfortunate. I suspect that in mixed VNF and CNF environments or multi-cloud settings where both will be accessed, having a consistent overview and nomenclature for resources would be a good thing. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From zhangbailin at inspur.com Tue Oct 29 12:50:26 2019 From: zhangbailin at inspur.com (=?gb2312?B?QnJpbiBaaGFuZyjVxbDZwdYp?=) Date: Tue, 29 Oct 2019 12:50:26 +0000 Subject: [nova] spec for feature liasion needed Message-ID: Hi,nova core team: In Ussuri we add the feature liaison to lead the feature to do [1], now there are some specs needed find a feature liaison, as follows: 1. https://review.opendev.org/#/c/580336/ Support re-configure deleted_on_termination in server 2. https://review.opendev.org/#/c/663563/ Add flavor group 3. https://review.opendev.org/#/c/682302/ Allow specify user to reset password 4. https://review.opendev.org/#/c/691651/ Support fuzzy querying instance by tag [1] http://specs.openstack.org/openstack/nova-specs/readme.html#feature-liaison-faq Thanks. Brin Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Oct 29 13:44:38 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 29 Oct 2019 13:44:38 +0000 Subject: [resource-management-sig] Status of the "Resource Management" SIG In-Reply-To: References: <5a84404f-0e10-9010-61ed-29aff08b5ec6@openstack.org> Message-ID: On Tue, 2019-10-29 at 12:23 +0000, Chris Dent wrote: > On Thu, 24 Oct 2019, Thierry Carrez wrote: > > > I was wondering about the status of the Resource Management SIG... It's been > > "forming" according to https://wiki.openstack.org/wiki/Res_Mgmt_SIG since > > January 2018... And I could'nt find a reference or log to any meeting after > > that. > > I think it is fair to say that this has failed to launch. Jay's > moved on to other things, I have maybe a pinkie toe of attention > left to community things, and I'm unable to speak for Howard (which > says something in itself). > > > Does anyone have updated status on this one? Should it be removed from the > > list of active SIGs at https://governance.openstack.org/sigs/ ? > > Unless we hear from Howard pretty soon I'd say, for the sake of > tidying things up and accurately reflecting reality, it would be a > good idea to remove it. > > Which is unfortunate. I suspect that in mixed VNF and CNF > environments or multi-cloud settings where both will be accessed, > having a consistent overview and nomenclature for resources would be > a good thing. well to that point i think working to integrate os-traits and placmenet with kubernetes to allow both opentack and kubernetes to run on the same plathform to support miext vnf cnf deployment would be a great future use of placement and openstack/kubernetes collabaration. we already have touch points with kuryr/neutron, cinder, nova/ironic and keystone integration with k8s i think placment would also be a strong candidate. for example i can see maping traits to kubernetes lables and using placemnt to track pod resouce garentees(i forget the k8s name for that). that said while im moderatly interested in this topic its not what im currently working on so i have not been following the work or lack there of of the sig and im not sure if i would be able to if it was to continue. i will proably bring this topic up on the placment channel again at some point if i have more time to spend on it. > From juliaashleykreger at gmail.com Tue Oct 29 14:29:04 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 29 Oct 2019 07:29:04 -0700 Subject: [ironic]: Timeout reached while waiting for callback for node In-Reply-To: <1530284401.3551200.1572301601958@mail.yahoo.com> References: <1530284401.3551200.1572301601958.ref@mail.yahoo.com> <1530284401.3551200.1572301601958@mail.yahoo.com> Message-ID: That is great news to hear that you've been able to correlate it. We've written some things regarding scaling, but the key really depends on your architecture and how your utilizing the workload. Since you mentioned a spine-leaf architecture, physical locality of conductors will matter as well as having as much efficiency as possible. I believe CERN is running 4-5 conductors to manage ?3000+? physical machines. Naturally you'll need to scale as appropriate to your deployment pattern. If much of your fleet is being redeployed often, you may wish to consider having more conductors to match that overall load. 1) Use the ``direct`` deploy interface. This moves the act of unpacking the image files and streaming them to disk to the end node. This generally requires an HTTP(S) download endpoint offered by the conductor OR via Swift. Ironic-Python-Agent downloads the file, and unpacks it in memory and directly streams it to disk. With the ``iscsi`` interface, you can end up in situations, depending on image composition and settings being passed to dd, where part of your deploy process is trying to write zeros over the wire in blocks to the remote disk. Naturally this needlessly consumes IO Bandwidth. 2) Once your using the ``direct`` deploy_interface, Consider using caching. While we don't use it in CI, ironic does have the capability to pass configuration for caching proxy servers. This is set on a per-node basis. If you have any deployed proxy/caching servers on your spine or in your leafs close to physical nodes. Some timers are also present to enable ironic to re-use swift URLs if your deploying the same image to multiple servers concurrently. Swift tempurl usage does negatively impact the gain over using a caching proxy though, but it is something to consider in your architecture and IO pattern. https://docs.openstack.org/ironic/latest/admin/drivers/ipa.html#using-proxies-for-image-download 3) Consider using ``conductor_groups``. If it would help, you can localize conductors to specific pools of machines of machines. This may be useful if you have pools with different security requirements, or if you have multiple spines and can dedicate some conductors per spine. https://docs.openstack.org/ironic/latest/admin/conductor-groups.html 4) Turn off periodic driver tasks for drivers your not using. Power sync, and sensor data collection are two periodic workers that consume resources when they run and the periodic tasks of other drivers still consume a worker slot and query the database to see if there is work to be done. You may also want to increase the number of permitted workers. Power sync can be a huge issue on older versions. I believe Stein is where we improved the parallelism of the power sync workers in Ironic and Train now has power state callback with nova, which will greatly reduce the ironic-api and nova-compute processor overhead. Hope this helps! -Julia On Mon, Oct 28, 2019 at 3:26 PM fsbiz at yahoo.com wrote: > > Thanks Julia. > In addition to what you mentioned this particular issue seems to have cropped up when we added 100 more baremetal nodes. > > I've also narrowed down the issue (TFTP timeouts) when 3-4 baremetal nodes are in "deploy" state and downloading the OS via iSCSI. Each iSCSI transfer takes about 6 Gbps and thus with four transfers we are over our 20Gbps capacity of the leaf-spine links. We are slowly migrating to iPXE so it should help. > > That being said is there a document on large scale ironic design architectures? > We are looking into a DC design (primarily for baremetals) for upto 2500 nodes. > > thanks, > Fred, > > > On Wednesday, October 23, 2019, 03:19:41 PM PDT, Julia Kreger wrote: > > > Greetings Fred! > > Reply in-line. > > On Tue, Oct 22, 2019 at 12:47 PM fsbiz at yahoo.com wrote: > > [trim] > > > > TFTP logs: shows TFTP client timed out (weird). Any pointers here? > > > Sadly this is one of those things that comes with using TFTP. Issues like this is why the community tends to recommend using ipxe.efi to chainload as you can perform transport over TCP as opposed to UDP where in something might happen mid-transport. > > > tftpd shows ramdisk_deployed completed. Then, it reports that the client timed out. > > > Grub does tend to be very abrupt and not wrap up very final actions. I suspect it may just never be sending the ack back and the transfer may be completing. I'm afraid this is one of those things you really need to see on the console what is going on. My guess would be that your deploy_ramdisk lost a packet in transfer or that it was corrupted in transport. It would be interesting to know if the network card stack is performing checksum validation, but for IPv4 it is optional. > > > [trim] > > > > This has me stumped here. This exact failure seems to be happening 3 to 4 times a week on different nodes. > Any pointers appreciated. > > thanks, > Fred. > > From whayutin at redhat.com Tue Oct 29 14:35:49 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Tue, 29 Oct 2019 08:35:49 -0600 Subject: [tripleo] October 5th TripleO meeting canceled Message-ID: Greetings, Due to the OpenStack PTG, the TripleO meeting scheduled for October 5th will be cancelled. Thank you!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Tue Oct 29 14:45:28 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Tue, 29 Oct 2019 08:45:28 -0600 Subject: [tripleo] November 5th TripleO meeting canceled In-Reply-To: References: Message-ID: Correction, That should read November 5th, not October :) On Tue, Oct 29, 2019 at 8:35 AM Wesley Hayutin wrote: > Greetings, > > Due to the OpenStack PTG, the TripleO meeting scheduled for October 5th > will be cancelled. > > Thank you!! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo.sgaravatto at gmail.com Tue Oct 29 14:54:07 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Tue, 29 Oct 2019 15:54:07 +0100 Subject: [ops][nova] Different quotas for different SLAs ? Message-ID: Dear all I would like to set different overcommitments factors for the compute nodes. In particular I would like to use some compute nodes without overcommitments, and some compute nodes with a cpu_allocation_ratio equals to 2.0. To decide if an instance should go to a compute node with or without overcommitment is easy; e.g. it could be done with host aggregates + setting metadata to the relevant flavors/images. But is it in some way possible to decide that a certain project has a quota of x VCPUs without overcommitment, and y VCPUs with overcommitments ? Or the only option is using 2 different projects for the 2 different SLAs (which is something that I would like to avoid) ? Thanks, Massimo -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Tue Oct 29 14:59:41 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Tue, 29 Oct 2019 09:59:41 -0500 Subject: [neutron][performance] Cancelling Neutron performance meeting on November 4th Message-ID: Hi everybody, Due to the Summit/ Forum / PTG in Shanghai, we will cancel the Neutron performance meeting on November 4th. We will resume on the 18th Regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Oct 29 15:13:21 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 29 Oct 2019 10:13:21 -0500 Subject: [tc][all] Ussuri community goal candidate 1: 'Project Specific New Contributor & PTL Docs' In-Reply-To: <9e27d91a-6974-32f6-dd81-ddb7c8eee0eb@gmail.com> References: <16df407dbf8.11d25468592036.8156563932102242889@ghanshyammann.com> <9e27d91a-6974-32f6-dd81-ddb7c8eee0eb@gmail.com> Message-ID: <16e18141b32.ba75338a301818.8284518731678814133@ghanshyammann.com> ---- On Fri, 25 Oct 2019 14:26:48 -0500 melanie witt wrote ---- > On 10/22/19 08:13, Ghanshyam Mann wrote: > > Hello Everyone, > > > > We are starting the next step for the Ussuri Cycle Community Goals. We have four candidates till now as proposed in > > etherpad[1]. > > > > The first candidate is "Project Specific New Contributor & PTL Docs". Kendall (diablo_rojo) volunteered to lead this goal > > as Champion. Thanks to her for stepping up for this job. > > > > This idea was brought up during Train cycle goal discussions also[2]. The idea here is to have a consistent and mandatory > > contributors guide in each project which will help new contributors to get onboard in upstream activities. > > Also, create PTL duties guide on the project's side. Few projects might have the PTL duties documented and making it > > consistent and for all projects is something easy for transferring the knowledge. > > Kendall can put up more details and highlights based on queries. > > > > We would like to open this idea to get wider feedback from the community and projects team before we start defining > > the goal in Gerrit. What do you think of this as a community goal? Any query or Improvement Feedback? > > I wrote what I call a "chronological PTL guide" for nova [3], which was > originally a local google doc I had used while I was PTL. I converted > and published it as an in-tree nova doc in case it would be helpful to > others. > > I think having a PTL guide doc is nice and could potentially make it > easier for new or prospective PTLs to learn what is involved in the duties. Thanks Melanie for link, that is very helpful. -gmann > > -melanie > > [3] https://docs.openstack.org/nova/latest/contributor/ptl-guide.html > > > [1] https://etherpad.openstack.org/p/PVG-u-series-goals > > [2] https://etherpad.openstack.org/p/BER-t-series-goals > > > > -gmann & diablo_rojo > > > > > > > From gmann at ghanshyammann.com Tue Oct 29 15:20:43 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 29 Oct 2019 10:20:43 -0500 Subject: [tc][all] Updates on Ussuri cycle community-wide goals Message-ID: <16e181adbc4.1191b0166302215.2291880664205036921@ghanshyammann.com> Hello Everyone, We have two goals with their champions ready for review. Please review and provide your feedback on Gerrit. 1. Add goal for project specific PTL and contributor guides - Kendall Nelson - https://review.opendev.org/#/c/691737/ 2. Propose a new goal to migrate all legacy zuul jobs - Luigi Toscano - https://review.opendev.org/#/c/691278/ We are still looking for the Champion volunteer for RBAC goal[1]. If you have any new ideas for goal, do not hesitate to add in etherpad[2] [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010291.html [2] https://etherpad.openstack.org/p/PVG-u-series-goals -gmann & diablo_rojo From openstack at fried.cc Tue Oct 29 15:40:37 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 29 Oct 2019 10:40:37 -0500 Subject: [ptg][nova][neutron][ironic][cinder][keystone][cyborg] Cross-project sessions Message-ID: All- Time to coordinate cross-project sessions at the PTG. slaweq approached me about a nova/neutron session so I penciled it in on the nova etherpad [1]. If anyone has conflicts/objections with that time, please speak up and we'll try to coordinate. Other teams (including nova): If cross-project meetings are necessary, please suggest times in the etherpad and/or on this thread and/or by grabbing me (efried) on IRC. (Note that stephenfin (I think) will be running the room, but he's on vacation until the event.) Thanks, efried [1] https://etherpad.openstack.org/p/nova-shanghai-ptg From openstack at fried.cc Tue Oct 29 15:45:24 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 29 Oct 2019 10:45:24 -0500 Subject: [ops][nova] Different quotas for different SLAs ? In-Reply-To: References: Message-ID: <84eee336-e79e-47a6-d9e6-ba66904f2465@fried.cc> Massimo- > To decide if an instance should go to a compute node with or without > overcommitment is easy; e.g. it could be done with host aggregates + > setting metadata to the relevant flavors/images. You could also use custom traits. > But is it in some  way possible to decide that a certain project has a > quota of  x VCPUs without overcommitment, and y VCPUs with overcommitments ? I'm not sure whether this helps, but it's easy to detect the allocation ratio of a compute node's VCPU resource via placement with GET /resource_providers/$cn_uuid/inventories/VCPU [1]. But breaking down a VCPU quota into different "classes" of VCPU sounds... impossible to me. But since you said > In particular I would like to use some compute nodes without > overcommitments ...perhaps it would help you to use PCPUs instead of VCPUs for these. We started reporting PCPUs in Train [2]. efried [1] https://docs.openstack.org/api-ref/placement/?expanded=show-resource-provider-inventory-detail#show-resource-provider-inventory [2] http://specs.openstack.org/openstack/nova-specs/specs/train/approved/cpu-resources.html From smooney at redhat.com Tue Oct 29 16:17:16 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 29 Oct 2019 16:17:16 +0000 Subject: [ops][nova] Different quotas for different SLAs ? In-Reply-To: <84eee336-e79e-47a6-d9e6-ba66904f2465@fried.cc> References: <84eee336-e79e-47a6-d9e6-ba66904f2465@fried.cc> Message-ID: <6da605b1f8391faafc1eda171fed326eea1d9147.camel@redhat.com> the normal way to achive this in the past would have been to create host aggreate and then use the AggregateTypeAffinityFilter to map flavor to specific host aggrates. so you can have a 2xOvercommit and a 4xOvercommit and map them to different host aggrates that have different over commit ratios set on the compute nodes. On Tue, 2019-10-29 at 10:45 -0500, Eric Fried wrote: > Massimo- > > > To decide if an instance should go to a compute node with or without > > overcommitment is easy; e.g. it could be done with host aggregates + > > setting metadata to the relevant flavors/images. ya that basicaly the same as what i said above > > You could also use custom traits. traits would work yes it woudl be effectivly the same but would have the advatage of having placment do most of the filtering so it should perform better. > > > But is it in some way possible to decide that a certain project has a > > quota of x VCPUs without overcommitment, and y VCPUs with overcommitments ? > > I'm not sure whether this helps, but it's easy to detect the allocation > ratio of a compute node's VCPU resource via placement with GET > /resource_providers/$cn_uuid/inventories/VCPU [1]. > > But breaking down a VCPU quota into different "classes" of VCPU > sounds... impossible to me. this is something that is not intended to be supported with unified limits at least not initially? ever? > > But since you said > > > In particular I would like to use some compute nodes without > > overcommitments > > ...perhaps it would help you to use PCPUs instead of VCPUs for these. We > started reporting PCPUs in Train [2]. ya pcpus are a good choice for the nova over commit case for cpus. hugepages are the equivalent for memory. idealy you should avoid disk over commit but if you have to do it use cinder when you need over commit and local storage whne you do not. > > efried > > [1] > https://docs.openstack.org/api-ref/placement/?expanded=show-resource-provider-inventory-detail#show-resource-provider-inventory > [2] > http://specs.openstack.org/openstack/nova-specs/specs/train/approved/cpu-resources.html > From Tim.Bell at cern.ch Tue Oct 29 17:09:59 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Tue, 29 Oct 2019 17:09:59 +0000 Subject: [ops][nova] Different quotas for different SLAs ? In-Reply-To: <6da605b1f8391faafc1eda171fed326eea1d9147.camel@redhat.com> References: <84eee336-e79e-47a6-d9e6-ba66904f2465@fried.cc> <6da605b1f8391faafc1eda171fed326eea1d9147.camel@redhat.com> Message-ID: <6788F9F1-F95C-4BAE-8E55-2BE14C321DA6@cern.ch> We’ve had similar difficulties with a need to quota flavours .. cinder has a nice feature for this but with nova, I think we ended up creating two distinct projects and exposing the different flavours to the different projects, each with the related quota… from a user interface perspective, it means they’re switching projects more often than is ideal but it does control the limits. Tim > On 29 Oct 2019, at 17:17, Sean Mooney wrote: > > the normal way to achive this in the past would have been to create host aggreate and then > use the AggregateTypeAffinityFilter to map flavor to specific host aggrates. > > so you can have a 2xOvercommit and a 4xOvercommit and map them to different host aggrates that have different over > commit ratios set on the compute nodes. > > On Tue, 2019-10-29 at 10:45 -0500, Eric Fried wrote: >> Massimo- >> >>> To decide if an instance should go to a compute node with or without >>> overcommitment is easy; e.g. it could be done with host aggregates + >>> setting metadata to the relevant flavors/images. > ya that basicaly the same as what i said above >> >> You could also use custom traits. > traits would work yes it woudl be effectivly the same but would have the advatage of having placment > do most of the filtering so it should perform better. >> >>> But is it in some way possible to decide that a certain project has a >>> quota of x VCPUs without overcommitment, and y VCPUs with overcommitments ? >> >> I'm not sure whether this helps, but it's easy to detect the allocation >> ratio of a compute node's VCPU resource via placement with GET >> /resource_providers/$cn_uuid/inventories/VCPU [1]. >> >> But breaking down a VCPU quota into different "classes" of VCPU >> sounds... impossible to me. > this is something that is not intended to be supported with unified limits at least not initially? ever? >> >> But since you said >> >>> In particular I would like to use some compute nodes without >>> overcommitments >> >> ...perhaps it would help you to use PCPUs instead of VCPUs for these. We >> started reporting PCPUs in Train [2]. > ya pcpus are a good choice for the nova over commit case for cpus. > hugepages are the equivalent for memory. > idealy you should avoid disk over commit but if you have to do it use cinder when you > need over commit and local storage whne you do not. >> >> efried >> >> [1] >> > https://docs.openstack.org/api-ref/placement/?expanded=show-resource-provider-inventory-detail#show-resource-provider-inventory >> [2] >> http://specs.openstack.org/openstack/nova-specs/specs/train/approved/cpu-resources.html >> > > From kendall at openstack.org Tue Oct 29 17:22:38 2019 From: kendall at openstack.org (Kendall Waters) Date: Tue, 29 Oct 2019 12:22:38 -0500 Subject: Important Shanghai PTG Information In-Reply-To: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> References: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> Message-ID: Hello Everyone! The Shanghai PTG is just around the corner! Here are a few reminders for you. Also, please review the previous email for additional information. Registration & Badges Registration for the PTG is included in the cost of the Summit. It is a single registration for both events. Since there is a single registration for the event, there is also one badge for both events. You will pick it up when you check in for the Summit and keep it until the end of the PTG. Registration/Help Desk hours: Sunday, November 3: 1:00pm - 4:30pm Monday, November 4: 7:30am - 5:00pm Tuesday, November 5: 8:30am - 5:00pm Wednesday, November 6: 8:30am - 5:00pm Thursday, November 7: 8:30am - 4:30pm Friday, November 8: 8:30am - 4:30pm Starting Wednesday, there will be a PTG help desk located on level 4 outside of the Blue Hall. If you have any issues, you can ping wendallkaters or diablo_rojo on IRC or email ptg at openstack.org . PTG bot The event is organized around separate 'tracks' (generally tied to a specific team/group). Topics of discussion are loosely scheduled in those tracks, based on the needs of the attendance. This allows to maximize attendee productivity, but the downside is that it can make the event a bit confusing to navigate. To mitigate that issue, we are using an IRC bot to expose what's happening currently at the event at the following page: http://ptg.openstack.org/ptg.html Unofficial Game Night OSF staff are planning on hanging out at the Shanghai Marriott Hotel City Centre (Address: 555 Xizang Middle Rd, Huangpu, Shanghai, China, 200003) lobby bar starting at 8:00pm. For those interested, please enter your name in the etherpad: https://etherpad.openstack.org/p/pvg-game-night Feel free to reach out if you have any questions. Cheers, The Kendalls (wendallkaters & diablo_rojo) > On Oct 9, 2019, at 12:06 PM, Kendall Waters wrote: > > Hello Everyone! > > As I’m sure you already know, the Shanghai PTG is going to be a very different event from PTGs in the past so we wanted to spell out the differences so you can be better prepared. > > Registration & Badges > > Registration for the PTG is included in the cost of the Summit. It is a single registration for both events. Since there is a single registration for the event, there is also one badge for both events. You will pick it up when you check in for the Summit and keep it until the end of the PTG. > > The Space > > Rooms > > The space we are contracted to have for the PTG will be laid out differently. We only have a couple dedicated rooms which are allocated to those groups with the largest numbers of people. The rest of the teams will be in a single larger room together. To help people gather teams in an organized fashion, we will be naming the arrangements of tables after OpenStack releases (Austin, Bexar, Cactus, etc). > > Food & Beverage Rules > > Unfortunately, the venue does not allow ANY food or drink in any of the rooms. This includes coffee and tea. Lunch will be from 12:30 to 1:30 in the beautiful pre-function space outside of the Blue Hall. > > Moving Furniture > > You are allowed to! Yay! If the table arrangements your project/team/group lead requested don’t work for you, feel free to move the furniture around. That being said, try to keep the tables marked with their names so that others can find them during their time slots. There will also be extra chairs stacked in the corner if your team needs them. > > Hours > > This venue is particularly strict about the hours we are allowed to be there. The PTG is scheduled to run from 9:00 in the morning to 4:30 in the evening. Its reasonably likely that if you try to come early or stay late, security will talk to you. So please be kind and respectfully leave if they ask you to. > > Resources > > Power > > While we have been working with the venue to accomodate our power needs, we won’t have as many power strips as we have had in the past. For this reason, we want to remind everyone to charge all their devices every night and share the power strips we do have during the day. Sharing is caring! > > Flipcharts > > While we won’t have projection available, we will have some flipcharts around. Each dedicated room will have one flipchart and the big main room will have a few to share. Please feel free to grab one when you need it, but put it back when you are finished so that others can use it if they need. Again, sharing is caring! :) > > Onboarding > > A lot of the usual PTG attendees won’t be able to attend this event, but we will also have a lot of new faces. With this in mind, we have decided to add project onboarding to the PTG so that the new contributors can get up to speed with the projects meeting that week. The teams gathering that will be doing onboarding will have that denoted on the print and digital schedule on site. They have also been encouraged to promote when they will be doing their onboarding via the PTGBot and on the mailing lists. > > If you have any questions, please let us know! > > Cheers, > The Kendalls > (wendallkaters & diablo_rojo) > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Tue Oct 29 17:23:53 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Tue, 29 Oct 2019 10:23:53 -0700 Subject: [keystone] Upcoming team meetings and PTGs In-Reply-To: References: Message-ID: <06575513-3416-40cd-9a5e-8e7bc3a1044e@www.fastmail.com> On Mon, Oct 28, 2019, at 10:35, Colleen Murphy wrote: > Hi team, > > Tomorrow, October 29, we are having our virtual pre-PTG meeting from > 14:00-17:30 UTC in lieu of our regular 16:00 IRC meeting. > > Tuesday, November 5 some of us will be in Shanghai for the Forum and I > will not be available to chair the meeting, so I propose we skip it. > > Tuesday, November 12, we will have our virtual post-PTG meeting from > 15:00-17:30 UTC in lieu of our regular 16:00 IRC meeting. Since some of us will still be traveling or recovering the week of the 12th, we decided today to postpone the post-PTG meeting to Tuesday, November 19. We'll hold a regular IRC meeting on the 12th. I also propose we skip the meeting on the 26th since that week is Thanksgiving in the US which most of us will be observing. Colleen > > Details about the PTG sessions can be found on the planning etherpad: > > https://etherpad.openstack.org/p/keystone-shanghai-ptg > > Colleen > > From rleander at redhat.com Tue Oct 29 09:13:57 2019 From: rleander at redhat.com (Rain Leander) Date: Tue, 29 Oct 2019 10:13:57 +0100 Subject: [PTG] Unoffical Game Night In-Reply-To: References: Message-ID: RDO can handle the tab for an hour or so. And I'll bring an escape room or three. See you there! ~Rain. On Tue, Oct 29, 2019 at 1:44 AM Kendall Nelson wrote: > Hello! > > While we won't have space in any official capacity, I am planning on > bringing games and thinking Thursday would be a good night to play them if > anyone wants to bring games and join me. Myself + other OSF staff are > planning on hanging out at the Shanghai Marriott Hotel City Centre (Address: > 555 Xizang Middle Rd, Huangpu, Shanghai, China, 200003) lobby bar > starting at 8:00. > > For those interested I'm gathering an etherpad of people + games[1]. > > Hope to see you there! > > -Kendall (diablo_rojo) > > [1] https://etherpad.openstack.org/p/pvg-game-night > -- K Rain Leander OpenStack Community Liaison Open Source Program Office https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From sagarun at gmail.com Tue Oct 29 17:43:37 2019 From: sagarun at gmail.com (Arun SAG) Date: Tue, 29 Oct 2019 10:43:37 -0700 Subject: WFH Message-ID: I have a doctors appointment around 3:45pm PST. Will WFH today. -- Arun S A G http://zer0c00l.in/ From gmann at ghanshyammann.com Tue Oct 29 19:53:11 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 29 Oct 2019 14:53:11 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> Message-ID: <16e19144cf0.f6b07849311271.7773306777497055114@ghanshyammann.com> ---- On Thu, 24 Oct 2019 14:32:03 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > We had good amount of discussion on the final plan and schedule in today's TC office hour[1]. > > I captured the agreement on each point in etherpad (you can see the AGREE:). Also summarizing > the discussions here. Imp point is if your projects are planning to keep the py2.7 support then do not delay > to tell us. Reply on this ML thread or add your project in etherpad. > > - Projects can start dropping the py2.7 support. Common lib and testing tools need to wait until milestone-2. > ** pepe8 job to be included in openstack-python3-ussuri-jobs-* templates - https://review.opendev.org/#/c/688997/ > ** You can drop openstack-python-jobs template and start using ussuri template once 688997 patch is merged. > ** Cross projects dependency (if any ) can be sync up among dependent projects. > > - I will add this plan and schedule as a community goal. The goal is more about what all things to do and when. > ** If any project keeping the support then it has to be notified explicitly for its consumer. > > - Schedule: > The schedule is aligned with the Ussuri cycle milestone[2]. I will add the plan in the release schedule also. > Phase-1: Dec 09 - Dec 13 R-22 Ussuri-1 milestone > ** Project to start dropping the py2 support along with all the py2 CI jobs. > Phase-2: Feb 10 - Feb 14 R-13 Ussuri-2 milestone > ** This includes Oslo, QA tools (or any other testing tools), common lib (os-brick), Client library. > ** This will give enough time to projects to drop the py2 support. > Phase-3: Apr 06 - Apr 10 R-5 Ussuri-3 milestone > ** Final audit on Phase-1 and Phase-2 plan and make sure everything is done without breaking anything. > This is enough time to measure such break or anything extra to do before ussuri final release. > > Other discussions points and agreement: > - Projects want to keep python 2 support and need oslo, QA or any other dependent projects/lib support: > ** swift. AI: gmann to reach out to swift team about the plan and exact required things from its dependency > (the common lib/testing tool). I chated with timburke on IRC about things required by swift to keep the py2.7 support[1]. Below are client lib/middleware swift required for py2 testing. @timburke, feel free to update if any missing point. - devstack. able to keep running swift on py2 and rest all services can be on py3 - keystonemiddleware and its dependency - keystoneclient and openstackclient (dep of keystonemiddleware) - castellan and barbicanclient As those lib/middleware going to drop the py2.7 support in phase-2, we need to cap them for swift. I think capping them for python2.7 in upper constraint file would not affect any other users but Matthew Thode can explain better how that will work from the requirement constraint perspective. [1] http://eavesdrop.openstack.org/irclogs/%23openstack-swift/%23openstack-swift.2019-10-28.log.html#t2019-10-28T16:37:33 -gmann > ** List your project if you want to keep the py2 support. > ** Action item: TC liaisons to reach out to their projects and make sure they are aware of this plan[3]. > > - How to test the upgrade from python2 -> python3 > ** AGREE: let's drop the integrated testing for py2->py3 and oslo can check if they can do functional testing of > oslo.messaging - https://bugs.launchpad.net/oslo.messaging/+bug/1792977 > > - What are our guidelines to backport fixes to stable branches? > ** AGREE: This will be rare case and if that happen then tweaking for py27 in the backport is fine. The stable > branch backport policy does not need any changing for this > > - What is the tactics of dropping openstack-tox-py27 in our gate? > ** AGREE: on merging the pep8 job in ussuri job template https://review.opendev.org/#/c/688997/ > > [1] http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%23openstack-tc.2019-10-24.log.html#t2019-10-24T14:49:12 > [2] https://releases.openstack.org/ussuri/schedule.html > [3] https://governance.openstack.org/tc/reference/tc-liaisons.html > > -gmann > > > ---- On Thu, 24 Oct 2019 09:55:58 -0500 Ghanshyam Mann wrote ---- > > Just a reminder, discussion for this is going to start in #openstack-tc channel in another 5 min. > > > > -gmann > > > > > > ---- On Tue, 15 Oct 2019 13:18:03 -0500 Ghanshyam Mann wrote ---- > > > Hello Everyone, > > > > > > Python 2.7 is going to retire in Jan 2020 [1] and we planned to drop the python 2 support from OpenStack > > > during the start of the Ussuri cycle[2]. > > > > > > Time has come now to start the planning on dropping the Python2. It needs to be coordinated among various > > > Projects, libraries, vendors driver, third party CI and testing frameworks. > > > > > > * Preparation for the Plan & Schedule: > > > > > > Etherpad: https://etherpad.openstack.org/p/drop-python2-support > > > > > > We discussed it in TC to come up with the plan, execute it smoothly and avoid breaking any dependent projects. > > > I have prepared an etherpad[3](mentioned above also) to capture all the points related to this topic and most importantly > > > the draft schedule about who can drop the support and when. The schedule is in the draft state and not final yet. > > > The most important points are if you are dropping the support then all your consumers (OpenStack Projects, Vendors drivers etc) > > > are ready for that. For example, oslo, os-bricks, client lib, testing framework projects will keep the python2 support until we make > > > sure all the consumers of those projects do not require py2 support. If anyone require then how long they can support py2. > > > These libraries, testing frameworks will be the last one to drop py2. > > > > > > We have planned to have a dedicated discussion in TC office hours on the 24th Thursday #openstack-tc channel. We will > > > discuss what all need to be done and the schedules. > > > > > > You do not have to drop it immediately and keep eyes on this ML thread till we get the consensus on the > > > community-level plan and schedule. > > > > > > Meanwhile, you can always start pre-planning for your projects, for example, stephenfin has started for Nova[4] to > > > migrate the third party CI etc. Cinder has coordinated with all vendor drivers & their CI to migrate from py2 to py3. > > > > > > * Projects want to keep the py2 support? > > > There is no mandate that projects have to drop the py2 support right now. If you want to keep the support then key things > > > to discuss are what all you need and does all your dependent projects/libs provide the support of py2. This is something needs to be > > > discussed case by case. If any project wants to keep the support, add that in the etherpad with a brief reason which will > > > be helpful to discuss the need and feasibility. > > > > > > Feel free to provide feedback or add the missing point on the etherpad. Do not forget to attend the 24th Oct 2019, TC > > > office hour on Thursday at 1500 UTC in #openstack-tc. > > > > > > > > > [1] https://pythonclock.org/ > > > [2] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html > > > [3] https://etherpad.openstack.org/p/drop-python2-support > > > [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010109.html > > > > > > -gmann > > > > > > > > > > > > > > > > > > > > > From gmann at ghanshyammann.com Tue Oct 29 20:23:33 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 29 Oct 2019 15:23:33 -0500 Subject: [nova]review guide for the policy default refresh spec In-Reply-To: <16c956e577d.e430310375433.1480190884000484078@ghanshyammann.com> References: <16c956e577d.e430310375433.1480190884000484078@ghanshyammann.com> Message-ID: <16e19301d34.fae57b18311778.4123684314278393774@ghanshyammann.com> Bumping this email. Spec is merged for Ussuri cycle now. Please refer to the below review guide for this BP work. -gmann ---- On Thu, 15 Aug 2019 08:18:52 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > As many of you might know that in Train, we are doing Nova policy changes to adopt the > keystone's new defaults and scope type[1]. There are multiple changes required per policy as > mentioned in spec. I am writing this review guide for the patch sequence and at the end how > each policy will look like. > > I have prepared the first set of patches. I would like to get the feedback on those so that we > can modify the other policy also on the same line. My plan is to start the other policy work after > we merge the first set of policy changes. > > Patch sequence: Example: os-services API policy: > ------------------------------------------------------------- > 1. Cover/Improve the test coverage for existing policies: > This will be the first patch. We do not have good test coverage of the policy, current tests are > not at all useful and do not perform the real checks. Idea is to add the actual tests coverage for > each policy as the first patch. new tests try to access the API with all possible context and check > for positive and negative cases. > - https://review.opendev.org/#/c/669181/ > > 2. Introduce scope_types: > This will add the scope_type for policy. It will be either 'system', 'project' or 'system and project'. > In the same patch, along with existing test working as it is, new tests of scope type will be added which will > run with [oslo_policy] enforce_scope=True so that we can capture the real scope checks. > - https://review.opendev.org/#/c/645427/ > > 3. Add new default roles: > This will add new defaults which can be SYSTEM_ADMIN, SYSTEM_READER, PROJECT_MEMBER_OR_SYSTEM_ADMIN, PROJECT_READER_OR_SYSTEM_READER etc depends on Policy. > Test coverage of new defaults, as well as deprecated defaults, are covered in same patch. This patch will > add the granularity in policy if needed. Without policy granularity, we cannot add new defaults per rule. > - https://review.opendev.org/#/c/648480/ (I need to add more tests for deprecated rules) > > 4. Pass actual Targets in policy: > This is to pass the actual targets in context.can(). Main goal is to remove the defaults targets which is > nothing but context'user_id,project_id only. It will be {} if no actual target data needed in check_str. > - https://review.opendev.org/#/c/676688/ > > Patch sequence: Example: Admin Action API policy: > 1. https://review.opendev.org/#/c/657698/ > 2. https://review.opendev.org/#/c/657823/ > 3. https://review.opendev.org/#/c/676682/ > 4. https://review.opendev.org/#/c/663095/ > > There are other patches I have posted in between for common changes or fix or framework etc. > > [1] https://specs.openstack.org/openstack/nova-specs/specs/train/approved/policy-default-refresh.html > > -gmann > > > From stig.openstack at telfer.org Tue Oct 29 20:49:10 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Tue, 29 Oct 2019 20:49:10 +0000 Subject: [scientific-sig] IRC Meeting today (about 10 minutes time) Message-ID: Hi All - We have a Scientific SIG IRC meeting coming up at 2100 UTC in channel #openstack-meeting. Everyone is welcome. Today’s agenda is here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_October_29th_2019 Cheers, Stig From mriedemos at gmail.com Tue Oct 29 21:32:59 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 29 Oct 2019 16:32:59 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <16e19144cf0.f6b07849311271.7773306777497055114@ghanshyammann.com> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> <16e19144cf0.f6b07849311271.7773306777497055114@ghanshyammann.com> Message-ID: <98647208-6a47-8aa5-7835-02a602c44dbe@gmail.com> On 10/29/2019 2:53 PM, Ghanshyam Mann wrote: > I chated with timburke on IRC about things required by swift to keep the py2.7 support[1]. Below are > client lib/middleware swift required for py2 testing. > @timburke, feel free to update if any missing point. > > - keystoneclient and openstackclient (dep of keystonemiddleware) openstackclient is not a library so I'm a bit confused about the point about keeping python2 support in python-openstackclient for swift. I mention it because Eric Fried noticed today [1] that CI is blocked for OSC because the osc-functional-devstack* jobs seem to have mysteriously switched over to python3 but the functional tox target is expecting python2. So there is talk about moving OSC forward and dropping python2 support as a result of that. [1] https://review.opendev.org/#/c/691980/ -- Thanks, Matt From gmann at ghanshyammann.com Tue Oct 29 22:29:10 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 29 Oct 2019 17:29:10 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <98647208-6a47-8aa5-7835-02a602c44dbe@gmail.com> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> <16e19144cf0.f6b07849311271.7773306777497055114@ghanshyammann.com> <98647208-6a47-8aa5-7835-02a602c44dbe@gmail.com> Message-ID: <16e19a31c63.f085b664313604.5181888794688963432@ghanshyammann.com> ---- On Tue, 29 Oct 2019 16:32:59 -0500 Matt Riedemann wrote ---- > On 10/29/2019 2:53 PM, Ghanshyam Mann wrote: > > I chated with timburke on IRC about things required by swift to keep the py2.7 support[1]. Below are > > client lib/middleware swift required for py2 testing. > > @timburke, feel free to update if any missing point. > > > > - keystoneclient and openstackclient (dep of keystonemiddleware) > > openstackclient is not a library so I'm a bit confused about the point > about keeping python2 support in python-openstackclient for swift. We do not need to keep the py2 in OSC, we are good to drop it in phase-1 and other client lib (python-*client) are target in phase-2. As swift keeping the py2.7 support, we need to find out how we can make it work for swift and its py2.7 testing. Capping the swift's dependency version for python2.7 is an option where dependency can drop the support and swift use of the older version for py2.7 support/testing. Till now there is no strict requirement from anyone to block anyone to drop the py2.7. If any project like Swift which is the only project in the list of keeping the support has other projects/lib dependency then they need to work with version cap (as long as dependency also keeping the support). -gmann > > I mention it because Eric Fried noticed today [1] that CI is blocked for > OSC because the osc-functional-devstack* jobs seem to have mysteriously > switched over to python3 but the functional tox target is expecting > python2. So there is talk about moving OSC forward and dropping python2 > support as a result of that. > > [1] https://review.opendev.org/#/c/691980/ > > -- > > Thanks, > > Matt > > From colleen at gazlene.net Tue Oct 29 23:53:19 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Tue, 29 Oct 2019 16:53:19 -0700 Subject: =?UTF-8?Q?Re:_[tc][all][goal]:_Ussuri_community_goal_candidate_3:_'Consi?= =?UTF-8?Q?stent_and_secure_default_policies'?= In-Reply-To: <16df4609432.e8a5d43f95570.8265671559696188212@ghanshyammann.com> References: <16df4609432.e8a5d43f95570.8265671559696188212@ghanshyammann.com> Message-ID: <68d21e29-0ef8-42b6-bcd1-6d7b5fe2ea79@www.fastmail.com> On Tue, Oct 22, 2019, at 09:50, Ghanshyam Mann wrote: > Hello Everyone, > > This is 3rd proposal candidate for the Ussuri cycle community-wide > goal. The other two are [1]. > > Colleen proposed this idea for the Ussuri cycle community goal. > > Projects implemented/plan to implement this: > *Keystone already implemented this with all necessary support in > oslo.policy with nice documentation. > * We discussed this in nova train PTG to implement it in nova [2]. Nova > spec was merged in Train but > could not implement. I have re-proposed the spec for the Ussuri > cycle [3]. > > This is nice idea as a goal from the user perspective. Colleen has less > bandwidth to drive this goal alone. > We are looking for a champion or co-champions (1-2 people will be much > better) this goal along with Colleen. > > Also, let us know what do you think of this as a community goal? Any > query or Improvement Feedback? [snipped] It's possible that this won't work very well as a community goal. This migration took the keystone team about two cycles to implement, not including all the planning and foundational work that needed to happen first. In our virtual PTG meeting today we discussed the possibility of instead forming a pop-up team around this work so that a group of individuals (across several teams, not just the keystone team) could target the largest projects and make more of a dent in the work over a couple of cycles before using the community goal model to close the gaps in the smaller or more peripheral projects. However, we will still need to get buy-in from the projects so that we can be sure that the work the pop-up team does gets prioritized and reviewed. I've drafted a set of steps that we'd expect projects to follow to get this done: https://etherpad.openstack.org/p/policy-migration-steps And we're holding a forum session on it next week: https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24357/next-steps-for-policy-in-openstack Colleen From mthode at mthode.org Wed Oct 30 00:40:35 2019 From: mthode at mthode.org (Matthew Thode) Date: Tue, 29 Oct 2019 19:40:35 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <16e19144cf0.f6b07849311271.7773306777497055114@ghanshyammann.com> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> <16e19144cf0.f6b07849311271.7773306777497055114@ghanshyammann.com> Message-ID: <20191030004035.rsuegdsij2eezps3@mthode.org> On 19-10-29 14:53:11, Ghanshyam Mann wrote: > ---- On Thu, 24 Oct 2019 14:32:03 -0500 Ghanshyam Mann wrote ---- > > Hello Everyone, > > > > We had good amount of discussion on the final plan and schedule in today's TC office hour[1]. > > > > I captured the agreement on each point in etherpad (you can see the AGREE:). Also summarizing > > the discussions here. Imp point is if your projects are planning to keep the py2.7 support then do not delay > > to tell us. Reply on this ML thread or add your project in etherpad. > > > > - Projects can start dropping the py2.7 support. Common lib and testing tools need to wait until milestone-2. > > ** pepe8 job to be included in openstack-python3-ussuri-jobs-* templates - https://review.opendev.org/#/c/688997/ > > ** You can drop openstack-python-jobs template and start using ussuri template once 688997 patch is merged. > > ** Cross projects dependency (if any ) can be sync up among dependent projects. > > > > - I will add this plan and schedule as a community goal. The goal is more about what all things to do and when. > > ** If any project keeping the support then it has to be notified explicitly for its consumer. > > > > - Schedule: > > The schedule is aligned with the Ussuri cycle milestone[2]. I will add the plan in the release schedule also. > > Phase-1: Dec 09 - Dec 13 R-22 Ussuri-1 milestone > > ** Project to start dropping the py2 support along with all the py2 CI jobs. > > Phase-2: Feb 10 - Feb 14 R-13 Ussuri-2 milestone > > ** This includes Oslo, QA tools (or any other testing tools), common lib (os-brick), Client library. > > ** This will give enough time to projects to drop the py2 support. > > Phase-3: Apr 06 - Apr 10 R-5 Ussuri-3 milestone > > ** Final audit on Phase-1 and Phase-2 plan and make sure everything is done without breaking anything. > > This is enough time to measure such break or anything extra to do before ussuri final release. > > > > Other discussions points and agreement: > > - Projects want to keep python 2 support and need oslo, QA or any other dependent projects/lib support: > > ** swift. AI: gmann to reach out to swift team about the plan and exact required things from its dependency > > (the common lib/testing tool). > > I chated with timburke on IRC about things required by swift to keep the py2.7 support[1]. Below are > client lib/middleware swift required for py2 testing. > @timburke, feel free to update if any missing point. > > - devstack. able to keep running swift on py2 and rest all services can be on py3 > - keystonemiddleware and its dependency > - keystoneclient and openstackclient (dep of keystonemiddleware) > - castellan and barbicanclient > > > As those lib/middleware going to drop the py2.7 support in phase-2, we need to cap them for swift. > I think capping them for python2.7 in upper constraint file would not affect any other users but Matthew Thode can > explain better how that will work from the requirement constraint perspective. > > [1] http://eavesdrop.openstack.org/irclogs/%23openstack-swift/%23openstack-swift.2019-10-28.log.html#t2019-10-28T16:37:33 > > -gmann > ya, there are examples already for libs that have dropped py2 support. What you need to do is update global requirements to be something like the following. sphinx!=1.6.6,!=1.6.7,<2.0.0;python_version=='2.7' # BSD sphinx!=1.6.6,!=1.6.7,!=2.1.0;python_version>='3.4' # BSD or keyring<19.0.0;python_version=='2.7' # MIT/PSF keyring;python_version>='3.4' # MIT/PSF -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From massimo.sgaravatto at gmail.com Wed Oct 30 05:36:00 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Wed, 30 Oct 2019 06:36:00 +0100 Subject: [ops][nova] Different quotas for different SLAs ? In-Reply-To: <6788F9F1-F95C-4BAE-8E55-2BE14C321DA6@cern.ch> References: <84eee336-e79e-47a6-d9e6-ba66904f2465@fried.cc> <6da605b1f8391faafc1eda171fed326eea1d9147.camel@redhat.com> <6788F9F1-F95C-4BAE-8E55-2BE14C321DA6@cern.ch> Message-ID: Thanks a lot for your feedbacks The possibility to quota flavours would indeed address also my use case. Cheers, Massimo On Tue, Oct 29, 2019 at 6:14 PM Tim Bell wrote: > We’ve had similar difficulties with a need to quota flavours .. cinder has > a nice feature for this but with nova, I think we ended up creating two > distinct projects and exposing the different flavours to the different > projects, each with the related quota… from a user interface perspective, > it means they’re switching projects more often than is ideal but it does > control the limits. > > Tim > > > On 29 Oct 2019, at 17:17, Sean Mooney wrote: > > > > the normal way to achive this in the past would have been to create host > aggreate and then > > use the AggregateTypeAffinityFilter to map flavor to specific host > aggrates. > > > > so you can have a 2xOvercommit and a 4xOvercommit and map them to > different host aggrates that have different over > > commit ratios set on the compute nodes. > > > > On Tue, 2019-10-29 at 10:45 -0500, Eric Fried wrote: > >> Massimo- > >> > >>> To decide if an instance should go to a compute node with or without > >>> overcommitment is easy; e.g. it could be done with host aggregates + > >>> setting metadata to the relevant flavors/images. > > ya that basicaly the same as what i said above > >> > >> You could also use custom traits. > > traits would work yes it woudl be effectivly the same but would have the > advatage of having placment > > do most of the filtering so it should perform better. > >> > >>> But is it in some way possible to decide that a certain project has a > >>> quota of x VCPUs without overcommitment, and y VCPUs with > overcommitments ? > >> > >> I'm not sure whether this helps, but it's easy to detect the allocation > >> ratio of a compute node's VCPU resource via placement with GET > >> /resource_providers/$cn_uuid/inventories/VCPU [1]. > >> > >> But breaking down a VCPU quota into different "classes" of VCPU > >> sounds... impossible to me. > > this is something that is not intended to be supported with unified > limits at least not initially? ever? > >> > >> But since you said > >> > >>> In particular I would like to use some compute nodes without > >>> overcommitments > >> > >> ...perhaps it would help you to use PCPUs instead of VCPUs for these. We > >> started reporting PCPUs in Train [2]. > > ya pcpus are a good choice for the nova over commit case for cpus. > > hugepages are the equivalent for memory. > > idealy you should avoid disk over commit but if you have to do it use > cinder when you > > need over commit and local storage whne you do not. > >> > >> efried > >> > >> [1] > >> > > > https://docs.openstack.org/api-ref/placement/?expanded=show-resource-provider-inventory-detail#show-resource-provider-inventory > >> [2] > >> > http://specs.openstack.org/openstack/nova-specs/specs/train/approved/cpu-resources.html > >> > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arne.wiebalck at cern.ch Wed Oct 30 07:05:12 2019 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Wed, 30 Oct 2019 08:05:12 +0100 Subject: [ops][nova] Different quotas for different SLAs ? In-Reply-To: References: <84eee336-e79e-47a6-d9e6-ba66904f2465@fried.cc> <6da605b1f8391faafc1eda171fed326eea1d9147.camel@redhat.com> <6788F9F1-F95C-4BAE-8E55-2BE14C321DA6@cern.ch> Message-ID: <594252EA-C717-4493-8FA6-27EEA7C19B48@cern.ch> Another use case where per flavour quotas would be helpful is bare metal provisioning: since the flavors are tied via the resource class to specific classes of physical machines, the usual instance/cores/RAM quotas do not help the user to see how many instances of which type can still be created. Having per flavor (resource class, h/w type) projects is what we do for larger chunks of identical hardware, but this is less practical for users with access to fewer machines of many different types. Cheers, Arne > On 30 Oct 2019, at 06:36, Massimo Sgaravatto wrote: > > Thanks a lot for your feedbacks > The possibility to quota flavours would indeed address also my use case. > > Cheers, Massimo > > > On Tue, Oct 29, 2019 at 6:14 PM Tim Bell > wrote: > We’ve had similar difficulties with a need to quota flavours .. cinder has a nice feature for this but with nova, I think we ended up creating two distinct projects and exposing the different flavours to the different projects, each with the related quota… from a user interface perspective, it means they’re switching projects more often than is ideal but it does control the limits. > > Tim > > > On 29 Oct 2019, at 17:17, Sean Mooney > wrote: > > > > the normal way to achive this in the past would have been to create host aggreate and then > > use the AggregateTypeAffinityFilter to map flavor to specific host aggrates. > > > > so you can have a 2xOvercommit and a 4xOvercommit and map them to different host aggrates that have different over > > commit ratios set on the compute nodes. > > > > On Tue, 2019-10-29 at 10:45 -0500, Eric Fried wrote: > >> Massimo- > >> > >>> To decide if an instance should go to a compute node with or without > >>> overcommitment is easy; e.g. it could be done with host aggregates + > >>> setting metadata to the relevant flavors/images. > > ya that basicaly the same as what i said above > >> > >> You could also use custom traits. > > traits would work yes it woudl be effectivly the same but would have the advatage of having placment > > do most of the filtering so it should perform better. > >> > >>> But is it in some way possible to decide that a certain project has a > >>> quota of x VCPUs without overcommitment, and y VCPUs with overcommitments ? > >> > >> I'm not sure whether this helps, but it's easy to detect the allocation > >> ratio of a compute node's VCPU resource via placement with GET > >> /resource_providers/$cn_uuid/inventories/VCPU [1]. > >> > >> But breaking down a VCPU quota into different "classes" of VCPU > >> sounds... impossible to me. > > this is something that is not intended to be supported with unified limits at least not initially? ever? > >> > >> But since you said > >> > >>> In particular I would like to use some compute nodes without > >>> overcommitments > >> > >> ...perhaps it would help you to use PCPUs instead of VCPUs for these. We > >> started reporting PCPUs in Train [2]. > > ya pcpus are a good choice for the nova over commit case for cpus. > > hugepages are the equivalent for memory. > > idealy you should avoid disk over commit but if you have to do it use cinder when you > > need over commit and local storage whne you do not. > >> > >> efried > >> > >> [1] > >> > > https://docs.openstack.org/api-ref/placement/?expanded=show-resource-provider-inventory-detail#show-resource-provider-inventory > >> [2] > >> http://specs.openstack.org/openstack/nova-specs/specs/train/approved/cpu-resources.html > >> > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Wed Oct 30 07:05:59 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 30 Oct 2019 15:05:59 +0800 Subject: [auto-scaling] SIG PTG schedule. please join us :) Message-ID: Hi all PTG is right next week, so if you're interested in Auto-scaling features, please join us. Our session will be hosted on 11/6 for Half-day from 9:00 am to 12:30 pm in room 431. You can suggest PTG sessions in our etherpad: https://etherpad.openstack.org/p/PVG-auto-scaling-sig Feel free to join our IRC channel as well: #openstack-auto-scaling Also, you can check out our new documents for autoscaling: https://docs.openstack.org/auto-scaling-sig And provide any feedback/ bug report for any Auto-scaling issue related to OpenStack in https://storyboard.openstack.org/#!/project/openstack/auto-scaling-sig Here's my Wechat ID: RicoLinCloudGeek See you all in Shanghai! -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Wed Oct 30 07:23:14 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 30 Oct 2019 15:23:14 +0800 Subject: [meta-sig] SIG PTG schedule. feel free to join! Message-ID: Hi all PTG is right next week, so if you're interested in SIG governance or you like to walk by and say hello, please join us! Both TC and UC will have people there as well! Our session will be hosted on 11/6 for a quarter-day from 3:20 pm to 5:00 pm in room 431. You can suggest PTG sessions in our etherpad: https://etherpad.openstack.org/p/PVG-meta-sig Here's my Wechat ID, if you like to find me there: RicoLinCloudGeek See you all in Shanghai! -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Wed Oct 30 07:26:38 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 30 Oct 2019 15:26:38 +0800 Subject: [meta-sig] SIG PTG schedule. feel free to join! In-Reply-To: References: Message-ID: Sorry about this mistake, it should be 11/5 instead of 6th On Wed, Oct 30, 2019 at 3:23 PM Rico Lin wrote: > Hi all > > PTG is right next week, so if you're interested in SIG governance or you > like to walk by and say hello, please join us! Both TC and UC will have > people there as well! > Our session will be hosted on 11/6 for a quarter-day from 3:20 pm to 5:00 > pm in room 431. > You can suggest PTG sessions in our etherpad: > https://etherpad.openstack.org/p/PVG-meta-sig > > Here's my Wechat ID, if you like to find me there: RicoLinCloudGeek > See you all in Shanghai! > > -- > May The Force of OpenStack Be With You, > > *Rico Lin*irc: ricolin > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Wed Oct 30 07:26:49 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 30 Oct 2019 15:26:49 +0800 Subject: [auto-scaling] SIG PTG schedule. please join us :) In-Reply-To: References: Message-ID: Sorry about this mistake, it should be 11/5 instead of 6th On Wed, Oct 30, 2019 at 3:05 PM Rico Lin wrote: > Hi all > > PTG is right next week, so if you're interested in Auto-scaling features, > please join us. > Our session will be hosted on 11/6 for Half-day from 9:00 am to 12:30 pm > in room 431. > You can suggest PTG sessions in our etherpad: > https://etherpad.openstack.org/p/PVG-auto-scaling-sig > Feel free to join our IRC channel as well: #openstack-auto-scaling > > Also, you can check out our new documents for autoscaling: > https://docs.openstack.org/auto-scaling-sig > And provide any feedback/ bug report for any Auto-scaling issue related > to OpenStack in > https://storyboard.openstack.org/#!/project/openstack/auto-scaling-sig > > Here's my Wechat ID: RicoLinCloudGeek > See you all in Shanghai! > > -- > May The Force of OpenStack Be With You, > > *Rico Lin*irc: ricolin > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Wed Oct 30 07:41:42 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 30 Oct 2019 15:41:42 +0800 Subject: [heat] Heat PTG Schedule. Please join us! Message-ID: Dear all, As Summit and PTG is right next week. would like to update our schedule with you. Our PTG session will be hosted on 11/7 full-day from 9:00 am to 4:30 pm in room Grizzly. You can suggest PTG sessions in our etherpad: https://etherpad.openstack.org/p/PVG-heat We also have project update session on Monday, November 4, 2:35pm-2:50pm at Shanghai Expo Center - 5th Floor - 515 And Some sessions with heat tags on for your consideration: - Quick, Solid, and Automatic - OpenStack Bare-metal Orchestration - Monday, November 4, 1:20pm-2:00pm, on Shanghai Expo Center - 6th Floor - 619 - Forum: Project Resource Cleanup - followup - Monday, November 4, 11:40am-12:20pm in Shanghai Expo Center - 4th Floor - 431 - Leveraging OpenStack IaaS to run a Cloud CI/CD pipeline - Monday, November 4, 2:05pm-2:15pm in Shanghai Expo Center - 5th Floor - Marketplace Theater - The Little Bag O'Tricks: 10 things you might not know you can do with OpenStack - Tuesday, November 5, 11:40am-12:20pm in Shanghai Expo Center - 6th Floor - 617 - Enabling a Secure Industry 4.0transition for ShangHai Electric with OpenStack based Edge Clouds - Wednesday, November 6, 3:20pm-4:00pm in Shanghai Expo Center - 6th Floor - 617 Here's my Wechat ID if you would like to find me there: RicoLinCloudGeek See you all in Shanghai! -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rleander at redhat.com Wed Oct 30 08:33:17 2019 From: rleander at redhat.com (Rain Leander) Date: Wed, 30 Oct 2019 09:33:17 +0100 Subject: [PTG][interviews] We Want To Talk With YOU! Message-ID: I am attending OIS and PTG and will conduct interviews on Friday 08 November in room Liberty. If you're keen and able, I'd love to talk with you. [0] These interviews have several purposes. Please consider all of the following when thinking about what you might want to say in your interview: * Tell the users/customers/press what you've been working on in Rocky * Give them some idea of what's (what might be?) coming in Stein * Put a human face on the OpenStack project and encourage new participants to join us * You're welcome to promote your company's involvement in OpenStack but we ask that you avoid any kind of product pitches or job recruitment In the interview I'll ask some leading questions and it'll go easier if you've given some thought to them ahead of time: * Who are you? (Your name, your employer, and the project(s) on which you are active.) * What did you accomplish in Rocky? (Focus on the 2-3 things that will be most interesting to cloud operators) * What do you expect to be the focus in Stein? (At the time of your interview, it's likely that the meetings will not yet have decided anything firm. That's ok.) * Anything further about the project(s) you work on or the OpenStack community in general. Finally, note that there are only 33 interview slots available, so please consider coordinating with your project to designate the people that you want to represent the project, so that we don't end up with 12 interview about Neutron, or whatever. It's fine to have multiple people in one interview - Maximum 3, probably. Interview slots are 30 minutes, in which time we hope to capture somewhere between 10 and 20 minutes of content. It's fine to run shorter, but 15 minutes is probably an ideal length. [0] https://docs.google.com/spreadsheets/d/1XwRC_nHm2hZs37Hlrb5fUl4zGtOGgekj1oHaGn3TBxA/edit?usp=sharing -- K Rain Leander OpenStack Community Liaison Open Source Program Office https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Wed Oct 30 09:07:44 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 30 Oct 2019 10:07:44 +0100 Subject: [PTG][interviews] We Want To Talk With YOU! In-Reply-To: References: Message-ID: <20191030090744.rgej6yfdfbwwmhoe@skaplons-mac> Hi, On Wed, Oct 30, 2019 at 09:33:17AM +0100, Rain Leander wrote: > I am attending OIS and PTG and will conduct interviews on Friday 08 > November in room Liberty. If you're keen and able, I'd love to talk with > you. [0] > > These interviews have several purposes. Please consider all of the > following when thinking about what you might want to say in your interview: > > * Tell the users/customers/press what you've been working on in Rocky > * Give them some idea of what's (what might be?) coming in Stein I think that those versions above are here just by mistake and it should be Train and Ussuri now, right? :) > * Put a human face on the OpenStack project and encourage new participants > to join us > * You're welcome to promote your company's involvement in OpenStack but we > ask that you avoid any kind of product pitches or job recruitment > > In the interview I'll ask some leading questions and it'll go easier if > you've given some thought to them ahead of time: > > * Who are you? (Your name, your employer, and the project(s) on which you > are active.) > * What did you accomplish in Rocky? (Focus on the 2-3 things that will be > most interesting to cloud operators) > * What do you expect to be the focus in Stein? (At the time of your > interview, it's likely that the meetings will not yet have decided anything > firm. That's ok.) > * Anything further about the project(s) you work on or the OpenStack > community in general. > > Finally, note that there are only 33 interview slots available, so please > consider coordinating with your project to designate the people that you > want to represent the project, so that we don't end up with 12 interview > about Neutron, or whatever. It's fine to have multiple people in one > interview - Maximum 3, probably. > > Interview slots are 30 minutes, in which time we hope to capture somewhere > between 10 and 20 minutes of content. It's fine to run shorter, but 15 > minutes is probably an ideal length. > > [0] > https://docs.google.com/spreadsheets/d/1XwRC_nHm2hZs37Hlrb5fUl4zGtOGgekj1oHaGn3TBxA/edit?usp=sharing > > -- > K Rain Leander > OpenStack Community Liaison > Open Source Program Office > https://www.rdoproject.org/ > http://community.redhat.com -- Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Wed Oct 30 10:00:20 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 30 Oct 2019 11:00:20 +0100 Subject: [tc][all][goal]: Ussuri community goal candidate 3: 'Consistent and secure default policies' In-Reply-To: <68d21e29-0ef8-42b6-bcd1-6d7b5fe2ea79@www.fastmail.com> References: <16df4609432.e8a5d43f95570.8265671559696188212@ghanshyammann.com> <68d21e29-0ef8-42b6-bcd1-6d7b5fe2ea79@www.fastmail.com> Message-ID: <20191030100020.epbuiov4jcufxktr@skaplons-mac> Hi Colleen, We have this as one of the topic to discuss during PTG in Neutron team. I don't have full agenda ready yet, but I can ping You when it will be exactly if You would like to join. On Tue, Oct 29, 2019 at 04:53:19PM -0700, Colleen Murphy wrote: > On Tue, Oct 22, 2019, at 09:50, Ghanshyam Mann wrote: > > Hello Everyone, > > > > This is 3rd proposal candidate for the Ussuri cycle community-wide > > goal. The other two are [1]. > > > > Colleen proposed this idea for the Ussuri cycle community goal. > > > > Projects implemented/plan to implement this: > > *Keystone already implemented this with all necessary support in > > oslo.policy with nice documentation. > > * We discussed this in nova train PTG to implement it in nova [2]. Nova > > spec was merged in Train but > > could not implement. I have re-proposed the spec for the Ussuri > > cycle [3]. > > > > This is nice idea as a goal from the user perspective. Colleen has less > > bandwidth to drive this goal alone. > > We are looking for a champion or co-champions (1-2 people will be much > > better) this goal along with Colleen. > > > > Also, let us know what do you think of this as a community goal? Any > > query or Improvement Feedback? > > [snipped] > > It's possible that this won't work very well as a community goal. This migration took the keystone team about two cycles to implement, not including all the planning and foundational work that needed to happen first. In our virtual PTG meeting today we discussed the possibility of instead forming a pop-up team around this work so that a group of individuals (across several teams, not just the keystone team) could target the largest projects and make more of a dent in the work over a couple of cycles before using the community goal model to close the gaps in the smaller or more peripheral projects. However, we will still need to get buy-in from the projects so that we can be sure that the work the pop-up team does gets prioritized and reviewed. > > I've drafted a set of steps that we'd expect projects to follow to get this done: > > https://etherpad.openstack.org/p/policy-migration-steps > > And we're holding a forum session on it next week: > > https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24357/next-steps-for-policy-in-openstack > > Colleen > -- Slawek Kaplonski Senior software engineer Red Hat From zhipengh512 at gmail.com Wed Oct 30 10:12:46 2019 From: zhipengh512 at gmail.com (Zhipeng Huang) Date: Wed, 30 Oct 2019 18:12:46 +0800 Subject: [resource-management-sig] Status of the "Resource Management" SIG In-Reply-To: <5a84404f-0e10-9010-61ed-29aff08b5ec6@openstack.org> References: <5a84404f-0e10-9010-61ed-29aff08b5ec6@openstack.org> Message-ID: Hi Thierry, Intend to launch the activity during Shanghai PTG via Cyborg sessions, for those you are interested in topics like Kubernetes, OCP OAI, RISC-V you are more than welcomed to reach out to me. On Thu, Oct 24, 2019 at 10:19 PM Thierry Carrez wrote: > Hi, > > I was wondering about the status of the Resource Management SIG... It's > been "forming" according to https://wiki.openstack.org/wiki/Res_Mgmt_SIG > since January 2018... And I could'nt find a reference or log to any > meeting after that. > > Does anyone have updated status on this one? Should it be removed from > the list of active SIGs at https://governance.openstack.org/sigs/ ? > > -- > Thierry Carrez (ttx) > > -- Zhipeng (Howard) Huang Principle Engineer OpenStack, Kubernetes, CNCF, LF Edge, ONNX, Kubeflow, OpenSDS, Open Service Broker API, OCP, Hyperledger, ETSI, SNIA, DMTF, W3C -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhipengh512 at gmail.com Wed Oct 30 10:14:00 2019 From: zhipengh512 at gmail.com (Zhipeng Huang) Date: Wed, 30 Oct 2019 18:14:00 +0800 Subject: [PTG][interviews] We Want To Talk With YOU! In-Reply-To: References: Message-ID: Hi Rain, The Cyborg team did a great interview at the Denver PTG but it seems there is no recording of that session available, is it still possible to make it available on YouTube ? On Wed, Oct 30, 2019 at 4:39 PM Rain Leander wrote: > I am attending OIS and PTG and will conduct interviews on Friday 08 > November in room Liberty. If you're keen and able, I'd love to talk with > you. [0] > > These interviews have several purposes. Please consider all of the > following when thinking about what you might want to say in your interview: > > * Tell the users/customers/press what you've been working on in Rocky > * Give them some idea of what's (what might be?) coming in Stein > * Put a human face on the OpenStack project and encourage new participants > to join us > * You're welcome to promote your company's involvement in OpenStack but we > ask that you avoid any kind of product pitches or job recruitment > > In the interview I'll ask some leading questions and it'll go easier if > you've given some thought to them ahead of time: > > * Who are you? (Your name, your employer, and the project(s) on which you > are active.) > * What did you accomplish in Rocky? (Focus on the 2-3 things that will be > most interesting to cloud operators) > * What do you expect to be the focus in Stein? (At the time of your > interview, it's likely that the meetings will not yet have decided anything > firm. That's ok.) > * Anything further about the project(s) you work on or the OpenStack > community in general. > > Finally, note that there are only 33 interview slots available, so please > consider coordinating with your project to designate the people that you > want to represent the project, so that we don't end up with 12 interview > about Neutron, or whatever. It's fine to have multiple people in one > interview - Maximum 3, probably. > > Interview slots are 30 minutes, in which time we hope to capture somewhere > between 10 and 20 minutes of content. It's fine to run shorter, but 15 > minutes is probably an ideal length. > > [0] > https://docs.google.com/spreadsheets/d/1XwRC_nHm2hZs37Hlrb5fUl4zGtOGgekj1oHaGn3TBxA/edit?usp=sharing > > -- > K Rain Leander > OpenStack Community Liaison > Open Source Program Office > https://www.rdoproject.org/ > http://community.redhat.com > -- Zhipeng (Howard) Huang Principle Engineer OpenStack, Kubernetes, CNCF, LF Edge, ONNX, Kubeflow, OpenSDS, Open Service Broker API, OCP, Hyperledger, ETSI, SNIA, DMTF, W3C -------------- next part -------------- An HTML attachment was scrubbed... URL: From i at liuyulong.me Wed Oct 30 11:57:54 2019 From: i at liuyulong.me (=?utf-8?B?TElVIFl1bG9uZw==?=) Date: Wed, 30 Oct 2019 19:57:54 +0800 Subject: [Neutron] cancel the L3 meeting today Message-ID: Hi all, Due to the Summit and PTG next week, I will cancel the Neutron L3 meeting today. So see you there, and travel safe. Thanks. LIU Yulong -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Oct 30 11:59:19 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 30 Oct 2019 11:59:19 +0000 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: <20191030004035.rsuegdsij2eezps3@mthode.org> References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> <16e19144cf0.f6b07849311271.7773306777497055114@ghanshyammann.com> <20191030004035.rsuegdsij2eezps3@mthode.org> Message-ID: On Tue, 2019-10-29 at 19:40 -0500, Matthew Thode wrote: > On 19-10-29 14:53:11, Ghanshyam Mann wrote: > > ---- On Thu, 24 Oct 2019 14:32:03 -0500 Ghanshyam Mann wrote ---- > > > Hello Everyone, > > > > > > We had good amount of discussion on the final plan and schedule in today's TC office hour[1]. > > > > > > I captured the agreement on each point in etherpad (you can see the AGREE:). Also summarizing > > > the discussions here. Imp point is if your projects are planning to keep the py2.7 support then do not delay > > > to tell us. Reply on this ML thread or add your project in etherpad. > > > > > > - Projects can start dropping the py2.7 support. Common lib and testing tools need to wait until milestone-2. > > > ** pepe8 job to be included in openstack-python3-ussuri-jobs-* templates - > > https://review.opendev.org/#/c/688997/ > > > ** You can drop openstack-python-jobs template and start using ussuri template once 688997 patch is merged. > > > ** Cross projects dependency (if any ) can be sync up among dependent projects. > > > > > > - I will add this plan and schedule as a community goal. The goal is more about what all things to do and when. > > > ** If any project keeping the support then it has to be notified explicitly for its consumer. > > > > > > - Schedule: > > > The schedule is aligned with the Ussuri cycle milestone[2]. I will add the plan in the release schedule also. > > > Phase-1: Dec 09 - Dec 13 R-22 Ussuri-1 milestone > > > ** Project to start dropping the py2 support along with all the py2 CI jobs. > > > Phase-2: Feb 10 - Feb 14 R-13 Ussuri-2 milestone > > > ** This includes Oslo, QA tools (or any other testing tools), common lib (os-brick), Client library. > > > ** This will give enough time to projects to drop the py2 support. > > > Phase-3: Apr 06 - Apr 10 R-5 Ussuri-3 milestone > > > ** Final audit on Phase-1 and Phase-2 plan and make sure everything is done without breaking anything. > > > This is enough time to measure such break or anything extra to do before ussuri final release. > > > > > > Other discussions points and agreement: > > > - Projects want to keep python 2 support and need oslo, QA or any other dependent projects/lib support: > > > ** swift. AI: gmann to reach out to swift team about the plan and exact required things from its dependency > > > (the common lib/testing tool). > > > > I chated with timburke on IRC about things required by swift to keep the py2.7 support[1]. Below are > > client lib/middleware swift required for py2 testing. > > @timburke, feel free to update if any missing point. > > > > - devstack. able to keep running swift on py2 and rest all services can be on py3 > > - keystonemiddleware and its dependency > > - keystoneclient and openstackclient (dep of keystonemiddleware) > > - castellan and barbicanclient > > > > > > As those lib/middleware going to drop the py2.7 support in phase-2, we need to cap them for swift. > > I think capping them for python2.7 in upper constraint file would not affect any other users but Matthew Thode can > > explain better how that will work from the requirement constraint perspective. > > > > [1] > > http://eavesdrop.openstack.org/irclogs/%23openstack-swift/%23openstack-swift.2019-10-28.log.html#t2019-10-28T16:37:33 > > > > -gmann > > > > ya, there are examples already for libs that have dropped py2 support. > What you need to do is update global requirements to be something like > the following. > > sphinx!=1.6.6,!=1.6.7,<2.0.0;python_version=='2.7' # BSD > sphinx!=1.6.6,!=1.6.7,!=2.1.0;python_version>='3.4' # BSD > > or > > keyring<19.0.0;python_version=='2.7' # MIT/PSF > keyring;python_version>='3.4' # MIT/PSF on a related note os-vif is blocked form running tempest jobs under python 3 until https://review.opendev.org/#/c/681029/ is merged due to https://zuul.opendev.org/t/openstack/build/4ff60d6bd2f24782abeb12cc7bdb8013/log/controller/logs/screen-q-agt.txt.gz#308-318 i think this issue will affect any job that install proejcts that use privsep using the required-proejcts section of the zuul job definition. adding a project to required-proejcts sechtion adds it to the LIBS_FROM_GIT varible in devstack. this inturn istalls it twice due to https://review.opendev.org/#/c/418135/ . the side effect of this is that the privsep helper script gets installed under python2 and the neutron ageint in this case gets install under python 3 so when it trys to spawn the privsep deamon and invoke commands it typically expodes due to dependcy issues or in this case because it failed to drop privileges correctly. so as part of phase 1 we need to merge https://review.opendev.org/#/c/681029/ so that lib project that use required- projects to run with master of project that comsume it and support depends-on can move to python 3 tempest jobs. > From smooney at redhat.com Wed Oct 30 12:07:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 30 Oct 2019 12:07:54 +0000 Subject: [ops][nova] Different quotas for different SLAs ? In-Reply-To: <594252EA-C717-4493-8FA6-27EEA7C19B48@cern.ch> References: <84eee336-e79e-47a6-d9e6-ba66904f2465@fried.cc> <6da605b1f8391faafc1eda171fed326eea1d9147.camel@redhat.com> <6788F9F1-F95C-4BAE-8E55-2BE14C321DA6@cern.ch> <594252EA-C717-4493-8FA6-27EEA7C19B48@cern.ch> Message-ID: On Wed, 2019-10-30 at 08:05 +0100, Arne Wiebalck wrote: > Another use case where per flavour quotas would be helpful is bare metal provisioning: > since the flavors are tied via the resource class to specific classes of physical machines, > the usual instance/cores/RAM quotas do not help the user to see how many instances > of which type can still be created. > > Having per flavor (resource class, h/w type) projects is what we do for larger chunks of > identical hardware, but this is less practical for users with access to fewer machines of > many different types. flavor quota are not the direction we are currently persuing with quoats and unifed limits at present it has been discussed in the past but we are actully moving in the direction of allowing quota based on placmenet resouce classes. https://review.opendev.org/#/c/602201/ i personally think in the long run the unified limits apporch based on placment resouce classes is a better approch then flavor quotas so i woudl prefer to expend our energy completing that work then designing a flavor based quota mechanism that nova would have to maintain. that said i would encourage you to review that spec and consider if it addresses your usecases. for the ironic case i think it does quite nicely. for the sla case i dont think it does but there may be a way to extended it after the inital version is complete to allow that. e.g. by allow int the quota to be placed on a resouce class + trait in stead of just a resource class. that would complicate this however so i think that would be best left out of scope of v1 of unifed limits. > > Cheers, > Arne > > > > On 30 Oct 2019, at 06:36, Massimo Sgaravatto wrote: > > > > Thanks a lot for your feedbacks > > The possibility to quota flavours would indeed address also my use case. > > > > Cheers, Massimo > > > > > > On Tue, Oct 29, 2019 at 6:14 PM Tim Bell > wrote: > > We’ve had similar difficulties with a need to quota flavours .. cinder has a nice feature for this but with nova, I > > think we ended up creating two distinct projects and exposing the different flavours to the different projects, each > > with the related quota… from a user interface perspective, it means they’re switching projects more often than is > > ideal but it does control the limits. > > > > Tim > > > > > On 29 Oct 2019, at 17:17, Sean Mooney > wrote: > > > > > > the normal way to achive this in the past would have been to create host aggreate and then > > > use the AggregateTypeAffinityFilter to map flavor to specific host aggrates. > > > > > > so you can have a 2xOvercommit and a 4xOvercommit and map them to different host aggrates that have different over > > > commit ratios set on the compute nodes. > > > > > > On Tue, 2019-10-29 at 10:45 -0500, Eric Fried wrote: > > > > Massimo- > > > > > > > > > To decide if an instance should go to a compute node with or without > > > > > overcommitment is easy; e.g. it could be done with host aggregates + > > > > > setting metadata to the relevant flavors/images. > > > > > > ya that basicaly the same as what i said above > > > > > > > > You could also use custom traits. > > > > > > traits would work yes it woudl be effectivly the same but would have the advatage of having placment > > > do most of the filtering so it should perform better. > > > > > > > > > But is it in some way possible to decide that a certain project has a > > > > > quota of x VCPUs without overcommitment, and y VCPUs with overcommitments ? > > > > > > > > I'm not sure whether this helps, but it's easy to detect the allocation > > > > ratio of a compute node's VCPU resource via placement with GET > > > > /resource_providers/$cn_uuid/inventories/VCPU [1]. > > > > > > > > But breaking down a VCPU quota into different "classes" of VCPU > > > > sounds... impossible to me. > > > > > > this is something that is not intended to be supported with unified limits at least not initially? ever? > > > > > > > > But since you said > > > > > > > > > In particular I would like to use some compute nodes without > > > > > overcommitments > > > > > > > > ...perhaps it would help you to use PCPUs instead of VCPUs for these. We > > > > started reporting PCPUs in Train [2]. > > > > > > ya pcpus are a good choice for the nova over commit case for cpus. > > > hugepages are the equivalent for memory. > > > idealy you should avoid disk over commit but if you have to do it use cinder when you > > > need over commit and local storage whne you do not. > > > > > > > > efried > > > > > > > > [1] > > > > > > > > > > https://docs.openstack.org/api-ref/placement/?expanded=show-resource-provider-inventory-detail#show-resource-provider-inventory > > > < > > > https://docs.openstack.org/api-ref/placement/?expanded=show-resource-provider-inventory-detail#show-resource-provider-inventory > > > > > > > > [2] > > > > http://specs.openstack.org/openstack/nova-specs/specs/train/approved/cpu-resources.html < > > > > http://specs.openstack.org/openstack/nova-specs/specs/train/approved/cpu-resources.html> > > > > > > > > > > > > From colleen at gazlene.net Wed Oct 30 12:50:08 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Wed, 30 Oct 2019 05:50:08 -0700 Subject: =?UTF-8?Q?Re:_[tc][all][goal]:_Ussuri_community_goal_candidate_3:_'Consi?= =?UTF-8?Q?stent_and_secure_default_policies'?= In-Reply-To: <20191030100020.epbuiov4jcufxktr@skaplons-mac> References: <16df4609432.e8a5d43f95570.8265671559696188212@ghanshyammann.com> <68d21e29-0ef8-42b6-bcd1-6d7b5fe2ea79@www.fastmail.com> <20191030100020.epbuiov4jcufxktr@skaplons-mac> Message-ID: <87d0e6f5-baf1-4e42-91c0-82aceb4b1a7b@www.fastmail.com> On Wed, Oct 30, 2019, at 03:00, Slawek Kaplonski wrote: > Hi Colleen, > > We have this as one of the topic to discuss during PTG in Neutron team. > I don't have full agenda ready yet, but I can ping You when it will be exactly > if You would like to join. I'd be happy to join, please do let me know. Colleen > > On Tue, Oct 29, 2019 at 04:53:19PM -0700, Colleen Murphy wrote: > > On Tue, Oct 22, 2019, at 09:50, Ghanshyam Mann wrote: > > > Hello Everyone, > > > > > > This is 3rd proposal candidate for the Ussuri cycle community-wide > > > goal. The other two are [1]. > > > > > > Colleen proposed this idea for the Ussuri cycle community goal. > > > > > > Projects implemented/plan to implement this: > > > *Keystone already implemented this with all necessary support in > > > oslo.policy with nice documentation. > > > * We discussed this in nova train PTG to implement it in nova [2]. Nova > > > spec was merged in Train but > > > could not implement. I have re-proposed the spec for the Ussuri > > > cycle [3]. > > > > > > This is nice idea as a goal from the user perspective. Colleen has less > > > bandwidth to drive this goal alone. > > > We are looking for a champion or co-champions (1-2 people will be much > > > better) this goal along with Colleen. > > > > > > Also, let us know what do you think of this as a community goal? Any > > > query or Improvement Feedback? > > > > [snipped] > > > > It's possible that this won't work very well as a community goal. This migration took the keystone team about two cycles to implement, not including all the planning and foundational work that needed to happen first. In our virtual PTG meeting today we discussed the possibility of instead forming a pop-up team around this work so that a group of individuals (across several teams, not just the keystone team) could target the largest projects and make more of a dent in the work over a couple of cycles before using the community goal model to close the gaps in the smaller or more peripheral projects. However, we will still need to get buy-in from the projects so that we can be sure that the work the pop-up team does gets prioritized and reviewed. > > > > I've drafted a set of steps that we'd expect projects to follow to get this done: > > > > https://etherpad.openstack.org/p/policy-migration-steps > > > > And we're holding a forum session on it next week: > > > > https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24357/next-steps-for-policy-in-openstack > > > > Colleen > > > > -- > Slawek Kaplonski > Senior software engineer > Red Hat > > From flux.adam at gmail.com Wed Oct 30 13:26:10 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Wed, 30 Oct 2019 22:26:10 +0900 Subject: [ospurge] looking for project owners / considering adoption In-Reply-To: <342983ed-1d22-8f3a-3335-f153512ec2b2@catalyst.net.nz> References: <342983ed-1d22-8f3a-3335-f153512ec2b2@catalyst.net.nz> Message-ID: That's too bad that you won't be at the summit, but I think there may still be some discussion planned about this topic. Yeah, I understand completely about priorities and such internally. Same for me... It just happens that this IS priority work for us right now. :) On Tue, Oct 29, 2019, 07:48 Adrian Turjak wrote: > My apologies I missed this email. > > Sadly I won't be at the summit this time around. There may be some public > cloud focused discussions, and some of those often have this topic come up. > Also if Monty from the SDK team is around, I'd suggest finding him and > having a chat. > > I'll help if I can but we are swamped with internal work and I can't > dedicate much time to do upstream work that isn't urgent. :( > On 17/10/19 8:48 am, Adam Harwell wrote: > > That's interesting -- we have already started working to add features and > improve ospurge, and it seems like a plenty useful tool for our needs, but > I think I agree that it would be nice to have that functionality built into > the sdk. I might be able to help with both, since one is immediately useful > and we (like everyone) have deadlines to meet, and the other makes sense to > me as a possible future direction that could be more widely supported. > > Will you or someone else be hosting and discussion about this at the > Shanghai summit? I'll be there and would be happy to join and discuss. > > --Adam > > On Tue, Oct 15, 2019, 22:04 Adrian Turjak wrote: > >> I tried to get a community goal to do project deletion per project, but >> we ended up deciding that a community goal wasn't ideal unless we did >> build a bulk delete API in each service: >> https://review.opendev.org/#/c/639010/ >> https://etherpad.openstack.org/p/community-goal-project-deletion >> https://etherpad.openstack.org/p/DEN-Deletion-of-resources >> https://etherpad.openstack.org/p/DEN-Train-PublicCloudWG-brainstorming >> >> What we decided on, but didn't get a chance to work on, was building >> into the OpenstackSDK OS-purge like functionality, as well as reporting >> functionality (of all project resources to be deleted). That way we >> could have per project per resource deletion logic, and all of that >> defined in the SDK. >> >> I was up for doing some of the work, but ended up swamped with internal >> work and just didn't drive or push for the deletion work upstream. >> >> If you want to do something useful, don't pursue OS-Purge, help us add >> that official functionality to the SDK, and then we can push for bulk >> deletion APIs in each project to make resource deletion more pleasant. >> >> I'd be happy to help with the work, and Monty on the SDK team will most >> likely be happy to as well. :) >> >> Cheers, >> Adrian >> >> On 1/10/19 11:48 am, Adam Harwell wrote: >> > I haven't seen much activity on this project in a while, and it's been >> > moved to opendev/x since the opendev migration... Who is the current >> > owner of this project? Is there anyone who actually is maintaining it, >> > or would mind if others wanted to adopt the project to move it forward? >> > >> > Thanks, >> > --Adam Harwell >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Wed Oct 30 13:49:23 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 30 Oct 2019 14:49:23 +0100 Subject: [neutron][ptg] Agenda for PTG Message-ID: <20191030134923.hztu6lkfy7y5m2jb@skaplons-mac> Hi neutrinos, I prepared agenda for our PTG sessions in Shanghai. It's available at [1]. Please check that and let me know if You need to add/change anything. [1] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning -- Slawek Kaplonski Senior software engineer Red Hat From artem.goncharov at gmail.com Wed Oct 30 14:43:30 2019 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Wed, 30 Oct 2019 15:43:30 +0100 Subject: [ospurge] looking for project owners / considering adoption In-Reply-To: References: <342983ed-1d22-8f3a-3335-f153512ec2b2@catalyst.net.nz> Message-ID: <576E74EB-ED80-497F-9706-482FE0433208@gmail.com> Hi Adam, Since I need this now as well I will start working on implementation how it was agreed (in SDK and in OSC) during last summit by mid of November. There is no need for discussing this further, it just need to be implemented. Sad that we got no progress in half a year. Regards, Artem (gtema). > On 30. Oct 2019, at 14:26, Adam Harwell wrote: > > That's too bad that you won't be at the summit, but I think there may still be some discussion planned about this topic. > > Yeah, I understand completely about priorities and such internally. Same for me... It just happens that this IS priority work for us right now. :) > > > On Tue, Oct 29, 2019, 07:48 Adrian Turjak > wrote: > My apologies I missed this email. > > Sadly I won't be at the summit this time around. There may be some public cloud focused discussions, and some of those often have this topic come up. Also if Monty from the SDK team is around, I'd suggest finding him and having a chat. > > I'll help if I can but we are swamped with internal work and I can't dedicate much time to do upstream work that isn't urgent. :( > > On 17/10/19 8:48 am, Adam Harwell wrote: >> That's interesting -- we have already started working to add features and improve ospurge, and it seems like a plenty useful tool for our needs, but I think I agree that it would be nice to have that functionality built into the sdk. I might be able to help with both, since one is immediately useful and we (like everyone) have deadlines to meet, and the other makes sense to me as a possible future direction that could be more widely supported. >> >> Will you or someone else be hosting and discussion about this at the Shanghai summit? I'll be there and would be happy to join and discuss. >> >> --Adam >> >> On Tue, Oct 15, 2019, 22:04 Adrian Turjak > wrote: >> I tried to get a community goal to do project deletion per project, but >> we ended up deciding that a community goal wasn't ideal unless we did >> build a bulk delete API in each service: >> https://review.opendev.org/#/c/639010/ >> https://etherpad.openstack.org/p/community-goal-project-deletion >> https://etherpad.openstack.org/p/DEN-Deletion-of-resources >> https://etherpad.openstack.org/p/DEN-Train-PublicCloudWG-brainstorming >> >> What we decided on, but didn't get a chance to work on, was building >> into the OpenstackSDK OS-purge like functionality, as well as reporting >> functionality (of all project resources to be deleted). That way we >> could have per project per resource deletion logic, and all of that >> defined in the SDK. >> >> I was up for doing some of the work, but ended up swamped with internal >> work and just didn't drive or push for the deletion work upstream. >> >> If you want to do something useful, don't pursue OS-Purge, help us add >> that official functionality to the SDK, and then we can push for bulk >> deletion APIs in each project to make resource deletion more pleasant. >> >> I'd be happy to help with the work, and Monty on the SDK team will most >> likely be happy to as well. :) >> >> Cheers, >> Adrian >> >> On 1/10/19 11:48 am, Adam Harwell wrote: >> > I haven't seen much activity on this project in a while, and it's been >> > moved to opendev/x since the opendev migration... Who is the current >> > owner of this project? Is there anyone who actually is maintaining it, >> > or would mind if others wanted to adopt the project to move it forward? >> > >> > Thanks, >> > --Adam Harwell -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnsomor at gmail.com Wed Oct 30 16:24:08 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Wed, 30 Oct 2019 09:24:08 -0700 Subject: [tc][horizon][all] Horizon plugins maintenance In-Reply-To: References: Message-ID: Hi Ivan, Thank you for your offer and support. I raised this topic at the weekly Octavia meeting and we agreed to add the horizon core team to octavia-dashboard-core. I have updated the access on gerrit. Michael On Wed, Oct 23, 2019 at 5:44 AM Ivan Kolodyazhny wrote: > > Hi team, > > As you may know, we've got a pretty big list of Horizon Plugins [1]. Unfortunately, not all of them are in active development due to the lack of resources in projects teams. > > As a Horizon team, we understand all the reasons, and we're doing our best to help other teams to maintain plugins. > > That's why we're proposing our help to maintain horizon plugins. We raised this topic during the last Horizon weekly meeting [2] and we'll have some discussion during the PTG [3] too. > > There are a lot of Horizon changes which affect plugins and horizon team is ready to help: > - new Django versions > - dependencies updates > - Horizon API changes > - etc. > > To get faster fixes in, it would be good to have +2 permissions for the horizon-core team for each plugin. > > We helped Heat team during the last cycle adding horizon-core to the heat-dashboard-core team. Also, we've got +2 on other plugins via global project config [4] and via Gerrit configuration for (neutron-*aas-dashboard, tuskar-ui). > > Vitrage PTL agreed to do the same for vitrage-dashboard during the last meeting [5]. > > > Of course, it's up to each project to maintain horizon plugins and it's responsibilities but I would like to raise this topic to the TC too. I really sure, that it will speed up some critical fixes for Horizon plugins and makes users and operators experience better. > > > [1] https://docs.openstack.org/horizon/latest/install/plugin-registry.html > [2] http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-10-16-15.02.log.html#l-128 > [3] https://etherpad.openstack.org/p/horizon-u-ptg > [4] http://codesearch.openstack.org/?q=horizon-core&i=nope&files=&repos=openstack/project-config > [5] http://eavesdrop.openstack.org/meetings/vitrage/2019/vitrage.2019-10-23-08.03.log.html#l-21 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ From dougal at redhat.com Wed Oct 30 16:29:49 2019 From: dougal at redhat.com (Dougal Matthews) Date: Wed, 30 Oct 2019 16:29:49 +0000 Subject: Where do we support running tripleoclient Message-ID: Hi all, Pretty much as the subject says. Do we specify where we support running tripleoclient? In theory it should run anywhere with the correct environment variables. However, as far as I understand most users run it on the undercloud itself. Is this something we have a position on? To give a little more context, this is relevant when replacing Mistral with Ansible. Can we assume that we are running ansible only against localhost or do we need to create an inventory for the undercloud? Cheers, Dougal -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajatdhasmana at gmail.com Wed Oct 30 16:54:16 2019 From: rajatdhasmana at gmail.com (Rajat Dhasmana) Date: Wed, 30 Oct 2019 22:24:16 +0530 Subject: [cinder] changing the weekly meeting time In-Reply-To: References: Message-ID: Hi Brian, It's great that the change in weekly meeting time is considered, here are my opinions on the same from the perspective of Asian countries (having active upstream developers) Current meeting time (16:00 - 17:00 UTC) INDIA : is 9:30 - 10:30 PM IST (UTC+05:30) is a little late but manageable. CHINA : is 12:00 - 01:00 AM CST (UTC+08:00) is almost impossible to attend. JAPAN : is 01:00 - 02:00 AM JST (UTC+09:00) similar to the case as China. IMO shifting the meeting time 2 hours earlier (UTC 14:00) might bring more participation and would ease out timings for some (including me) but these are just my thoughts. Thanks and Regards Rajat Dhasmana On Thu, Oct 24, 2019 at 3:05 AM Brian Rosmaita wrote: > (Just to be completely clear -- we're only gathering information at this > point. The Cinder weekly meeting is still Wednesdays at 16:00 UTC.) > > As we discussed at today's meeting [0], a request has been made to hold > the weekly meeting earlier so that it would be friendlier for people in > Asia time zones. > > Based on the people in attendance today, it seems that a move to 14:00 > UTC is not out of the question. > > Thus, the point of this email is to solicit comments on whether we > should change the meeting time to 14:00 UTC. As you consider the impact > on yourself, if you are in a TZ that observes Daylight Savings Time, > keep in mind that most TZs go back to standard time over the next few > weeks. > > (I was going to insert an opinion here, but I will wait and respond in > this thread like everyone else.) > > cheers, > brian > > > [0] > > http://eavesdrop.openstack.org/meetings/cinder/2019/cinder.2019-10-23-16.00.log.html#l-166 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Wed Oct 30 16:59:56 2019 From: aschultz at redhat.com (Alex Schultz) Date: Wed, 30 Oct 2019 10:59:56 -0600 Subject: [tripleo] Re: Where do we support running tripleoclient In-Reply-To: References: Message-ID: fixed the subject On Wed, Oct 30, 2019 at 10:36 AM Dougal Matthews wrote: > Hi all, > > Pretty much as the subject says. Do we specify where we support running > tripleoclient? In theory it should run anywhere with the correct > environment variables. However, as far as I understand most users run it on > the undercloud itself. Is this something we have a position on? > > To give a little more context, this is relevant when replacing Mistral > with Ansible. Can we assume that we are running ansible only against > localhost or do we need to create an inventory for the undercloud? > > Where do we currently assume it's run from? the undercloud. Where should we assume it can be run from going forward? anywhere. For the mistral converted actions, it's likely that we might want to run them using localconnection rather than a specific host depending on the action (e.g. openstack api actions can be run using localconnection because they are just api calls). However yes I would think that we might want to construct an inventory file for remote actions. > Cheers, > Dougal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnsomor at gmail.com Wed Oct 30 17:01:06 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Wed, 30 Oct 2019 10:01:06 -0700 Subject: [octavia] Octavia weekly IRC meeting cancelled November 6th. Message-ID: As many of the cores are travelling to Shanghai and our meeting time would be at midnight, we are cancelling the November 6th Octavia IRC meeting. Safe travels everyone, Michael From gmann at ghanshyammann.com Wed Oct 30 17:03:33 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 30 Oct 2019 12:03:33 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack In-Reply-To: References: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> <16dfe4467a4.db6f72ec168733.7542022367023887408@ghanshyammann.com> <16dff41292e.11b7e81b1177136.7669214833037569841@ghanshyammann.com> <16e19144cf0.f6b07849311271.7773306777497055114@ghanshyammann.com> <20191030004035.rsuegdsij2eezps3@mthode.org> Message-ID: <16e1d9f5df9.e9dfc74911451.4806654031763681992@ghanshyammann.com> ---- On Wed, 30 Oct 2019 06:59:19 -0500 Sean Mooney wrote ---- > On Tue, 2019-10-29 at 19:40 -0500, Matthew Thode wrote: > > On 19-10-29 14:53:11, Ghanshyam Mann wrote: > > > ---- On Thu, 24 Oct 2019 14:32:03 -0500 Ghanshyam Mann wrote ---- > > > > Hello Everyone, > > > > > > > > We had good amount of discussion on the final plan and schedule in today's TC office hour[1]. > > > > > > > > I captured the agreement on each point in etherpad (you can see the AGREE:). Also summarizing > > > > the discussions here. Imp point is if your projects are planning to keep the py2.7 support then do not delay > > > > to tell us. Reply on this ML thread or add your project in etherpad. > > > > > > > > - Projects can start dropping the py2.7 support. Common lib and testing tools need to wait until milestone-2. > > > > ** pepe8 job to be included in openstack-python3-ussuri-jobs-* templates - > > > https://review.opendev.org/#/c/688997/ > > > > ** You can drop openstack-python-jobs template and start using ussuri template once 688997 patch is merged. > > > > ** Cross projects dependency (if any ) can be sync up among dependent projects. > > > > > > > > - I will add this plan and schedule as a community goal. The goal is more about what all things to do and when. > > > > ** If any project keeping the support then it has to be notified explicitly for its consumer. > > > > > > > > - Schedule: > > > > The schedule is aligned with the Ussuri cycle milestone[2]. I will add the plan in the release schedule also. > > > > Phase-1: Dec 09 - Dec 13 R-22 Ussuri-1 milestone > > > > ** Project to start dropping the py2 support along with all the py2 CI jobs. > > > > Phase-2: Feb 10 - Feb 14 R-13 Ussuri-2 milestone > > > > ** This includes Oslo, QA tools (or any other testing tools), common lib (os-brick), Client library. > > > > ** This will give enough time to projects to drop the py2 support. > > > > Phase-3: Apr 06 - Apr 10 R-5 Ussuri-3 milestone > > > > ** Final audit on Phase-1 and Phase-2 plan and make sure everything is done without breaking anything. > > > > This is enough time to measure such break or anything extra to do before ussuri final release. > > > > > > > > Other discussions points and agreement: > > > > - Projects want to keep python 2 support and need oslo, QA or any other dependent projects/lib support: > > > > ** swift. AI: gmann to reach out to swift team about the plan and exact required things from its dependency > > > > (the common lib/testing tool). > > > > > > I chated with timburke on IRC about things required by swift to keep the py2.7 support[1]. Below are > > > client lib/middleware swift required for py2 testing. > > > @timburke, feel free to update if any missing point. > > > > > > - devstack. able to keep running swift on py2 and rest all services can be on py3 > > > - keystonemiddleware and its dependency > > > - keystoneclient and openstackclient (dep of keystonemiddleware) > > > - castellan and barbicanclient > > > > > > > > > As those lib/middleware going to drop the py2.7 support in phase-2, we need to cap them for swift. > > > I think capping them for python2.7 in upper constraint file would not affect any other users but Matthew Thode can > > > explain better how that will work from the requirement constraint perspective. > > > > > > [1] > > > http://eavesdrop.openstack.org/irclogs/%23openstack-swift/%23openstack-swift.2019-10-28.log.html#t2019-10-28T16:37:33 > > > > > > -gmann > > > > > > > ya, there are examples already for libs that have dropped py2 support. > > What you need to do is update global requirements to be something like > > the following. > > > > sphinx!=1.6.6,!=1.6.7,<2.0.0;python_version=='2.7' # BSD > > sphinx!=1.6.6,!=1.6.7,!=2.1.0;python_version>='3.4' # BSD > > > > or > > > > keyring<19.0.0;python_version=='2.7' # MIT/PSF > > keyring;python_version>='3.4' # MIT/PSF > on a related note os-vif is blocked form running tempest jobs under python 3 > until https://review.opendev.org/#/c/681029/ is merged due to > https://zuul.opendev.org/t/openstack/build/4ff60d6bd2f24782abeb12cc7bdb8013/log/controller/logs/screen-q-agt.txt.gz#308-318 > > i think this issue will affect any job that install proejcts that use privsep using the required-proejcts section of the > zuul job definition. adding a project to required-proejcts sechtion adds it to the LIBS_FROM_GIT varible in devstack. > this inturn istalls it twice due to https://review.opendev.org/#/c/418135/ . the side effect of this is that the > privsep helper script gets installed under python2 and the neutron ageint in this case gets install under python 3 so > when it trys to spawn the privsep deamon and invoke commands it typically expodes due to dependcy issues or in this case > because it failed to drop privileges correctly. > > so as part of phase 1 we need to merge https://review.opendev.org/#/c/681029/ so that lib project that use required- > projects to run with master of project that comsume it and support depends-on can move to python 3 tempest jobs. Thanks for raising this. I agree on not falling back to py2 in Ussuri, I approved 681029. -gmann > > > > > > > From rosmaita.fossdev at gmail.com Wed Oct 30 17:21:55 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 30 Oct 2019 13:21:55 -0400 Subject: [ptg][nova][neutron][ironic][cinder][keystone][cyborg] Cross-project sessions In-Reply-To: References: Message-ID: On 10/29/19 11:40 AM, Eric Fried wrote: > All- > > Time to coordinate cross-project sessions at the PTG. > > slaweq approached me about a nova/neutron session so I penciled it in on > the nova etherpad [1]. If anyone has conflicts/objections with that > time, please speak up and we'll try to coordinate. > > Other teams (including nova): If cross-project meetings are necessary, > please suggest times in the etherpad and/or on this thread and/or by > grabbing me (efried) on IRC. I took the liberty of proposing a cinder-nova session on improving replication at 4:00-4:30 on Thursday; details on the nova etherpad [1]. > > (Note that stephenfin (I think) will be running the room, but he's on > vacation until the event.) > > Thanks, > efried > > [1] https://etherpad.openstack.org/p/nova-shanghai-ptg > From radoslaw.piliszek at gmail.com Wed Oct 30 17:26:04 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 30 Oct 2019 18:26:04 +0100 Subject: [kolla][ptg] Kolla Ussuri PVPTG (Partially Virtual PTG) Message-ID: Hello Everyone, As you may already know, Kolla core team is mostly not present on summit in Shanghai. Instead we are organizing a PTG next week, 7-8th Nov (Thu-Fri), in Białystok, Poland. Please let me know this week if you are interested in coming in person. We invite operators, contributors and contributors-to-be to join us for the virtual PTG online. The time schedule will be advertised later. Please fill yourself in on the whiteboard [1]. New ideas are welcome. [1] https://etherpad.openstack.org/p/kolla-ussuri-ptg Kind regards, Radek aka yoctozepto -------------- next part -------------- An HTML attachment was scrubbed... URL: From swamireddy at gmail.com Wed Oct 30 17:35:45 2019 From: swamireddy at gmail.com (M Ranga Swami Reddy) Date: Wed, 30 Oct 2019 23:05:45 +0530 Subject: Cinder multi backend quota update Message-ID: Hello, We use 2 types of volume, like volumes and volumes_ceph. I can update the quota for volumes quota using "cinder quota-update --volumes=20 project-id" But for volumes_ceph, the above CLI failed with volumes_ceph un recognised option.. Any suggestions here? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Wed Oct 30 17:48:16 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Wed, 30 Oct 2019 13:48:16 -0400 Subject: Cinder multi backend quota update In-Reply-To: References: Message-ID: I didn't try this but.. openstack quota set --volume-type ceph --volumes 20 project-id should do the trick. Bonne chance On Wed, Oct 30, 2019 at 1:38 PM M Ranga Swami Reddy wrote: > > Hello, > We use 2 types of volume, like volumes and volumes_ceph. > I can update the quota for volumes quota using "cinder quota-update --volumes=20 project-id" > > But for volumes_ceph, the above CLI failed with volumes_ceph un recognised option.. > Any suggestions here? -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. https://vexxhost.com From arne.wiebalck at cern.ch Wed Oct 30 20:58:51 2019 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Wed, 30 Oct 2019 21:58:51 +0100 Subject: [ironic]: Timeout reached while waiting for callback for node In-Reply-To: References: <1530284401.3551200.1572301601958.ref@mail.yahoo.com> <1530284401.3551200.1572301601958@mail.yahoo.com> Message-ID: <1A784864-3B47-4DE4-AD1B-F4614FB6ADC9@cern.ch> Hi Fred, To confirm what Julia said: We currently have ~3700 physical nodes in Ironic, managed by 3 controllers (16GB VMs running httpd, conductor, and inspector). We recently moved to larger nodes for these controllers due to the "thundering image” problem Julia was mentioning: when we deployed ~100 nodes in parallel, the conductors were running out of memory. We have yet to see if that change has the desired effect, though: we will add another 1000 nodes or so over the coming weeks. As for you, this is all with iscsi deploy. We didn’t set up things up with ‘direct' initially as we didn’t have a swift endpoint, but if this problem persists we will look into this as ‘direct' will clearly scale better. The recently added parallelism in Ironic's power sync in Ironic sped up this sync loop significantly: while the loops were running into each other before, the conductors can now check each of their 1000+ servers in <60 seconds. Cheers, Arne > On 29 Oct 2019, at 15:29, Julia Kreger wrote: > > That is great news to hear that you've been able to correlate it. > We've written some things regarding scaling, but the key really > depends on your architecture and how your utilizing the workload. > Since you mentioned a spine-leaf architecture, physical locality of > conductors will matter as well as having as much efficiency as > possible. I believe CERN is running 4-5 conductors to manage ?3000+? > physical machines. Naturally you'll need to scale as appropriate to > your deployment pattern. If much of your fleet is being redeployed > often, you may wish to consider having more conductors to match that > overall load. > > 1) Use the ``direct`` deploy interface. This moves the act of > unpacking the image files and streaming them to disk to the end node. > This generally requires an HTTP(S) download endpoint offered by the > conductor OR via Swift. Ironic-Python-Agent downloads the file, and > unpacks it in memory and directly streams it to disk. With the > ``iscsi`` interface, you can end up in situations, depending on image > composition and settings being passed to dd, where part of your deploy > process is trying to write zeros over the wire in blocks to the remote > disk. Naturally this needlessly consumes IO Bandwidth. > 2) Once your using the ``direct`` deploy_interface, Consider using > caching. While we don't use it in CI, ironic does have the capability > to pass configuration for caching proxy servers. This is set on a > per-node basis. If you have any deployed proxy/caching servers on your > spine or in your leafs close to physical nodes. Some timers are also > present to enable ironic to re-use swift URLs if your deploying the > same image to multiple servers concurrently. Swift tempurl usage does > negatively impact the gain over using a caching proxy though, but it > is something to consider in your architecture and IO pattern. > https://docs.openstack.org/ironic/latest/admin/drivers/ipa.html#using-proxies-for-image-download > 3) Consider using ``conductor_groups``. If it would help, you can > localize conductors to specific pools of machines of machines. This > may be useful if you have pools with different security requirements, > or if you have multiple spines and can dedicate some conductors per > spine. https://docs.openstack.org/ironic/latest/admin/conductor-groups.html > 4) Turn off periodic driver tasks for drivers your not using. Power > sync, and sensor data collection are two periodic workers that consume > resources when they run and the periodic tasks of other drivers still > consume a worker slot and query the database to see if there is work > to be done. You may also want to increase the number of permitted > workers. > > Power sync can be a huge issue on older versions. I believe Stein is > where we improved the parallelism of the power sync workers in Ironic > and Train now has power state callback with nova, which will greatly > reduce the ironic-api and nova-compute processor overhead. > > Hope this helps! > > -Julia > > On Mon, Oct 28, 2019 at 3:26 PM fsbiz at yahoo.com wrote: >> >> Thanks Julia. >> In addition to what you mentioned this particular issue seems to have cropped up when we added 100 more baremetal nodes. >> >> I've also narrowed down the issue (TFTP timeouts) when 3-4 baremetal nodes are in "deploy" state and downloading the OS via iSCSI. Each iSCSI transfer takes about 6 Gbps and thus with four transfers we are over our 20Gbps capacity of the leaf-spine links. We are slowly migrating to iPXE so it should help. >> >> That being said is there a document on large scale ironic design architectures? >> We are looking into a DC design (primarily for baremetals) for upto 2500 nodes. >> >> thanks, >> Fred, >> >> >> On Wednesday, October 23, 2019, 03:19:41 PM PDT, Julia Kreger wrote: >> >> >> Greetings Fred! >> >> Reply in-line. >> >> On Tue, Oct 22, 2019 at 12:47 PM fsbiz at yahoo.com wrote: >> >> [trim] >> >> >> >> TFTP logs: shows TFTP client timed out (weird). Any pointers here? >> >> >> Sadly this is one of those things that comes with using TFTP. Issues like this is why the community tends to recommend using ipxe.efi to chainload as you can perform transport over TCP as opposed to UDP where in something might happen mid-transport. >> >> >> tftpd shows ramdisk_deployed completed. Then, it reports that the client timed out. >> >> >> Grub does tend to be very abrupt and not wrap up very final actions. I suspect it may just never be sending the ack back and the transfer may be completing. I'm afraid this is one of those things you really need to see on the console what is going on. My guess would be that your deploy_ramdisk lost a packet in transfer or that it was corrupted in transport. It would be interesting to know if the network card stack is performing checksum validation, but for IPv4 it is optional. >> >> >> [trim] >> >> >> >> This has me stumped here. This exact failure seems to be happening 3 to 4 times a week on different nodes. >> Any pointers appreciated. >> >> thanks, >> Fred. >> >> > From skaplons at redhat.com Wed Oct 30 21:15:37 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 30 Oct 2019 22:15:37 +0100 Subject: [neutron][ptg] Team dinner Message-ID: <20191030211537.trgnve7df27g3jh4@skaplons-mac> Hi neutrinos, Thanks to LIU Yulong who helped me a lot to choose and book some restaurant, we have now booked restaurant: Expo source B2, No.168, Shangnan Road, Pudong New Area, Shanghai, TEL: +86 21 58882117 书院人家(世博源店) 上海市浦东新区上南路168号世博源B2 The Dianping page: http://www.dianping.com/shop/20877292 Dinner is scheduled to Tuesday, 5th Nov at 6pm. Restaurant is close to the Expo center. It's about 15 minutes walk according to the Google maps: https://tinyurl.com/y2rc83ej -- Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Wed Oct 30 21:18:44 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 30 Oct 2019 22:18:44 +0100 Subject: [neutron] Neutron team meeting cancelled Message-ID: <20191030211844.5igwwaz7wqlgc4wn@skaplons-mac> Hi, Due to the PTG/Summit next week, and public holiday in Poland on 11th Nov, lets cancel neutron team meetings in next 2 weeks. We will get back to this meeting as usuall on Tuesday, 19th Nov. See You in Shanghai :) -- Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Wed Oct 30 21:20:37 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 30 Oct 2019 22:20:37 +0100 Subject: [neutron] CI meeting cencelled November 5th Message-ID: <20191030212037.5xff5gjcspeh53m4@skaplons-mac> Hi, Due to summit next week, lets cancel our CI meeting on November 5th. We will resume this meeting after the PTG on November 12th. -- Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Wed Oct 30 21:22:03 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 30 Oct 2019 22:22:03 +0100 Subject: [neutron] Drivers meetings cancelled Message-ID: <20191030212203.d7dibgwvsw3i2e7x@skaplons-mac> Hi, Due to public holiday on November 1st and PTG on November 8th, lets cancel drivers meetings at those days. Next drivers meeting will take place at November 15th. -- Slawek Kaplonski Senior software engineer Red Hat From kennelson11 at gmail.com Wed Oct 30 22:23:35 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 30 Oct 2019 15:23:35 -0700 Subject: [kolla][ptg] Kolla Ussuri PVPTG (Partially Virtual PTG) In-Reply-To: References: Message-ID: Hello :) If people were going to be in Shanghai for the Summit (or live in China) they wouldn't be able to participate because of the firewall. Can you (or someone else present in Poland) provide an alternative solution to Google meet so that everyone interested could join? -Kendall (diablo_rojo) On Wed, 30 Oct 2019, 10:27 am Radosław Piliszek, < radoslaw.piliszek at gmail.com> wrote: > Hello Everyone, > > As you may already know, Kolla core team is mostly not present on summit > in Shanghai. > Instead we are organizing a PTG next week, 7-8th Nov (Thu-Fri), in > Białystok, Poland. > Please let me know this week if you are interested in coming in person. > > We invite operators, contributors and contributors-to-be to join us for > the virtual PTG online. > The time schedule will be advertised later. > > Please fill yourself in on the whiteboard [1]. > New ideas are welcome. > > [1] https://etherpad.openstack.org/p/kolla-ussuri-ptg > > Kind regards, > Radek aka yoctozepto > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liang.a.fang at intel.com Thu Oct 31 02:45:13 2019 From: liang.a.fang at intel.com (Fang, Liang A) Date: Thu, 31 Oct 2019 02:45:13 +0000 Subject: [cinder] changing the weekly meeting time In-Reply-To: References: Message-ID: Hi Brian, I agree with Rajat. Meeting happened in mid-night will prevent people to attend, except they have topics that must to talk with team. Openstack is widely used in Asia, but there’s no much core from Asia (in countries that the meeting in mid-night). Meeting time is one of the reason😊 Nova have two meetings, two time zone friendly. But I don’t like to divide the meeting to two. Regards Liang From: Rajat Dhasmana Sent: Thursday, October 31, 2019 12:54 AM To: Brian Rosmaita Cc: openstack-discuss at lists.openstack.org Subject: Re: [cinder] changing the weekly meeting time Hi Brian, It's great that the change in weekly meeting time is considered, here are my opinions on the same from the perspective of Asian countries (having active upstream developers) Current meeting time (16:00 - 17:00 UTC) INDIA : is 9:30 - 10:30 PM IST (UTC+05:30) is a little late but manageable. CHINA : is 12:00 - 01:00 AM CST (UTC+08:00) is almost impossible to attend. JAPAN : is 01:00 - 02:00 AM JST (UTC+09:00) similar to the case as China. IMO shifting the meeting time 2 hours earlier (UTC 14:00) might bring more participation and would ease out timings for some (including me) but these are just my thoughts. Thanks and Regards Rajat Dhasmana On Thu, Oct 24, 2019 at 3:05 AM Brian Rosmaita > wrote: (Just to be completely clear -- we're only gathering information at this point. The Cinder weekly meeting is still Wednesdays at 16:00 UTC.) As we discussed at today's meeting [0], a request has been made to hold the weekly meeting earlier so that it would be friendlier for people in Asia time zones. Based on the people in attendance today, it seems that a move to 14:00 UTC is not out of the question. Thus, the point of this email is to solicit comments on whether we should change the meeting time to 14:00 UTC. As you consider the impact on yourself, if you are in a TZ that observes Daylight Savings Time, keep in mind that most TZs go back to standard time over the next few weeks. (I was going to insert an opinion here, but I will wait and respond in this thread like everyone else.) cheers, brian [0] http://eavesdrop.openstack.org/meetings/cinder/2019/cinder.2019-10-23-16.00.log.html#l-166 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.luna.valero at gmail.com Thu Oct 31 07:25:41 2019 From: sebastian.luna.valero at gmail.com (Sebastian Luna Valero) Date: Thu, 31 Oct 2019 08:25:41 +0100 Subject: [scientific-sig] OpenStack maintenance planning Message-ID: Hi All, At our Institute we have secured funding to buy a small infrastructure. We have surveyed the computational requirements of researchers and we came to the conclusion that we need to deploy OpenStack on-prem. No one at our Institute has worked with OpenStack before so there is no local expertise. I have been looking around at the available options. Given the complexity of OpenStack and the tight time-frame that we have to get the infrastructure up and running, we would like to go with one of the OpenStack distributions offered by vendors to minimize the time it takes to go into production. Over the initial period, we would need to learn from the vendor how to operate and troubleshoot OpenStack properly, but in the long run we would like to be autonomous. On the other hand, funding in academia is complex and there is no guarantee that we'll be successful in all future funding calls. So in the scenario where we start off with an OpenStack distribution offered by a vendor, and 3 years down the line we run out of budget, can we continue operating/upgrading our OpenStack deployment without issues? Best regards, Sebastian -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Thu Oct 31 07:52:17 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 31 Oct 2019 08:52:17 +0100 Subject: [kolla][ptg] Kolla Ussuri PVPTG (Partially Virtual PTG) In-Reply-To: References: Message-ID: Hello Kendall, good question. Thanks for raising the issue. Nothing comes to my mind right now but will do research. If someone from the ML is aware of a stable China-compatible solution, I am all ears (and eyes). :-) Radek aka yoctozepto śr., 30 paź 2019 o 23:23 Kendall Nelson napisał(a): > Hello :) > > If people were going to be in Shanghai for the Summit (or live in China) > they wouldn't be able to participate because of the firewall. Can you (or > someone else present in Poland) provide an alternative solution to Google > meet so that everyone interested could join? > > -Kendall (diablo_rojo) > > > > On Wed, 30 Oct 2019, 10:27 am Radosław Piliszek, < > radoslaw.piliszek at gmail.com> wrote: > >> Hello Everyone, >> >> As you may already know, Kolla core team is mostly not present on summit >> in Shanghai. >> Instead we are organizing a PTG next week, 7-8th Nov (Thu-Fri), in >> Białystok, Poland. >> Please let me know this week if you are interested in coming in person. >> >> We invite operators, contributors and contributors-to-be to join us for >> the virtual PTG online. >> The time schedule will be advertised later. >> >> Please fill yourself in on the whiteboard [1]. >> New ideas are welcome. >> >> [1] https://etherpad.openstack.org/p/kolla-ussuri-ptg >> >> Kind regards, >> Radek aka yoctozepto >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From veeraready at yahoo.co.in Thu Oct 31 08:16:25 2019 From: veeraready at yahoo.co.in (VeeraReddy) Date: Thu, 31 Oct 2019 08:16:25 +0000 (UTC) Subject: [openstack-dev][kuryr] Error in creating LB References: <1979762272.3801941.1572509785865.ref@mail.yahoo.com> Message-ID: <1979762272.3801941.1572509785865@mail.yahoo.com> Hi, I am trying to install openstack & Kubernetes using devstack https://docs.openstack.org/kuryr-kubernetes/latest/installation/devstack/basic.html  Error Log             : http://paste.openstack.org/show/785670/ Local.conf          : Paste #785671 | LodgeIt! Regards, Veera. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdemaced at redhat.com Thu Oct 31 08:37:00 2019 From: mdemaced at redhat.com (Maysa De Macedo Souza) Date: Thu, 31 Oct 2019 09:37:00 +0100 Subject: [openstack-dev][kuryr] Error in creating LB In-Reply-To: <1979762272.3801941.1572509785865@mail.yahoo.com> References: <1979762272.3801941.1572509785865.ref@mail.yahoo.com> <1979762272.3801941.1572509785865@mail.yahoo.com> Message-ID: Hi VeeraReddy, You could try using a different Amphora image. Download the centos image [1] and modify these settings OCTAVIA_AMP_IMAGE_FILE and OCTAVIA_AMP_IMAGE_NAME to point to it. Hope this helps. [1] https://tarballs.openstack.org/octavia/test-images/ Cheers, Maysa. On Thu, Oct 31, 2019 at 9:19 AM VeeraReddy wrote: > Hi, > > I am trying to install openstack & Kubernetes using devstack > > > https://docs.openstack.org/kuryr-kubernetes/latest/installation/devstack/basic.html > Error Log : http://paste.openstack.org/show/785670/ > Local.conf : Paste #785671 | LodgeIt! > > > > Regards, > Veera. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdulko at redhat.com Thu Oct 31 08:43:21 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Thu, 31 Oct 2019 09:43:21 +0100 Subject: [openstack-dev][kuryr] Error in creating LB In-Reply-To: <1979762272.3801941.1572509785865@mail.yahoo.com> References: <1979762272.3801941.1572509785865.ref@mail.yahoo.com> <1979762272.3801941.1572509785865@mail.yahoo.com> Message-ID: <2ea50863deddb3bd158ccab869536c3c4f9693d5.camel@redhat.com> On Thu, 2019-10-31 at 08:16 +0000, VeeraReddy wrote: > Hi, > > I am trying to install openstack & Kubernetes using devstack > > https://docs.openstack.org/kuryr-kubernetes/latest/installation/devstack/basic.html > > Error Log : http://paste.openstack.org/show/785670/ > Local.conf : Paste #785671 | LodgeIt! This error happens when Octavia is unable to create a load balancer in 5 minutes. Seems like your LB is still PENDING_CREATE, so this seems to be just unusually slow Octavia. This might happen if e.g. your host has no nested virtualization enabled. Try increasing KURYR_WAIT_TIMEOUT in local.conf. In the gate we use up to 20 minutes (value of 1200). > Regards, > Veera. > From marcin.juszkiewicz at linaro.org Thu Oct 31 08:55:26 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Thu, 31 Oct 2019 09:55:26 +0100 Subject: [kolla][ptg] Kolla Ussuri PVPTG (Partially Virtual PTG) In-Reply-To: References: Message-ID: <196d5a99-53f4-68c7-d28d-b6962abb8b3b@linaro.org> W dniu 30.10.2019 o 23:23, Kendall Nelson pisze: > If people were going to be in Shanghai for the Summit (or live in > China) they wouldn't be able to participate because of the firewall. > Can you (or someone else present in Poland) provide an alternative > solution to Google meet so that everyone interested could join? Tell us which of them work for you: - Bluejeans - Zoom As I have access to both platforms at work. From rico.lin.guanyu at gmail.com Thu Oct 31 09:07:43 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Thu, 31 Oct 2019 17:07:43 +0800 Subject: [SIGs][meta-sig] Encourage SIG Chairs to join `Meet the project leaders` opportunities in Shanghai Message-ID: Dear SIG chairs There will be two `Meet the project leaders` event in Shanghai summit. And SIG chairs are definitely part of it. I'm here to encourage you to sign up [1] and join that event so people might get more chances to meet with you. They're scheduled on Monday marketplace mixer (5:30pm-7:00pm) and Wednesday lunch (12:20pm-1:40pm). [1] https://etherpad.openstack.org/p/meet-the-project-leaders -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhangbailin at inspur.com Thu Oct 31 11:35:27 2019 From: zhangbailin at inspur.com (=?gb2312?B?QnJpbiBaaGFuZyjVxbDZwdYp?=) Date: Thu, 31 Oct 2019 11:35:27 +0000 Subject: [nova] spec is ready to review and for feature liasion needed Message-ID: <7620140da67f4b38bfbf0e88ab212874@inspur.com> Hi everyone: This spec comment by melwitt, and already updated, wait for review and need a feature liaison. Proposal for a safer noVNC console with password authentication https://review.opendev.org/#/c/623120 Brin Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From veera.b at nxp.com Thu Oct 31 06:18:43 2019 From: veera.b at nxp.com (Veera.reddy B) Date: Thu, 31 Oct 2019 06:18:43 +0000 Subject: [openstack-dev][kuryr] Error in creating LB Message-ID: Hi, I am trying to install openstack & Kubernetes using devstack https://docs.openstack.org/kuryr-kubernetes/latest/installation/devstack/basic.html Error Log : http://paste.openstack.org/show/785670/ Local.conf : http://paste.openstack.org/show/785671/ Thanks, Veera. -------------- next part -------------- An HTML attachment was scrubbed... URL: From veeraready at yahoo.co.in Thu Oct 31 07:28:57 2019 From: veeraready at yahoo.co.in (VeeraReddy) Date: Thu, 31 Oct 2019 07:28:57 +0000 (UTC) Subject: [openstack-dev][kuryr] Error in creating LB References: <1872069241.3772135.1572506937290.ref@mail.yahoo.com> Message-ID: <1872069241.3772135.1572506937290@mail.yahoo.com> Hi, I am trying to install openstack & Kubernetes usingdevstack https://docs.openstack.org/kuryr-kubernetes/latest/installation/devstack/basic.html       ErrorLog             : http://paste.openstack.org/show/785670/   Local.conf         : http://paste.openstack.org/show/785671/ Regards, Veera. -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Oct 31 13:01:29 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 31 Oct 2019 14:01:29 +0100 Subject: [tc][all][goal]: Ussuri community goal candidate 3: 'Consistent and secure default policies' In-Reply-To: <87d0e6f5-baf1-4e42-91c0-82aceb4b1a7b@www.fastmail.com> References: <16df4609432.e8a5d43f95570.8265671559696188212@ghanshyammann.com> <68d21e29-0ef8-42b6-bcd1-6d7b5fe2ea79@www.fastmail.com> <20191030100020.epbuiov4jcufxktr@skaplons-mac> <87d0e6f5-baf1-4e42-91c0-82aceb4b1a7b@www.fastmail.com> Message-ID: <20191031130129.foph5tet2yzlywbl@skaplons-mac> On Wed, Oct 30, 2019 at 05:50:08AM -0700, Colleen Murphy wrote: > On Wed, Oct 30, 2019, at 03:00, Slawek Kaplonski wrote: > > Hi Colleen, > > > > We have this as one of the topic to discuss during PTG in Neutron team. > > I don't have full agenda ready yet, but I can ping You when it will be exactly > > if You would like to join. > > I'd be happy to join, please do let me know. Great to see that. It is scheduled to be on Thursday at 14:40 - 16:00, see https://etherpad.openstack.org/p/Shanghai-Neutron-Planning line 109 > > Colleen > > > > > On Tue, Oct 29, 2019 at 04:53:19PM -0700, Colleen Murphy wrote: > > > On Tue, Oct 22, 2019, at 09:50, Ghanshyam Mann wrote: > > > > Hello Everyone, > > > > > > > > This is 3rd proposal candidate for the Ussuri cycle community-wide > > > > goal. The other two are [1]. > > > > > > > > Colleen proposed this idea for the Ussuri cycle community goal. > > > > > > > > Projects implemented/plan to implement this: > > > > *Keystone already implemented this with all necessary support in > > > > oslo.policy with nice documentation. > > > > * We discussed this in nova train PTG to implement it in nova [2]. Nova > > > > spec was merged in Train but > > > > could not implement. I have re-proposed the spec for the Ussuri > > > > cycle [3]. > > > > > > > > This is nice idea as a goal from the user perspective. Colleen has less > > > > bandwidth to drive this goal alone. > > > > We are looking for a champion or co-champions (1-2 people will be much > > > > better) this goal along with Colleen. > > > > > > > > Also, let us know what do you think of this as a community goal? Any > > > > query or Improvement Feedback? > > > > > > [snipped] > > > > > > It's possible that this won't work very well as a community goal. This migration took the keystone team about two cycles to implement, not including all the planning and foundational work that needed to happen first. In our virtual PTG meeting today we discussed the possibility of instead forming a pop-up team around this work so that a group of individuals (across several teams, not just the keystone team) could target the largest projects and make more of a dent in the work over a couple of cycles before using the community goal model to close the gaps in the smaller or more peripheral projects. However, we will still need to get buy-in from the projects so that we can be sure that the work the pop-up team does gets prioritized and reviewed. > > > > > > I've drafted a set of steps that we'd expect projects to follow to get this done: > > > > > > https://etherpad.openstack.org/p/policy-migration-steps > > > > > > And we're holding a forum session on it next week: > > > > > > https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24357/next-steps-for-policy-in-openstack > > > > > > Colleen > > > > > > > -- > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > > > > -- Slawek Kaplonski Senior software engineer Red Hat From mriedemos at gmail.com Thu Oct 31 15:23:01 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 31 Oct 2019 10:23:01 -0500 Subject: State of the Gate Message-ID: Things are great! Surprise! I just wanted to let everyone know. Later! . . . . . Now that you've been tricked, on Halloween no less, I'm here to tell you that things suck right now. This is your periodic digest of issues. Grab some fun-sized candy bars and read on. I think right now we have three major issues. 1. http://status.openstack.org/elastic-recheck/index.html#1763070 This has resurfaced and I'm not sure why, nor do I think we ever had a great handle on what is causing this or how to work around it so if anyone has new ideas please chip in. 2. http://status.openstack.org/elastic-recheck/index.html#1844929 I've done some digging into this one and my notes are in the bug report. It mostly affects grenade jobs but is not entirely restricted to grenade jobs. It's also mostly on OVH and FortNebula nodes but not totally. From looking at the mysql logs in the grenade jobs, mysqld is (re)started three times. I think (1) for initial package install, (2) for stacking devstack on the old side, and (3) for stacking devstack on the new side. After the last restart, there are a lot of aborted connection messages in the msyql error log. It's around then that grenade is running post-upgrade smoke tests to create a server and the nova-scheduler times out communicating with the nova_cell1 database. I have a few patches up to grenade/devstack [1] to try some things and get more msyql logs but so far they aren't really helpful. We need someone with some more mysql debugging experience to help here, maybe zzzeek or mordred? 3. CirrOS guest SSH issues There are several (some might be duplicates): http://status.openstack.org/elastic-recheck/index.html#1848078 http://status.openstack.org/elastic-recheck/index.html#1808010 (most hits) http://status.openstack.org/elastic-recheck/index.html#1463631 http://status.openstack.org/elastic-recheck/index.html#1849857 http://status.openstack.org/elastic-recheck/index.html#1737039 http://status.openstack.org/elastic-recheck/index.html#1840355 http://status.openstack.org/elastic-recheck/index.html#1843610 A few notes here. a) We're still using CirrOS 0.4.0 since Stein: https://review.opendev.org/#/c/521825/ And that image was published nearly 2 years ago and there are no newer versions on the CirrOS download site so we can't try a newer image to see if that fixes things. b) Some of the issues above are related to running out of disk in the guest. I'm not sure what is causing that, but I have posted a devstack patch that is related: https://review.opendev.org/#/c/690991 tl;dr before Stein the tempest flavors we used had disk=0 so nova would fit the root disk to the size of the image. Since Stein the tempest flavors specify root disk size (1GiB for the CirrOS images). My patch pads an extra 1GiB to the root disk on the tempest flavors. One side effect of that is the volumes tempest creates will go from 1GiB to 2GiB which could be a problem if a lot of tempest volume tests run at the same time, though we do have a volume group size of 24GB in gate runs so I think we're OK for now. I'm not sure my patch would help, but it's an idea. As for the other key generation and dhcp lease failures, I don't know what to do about those. We need more eyes on these issues to generate some ideas or see if we're doing something wrong in our tests, e.g. generating too much data for the config drive? Not using config drive in some cases? Metadata API server is too slow (note we cache the metadata since [2])? Other ideas on injecting logs for debug? [1] https://review.opendev.org/#/q/topic:bug/1844929+status:open [2] https://review.opendev.org/#/q/I9082be077b59acd3a39910fa64e29147cb5c2dd7 -- Thanks, Matt From gagehugo at gmail.com Thu Oct 31 15:40:42 2019 From: gagehugo at gmail.com (Gage Hugo) Date: Thu, 31 Oct 2019 10:40:42 -0500 Subject: [security] Security SIG Newsletter - Oct 31st 2019 Message-ID: Hello everyone, Due to a hectic October, this newsletter will have the updates for the month. The SIG meetings for the 07th, 21st, and 28th in November will be cancelled as well due to Summit/PTG & Holidays. Have a Happy Halloween! #Date: 31 Oct 2019 - Security SIG Meeting Info: http://eavesdrop.openstack.org/#Security_SIG_meeting - Weekly on Thursday at 1500 UTC in #openstack-meeting - Agenda: https://etherpad.openstack.org/p/security-agenda - https://security.openstack.org/ - https://wiki.openstack.org/wiki/Security-SIG #Meeting Notes (October) - fungi volunteers to be nova spec liaison for ussuri image encryption spec in nova: http://eavesdrop.openstack.org/meetings/image_encryption/2019/image_encryption.2019-10-21-13.00.log.html - The Security SIG won't be meeting on the following dates in November due to the Summit/PTG & Holidays - Nov 07th, Nov 21st, Nov 28th #VMT Reports - A full list of publicly marked security issues can be found here: https://bugs.launchpad.net/ossa/ - OSSA-2019-005 was released this month: https://security.openstack.org/ossa/OSSA-2019-005.html - ceph backend, secret key leak: https://bugs.launchpad.net/cinder/+bug/1849624 - CSV Injection Possible in Compute Usage History: https://bugs.launchpad.net/horizon/+bug/1842749 -------------- next part -------------- An HTML attachment was scrubbed... URL: From gouthampravi at gmail.com Thu Oct 31 15:57:12 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Thu, 31 Oct 2019 08:57:12 -0700 Subject: [manila] No IRC meeting on Nov 7 and Nov 21 2019 Message-ID: Hello Zorillas and interested stackers, Due to a part of our community attending the Open Infrastructure Summit+PTG (Nov 4-8, 2019) and KubeCon+CloudNativeCon (Nov 18-21, 2019), I propose that we cancel the weekly IRC meetings on Nov 7th and Nov 21st. If you'd like to discuss anything during these weeks, please chime in on freenode/#openstack-manila, or post to this mailing list. Thanks, Goutham From zhangbailin at inspur.com Thu Oct 31 16:03:31 2019 From: zhangbailin at inspur.com (=?utf-8?B?QnJpbiBaaGFuZyjlvKDnmb7mnpcp?=) Date: Thu, 31 Oct 2019 16:03:31 +0000 Subject: [qa] required rabbitMQ materials References: <7620140da67f4b38bfbf0e88ab212874@inspur.com> Message-ID: Hi all Can anyone provide me with some materials about RabbitMQ? like its implementation mechanisms, scenarios, etc. Thanks anyway. Brin Zhang From openstack at fried.cc Thu Oct 31 16:24:59 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 31 Oct 2019 11:24:59 -0500 Subject: [nova] No meeting during summit/PTG week Message-ID: <688e9e99-90e0-2424-acb7-b8f24266e3bb@fried.cc> And that's all I have to say about that. https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting From pawel.konczalski at everyware.ch Thu Oct 31 17:21:37 2019 From: pawel.konczalski at everyware.ch (Pawel Konczalski) Date: Thu, 31 Oct 2019 18:21:37 +0100 Subject: Permamently delete double / wrong openflow entry from br-tun bridge in OpenvSwitch Message-ID: Hi, i have a VM with ARP entries in both OpenVswitch bridges br-tun and br-int on the compute node where is it executed:     ovs-ofctl dump-flows br-int | egrep "arp.*10.20.0.34"     cookie=0x117c25cd8a3f96, duration=47029.368s, table=24, n_packets=20, n_bytes=840, priority=2,arp,in_port="qvod9cdff27-9c",arp_spa=10.20.0.34 actions=resubmit(,25)     ovs-ofctl dump-flows br-tun | egrep "arp.*10.20.0.34"     cookie=0x1597c76aa2fd74f2, duration=47015.961s, table=21, n_packets=0, n_bytes=0, priority=1,arp,dl_vlan=27,arp_tpa=10.20.0.34 ... 3e:42:af:1d,IN_PORT The unnecassary / faulty entry can by deleted manually from br-tun with:     ovs-ofctl --strict del-flows br-tun "priority=1,arp,dl_vlan=27,arp_tpa=10.20.0.34" but after the port is shutdown and up again the entry will by recreated again. When the VM is moved to another compute node, only the br-int entry is created, moving back to the initial host creates both again. Any idea how to remove permamenty this br-tun entry? Thanks Pawel -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5227 bytes Desc: not available URL: From Albert.Braden at synopsys.com Thu Oct 31 17:49:54 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 31 Oct 2019 17:49:54 +0000 Subject: CPU pinning blues Message-ID: I'm following this document to setup CPU pinning on Rocky: https://www.redhat.com/en/blog/driving-fast-lane-cpu-pinning-and-numa-topology-awareness-openstack-compute I followed all of the steps except for modifying non-pinned flavors and I have one aggregate containing a single NUMA-capable host: root at us01odc-dev1-ctrl1:/var/log/nova# os aggregate list +----+-------+-------------------+ | ID | Name | Availability Zone | +----+-------+-------------------+ | 4 | perf3 | None | +----+-------+-------------------+ root at us01odc-dev1-ctrl1:/var/log/nova# os aggregate show 4 +-------------------+----------------------------+ | Field | Value | +-------------------+----------------------------+ | availability_zone | None | | created_at | 2019-10-30T23:05:41.000000 | | deleted | False | | deleted_at | None | | hosts | [u'us01odc-dev1-hv003'] | | id | 4 | | name | perf3 | | properties | pinned='true' | | updated_at | None | +-------------------+----------------------------+ I have a flavor with the NUMA properties: root at us01odc-dev1-ctrl1:/var/log/nova# os flavor show s1.perf3 +----------------------------+-------------------------------------------------------------------------+ | Field | Value | +----------------------------+-------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | disk | 35 | | id | be3d21c4-7e91-42a2-b832-47f42fdd3907 | | name | s1.perf3 | | os-flavor-access:is_public | True | | properties | aggregate_instance_extra_specs:pinned='true', hw:cpu_policy='dedicated' | | ram | 30720 | | rxtx_factor | 1.0 | | swap | 7168 | | vcpus | 4 | +----------------------------+-------------------------------------------------------------------------+ I create a VM with that flavor: openstack server create --flavor s1.perf3 --image NOT-QSC-CentOS6.10-19P1-v4 --network it-network alberttest4 but it goes to error status, and I see this in the logs: 2019-10-30 16:17:55.590 3248800 INFO nova.virt.hardware [req-d0c2de13-db23-41bd-8da3-34c68ff1d998 2cb6757679d54a69803a5b6e317b3a93 474ae347d8ad426f8118e55eee47dcfd - default 7d3a4deab35b434bba403100a6729c81] Computed NUMA topology CPU pinning: usable pCPUs: [[4], [5], [6], [7]], vCPUs mapping: [(0, 4), (1, 5), (2, 6), (3, 7)] 2019-10-30 16:17:55.595 3248800 INFO nova.virt.hardware [req-d0c2de13-db23-41bd-8da3-34c68ff1d998 2cb6757679d54a69803a5b6e317b3a93 474ae347d8ad426f8118e55eee47dcfd - default 7d3a4deab35b434bba403100a6729c81] Computed NUMA topology CPU pinning: usable pCPUs: [[0], [1], [2], [3], [4], [5], [6], [7]], vCPUs mapping: [(0, 0), (1, 1), (2, 2), (3, 3)] 2019-10-30 16:17:55.595 3248800 INFO nova.filters [req-d0c2de13-db23-41bd-8da3-34c68ff1d998 2cb6757679d54a69803a5b6e317b3a93 474ae347d8ad426f8118e55eee47dcfd - default 7d3a4deab35b434bba403100a6729c81] Filter AggregateInstanceExtraSpecsFilter returned 0 hosts 2019-10-30 16:17:55.596 3248800 INFO nova.filters [req-d0c2de13-db23-41bd-8da3-34c68ff1d998 2cb6757679d54a69803a5b6e317b3a93 474ae347d8ad426f8118e55eee47dcfd - default 7d3a4deab35b434bba403100a6729c81] Filtering removed all hosts for the request with instance ID '73b1e584-0ce4-478c-a706-c5892609dc3f'. Filter results: ['RetryFilter: (start: 3, end: 3)', 'AvailabilityZoneFilter: (start: 3, end: 3)', 'CoreFilter: (start: 3, end: 2)', 'RamFilter: (start: 2, end: 2)', 'ComputeFilter: (start: 2, end: 2)', 'ComputeCapabilitiesFilter: (start: 2, end: 2)', 'ImagePropertiesFilter: (start: 2, end: 2)', 'ServerGroupAntiAffinityFilter: (start: 2, end: 2)', 'ServerGroupAffinityFilter: (start: 2, end: 2)', 'DifferentHostFilter: (start: 2, end: 2)', 'SameHostFilter: (start: 2, end: 2)', 'NUMATopologyFilter: (start: 2, end: 2)', 'AggregateInstanceExtraSpecsFilter: (start: 2, end: 0)'] It looks like my hypervisor is passing the hw:cpu_policy='dedicated' requirement but it is failing on "pinned=true" The interesting part of the problem is that if I add a second apparently identical hypervisor to the aggregate it starts working. I create s1.perf3 VMs and they land on us01odc-dev1-hv002 and the XML shows that they are correctly pinned. When us01odc-dev1-hv002 is full then they start failing again. What should I be looking for here? What could cause one apparently identical hypervisor to fail AggregateInstanceExtraSpecsFilter while another one passes? In the nova-compute log of the failing hypervisor I see this: 2019-10-31 10:43:01.147 1103 INFO nova.compute.resource_tracker [req-dda65a9c-9d0a-4888-b4cb-0bf4423dd2f3 - - - - -] Instance 4856d505-c220-4873-b881-836b5b75f7bb has allocations against this compute host but is not found in the database. 2019-10-31 10:43:01.148 1103 INFO nova.compute.resource_tracker [req-dda65a9c-9d0a-4888-b4cb-0bf4423dd2f3 - - - - -] Final resource view: name=us01odc-dev1-hv003.internal.synopsys.com phys_ram=128888MB used_ram=38912MB phys_disk=1208GB used_disk=297GB total_vcpus=8 used_vcpus=6 pci_stats=[] Openstack can't find a VM with UUID 4856d505-c220-4873-b881-836b5b75f7bb. There are no VMs on hv003 but I can create a non-pinned VM there and it works. Do I have a "phantom" VM that is consuming resources on hv003? How can I fix that? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Thu Oct 31 18:46:57 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Thu, 31 Oct 2019 15:46:57 -0300 Subject: [keystone] Federated users who wish to use CLI In-Reply-To: References: <8f3bc525-451e-a677-8dcb-c43770ff3d2d@uchicago.edu> Message-ID: Hey guys. Here is our fix for the issue. https://review.opendev.org/#/c/692140/1 When that PR gets merged, the CLI will be able to use federated users and authenticate them via OIDCv3password One example of configurations to use with that PR is the following: > export OS_AUTH_URL=http://keystone:35357/v3export OS_INTERFACE=internalexport OS_IDENTITY_API_VERSION=3export OS_REGION_NAME=Z1export OS_AUTH_PLUGIN=openidexport OS_AUTH_TYPE=v3oidcpasswordexport OS_IDENTITY_PROVIDER=IDPexport OS_CLIENT_ID=IDP_CLIENT_IDexport OS_CLIENT_SECRET=IDP_CLIENT_SECRETexport OS_OPENID_SCOPE="openid address email profile phone offline_access"export OS_PROTOCOL=openidexport OS_ACCESS_TOKEN_ENDPOINT=https://IDP_SERVER_NAME:PORT/openid-connect/tokenexport OS_ACCESS_TOKEN_TYPE=access_tokenexport OS_DISCOVERY_ENDPOINT=https://IDP_SERVER_NAME:PORT/.well-known/openid-configurationexport OS_PROJECT_ID=OPENSTACK_PROJECT_IDexport OS_PROJECT_NAME="OPENSTACK_PROJECT_NAME"export OS_PROJECT_DOMAIN_ID="OPENSTACK_PROJECT_DOMAIN_ID"export OS_USERNAME=federation-testexport OS_PASSWORD=federation-test-password > > On Thu, Oct 24, 2019 at 5:53 PM Rafael Weingärtner < rafaelweingartner at gmail.com> wrote: > Jason, just watch out for another issue, which is the group assignment > permissions and app credentials. > As soon, as we have some updates, I will ping you guys. > > > On Thu, Oct 24, 2019 at 4:49 PM Jason Anderson > wrote: > >> Hey all, thanks for the helpful replies! >> >> I did discover that some of my issues were fixed in Horizon Stein (I'm on >> Rocky still), which added support for RC file templates. Good to know about >> some of the client quirks that are being sorted out. One thing to point >> out, v3oidcpassword requires Resource Owner Password Credential grant >> support (grant_type=password), which not all IdPs support (for example, the >> one I am integrating against!) >> >> Application credentials are an interesting feature and I'll see how it >> might make sense to leverage them. >> >> Cheers! >> >> On 10/24/19 12:21 PM, Kristi Nikolla wrote: >> >> Keep us posted! It would be great to have this documented for >> future reference. >> >> On Thu, Oct 24, 2019 at 1:04 PM Rafael Weingärtner < >> rafaelweingartner at gmail.com> wrote: >> >>> We are using the "access_token_endpoint". The token is retrieved nicely >>> from the IdP. However, the issue starts on Keystone side and the Apache >>> HTTPD mod_auth_openidc. The CLI was not ready to deal with it. It is like >>> Horizon, when we have multiple IdPs. The discovery process happens twice, >>> once in Horizon and another one in Keystone. We already fixed the Horizon >>> issue, and now we are working to fix the CLI. We should have something in >>> the next few days. >>> >>> On Thu, Oct 24, 2019 at 1:29 PM Kristi Nikolla >>> wrote: >>> >>>> Hi Rafael, >>>> >>>> I have no experience with using multiple identity providers directly in >>>> Keystone. Does specifying the access_token_endpoint or discovery_endpoint >>>> for the specific provider you are trying to authenticate to work? >>>> >>>> Kristi >>>> >>>> On Wed, Oct 23, 2019 at 2:06 PM Rafael Weingärtner < >>>> rafaelweingartner at gmail.com> wrote: >>>> >>>>> Hello Colleen, >>>>> Have you tested the OpenStack CLI with v3oidcpassword or >>>>> v3oidcauthcode and multiple IdPs configured in Keystone? >>>>> >>>>> We are currently debugging and discussing on how to enable this >>>>> support in the CLI. So far, we were not able to make it work with the >>>>> current code. This also happens with Horizon. If one has multiple IdPs in >>>>> Keystone, the "discovery" process would happen twice, one in Horizon and >>>>> another in Keystone, which is executed by the OIDC plugin in the HTTPD. We >>>>> already fixed the Horizon issue, but the CLI we are still investigating, >>>>> and we suspect that is probably the same problem. >>>>> >>>>> On Wed, Oct 23, 2019 at 1:56 PM Colleen Murphy >>>>> wrote: >>>>> >>>>>> Hi Jason, >>>>>> >>>>>> On Mon, Oct 21, 2019, at 14:35, Jason Anderson wrote: >>>>>> > Hi all, >>>>>> > >>>>>> > I'm in the process of prototyping a federated Keystone using >>>>>> OpenID >>>>>> > Connect, which will place ephemeral users in a group that has roles >>>>>> in >>>>>> > existing projects. I was testing how it felt from the user's >>>>>> > perspective and am confused how I'm supposed to be able to use the >>>>>> > openstacksdk with federation. For one thing, the RC files I can >>>>>> > download from the "API Access" section of Horizon don't seem like >>>>>> they >>>>>> > work; the domain is hard-coded to "Federated", >>>>>> >>>>>> This should be fixed in the latest version of keystone... >>>>>> >>>>>> > and it also uses a >>>>>> > username/password authentication method. >>>>>> >>>>>> ...but this is not, horizon only knows about the 'password' >>>>>> authentication method and can't provide RC files for other types of auth >>>>>> methods (unless you create an application credential). >>>>>> >>>>>> > >>>>>> > I can see that there is a way to use KSA to use an existing OIDC >>>>>> > token, which I think is probably the most "user-friendly" way, but >>>>>> the >>>>>> > user still has to obtain this token themselves out-of-band, which >>>>>> is >>>>>> > not trivial. Has anybody else set this up for users who liked to >>>>>> use >>>>>> > the CLI? >>>>>> >>>>>> All of KSA's auth types are supported by the openstack CLI. Which one >>>>>> you use depends on your OpenID Connect provider. If your provider supports >>>>>> it, you can use the "v3oidcpassword" auth method with the openstack CLI, >>>>>> following this example: >>>>>> >>>>>> https://support.massopen.cloud/kb/faq.php?id=16 >>>>>> >>>>>> On the other hand if you are using something like Google which only >>>>>> supports the authorization_code grant type, then you would have to get the >>>>>> authorization code out of band and then use the "v3oidcauthcode" auth type, >>>>>> and personally I've never gotten that to work with Google. >>>>>> >>>>>> > Is the solution to educate users about creating application >>>>>> > credentials instead? >>>>>> >>>>>> This is the best option. It's much easier to manage and horizon >>>>>> provides openrc and clouds.yaml files for app creds. >>>>>> >>>>>> Hope this helps, >>>>>> >>>>>> Colleen >>>>>> >>>>>> > >>>>>> > Thank you in advance, >>>>>> > >>>>>> > -- >>>>>> > Jason Anderson >>>>>> > >>>>>> > Chameleon DevOps Lead >>>>>> > *Consortium for Advanced Science and Engineering, The University of >>>>>> Chicago* >>>>>> > *Mathematics & Computer Science Division, Argonne National >>>>>> Laboratory* >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Rafael Weingärtner >>>>> >>>> >>>> >>>> -- >>>> Kristi >>>> >>> >>> >>> -- >>> Rafael Weingärtner >>> >> >> >> -- >> Kristi >> >> >> > > -- > Rafael Weingärtner > -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Oct 31 21:15:35 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 31 Oct 2019 22:15:35 +0100 Subject: State of the Gate In-Reply-To: References: Message-ID: <20191031211535.vk7rtiq3pvsb6j2t@skaplons-mac> Hi, On Thu, Oct 31, 2019 at 10:23:01AM -0500, Matt Riedemann wrote: > Things are great! Surprise! I just wanted to let everyone know. Later! > . > . > . > . > . > Now that you've been tricked, on Halloween no less, I'm here to tell you > that things suck right now. This is your periodic digest of issues. Grab > some fun-sized candy bars and read on. > > I think right now we have three major issues. > > 1. http://status.openstack.org/elastic-recheck/index.html#1763070 > > This has resurfaced and I'm not sure why, nor do I think we ever had a great > handle on what is causing this or how to work around it so if anyone has new > ideas please chip in. I think that this is "just" some slowdown of node on which job is running. I noticed it too in some neutron jobs and I checked some. It seems that one API request is processed for very long time. For example in one of fresh examples: https://13cf3dd11b8f009809dc-97cb3b32849366f5bed744685e46b266.ssl.cf5.rackcdn.com/692206/3/check/tempest-integrated-compute/35ecb4a/job-output.txt it was request to nova which caused very long time: Oct 31 16:55:08.632162 ubuntu-bionic-inap-mtl01-0012620879 devstack at n-api.service[7191]: INFO nova.api.openstack.requestlog [None req-275af2df-bd4e-4e64-b46e-6582e8de5148 tempest-ServerDiskConfigTestJSON-1598674508 tempest-ServerDiskConfigTestJSON-1598674508] 198.72.124.104 "POST /compute/v2.1/servers/d15d2033-b29b-44f7-b619-ed7ef83fe477/action" status: 500 len: 216 microversion: 2.1 time: 161.951140 > > 2. http://status.openstack.org/elastic-recheck/index.html#1844929 > > I've done some digging into this one and my notes are in the bug report. It > mostly affects grenade jobs but is not entirely restricted to grenade jobs. > It's also mostly on OVH and FortNebula nodes but not totally. > > From looking at the mysql logs in the grenade jobs, mysqld is (re)started > three times. I think (1) for initial package install, (2) for stacking > devstack on the old side, and (3) for stacking devstack on the new side. > After the last restart, there are a lot of aborted connection messages in > the msyql error log. It's around then that grenade is running post-upgrade > smoke tests to create a server and the nova-scheduler times out > communicating with the nova_cell1 database. > > I have a few patches up to grenade/devstack [1] to try some things and get > more msyql logs but so far they aren't really helpful. We need someone with > some more mysql debugging experience to help here, maybe zzzeek or mordred? > > 3. CirrOS guest SSH issues > > There are several (some might be duplicates): > > http://status.openstack.org/elastic-recheck/index.html#1848078 This one is I think the same as we have reported in https://bugs.launchpad.net/neutron/+bug/1850557 Basically we noticed issues with dhcp after resize/migration/shelve of instance but I didn't have time to investigate it yet. > http://status.openstack.org/elastic-recheck/index.html#1808010 (most hits) > http://status.openstack.org/elastic-recheck/index.html#1463631 > http://status.openstack.org/elastic-recheck/index.html#1849857 > http://status.openstack.org/elastic-recheck/index.html#1737039 > http://status.openstack.org/elastic-recheck/index.html#1840355 > http://status.openstack.org/elastic-recheck/index.html#1843610 > > A few notes here. > > a) We're still using CirrOS 0.4.0 since Stein: > > https://review.opendev.org/#/c/521825/ > > And that image was published nearly 2 years ago and there are no newer > versions on the CirrOS download site so we can't try a newer image to see if > that fixes things. > > b) Some of the issues above are related to running out of disk in the guest. > I'm not sure what is causing that, but I have posted a devstack patch that > is related: > > https://review.opendev.org/#/c/690991 > > tl;dr before Stein the tempest flavors we used had disk=0 so nova would fit > the root disk to the size of the image. Since Stein the tempest flavors > specify root disk size (1GiB for the CirrOS images). My patch pads an extra > 1GiB to the root disk on the tempest flavors. One side effect of that is the > volumes tempest creates will go from 1GiB to 2GiB which could be a problem > if a lot of tempest volume tests run at the same time, though we do have a > volume group size of 24GB in gate runs so I think we're OK for now. I'm not > sure my patch would help, but it's an idea. > > As for the other key generation and dhcp lease failures, I don't know what > to do about those. We need more eyes on these issues to generate some ideas > or see if we're doing something wrong in our tests, e.g. generating too much > data for the config drive? Not using config drive in some cases? Metadata > API server is too slow (note we cache the metadata since [2])? Other ideas > on injecting logs for debug? > > [1] https://review.opendev.org/#/q/topic:bug/1844929+status:open > [2] https://review.opendev.org/#/q/I9082be077b59acd3a39910fa64e29147cb5c2dd7 > > -- > > Thanks, > > Matt > -- Slawek Kaplonski Senior software engineer Red Hat From johnsomor at gmail.com Thu Oct 31 21:42:46 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Thu, 31 Oct 2019 14:42:46 -0700 Subject: [openstack-dev][kuryr] Error in creating LB In-Reply-To: <2ea50863deddb3bd158ccab869536c3c4f9693d5.camel@redhat.com> References: <1979762272.3801941.1572509785865.ref@mail.yahoo.com> <1979762272.3801941.1572509785865@mail.yahoo.com> <2ea50863deddb3bd158ccab869536c3c4f9693d5.camel@redhat.com> Message-ID: Yeah, this likely means something is wrong with your nova setup. Either it is too slow to boot a VM or there is some other error. Try looking for the "amphora" instances in nova (openstack server list) then do a show on them (openstack server show ). There is an error field from nova that may contain the error. Michael On Thu, Oct 31, 2019 at 1:46 AM Michał Dulko wrote: > > On Thu, 2019-10-31 at 08:16 +0000, VeeraReddy wrote: > > Hi, > > > > I am trying to install openstack & Kubernetes using devstack > > > > https://docs.openstack.org/kuryr-kubernetes/latest/installation/devstack/basic.html > > > > Error Log : http://paste.openstack.org/show/785670/ > > Local.conf : Paste #785671 | LodgeIt! > > This error happens when Octavia is unable to create a load balancer in > 5 minutes. Seems like your LB is still PENDING_CREATE, so this seems to > be just unusually slow Octavia. This might happen if e.g. your host has > no nested virtualization enabled. > > Try increasing KURYR_WAIT_TIMEOUT in local.conf. In the gate we use up > to 20 minutes (value of 1200). > > > Regards, > > Veera. > > > > > From mriedemos at gmail.com Thu Oct 31 21:52:39 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 31 Oct 2019 16:52:39 -0500 Subject: State of the Gate (placement?) In-Reply-To: <20191031211535.vk7rtiq3pvsb6j2t@skaplons-mac> References: <20191031211535.vk7rtiq3pvsb6j2t@skaplons-mac> Message-ID: On 10/31/2019 4:15 PM, Slawek Kaplonski wrote: >> 1.http://status.openstack.org/elastic-recheck/index.html#1763070 >> >> This has resurfaced and I'm not sure why, nor do I think we ever had a great >> handle on what is causing this or how to work around it so if anyone has new >> ideas please chip in. > I think that this is "just" some slowdown of node on which job is running. I > noticed it too in some neutron jobs and I checked some. It seems that one API > request is processed for very long time. For example in one of fresh examples: > https://13cf3dd11b8f009809dc-97cb3b32849366f5bed744685e46b266.ssl.cf5.rackcdn.com/692206/3/check/tempest-integrated-compute/35ecb4a/job-output.txt > it was request to nova which caused very long time: > > Oct 31 16:55:08.632162 ubuntu-bionic-inap-mtl01-0012620879devstack at n-api.service[7191]: INFO nova.api.openstack.requestlog [None req-275af2df-bd4e-4e64-b46e-6582e8de5148 tempest-ServerDiskConfigTestJSON-1598674508 tempest-ServerDiskConfigTestJSON-1598674508] 198.72.124.104 "POST /compute/v2.1/servers/d15d2033-b29b-44f7-b619-ed7ef83fe477/action" status: 500 len: 216 microversion: 2.1 time: 161.951140 > > That's really interesting, I hadn't noticed the MessagingTimeout in n-api linked to this. I dug into that one and it's really odd, but there is one known and one unknown issue here. The known issue is resize/cold migrate is synchronous through API, conductor and scheduler while picking a host and then we finally return to the API caller when we RPC cast to the selected destination host. If we RPC cast from API to conductor we'd avoid that MessagingTimeout issue. The unknown weirdness that made this fail is it took ~3 minutes to move allocations from the instance to the migration record. We hit conductor about here: Oct 31 16:52:24.300732 ubuntu-bionic-inap-mtl01-0012620879 nova-conductor[14891]: DEBUG nova.conductor.tasks.migrate [None req-275af2df-bd4e-4e64-b46e-6582e8de5148 tempest-ServerDiskConfigTestJSON-1598674508 tempest-ServerDiskConfigTestJSON-1598674508] [instance: d15d2033-b29b-44f7-b619-ed7ef83fe477] Requesting cell 374faad6-7a6f-40c7-b2f4-9138e0dde5d5(cell1) while migrating {{(pid=16132) _restrict_request_spec_to_cell /opt/stack/nova/nova/conductor/tasks/migrate.py:172}} And then finally get the message that we moved allocations prior to scheduling: Oct 31 16:55:08.634009 ubuntu-bionic-inap-mtl01-0012620879 nova-conductor[14891]: DEBUG nova.conductor.tasks.migrate [None req-275af2df-bd4e-4e64-b46e-6582e8de5148 tempest-ServerDiskConfigTestJSON-1598674508 tempest-ServerDiskConfigTestJSON-1598674508] Created allocations for migration 86ccf6f2-a2b0-4ae0-9c31-8bbdac4ceda9 on eccff5f8-3175-4170-b87e-cb556865fde0 {{(pid=16132) replace_allocation_with_migration /opt/stack/nova/nova/conductor/tasks/migrate.py:86}} After that we call the scheduler to find a host and that only takes about 1 second. That POST /allocations call to placement shouldn't be taking around 3 minutes so something crazy is going on there. If I'm reading the placement log correctly, that POST starts here: Oct 31 16:52:24.721346 ubuntu-bionic-inap-mtl01-0012620879 devstack at placement-api.service[8591]: DEBUG placement.requestlog [req-275af2df-bd4e-4e64-b46e-6582e8de5148 req-295f7350-2f0b-4f85-8cdf-d76801637221 None None] Starting request: 198.72.124.104 "POST /placement/allocations" {{(pid=8593) __call__ /opt/stack/placement/placement/requestlog.py:61}} And returns here: Oct 31 16:55:08.629708 ubuntu-bionic-inap-mtl01-0012620879 devstack at placement-api.service[8591]: DEBUG placement.handlers.allocation [req-275af2df-bd4e-4e64-b46e-6582e8de5148 req-295f7350-2f0b-4f85-8cdf-d76801637221 service placement] Successfully wrote allocations [, , , , , ] {{(pid=8593) _create_allocations /opt/stack/placement/placement/handlers/allocation.py:524}} Oct 31 16:55:08.629708 ubuntu-bionic-inap-mtl01-0012620879 devstack at placement-api.service[8591]: INFO placement.requestlog [req-275af2df-bd4e-4e64-b46e-6582e8de5148 req-295f7350-2f0b-4f85-8cdf-d76801637221 service placement] 198.72.124.104 "POST /placement/allocations" status: 204 len: 0 microversion: 1.28 Oct 31 16:55:08.629708 ubuntu-bionic-inap-mtl01-0012620879 devstack at placement-api.service[8591]: [pid: 8593|app: 0|req: 304/617] 198.72.124.104 () {72 vars in 1478 bytes} [Thu Oct 31 16:52:24 2019] POST /placement/allocations => generated 0 bytes in 163883 msecs (HTTP/1.1 204) 5 headers in 199 bytes (1 switches on core 0) I know Chris Dent has done a lot of profiling on placement recently but I'm not sure if much has been done around profiling the POST /allocations call to move allocations from one consumer to another. That also looks like some kind of anomaly because looking at that log at other POST /allocations calls, many take only 50-70 milliseconds. I wonder if there is a potential lock somewhere in placement slowing things down in that path. -- Thanks, Matt From mriedemos at gmail.com Thu Oct 31 22:03:17 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 31 Oct 2019 17:03:17 -0500 Subject: State of the Gate In-Reply-To: <20191031211535.vk7rtiq3pvsb6j2t@skaplons-mac> References: <20191031211535.vk7rtiq3pvsb6j2t@skaplons-mac> Message-ID: On 10/31/2019 4:15 PM, Slawek Kaplonski wrote: >> 3. CirrOS guest SSH issues >> >> There are several (some might be duplicates): >> >> http://status.openstack.org/elastic-recheck/index.html#1848078 > This one is I think the same as we have reported in > https://bugs.launchpad.net/neutron/+bug/1850557 > > Basically we noticed issues with dhcp after resize/migration/shelve of instance > but I didn't have time to investigate it yet. > Hmm, https://review.opendev.org/#/c/670591/ is new to devstack in Train and was backported to stable/stein. I wonder if that is too aggressive and is causing issues with operations where the guest is stopped and started, though for resize/migrate/shelve/unshelve the guest is destroyed on one host and re-spawned on another, so I would think that having a graceful shutdown for the guest wouldn't matter in those cases, unless it has to do with leaving the guest "dirty" somehow before transferring the root disk / creating a snapshot (in the case of shelve). Maybe we should bump that back up to 10 seconds? -- Thanks, Matt From skaplons at redhat.com Thu Oct 31 22:34:51 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 31 Oct 2019 23:34:51 +0100 Subject: [ptg][neutron] Onboarding during the PTG Message-ID: <20191031223451.yljtzrp2zulwpzoc@skaplons-mac> Hi all new (and existing) Neutrinos, During the PTG in Shanghai we are planning to organize onboarding session. It will take place on Wednesday in morning sessions. It is planned to be started at 9:00 am and finished just before the lunch on 12:30. See on [1] for details. All people who wants to learn about Neutron and contribution to it are welcome on this session. Also all existing team members are welcome to be there to show to new contributors and help them with onboarding process :) See You all in Shanghai! [1] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning -- Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Thu Oct 31 22:38:36 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 31 Oct 2019 23:38:36 +0100 Subject: State of the Gate (placement?) In-Reply-To: References: <20191031211535.vk7rtiq3pvsb6j2t@skaplons-mac> Message-ID: <20191031223836.pnq62mu25j64tohh@skaplons-mac> On Thu, Oct 31, 2019 at 04:52:39PM -0500, Matt Riedemann wrote: > On 10/31/2019 4:15 PM, Slawek Kaplonski wrote: > > > 1.http://status.openstack.org/elastic-recheck/index.html#1763070 > > > > > > This has resurfaced and I'm not sure why, nor do I think we ever had a great > > > handle on what is causing this or how to work around it so if anyone has new > > > ideas please chip in. > > I think that this is "just" some slowdown of node on which job is running. I > > noticed it too in some neutron jobs and I checked some. It seems that one API > > request is processed for very long time. For example in one of fresh examples: > > https://13cf3dd11b8f009809dc-97cb3b32849366f5bed744685e46b266.ssl.cf5.rackcdn.com/692206/3/check/tempest-integrated-compute/35ecb4a/job-output.txt > > it was request to nova which caused very long time: > > > > Oct 31 16:55:08.632162 ubuntu-bionic-inap-mtl01-0012620879devstack at n-api.service[7191]: INFO nova.api.openstack.requestlog [None req-275af2df-bd4e-4e64-b46e-6582e8de5148 tempest-ServerDiskConfigTestJSON-1598674508 tempest-ServerDiskConfigTestJSON-1598674508] 198.72.124.104 "POST /compute/v2.1/servers/d15d2033-b29b-44f7-b619-ed7ef83fe477/action" status: 500 len: 216 microversion: 2.1 time: 161.951140 > > > > > > That's really interesting, I hadn't noticed the MessagingTimeout in n-api > linked to this. I dug into that one and it's really odd, but there is one > known and one unknown issue here. I point to this one only as an example. I saw similar timeouts on calls to neutron and it was e.g. POST /v2.0/networks which took about 90 seconds. In same test run all other similar POST calls took less than a second. That's why it's hard to really debug what happens there and I think it is "some" slowdown of VM on which job is running. I simply don't have any better explanation for now :/ > > The known issue is resize/cold migrate is synchronous through API, conductor > and scheduler while picking a host and then we finally return to the API > caller when we RPC cast to the selected destination host. If we RPC cast > from API to conductor we'd avoid that MessagingTimeout issue. > > The unknown weirdness that made this fail is it took ~3 minutes to move > allocations from the instance to the migration record. We hit conductor > about here: > > Oct 31 16:52:24.300732 ubuntu-bionic-inap-mtl01-0012620879 > nova-conductor[14891]: DEBUG nova.conductor.tasks.migrate [None > req-275af2df-bd4e-4e64-b46e-6582e8de5148 > tempest-ServerDiskConfigTestJSON-1598674508 > tempest-ServerDiskConfigTestJSON-1598674508] [instance: > d15d2033-b29b-44f7-b619-ed7ef83fe477] Requesting cell > 374faad6-7a6f-40c7-b2f4-9138e0dde5d5(cell1) while migrating {{(pid=16132) > _restrict_request_spec_to_cell > /opt/stack/nova/nova/conductor/tasks/migrate.py:172}} > > And then finally get the message that we moved allocations prior to > scheduling: > > Oct 31 16:55:08.634009 ubuntu-bionic-inap-mtl01-0012620879 > nova-conductor[14891]: DEBUG nova.conductor.tasks.migrate [None > req-275af2df-bd4e-4e64-b46e-6582e8de5148 > tempest-ServerDiskConfigTestJSON-1598674508 > tempest-ServerDiskConfigTestJSON-1598674508] Created allocations for > migration 86ccf6f2-a2b0-4ae0-9c31-8bbdac4ceda9 on > eccff5f8-3175-4170-b87e-cb556865fde0 {{(pid=16132) > replace_allocation_with_migration > /opt/stack/nova/nova/conductor/tasks/migrate.py:86}} > > After that we call the scheduler to find a host and that only takes about 1 > second. That POST /allocations call to placement shouldn't be taking around > 3 minutes so something crazy is going on there. > > If I'm reading the placement log correctly, that POST starts here: > > Oct 31 16:52:24.721346 ubuntu-bionic-inap-mtl01-0012620879 > devstack at placement-api.service[8591]: DEBUG placement.requestlog > [req-275af2df-bd4e-4e64-b46e-6582e8de5148 > req-295f7350-2f0b-4f85-8cdf-d76801637221 None None] Starting request: > 198.72.124.104 "POST /placement/allocations" {{(pid=8593) __call__ > /opt/stack/placement/placement/requestlog.py:61}} > > And returns here: > > Oct 31 16:55:08.629708 ubuntu-bionic-inap-mtl01-0012620879 > devstack at placement-api.service[8591]: DEBUG placement.handlers.allocation > [req-275af2df-bd4e-4e64-b46e-6582e8de5148 > req-295f7350-2f0b-4f85-8cdf-d76801637221 service placement] Successfully > wrote allocations [ 0x7f023c9bcfd0>, 0x7f023c9bc6d8>, 0x7f023c9bcc18>, 0x7f023c9eb7f0>, 0x7f023c9ebdd8>, 0x7f023c9eb630>] {{(pid=8593) _create_allocations > /opt/stack/placement/placement/handlers/allocation.py:524}} > Oct 31 16:55:08.629708 ubuntu-bionic-inap-mtl01-0012620879 > devstack at placement-api.service[8591]: INFO placement.requestlog > [req-275af2df-bd4e-4e64-b46e-6582e8de5148 > req-295f7350-2f0b-4f85-8cdf-d76801637221 service placement] 198.72.124.104 > "POST /placement/allocations" status: 204 len: 0 microversion: 1.28 > Oct 31 16:55:08.629708 ubuntu-bionic-inap-mtl01-0012620879 > devstack at placement-api.service[8591]: [pid: 8593|app: 0|req: 304/617] > 198.72.124.104 () {72 vars in 1478 bytes} [Thu Oct 31 16:52:24 2019] POST > /placement/allocations => generated 0 bytes in 163883 msecs (HTTP/1.1 204) 5 > headers in 199 bytes (1 switches on core 0) > > I know Chris Dent has done a lot of profiling on placement recently but I'm > not sure if much has been done around profiling the POST /allocations call > to move allocations from one consumer to another. > > That also looks like some kind of anomaly because looking at that log at > other POST /allocations calls, many take only 50-70 milliseconds. I wonder > if there is a potential lock somewhere in placement slowing things down in > that path. > > -- > > Thanks, > > Matt > -- Slawek Kaplonski Senior software engineer Red Hat