From melwittt at gmail.com Tue Oct 1 03:14:31 2019 From: melwittt at gmail.com (melanie witt) Date: Mon, 30 Sep 2019 20:14:31 -0700 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: <14cab401-c416-2eb8-b1d9-97aff0642a8e@gmail.com> On 9/30/19 12:08 PM, Matt Riedemann wrote: > On 9/30/2019 12:27 PM, Dan Smith wrote: >>> 2. Do console proxies need to live in the cells? This is what devstack >>> does in superconductor mode. I did some digging through nova code, and >>> it looks that way. Testing with novncproxy agrees. This suggests we >>> need to expose a unique proxy endpoint for each cell, and configure >>> all computes to use the right one via e.g. novncproxy_base_url, >>> correct? >> I'll punt this to Melanie, as she's the console expert at this point, >> but I imagine you're right. >> > > Based on the Rocky spec [1] which says: > > "instead we will resolve the cell database issue by running console > proxies per cell instead of global to a deployment, such that the cell > database is local to the console proxy" > > Yes it's per-cell. There was stuff in the Rock release notes about this > [2] and a lot of confusion around the deprecation of the > nova-consoleauth service for which Mel knows the details, but it looks > like we really should have something documented about this too, here [3] > and/or here [4]. To echo, yes, console proxies need to run per cell. This used to be mentioned in our docs and I looked and found it got removed by the following commit: https://github.com/openstack/nova/commit/009fd0f35bcb88acc80f12e69d5fb72c0ee5391f so, we just need to add back the bit about running console proxies per cell. -melanie > [1] > https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html > > [2] https://docs.openstack.org/releasenotes/nova/rocky.html > [3] https://docs.openstack.org/nova/latest/user/cellsv2-layout.html > [4] https://docs.openstack.org/nova/latest/admin/remote-console-access.html > From balazs.gibizer at est.tech Tue Oct 1 07:30:57 2019 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 1 Oct 2019 07:30:57 +0000 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> Message-ID: <1569915055.26355.1@smtp.office365.com> On Tue, Oct 1, 2019 at 1:09 AM, Eric Fried wrote: > Nova developers and maintainers- > > Every cycle we approve some number of blueprints and then complete a > low > percentage [1] of them. Which blueprints go unfinished seems to be > completely random (notably, it appears to have nothing to do with our > declared cycle priorities). This is especially frustrating for > consumers > of a feature, who (understandably) interpret blueprint/spec approval > as > a signal that they can reasonably expect the feature to land [2]. > > The cause for non-completion usually seems to fall into one of several > broad categories: > > == Inadequate *developer* attention == > - There's not much to be done about the subset of these where the > contributor actually walks away. > > - The real problem is where the developer thinks they're ready for > reviewers to look, but reviewers don't. Even things that seem obvious > to > experienced reviewers, like failing CI or "WIP" in the commit title, > will cause patches to be completely ignored -- but unseasoned > contributors don't necessarily understand even that, let alone more > subtle issues. Consequently, patches will languish, with each side > expecting the other to take the next action. This is a problem of > culture: contributors don't understand nova reviewer procedures and > psychology. > > == Inadequate *reviewer* attention == > - Upstream maintainer time is limited. > > - We always seem to have low review activity until the last two or > three > weeks before feature freeze, when there's a frantic uptick and lots > gets > done. > > - But there's a cultural rift here as well. Getting maintainers to > care > about a blueprint is hard if they don't already have a stake in it. > The > "squeaky wheel" concept is not well understood by unseasoned > contributors. The best way to get reviews is to lurk in IRC and beg. > Aside from not being intuitive, this can also be difficult > logistically > (time zone pain, knowing which nicks to ping and how) as well as > interpersonally (how much begging is enough? too much? when is it > appropriate?). When I joined I was taught that instead of begging go and review open patches which a) helps the review load of dev team b) makes you known in the community. Both helps getting reviews on your patches. Does it always work? No. Do I like begging for review? No. Do I like to get repatedly pinged to review? No. So I would suggest not to declare that the only way to get review is to go and beg. > > == Multi-release efforts that we knew were going to be multi-release > == > These may often drag on far longer than they perhaps should, but I'm > not > going to try to address that here. > > ======== > > There's nothing new or surprising about the above. We've tried to > address these issues in various ways in the past, with varying degrees > of effectiveness. > > I'd like to try a couple more. > > (A) Constrain scope, drastically. We marked 25 blueprints complete in > Train [3]. Since there has been no change to the core team, let's > limit > Ussuri to 25 blueprints [4]. If this turns out to be too few, what's > the > worst thing that happens? We finish everything, early, and wish we had > done more. If that happens, drinks are on me, and we can bump the > number > for V. I support the ide that we limit our scope. But it is pretty hard to select which 25 (or whathever amount we agree on) bp we approve out of possible ~50ish. What will be the method of selection? > > (B) Require a core to commit to "caring about" a spec before we > approve > it. The point of this "core liaison" is to act as a mentor to mitigate > the cultural issues noted above [5], and to be a first point of > contact > for reviews. I've proposed this to the spec template here [6]. I proposed this before and I still think this could help. And partially answer my question above, this could be one of the way to limit the approved bps. If each core only commits to "care about" the implementation of 2 bps, then we already have a limit for the number of approved bps. Cheers, gibi > > Thoughts? > > efried > > [1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware > of > a good way to go back and mine actual data. > [2] Stuff happens, sure, and nobody expects 100%, but 60%? Come on, we > have to be able to do better than that. > [3] https://blueprints.launchpad.net/nova/train > [4] Recognizing of course that not all blueprints are created equal, > this is more an attempt at a reasonable heuristic than an actual > expectation of total size/LOC/person-hours/etc. The theory being that > constraining to an actual number, whatever the number may be, is > better > than not constraining at all. > [5] If you're a core, you can be your own liaison, because presumably > you don't need further cultural indoctrination or help begging for > reviews. > [6] https://review.opendev.org/685857 > From amotoki at gmail.com Tue Oct 1 07:33:24 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Tue, 1 Oct 2019 16:33:24 +0900 Subject: [all][PTG] Strawman Schedule In-Reply-To: <20190930214215.GQ10891@t440s> References: <20190929101340.GM10891@t440s> <20190930214215.GQ10891@t440s> Message-ID: Thanks Slawek for taking care of this. I am really fine with your plan discussed in the team meeting. Akihiro On Tue, Oct 1, 2019 at 6:44 AM Slawek Kaplonski wrote: > > Hi, > > After today's discussion on Neutron's meeting I think that we will "keep" full 3 > days for Neutron. I will than try to plan Neutron's sessions in such way that on > Wednesday morning we will have things which Akihiro will not be interested much. > We will sync about it later. > > On Sun, Sep 29, 2019 at 12:13:40PM +0200, Slawek Kaplonski wrote: > > Hi, > > > > I think that 2.5 days for Neutron should be fine too so we can start on > > Wednesday after the lunch. > > Or, if we should do onboarding session during PTG (I heard something like that > > but I'm not actually sure that it's true), maybe we can do it on > > Wednesday morning and than start PTG discussions after lunch when You will be > > ready Akihiro. > > What do You think about it? > > > > On Sat, Sep 28, 2019 at 02:56:24AM +0900, Akihiro Motoki wrote: > > > Hi Kendall, > > > > > > Looking at the updated version of the schedule, neutron has 2.5 days > > > but actually 3 days are assigned to neutron. > > > As horizon PTL and neutron core, hopefully neutron session starts from > > > Wednesday afternoon (and horizon has Wed morning session). > > > > > > In addition, I see "1.5 or 3.5" or "2 or 3.5" in several projects. I > > > guess they are the number of days assigned, but two numbers are very > > > different so I wonder what this means. > > > > > > Thanks, > > > Akihiro Motoki (irc: amotoki) > > > > > > On Sat, Sep 28, 2019 at 1:59 AM Kendall Nelson wrote: > > > > > > > > Hello Everyone! > > > > > > > > Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 > > > > > > > > The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. > > > > > > > > Please let me know if there are any conflicts! > > > > > > > > -Kendall (diablo_rojo) > > > > > > > > On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: > > > >> > > > >> Hello Everyone! > > > >> > > > >> In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. > > > >> > > > >> We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. > > > >> > > > >> If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! > > > >> > > > >> -Kendall (diablo_rojo) > > > >> > > > >> [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png > > > > > > > -- > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > -- > Slawek Kaplonski > Senior software engineer > Red Hat > > From balazs.gibizer at est.tech Tue Oct 1 08:19:45 2019 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 1 Oct 2019 08:19:45 +0000 Subject: [oslo][nova] Revert of oslo.messaging JSON serialization change In-Reply-To: <1569857750.5848.0@smtp.office365.com> References: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> <1569857750.5848.0@smtp.office365.com> Message-ID: <1569917983.26355.2@smtp.office365.com> On Mon, Sep 30, 2019 at 5:35 PM, Balázs Gibizer wrote: > > > On Mon, Sep 30, 2019 at 4:45 PM, Ben Nemec > wrote: >> Hi, >> >> I've just proposed https://review.opendev.org/#/c/685724/ which >> reverts a change that recently went in to make the fake driver in >> oslo.messaging use jsonutils for message serialization instead of >> json.dumps. >> >> As explained in the commit message on the revert, this is >> problematic >> because the rabbit driver uses kombu's default serialization method, >> which is json.dumps. By changing the fake driver to use jsonutils >> we've made it more lenient than the most used real driver which >> opens >> us up to merging broken changes in consumers of oslo.messaging. >> >> We did have some discussion of whether we should try to override the >> kombu default and tell it to use jsonutils too, as a number of other >> drivers do. The concern with this was that the jsonutils handler for >> things like datetime objects is not tz-aware, which means if you >> send >> a datetime object over RPC and don't explicitly handle it you could >> lose important information. >> >> I'm open to being persuaded otherwise, but at the moment I'm leaning >> toward less magic happening at the RPC layer and requiring projects >> to explicitly handle types that aren't serializable by the standard >> library json module. If you have a different preference, please >> share >> it here. > > Hi, > > I might me totally wrong here and please help me understand how the > RabbitDriver works. What I did when I created the original patch that > I > looked at each drivers how they handle sending messages. The > oslo_messaging._drivers.base.BaseDriver defines the interface with a > send() message. The oslo_messaging._drivers.amqpdriver.AMQPDriverBase > implements the BaseDriver interface's send() method to call _send(). > Then _send() calls rpc_commom.serialize_msg which then calls > jsonutils.dumps. > > The oslo_messaging._drivers.impl_rabbit.RabbitDriver driver inherits > from AMQPDriverBase and does not override send() or _send() so I think > the AMQPDriverBase ._send() is called that therefore jsonutils is used > during sending a message with RabbitDriver. I did some tracing in devstack to prove my point. See the result in https://review.opendev.org/#/c/685724/1//COMMIT_MSG at 11 Cheers, gibi > > Cheers, > gibi > > > [1] > https://github.com/openstack/oslo.messaging/blob/7734ac1376a1a9285c8245a91cf43599358bfa9d/oslo_messaging/_drivers/amqpdriver.py#L599 > >> >> Thanks. >> >> -Ben >> > > From mark at stackhpc.com Tue Oct 1 10:00:49 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 1 Oct 2019 11:00:49 +0100 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: Thanks all for your responses. Replies to Dan inline. On Mon, 30 Sep 2019 at 18:27, Dan Smith wrote: > > > 1. Is there any benefit to not having a superconductor? Presumably > > it's a little more efficient in the single cell case? Also IIUC it > > only requires a single message queue so is a little simpler? > > In a multi-cell case you need it, but you're asking about the case where > there's only one (real) cell yeah? > > If the deployment is really small, then the overhead of having one is > probably measurable and undesirable. I dunno what to tell you about > where that cut-off is, unfortunately. However, once you're over a > certain number of nodes, that probably shakes out a bit. The > superconductor does things that the cell-specific ones won't have to do, > so there's about the same amount of total load, just a potentially > larger memory footprint for running extra services, which would be > measurable at small scales. For a tiny deployment there's also overhead > just in the complexity, but one of the goals of v2 has always been to > get everyone on the same architecture, so having a "small mode" and a > "large mode" brings with it its own complexity. Thanks for the explanation. We've built in a switch for single or super mode, and single mode keeps us compatible with existing deployments, so I guess we'll keep the switch. > > > 2. Do console proxies need to live in the cells? This is what devstack > > does in superconductor mode. I did some digging through nova code, and > > it looks that way. Testing with novncproxy agrees. This suggests we > > need to expose a unique proxy endpoint for each cell, and configure > > all computes to use the right one via e.g. novncproxy_base_url, > > correct? > > I'll punt this to Melanie, as she's the console expert at this point, > but I imagine you're right. > > > 3. Should I upgrade the superconductor or conductor service first? > > Superconductor first, although they all kinda have to go around the same > time. Superconductor, like the regular conductors, needs to look at the > cell database directly, so if you were to upgrade superconductor before > the cell database you'd likely have issues. I think probably the ideal > would be to upgrade the db schema everywhere (which you can do without > rolling code), then upgrade the top-level services (conductor, > scheduler, api) and then you could probably get away with doing > conductor in the cell along with computes, or whatever. If possible > rolling the cell conductors with the top-level services would be ideal. I should have included my strawman deploy and upgrade flow for context, but I'm still honing it. All DB schema changes will be done up front in both cases. In terms of ordering, the API-level services (superconductor, API scheduler) are grouped together and will be rolled first - agreeing with what you've said. I think between Ansible's tags and limiting actions to specific hosts, the code can be written to support upgrading all cell conductors together, or at the same time as (well, immediately before) the cell's computes. The thinking behind upgrading one cell at a time is to limit the blast radius if something goes wrong. You suggest it would be better to roll all cell conductors at the same time though - do you think it's safer to run with the version disparity between conductor and computes rather than super- and cell- conductors? > > > 4. Does the cell conductor need access to the API DB? > > Technically it should not be allowed to talk to the API DB for > "separation of concerns" reasons. However, there are a couple of > features that still rely on the cell conductor being able to upcall to > the API database, such as the late affinity check. If you can only > choose one, then I'd say configure the cell conductors to talk to the > API DB, but if there's a knob for "isolate them" it'd be better. Knobs are easy to make, and difficult to keep working in all positions :) It seems worthwhile in this case. > > > 5. What DB configuration should be used in nova.conf when running > > online data migrations? I can see some migrations that seem to need > > the API DB, and others that need a cell DB. If I just give it the API > > DB, will it use the cell mappings to get to each cell DB, or do I need > > to run it once for each cell? > > The API DB has its own set of migrations, so you obviously need API DB > connection info to make that happen. There is no fanout to all the rest > of the cells (currently), so you need to run it with a conf file > pointing to the cell, for each cell you have. The latest attempt > at making this fan out was abanoned in July with no explanation, so it > dropped off my radar at least. That makes sense. The rolling upgrade docs could be a little clearer for multi-cell deployments here. > > > 6. After an upgrade, when can we restart services to unpin the compute > > RPC version? Looking at the compute RPC API, it looks like the super > > conductor will remain pinned until all computes have been upgraded. > > For a cell conductor, it looks like I could restart it to unpin after > > upgrading all computes in that cell, correct? > > Yeah. > > > 7. Which services require policy.{yml,json}? I can see policy > > referenced in API, conductor and compute. > > That's a good question. I would have thought it was just API, so maybe > someone else can chime in here, although it's not specific to cells. Yeah, unrelated to cells, just something I wondered while digging through our nova Ansible role. Here is the line that made me think policies are required in conductors: https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. I guess this is only required for cell conductors though? > > --Dan From dtantsur at redhat.com Tue Oct 1 10:05:52 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Tue, 1 Oct 2019 12:05:52 +0200 Subject: Release Cycle Observations In-Reply-To: <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> References: <40ab2bd3-e23a-6877-e515-63bbc1663f66@gmail.com> <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> Message-ID: On Fri, Sep 27, 2019 at 10:47 PM Thomas Goirand wrote: > On 9/26/19 9:51 PM, Sean McGinnis wrote: > >> I know we'd like to have everyone CD'ing master > > > > Watch who you're lumping in with the "we" statement. ;) > > You've pinpointed what the problem is. > > Everyone but OpenStack upstream would like to stop having to upgrade > every 6 months. Yep, but the same "everyone" want to have features now or better yesterday, not in 2-3 years ;) > The only way this could be resolved would be an > OpenStack LTS release let's say every 2 years, and allowing upgrade > between them, though that's probably too much effort upstream. We have > different groups wishing for the opposite thing to happen. > > I don't see this problem going away anytime soon. > > Cheers, > > Thomas Goirand (zigo) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at stackhpc.com Tue Oct 1 10:53:07 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Tue, 1 Oct 2019 12:53:07 +0200 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Hi Kendall, Friday works for all who have replied so far, but I am still expecting answers from two people. Is there a room available for our Project Onboarding session that day? Probably in the morning, though I will confirm depending on availability of participants. We've never run one, so I don't know how many people to expect. Thanks, Pierre On Mon, 30 Sep 2019 at 23:29, Kendall Waters wrote: > > Hi Pierre, > > Apologies for the oversight on Blazar. Would all day Friday work for your team? > > Thanks, > Kendall > > Kendall Waters > OpenStack Marketing & Events > kendall at openstack.org > > > > On Sep 30, 2019, at 12:27 PM, Pierre Riteau wrote: > > Hi Kendall, > > I couldn't see Blazar anywhere on the schedule. We had requested time > for a Project Onboarding session. > > Additionally, there are more people travelling than initially planned, > so we may want to allocate a half day for technical discussions as > well (probably in the shared space, since we don't expect a huge > turnout). > > Would it be possible to update the schedule accordingly? > > Thanks, > Pierre > > On Fri, 27 Sep 2019 at 19:02, Kendall Nelson wrote: > > > Hello Everyone! > > Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 > > The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. > > Please let me know if there are any conflicts! > > -Kendall (diablo_rojo) > > On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: > > > Hello Everyone! > > In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. > > We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. > > If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! > > -Kendall (diablo_rojo) > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png > > > From thierry at openstack.org Tue Oct 1 11:18:29 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 1 Oct 2019 13:18:29 +0200 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> Message-ID: <0464fd02-d393-5cc1-f03d-c2638d8fdd1f@openstack.org> Eric Fried wrote: > [...] > There's nothing new or surprising about the above. We've tried to > address these issues in various ways in the past, with varying degrees > of effectiveness. > > I'd like to try a couple more. > > (A) Constrain scope, drastically. We marked 25 blueprints complete in > Train [3]. Since there has been no change to the core team, let's limit > Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the > worst thing that happens? We finish everything, early, and wish we had > done more. If that happens, drinks are on me, and we can bump the number > for V. > > (B) Require a core to commit to "caring about" a spec before we approve > it. The point of this "core liaison" is to act as a mentor to mitigate > the cultural issues noted above [5], and to be a first point of contact > for reviews. I've proposed this to the spec template here [6]. > > Thoughts? Setting expectations more reasonably is key to grow a healthy long-term environment, so I completely support your efforts here. However I suspect there will always be blueprints that fail to be completed. If it were purely a question of reviewer resources, then I agree that capping the number of blueprints to the reviewer team's throughput is the right approach. But as you noted, incomplete blueprints come from a few different reasons, sometimes not related to reviewers efforts at all. So if out of 50 blueprints, say 5 are incomplete due to lack of reviewers attention, 5 due to lack of developer attention, and 15 fail due to reviewers also being developers and having to make a hard choice... Targeting 30-35 might be better (expecting 5-10 of them to fail anyway, and not due to constrained resources). The other comment I have is that I suspect all blueprints do not have the same weight, so assigning them complexity points could help avoid under/overshooting. -- Thierry Carrez (ttx) From jean-philippe at evrard.me Tue Oct 1 12:29:07 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 01 Oct 2019 14:29:07 +0200 Subject: [tc] Weekly update Message-ID: Hello friends, Here's what need attention for the OpenStack TC this week: 1. We should ensure we have two TC members focusing on next cycle goal selection process. Only Ghanshyam is dealing with this, and we must help him on the way! Any volunteers? Thanks again gmann for working on that. 2. Jimmy McArthur sent us the results of the OpenStack User survey on the ML [1]. We currently haven't analyzed the information yet. Any volunteer to analyse the information (in order to extract action items) is welcomed. It would be great if we could discuss this at our next official meeting, or at least discuss the next steps. 3. Our next meeting date will be the Thursday 10 October. I will be travelling that day, so it would be nice to have a volunteer to host the meeting. For that, our next meeting agenda needs clarifications. It would be great if you could update the agenda (please also write if your absent) on the wiki [2], so that I can send the invite to the ML. I will send the invite on Thursday. 4. We still haven't finished the conversationg about naming releases. There are a few new ideas floated around, so we should maybe drop the current process to take count of the newly proposed ideas (The large cities lists proposed by Nate, the movie quotes proposed by Thierry [9])? Alternatively, if we can't find consensus, should we just entrust the release naming process to the release team? 5. We should decide to deprecate or not the PowerVMStackers team [3] and move it as a SIG. The votes don't reflect this. Thank you everyone! [1]: http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html [2]: https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [3]: https://review.opendev.org/680438 [4]: https://review.opendev.org/680985 [5]: https://review.opendev.org/681260 [6]: https://review.opendev.org/681480 [7]: https://review.opendev.org/681924 [8]: https://review.opendev.org/682380 [9]: https://review.opendev.org/684688 From tpb at dyncloud.net Tue Oct 1 12:38:50 2019 From: tpb at dyncloud.net (Tom Barron) Date: Tue, 1 Oct 2019 08:38:50 -0400 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <1569915055.26355.1@smtp.office365.com> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> Message-ID: <20191001123850.f7h4wmupoo3oyzta@barron.net> On 01/10/19 07:30 +0000, Balázs Gibizer wrote: > > >On Tue, Oct 1, 2019 at 1:09 AM, Eric Fried wrote: >> Nova developers and maintainers- >> >> Every cycle we approve some number of blueprints and then complete a >> low >> percentage [1] of them. Which blueprints go unfinished seems to be >> completely random (notably, it appears to have nothing to do with our >> declared cycle priorities). This is especially frustrating for >> consumers >> of a feature, who (understandably) interpret blueprint/spec approval >> as >> a signal that they can reasonably expect the feature to land [2]. >> >> The cause for non-completion usually seems to fall into one of several >> broad categories: >> >> == Inadequate *developer* attention == >> - There's not much to be done about the subset of these where the >> contributor actually walks away. >> >> - The real problem is where the developer thinks they're ready for >> reviewers to look, but reviewers don't. Even things that seem obvious >> to >> experienced reviewers, like failing CI or "WIP" in the commit title, >> will cause patches to be completely ignored -- but unseasoned >> contributors don't necessarily understand even that, let alone more >> subtle issues. Consequently, patches will languish, with each side >> expecting the other to take the next action. This is a problem of >> culture: contributors don't understand nova reviewer procedures and >> psychology. >> >> == Inadequate *reviewer* attention == >> - Upstream maintainer time is limited. >> >> - We always seem to have low review activity until the last two or >> three >> weeks before feature freeze, when there's a frantic uptick and lots >> gets >> done. >> >> - But there's a cultural rift here as well. Getting maintainers to >> care >> about a blueprint is hard if they don't already have a stake in it. >> The >> "squeaky wheel" concept is not well understood by unseasoned >> contributors. The best way to get reviews is to lurk in IRC and beg. >> Aside from not being intuitive, this can also be difficult >> logistically >> (time zone pain, knowing which nicks to ping and how) as well as >> interpersonally (how much begging is enough? too much? when is it >> appropriate?). > >When I joined I was taught that instead of begging go and review open >patches which a) helps the review load of dev team b) makes you known >in the community. Both helps getting reviews on your patches. Does it >always work? No. Do I like begging for review? No. Do I like to get >repatedly pinged to review? No. So I would suggest not to declare that >the only way to get review is to go and beg. +1 In projects I have worked on there is no need to encourage extra begging and squeaky wheel prioritization has IMO not been a healthy thing. There is no better way to get ones reviews stalled than to beg for reviews with patches that are not close to ready for review and at the same time contribute no useful reviews oneself. There is nothing wrong with pinging to get attention to a review if it is ready and languishing, or if it solves an urgent issue, but even in these cases a ping from someone who doesn't "cry wolf" and who has built a reputation as a contributor carries more weight. > >> >> == Multi-release efforts that we knew were going to be multi-release >> == >> These may often drag on far longer than they perhaps should, but I'm >> not >> going to try to address that here. >> >> ======== >> >> There's nothing new or surprising about the above. We've tried to >> address these issues in various ways in the past, with varying degrees >> of effectiveness. >> >> I'd like to try a couple more. >> >> (A) Constrain scope, drastically. We marked 25 blueprints complete in >> Train [3]. Since there has been no change to the core team, let's >> limit >> Ussuri to 25 blueprints [4]. If this turns out to be too few, what's >> the >> worst thing that happens? We finish everything, early, and wish we had >> done more. If that happens, drinks are on me, and we can bump the >> number >> for V. > >I support the ide that we limit our scope. But it is pretty hard to >select which 25 (or whathever amount we agree on) bp we approve out of >possible ~50ish. What will be the method of selection? > >> >> (B) Require a core to commit to "caring about" a spec before we >> approve >> it. The point of this "core liaison" is to act as a mentor to mitigate >> the cultural issues noted above [5], and to be a first point of >> contact >> for reviews. I've proposed this to the spec template here [6]. > >I proposed this before and I still think this could help. And partially >answer my question above, this could be one of the way to limit the >approved bps. If each core only commits to "care about" the >implementation of 2 bps, then we already have a limit for the number of >approved bps. > >Cheers, >gibi > >> >> Thoughts? >> >> efried >> >> [1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware >> of >> a good way to go back and mine actual data. >> [2] Stuff happens, sure, and nobody expects 100%, but 60%? Come on, we >> have to be able to do better than that. >> [3] https://blueprints.launchpad.net/nova/train >> [4] Recognizing of course that not all blueprints are created equal, >> this is more an attempt at a reasonable heuristic than an actual >> expectation of total size/LOC/person-hours/etc. The theory being that >> constraining to an actual number, whatever the number may be, is >> better >> than not constraining at all. >> [5] If you're a core, you can be your own liaison, because presumably >> you don't need further cultural indoctrination or help begging for >> reviews. >> [6] https://review.opendev.org/685857 >> > > From a.settle at outlook.com Tue Oct 1 12:49:03 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Tue, 1 Oct 2019 12:49:03 +0000 Subject: [all] [tc] [ptls] PDF Goal Aftermath Message-ID: Hi all, Thanks to all those who worked super hard to achieve the PDF Enablement goal for Train. Things are looking great, and I've already received feedback from downstream about how useful the PDFs are! So that's fantastic. For the sake of the goal, the intention was to have potentially imperfect PDFs and to ensure they're building. In some cases, this does leave room for improvement. In openstack-doc, we've been approached a few times regarding moving forward when there are issues with no known workarounds. So, asking all those that have been working on/building their PDFs and are experiencing issues that can be fixed post-mortem to ensure everything is documented in our Common Problems etherpad [1]. Ensuring everything is documented, means we can work together to identify appropriate workarounds or fixes in the future. We will review all items without a workaround in the short term. Cheers, Alex [1] https://etherpad.openstack.org/p/pdf-goal-train-common-problems -- Alexandra Settle IRC: asettle From fungi at yuggoth.org Tue Oct 1 13:00:36 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 1 Oct 2019 13:00:36 +0000 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <20191001123850.f7h4wmupoo3oyzta@barron.net> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> Message-ID: <20191001130035.hm2alc63eab4cpek@yuggoth.org> On 2019-10-01 08:38:50 -0400 (-0400), Tom Barron wrote: [...] > In projects I have worked on there is no need to encourage extra > begging and squeaky wheel prioritization has IMO not been a > healthy thing. > > There is no better way to get ones reviews stalled than to beg for > reviews with patches that are not close to ready for review and at > the same time contribute no useful reviews oneself. > > There is nothing wrong with pinging to get attention to a review > if it is ready and languishing, or if it solves an urgent issue, > but even in these cases a ping from someone who doesn't "cry wolf" > and who has built a reputation as a contributor carries more > weight. [...] Agreed, it drives back to Eric's comment about familiarity with the team's reviewer culture. Just saying "hey I pushed these patches can someone look" is often far less effective for a newcomer than "I reported a bug in subsystem X which is really aggravating and review 654321 fixes it if anyone has a moment to look" or "tbarron: I addressed your comments on review 654321 when you get a chance to revisit your -1, thanks!" My cardinal rules of begging: Don't mention the nicks of random people who have not been involved in the change unless you happen to actually know it's one they'll personally be interested in. Provide as much context as possible (within reason) to attract the actual interest of potential reviewers. Be polite, thank people, and don't assume your change is important to anyone nor that there's someone who has time to look at it. And most important, as you noted too, if you're waiting around then take a few minutes and go review something to pass the time! ;) -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dms at danplanet.com Tue Oct 1 13:37:42 2019 From: dms at danplanet.com (Dan Smith) Date: Tue, 01 Oct 2019 06:37:42 -0700 Subject: [nova][kolla] questions on cells In-Reply-To: (Mark Goddard's message of "Tue, 1 Oct 2019 11:00:49 +0100") References: Message-ID: Mark Goddard writes: > The thinking behind upgrading one cell at a time is to limit the blast > radius if something goes wrong. You suggest it would be better to roll > all cell conductors at the same time though - do you think it's safer > to run with the version disparity between conductor and computes > rather than super- and cell- conductors? Yes, the conductors and computes are built to work at different versions. Conductors, not so much. While you can pin the conductor RPC version to *technically* make them talk, they will do things like migrate data to new formats in the cell databases and since they *are* the insulation layer against such changes, older conductors are going to be unhappy if new conductors move data underneath them before they're ready. > Here is the line that made me think policies are required in > conductors: > https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. > I guess this is only required for cell conductors though? No, actually more likely to be the superconductors I think. However, it could technically be called at the cell level so you probably need to make sure it's there. That might be something left-over from a check that moved to the API and could now be removed (or ignored). --Dan From jungleboyj at gmail.com Tue Oct 1 13:52:56 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 1 Oct 2019 08:52:56 -0500 Subject: [tc] Weekly update In-Reply-To: References: Message-ID: On 10/1/2019 7:29 AM, Jean-Philippe Evrard wrote: > Hello friends, > > Here's what need attention for the OpenStack TC this week: > > 1. We should ensure we have two TC members focusing on next cycle goal > selection process. Only Ghanshyam is dealing with this, and we must > help him on the way! Any volunteers? Thanks again gmann for working on > that. > > 2. Jimmy McArthur sent us the results of the OpenStack User survey on > the ML [1]. We currently haven't analyzed the information yet. > Any volunteer to analyse the information (in order to extract action > items) is welcomed. It would be great if we could discuss this at our > next official meeting, or at least discuss the next steps. JP, I generally go through this for the Cinder team.  Since I will be in there I can review the comments and create an overview/action items before our next meeting. Jay > 3. Our next meeting date will be the Thursday 10 October. I will be > travelling that day, so it would be nice to have a volunteer to host > the meeting. For that, our next meeting agenda needs clarifications. > It would be great if you could update the agenda (please also write if > your absent) on the wiki [2], so that I can send the invite to the ML. > I will send the invite on Thursday. > > 4. We still haven't finished the conversationg about naming releases. > There are a few new ideas floated around, so we should maybe drop the > current process to take count of the newly proposed ideas (The large > cities lists proposed by Nate, the movie quotes proposed by Thierry > [9])? Alternatively, if we can't find consensus, should we just entrust > the release naming process to the release team? > > 5. We should decide to deprecate or not the PowerVMStackers team [3] > and move it as a SIG. The votes don't reflect this. > > Thank you everyone! > > [1]: > http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html > > [2]: > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > [3]: https://review.opendev.org/680438 > [4]: https://review.opendev.org/680985 > [5]: https://review.opendev.org/681260 > [6]: https://review.opendev.org/681480 > [7]: https://review.opendev.org/681924 > [8]: https://review.opendev.org/682380 > [9]: https://review.opendev.org/684688 > > From openstack at fried.cc Tue Oct 1 14:15:22 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 1 Oct 2019 09:15:22 -0500 Subject: [nova][ops] Removing Debug middleware Message-ID: <8659945a-91d3-2057-b089-92002508e188@fried.cc> Deployers- There's a Debug middleware in nova that's not used in the codebase since 2010, so we're removing it [1]. BUT Theoretically, deployments could be making use of it by injecting it into the paste pipeline (a ~3LOC edit to your local api-paste.ini). If this is you, and you're really relying on this behavior, let us know. We can either revert the change or show you how to carry it locally (which would be really easy to do). Thanks, efried [1] https://review.opendev.org/#/c/662506/ From no-reply at openstack.org Tue Oct 1 14:34:49 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Tue, 01 Oct 2019 14:34:49 -0000 Subject: networking-midonet 9.0.0.0rc2 (train) Message-ID: Hello everyone, A new release candidate for networking-midonet for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/networking-midonet/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/networking-midonet/src/branch/stable/train Release notes for networking-midonet can be found at: https://docs.openstack.org/releasenotes/networking-midonet/ If you find an issue that could be considered release-critical, please file it at: https://bugs.launchpad.net/networking-midonet/+bugs and tag it *train-rc-potential* to bring it to the networking-midonet release crew's attention. From openstack at fried.cc Tue Oct 1 15:00:20 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 1 Oct 2019 10:00:20 -0500 Subject: [nova][ptg] Review culture (was: Ussuri scope containment) In-Reply-To: <20191001130035.hm2alc63eab4cpek@yuggoth.org> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> <20191001130035.hm2alc63eab4cpek@yuggoth.org> Message-ID: <72a5c7e7-58a5-187d-3422-44fb110e0f10@fried.cc> Thanks for the responses, all. This subthread is becoming tangential to my original purpose, so I'm renaming it. >> The best way to get reviews is to lurk in IRC and beg. > When I joined I was taught that instead of begging go and review open > patches which a) helps the review load of dev team b) makes you known > in the community. Both helps getting reviews on your patches. Does it > always work? No. Do I like begging for review? No. Do I like to get > repatedly pinged to review? No. So I would suggest not to declare that > the only way to get review is to go and beg. I recognize I was generalizing; begging isn't really "the best way" to get reviews. Doing reviews and becoming known (and *then* begging :) is far more effective -- but is literally impossible for many contributors. Even if they have the time (percentage of work week) to dedicate upstream, it takes massive effort and time (calendar) to get there. We can not and should not expect this of every contributor. More... On 10/1/19 8:00 AM, Jeremy Stanley wrote: > On 2019-10-01 08:38:50 -0400 (-0400), Tom Barron wrote: > [...] >> In projects I have worked on there is no need to encourage extra >> begging and squeaky wheel prioritization has IMO not been a >> healthy thing. >> >> There is no better way to get ones reviews stalled than to beg for >> reviews with patches that are not close to ready for review and at >> the same time contribute no useful reviews oneself. >> >> There is nothing wrong with pinging to get attention to a review >> if it is ready and languishing, or if it solves an urgent issue, >> but even in these cases a ping from someone who doesn't "cry wolf" >> and who has built a reputation as a contributor carries more >> weight. > [...] > > Agreed, it drives back to Eric's comment about familiarity with the > team's reviewer culture. Just saying "hey I pushed these patches can > someone look" is often far less effective for a newcomer than "I > reported a bug in subsystem X which is really aggravating and review > 654321 fixes it if anyone has a moment to look" or "tbarron: I > addressed your comments on review 654321 when you get a chance to > revisit your -1, thanks!" > > My cardinal rules of begging: Don't mention the nicks of random > people who have not been involved in the change unless you happen to > actually know it's one they'll personally be interested in. Provide > as much context as possible (within reason) to attract the actual > interest of potential reviewers. Be polite, thank people, and don't > assume your change is important to anyone nor that there's someone > who has time to look at it. And most important, as you noted too, if > you're waiting around then take a few minutes and go review > something to pass the time! ;) > This is *precisely* the kind of culture that we cannot expect inexperienced contributors to understand. We can write it down [1], but then we have to get people to read what's written. To tie back to the original thread, this is where it would help to have a core (or experienced dev) as a mentor/liaison to be the first point of contact for questions, guidance, etc. Putting it in the spec process ensures it doesn't get missed (like a doc sitting "out there" somewhere). efried [1] though I fear that would end up being a long-winded and wandering tome, difficult to read and grok, assuming we could even agree on what it should say (frankly, there are some aspects we should be embarrassed to admit in writing) From moreira.belmiro.email.lists at gmail.com Tue Oct 1 15:12:55 2019 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Tue, 1 Oct 2019 17:12:55 +0200 Subject: [nova][kolla] questions on cells In-Reply-To: <0c65b9eb-63af-6daa-c82b-61034ca52440@gmail.com> References: <0c65b9eb-63af-6daa-c82b-61034ca52440@gmail.com> Message-ID: Hi, just to clarify, CERN runs the superconductor. Yes, affinity check is an issue. We plan work on it in the next cycle. The metadata API runs per cell. The main reason is that we still run nova-network in few cells. cheers, Belmiro On Mon, Sep 30, 2019 at 8:56 PM Matt Riedemann wrote: > On 9/30/2019 12:27 PM, Dan Smith wrote: > >> 4. Does the cell conductor need access to the API DB? > > Technically it should not be allowed to talk to the API DB for > > "separation of concerns" reasons. However, there are a couple of > > features that still rely on the cell conductor being able to upcall to > > the API database, such as the late affinity check. > > In case you haven't seen this yet, we have a list of operations > requiring "up-calls" from compute/cell-conductor to the API DB in the > docs here: > > > https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#operations-requiring-upcalls > > Some have been fixed for awhile and some are still open because they are > not default configuration we normally deal with (cross_az_attach=False) > or hit in CI* runs (reschedules). > > I think the biggest/hardest problem there to solve is the late affinity > check which long-term should be solved with placement but no one is > working on that. The reschedule stuff related to getting AZ/aggregate > info is simpler but involves some RPC changes so it's not trivial and > again no one is working on fixing that. > > I think for those reasons CERN is running without a superconductor mode > and can hit the API DB from the cells. Devstack superconductor mode is > the ideal though for the separation of concerns Dan pointed out. > > *Note we do hit the reschedule issue sometimes in multi-cell jobs: > > > http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22CantStartEngineError%3A%20No%20sql_connection%20parameter%20is%20established%5C%22%20AND%20tags%3A%5C%22screen-n-cond-cell1.txt%5C%22&from=7d > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kendall at openstack.org Tue Oct 1 15:37:07 2019 From: kendall at openstack.org (Kendall Waters) Date: Tue, 1 Oct 2019 10:37:07 -0500 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: <29C580AF-47C6-426A-B571-E0D0E9E8806E@openstack.org> Hi Pierre, Most of our space at the Shanghai PTG is shared space so we can offer you a designated table in the shared room all day Friday. There will be extra chairs in the room if you need to pull up more chairs to your table. Best, Kendall Kendall Waters OpenStack Marketing & Events kendall at openstack.org > On Oct 1, 2019, at 5:53 AM, Pierre Riteau wrote: > > Hi Kendall, > > Friday works for all who have replied so far, but I am still expecting > answers from two people. > > Is there a room available for our Project Onboarding session that day? > Probably in the morning, though I will confirm depending on > availability of participants. > We've never run one, so I don't know how many people to expect. > > Thanks, > Pierre > > On Mon, 30 Sep 2019 at 23:29, Kendall Waters wrote: >> >> Hi Pierre, >> >> Apologies for the oversight on Blazar. Would all day Friday work for your team? >> >> Thanks, >> Kendall >> >> Kendall Waters >> OpenStack Marketing & Events >> kendall at openstack.org >> >> >> >> On Sep 30, 2019, at 12:27 PM, Pierre Riteau wrote: >> >> Hi Kendall, >> >> I couldn't see Blazar anywhere on the schedule. We had requested time >> for a Project Onboarding session. >> >> Additionally, there are more people travelling than initially planned, >> so we may want to allocate a half day for technical discussions as >> well (probably in the shared space, since we don't expect a huge >> turnout). >> >> Would it be possible to update the schedule accordingly? >> >> Thanks, >> Pierre >> >> On Fri, 27 Sep 2019 at 19:02, Kendall Nelson wrote: >> >> >> Hello Everyone! >> >> Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 >> >> The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. >> >> Please let me know if there are any conflicts! >> >> -Kendall (diablo_rojo) >> >> On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: >> >> >> Hello Everyone! >> >> In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. >> >> We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. >> >> If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! >> >> -Kendall (diablo_rojo) >> >> [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From nate.johnston at redhat.com Tue Oct 1 15:47:07 2019 From: nate.johnston at redhat.com (Nate Johnston) Date: Tue, 1 Oct 2019 11:47:07 -0400 Subject: [tc] Weekly update In-Reply-To: References: Message-ID: Jean-Philippe, I'd be happy to run the meeting for you. Thanks, Nate On Tue, Oct 1, 2019 at 8:34 AM Jean-Philippe Evrard wrote: > Hello friends, > > Here's what need attention for the OpenStack TC this week: > > 1. We should ensure we have two TC members focusing on next cycle goal > selection process. Only Ghanshyam is dealing with this, and we must > help him on the way! Any volunteers? Thanks again gmann for working on > that. > > 2. Jimmy McArthur sent us the results of the OpenStack User survey on > the ML [1]. We currently haven't analyzed the information yet. > Any volunteer to analyse the information (in order to extract action > items) is welcomed. It would be great if we could discuss this at our > next official meeting, or at least discuss the next steps. > > 3. Our next meeting date will be the Thursday 10 October. I will be > travelling that day, so it would be nice to have a volunteer to host > the meeting. For that, our next meeting agenda needs clarifications. > It would be great if you could update the agenda (please also write if > your absent) on the wiki [2], so that I can send the invite to the ML. > I will send the invite on Thursday. > > 4. We still haven't finished the conversationg about naming releases. > There are a few new ideas floated around, so we should maybe drop the > current process to take count of the newly proposed ideas (The large > cities lists proposed by Nate, the movie quotes proposed by Thierry > [9])? Alternatively, if we can't find consensus, should we just entrust > the release naming process to the release team? > > 5. We should decide to deprecate or not the PowerVMStackers team [3] > and move it as a SIG. The votes don't reflect this. > > Thank you everyone! > > [1]: > > http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html > > [2]: > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > [3]: https://review.opendev.org/680438 > [4]: https://review.opendev.org/680985 > [5]: https://review.opendev.org/681260 > [6]: https://review.opendev.org/681480 > [7]: https://review.opendev.org/681924 > [8]: https://review.opendev.org/682380 > [9]: https://review.opendev.org/684688 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nate.johnston at redhat.com Tue Oct 1 15:48:15 2019 From: nate.johnston at redhat.com (Nate Johnston) Date: Tue, 1 Oct 2019 11:48:15 -0400 Subject: [tc] Weekly update In-Reply-To: References: Message-ID: Ah, never mind, I did not notice that asettle already volunteered. Apologies! Nate On Tue, Oct 1, 2019 at 11:47 AM Nate Johnston wrote: > Jean-Philippe, > > I'd be happy to run the meeting for you. > > Thanks, > > Nate > > On Tue, Oct 1, 2019 at 8:34 AM Jean-Philippe Evrard < > jean-philippe at evrard.me> wrote: > >> Hello friends, >> >> Here's what need attention for the OpenStack TC this week: >> >> 1. We should ensure we have two TC members focusing on next cycle goal >> selection process. Only Ghanshyam is dealing with this, and we must >> help him on the way! Any volunteers? Thanks again gmann for working on >> that. >> >> 2. Jimmy McArthur sent us the results of the OpenStack User survey on >> the ML [1]. We currently haven't analyzed the information yet. >> Any volunteer to analyse the information (in order to extract action >> items) is welcomed. It would be great if we could discuss this at our >> next official meeting, or at least discuss the next steps. >> >> 3. Our next meeting date will be the Thursday 10 October. I will be >> travelling that day, so it would be nice to have a volunteer to host >> the meeting. For that, our next meeting agenda needs clarifications. >> It would be great if you could update the agenda (please also write if >> your absent) on the wiki [2], so that I can send the invite to the ML. >> I will send the invite on Thursday. >> >> 4. We still haven't finished the conversationg about naming releases. >> There are a few new ideas floated around, so we should maybe drop the >> current process to take count of the newly proposed ideas (The large >> cities lists proposed by Nate, the movie quotes proposed by Thierry >> [9])? Alternatively, if we can't find consensus, should we just entrust >> the release naming process to the release team? >> >> 5. We should decide to deprecate or not the PowerVMStackers team [3] >> and move it as a SIG. The votes don't reflect this. >> >> Thank you everyone! >> >> [1]: >> >> http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html >> >> [2]: >> https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting >> [3]: https://review.opendev.org/680438 >> [4]: https://review.opendev.org/680985 >> [5]: https://review.opendev.org/681260 >> [6]: https://review.opendev.org/681480 >> [7]: https://review.opendev.org/681924 >> [8]: https://review.opendev.org/682380 >> [9]: https://review.opendev.org/684688 >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Tue Oct 1 16:19:44 2019 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 1 Oct 2019 16:19:44 +0000 Subject: [nova][ptg] Review culture (was: Ussuri scope containment) In-Reply-To: <72a5c7e7-58a5-187d-3422-44fb110e0f10@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> <20191001130035.hm2alc63eab4cpek@yuggoth.org> <72a5c7e7-58a5-187d-3422-44fb110e0f10@fried.cc> Message-ID: <1569946782.31568.0@smtp.office365.com> On Tue, Oct 1, 2019 at 5:00 PM, Eric Fried wrote: > Thanks for the responses, all. > > This subthread is becoming tangential to my original purpose, so I'm > renaming it. > >>> The best way to get reviews is to lurk in IRC and beg. > >> When I joined I was taught that instead of begging go and review >> open >> patches which a) helps the review load of dev team b) makes you >> known >> in the community. Both helps getting reviews on your patches. Does >> it >> always work? No. Do I like begging for review? No. Do I like to get >> repatedly pinged to review? No. So I would suggest not to declare >> that >> the only way to get review is to go and beg. > > I recognize I was generalizing; begging isn't really "the best way" to > get reviews. Doing reviews and becoming known (and *then* begging :) > is > far more effective -- but is literally impossible for many > contributors. > Even if they have the time (percentage of work week) to dedicate > upstream, it takes massive effort and time (calendar) to get there. We > can not and should not expect this of every contributor. > Sure, it is not easy for a new commer to read a random nova patch. But I think we should encourage them to do so. As that is one of the way how a newcomer will learn how nova (as software) works. I don't expect from a newcommer to point out in a nova review that I made a mistake about an obscure nova specific construct. But I think a newcommer still can give us valuable feedback about the code readability, about generic python usage, about English grammar... gibi From openstack at fried.cc Tue Oct 1 18:33:27 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 1 Oct 2019 13:33:27 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <1569915055.26355.1@smtp.office365.com> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> Message-ID: Thanks all for the feedback, refinements, suggestions. Please keep them coming! > If each core only commits to "care about" the > implementation of 2 bps, then we already have a limit for the number of > approved bps. I'd like to not try to prescribe this level of detail. [all blueprints are not created equal] x [all cores are not created equal] = [too many variables]. Different cores will have different amounts of time, effort, and willingness to be liaisons. > I support the ide that we limit our scope. But it is pretty hard to > select which 25 (or whathever amount we agree on) bp we approve out of > possible ~50ish. What will be the method of selection? Basically have a meeting and decide what should fall above or below the line, like you would in a corporate setting. It's not vastly different than how we already decide whether to approve a blueprint; it's just based on resource rather than technical criteria. (It's a hard thing to have to tell somebody their feature is denied despite having technical merit, but my main driver here is that they would rather know that up front than it be basically a coin toss whose result they don't know until feature freeze.) > So if out of 50 blueprints, say 5 are incomplete due to lack of > reviewers attention, 5 due to lack of developer attention, and 15 fail > due to reviewers also being developers and having to make a hard > choice... Targeting 30-35 might be better (expecting 5-10 of them to > fail anyway, and not due to constrained resources). Yup, your math makes sense. It pains me to phrase it this way, but it's more realistic: (A) Let's aim to complete 25 blueprints in Ussuri; so we'll approve 30, expecting 5 to fail. And the goal of this manifesto is to ensure that ~zero of the 5 incompletes are due to (A) overcommitment and (B) cultural disconnects. > The other comment I have is that I suspect all blueprints do not have > the same weight, so assigning them complexity points could help avoid > under/overshooting. Yeah, that's a legit suggestion, but I really didn't want to go there [1]. I want to try to keep this conceptually as simple as possible, at least the first time around. (I also really don't see the team trying to subvert the process by e.g. making sure we pick the 30 biggest blueprints.) efried [1] I have long-lasting scars from my experiences with "story points" and other "agile" planning techniques. From satish.txt at gmail.com Tue Oct 1 18:39:28 2019 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 1 Oct 2019 14:39:28 -0400 Subject: issues creating a second vm with numa affinity In-Reply-To: <9D8A2486E35F0941A60430473E29F15B017EB7B8AE@MXDB1.ad.garvan.unsw.edu.au> References: <9D8A2486E35F0941A60430473E29F15B017EB7B8AE@MXDB1.ad.garvan.unsw.edu.au> Message-ID: did you try to removing "hw:numa_nodes=1" ? On Tue, Oct 1, 2019 at 2:16 PM Manuel Sopena Ballesteros wrote: > > Dear Openstack user community, > > > > I have a compute node with 2 numa nodes and I would like to create 2 vms, each one using a different numa node through numa affinity with cpu, memory and nvme pci devices. > > > > pci passthrough whitelist > > [root at zeus-53 ~]# tail /etc/kolla/nova-compute/nova.conf > > [notifications] > > > > [filter_scheduler] > > enabled_filters = enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter > > available_filters = nova.scheduler.filters.all_filters > > > > [pci] > > passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, {"address":"0000:87:00.0"} ] > > alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"} > > > > Openstack flavor > > openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property pci_passthrough:alias='nvme:4' > > > > The first vm is successfully created > > openstack server create --network hpc --flavor xlarge.numa.perf.test --image centos7.6-image --availability-zone nova:zeus-53.localdomain --key-name mykey kudu-1 > > > > However the second vm fails > > openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone nova:zeus-53.localdomain --key-name mykey kudu-4 > > > > Errors in nova compute node > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [req-b5a25c73-8c7d-466c-8128-71f29e7ae8aa 91e83343e9834c8ba0172ff369c8acac b91520cff5bd45c59a8de07c38641582 - default default] [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] Instance failed to spawn: libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] Traceback (most recent call last): > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2369, in _build_resources > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] yield resources > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2133, in _build_and_run_instance > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] block_device_info=block_device_info) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3142, in spawn > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] destroy_disks_on_failure=True) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5705, in _create_domain_and_network > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] destroy_disks_on_failure) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] self.force_reraise() > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5674, in _create_domain_and_network > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] post_xml_callback=post_xml_callback) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5608, in _create_domain > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] guest.launch(pause=pause) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] self._encoded_xml, errors='ignore') > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] self.force_reraise() > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] return self._domain.createWithFlags(flags) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] result = proxy_call(self._autowrap, f, *args, **kwargs) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = execute(f, *args, **kwargs) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] six.reraise(c, e, tb) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = meth(*args, **kwargs) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] > > > > Numa cell/node 1 (the one assigned on kudu-4) has enough cpu, memory, pci devices and disk capacity to fit this vm. NOTE: below is the information relevant I could think of that shows resources available after creating the second vm. > > > > [root at zeus-53 ~]# numactl -H > > available: 2 nodes (0-1) > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > node 0 size: 262029 MB > > node 0 free: 52787 MB > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > node 1 size: 262144 MB > > node 1 free: 250624 MB > > node distances: > > node 0 1 > > 0: 10 21 > > 1: 21 10 > > NOTE: this is to show that numa node/cell 1 has enough resources available (also nova-compute logs shows that kudu-4 is assigned to cell 1) > > > > [root at zeus-53 ~]# df -h > > Filesystem Size Used Avail Use% Mounted on > > /dev/md127 3.7T 9.1G 3.7T 1% / > > ... > > NOTE: vm disk files goes to root (/) partition > > > > [root at zeus-53 ~]# lsblk > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > sda 8:0 0 59.6G 0 disk > > ├─sda1 8:1 0 1G 0 part /boot > > └─sda2 8:2 0 16G 0 part [SWAP] > > loop0 7:0 0 100G 0 loop > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > loop1 7:1 0 2G 0 loop > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > nvme0n1 259:8 0 1.8T 0 disk > > └─nvme0n1p1 259:9 0 1.8T 0 part > > └─md127 9:127 0 3.7T 0 raid0 / > > nvme1n1 259:6 0 1.8T 0 disk > > └─nvme1n1p1 259:7 0 1.8T 0 part > > └─md127 9:127 0 3.7T 0 raid0 / > > nvme2n1 259:2 0 1.8T 0 disk > > nvme3n1 259:1 0 1.8T 0 disk > > nvme4n1 259:0 0 1.8T 0 disk > > nvme5n1 259:3 0 1.8T 0 disk > > NOTE: this is to show that there are 4 nvme disks (nvme2n1, nvme3n1, nvme4n1, nvme5n1) available for the second vm > > > > What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context? > > > > Thank you very much > > NOTICE > Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. From sean.mcginnis at gmx.com Tue Oct 1 18:57:07 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 1 Oct 2019 13:57:07 -0500 Subject: [all] Planned Ussuri release schedule published Message-ID: <20191001185707.GA17150@sm-workstation> Hey everyone, The proposed release schedule for Ussuri was up for a few weeks with only cosmetic issues to address. The proposed schedule has now been merged and published to: https://releases.openstack.org/ussuri/schedule.html Barring any new issues with the schedule being raised, this should be our schedule for the Ussuri development cycle. The planned Ussuri release date is May 13, 2020. Thanks! Sean From Arkady.Kanevsky at dell.com Tue Oct 1 19:25:51 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Tue, 1 Oct 2019 19:25:51 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <20191001185707.GA17150@sm-workstation> References: <20191001185707.GA17150@sm-workstation> Message-ID: <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> Why do we have requirements freeze after feature freeze? -----Original Message----- From: Sean McGinnis Sent: Tuesday, October 1, 2019 1:57 PM To: openstack-discuss at lists.openstack.org Subject: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] Hey everyone, The proposed release schedule for Ussuri was up for a few weeks with only cosmetic issues to address. The proposed schedule has now been merged and published to: https://releases.openstack.org/ussuri/schedule.html Barring any new issues with the schedule being raised, this should be our schedule for the Ussuri development cycle. The planned Ussuri release date is May 13, 2020. Thanks! Sean From gouthampravi at gmail.com Tue Oct 1 19:40:06 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Tue, 1 Oct 2019 12:40:06 -0700 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> Message-ID: On Tue, Oct 1, 2019 at 12:30 PM wrote: > Why do we have requirements freeze after feature freeze? > It isn't after the feature freeze - it is alongside. As I understand it, requirements freeze is the same week as feature freeze - it's been the case with Train and past cycles too. > > -----Original Message----- > From: Sean McGinnis > Sent: Tuesday, October 1, 2019 1:57 PM > To: openstack-discuss at lists.openstack.org > Subject: [all] Planned Ussuri release schedule published > > > [EXTERNAL EMAIL] > > Hey everyone, > > The proposed release schedule for Ussuri was up for a few weeks with only > cosmetic issues to address. The proposed schedule has now been merged and > published to: > > https://releases.openstack.org/ussuri/schedule.html > > Barring any new issues with the schedule being raised, this should be our > schedule for the Ussuri development cycle. The planned Ussuri release date > is May 13, 2020. > > Thanks! > Sean > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.rosser at rd.bbc.co.uk Tue Oct 1 20:08:42 2019 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Tue, 1 Oct 2019 21:08:42 +0100 Subject: [OSA][openstack-ansible] Stepping down from core reviewer In-Reply-To: <99083b43-54a3-4fc3-a5c8-fec01907756d@www.fastmail.com> References: <99083b43-54a3-4fc3-a5c8-fec01907756d@www.fastmail.com> Message-ID: <28cde9c7-15ea-a644-d776-cd1a063ce134@rd.bbc.co.uk> Agreed this is a sad day - you've been super cool helping me grapple with all that comes with OpenStack, and I wholeheartedly agree that the OSA community has been a special place where deployers have "got on with it" in a very user oriented way. Hopefully we can maintain that DNA you describe... your description is spot on, and thanks for being part of creating it :) On 27/09/2019 15:20, Jean-Philippe Evrard wrote: > Hello OSA friends, > > It's with great sadness that announcing I will be stepping down from OpenStack-Ansible's core role. > OSA has been the place where I grew from contributor to OpenStack core for the first time, so it will always keep a special place in my mind :) It's also where I met contributors I can now consider personal friends. It's a project I've helped grow and prosper the last 4 years. I am very happy of what we have all achieved. > > My last goals in OSA were to simplify it further and a focus on bare metal. Those efforts are either merged or advanced enough nowadays. I consider I have achieved what I wanted to: OSA is now easier than ever to manage, contribute, and deliver. > > With more experienced people leaving and new people joining, I sure hope the DNA of the project will stay the same: An always welcoming and friendly community, with a no-bullshit and not-too-serious attitude. A project focusing on operator issues and use cases, mentoring contributors to be great members of the OpenStack community. > > Again, I want to thank you for being an amazing community, and it's been great working with all of you. > I think there are still plenty of things OSA can achieve. If you want my opinion, you shouldn't hesitate to contact me. I just don't have enough time to keep up with the reviews, nor am I actively contributing enough to stay at core. > > All the best, > Jean-Philippe Evrard (evrardjp) > > From kgiusti at gmail.com Tue Oct 1 20:35:27 2019 From: kgiusti at gmail.com (Ken Giusti) Date: Tue, 1 Oct 2019 16:35:27 -0400 Subject: [oslo][nova] Revert of oslo.messaging JSON serialization change In-Reply-To: <1569917983.26355.2@smtp.office365.com> References: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> <1569857750.5848.0@smtp.office365.com> <1569917983.26355.2@smtp.office365.com> Message-ID: Sorry I'm late to the party.... At the risk of stating the obvious I wouldn't put much faith in the fact that the Kafka and Amqp1 drivers use jsonutils. The use of jsonutils in these drivers is simply a cut-n-paste from the way old qpidd driver. Why jsonutils was used there... I dunno. IMHO the RabbitMQ driver is the authoritative source for correct driver implementation - the Fake driver (and the others) should use the same serialization as the rabbitmq driver if possible. -K On Tue, Oct 1, 2019 at 4:30 AM Balázs Gibizer wrote: > > > On Mon, Sep 30, 2019 at 5:35 PM, Balázs Gibizer > wrote: > > > > > > On Mon, Sep 30, 2019 at 4:45 PM, Ben Nemec > > wrote: > >> Hi, > >> > >> I've just proposed https://review.opendev.org/#/c/685724/ which > >> reverts a change that recently went in to make the fake driver in > >> oslo.messaging use jsonutils for message serialization instead of > >> json.dumps. > >> > >> As explained in the commit message on the revert, this is > >> problematic > >> because the rabbit driver uses kombu's default serialization method, > >> which is json.dumps. By changing the fake driver to use jsonutils > >> we've made it more lenient than the most used real driver which > >> opens > >> us up to merging broken changes in consumers of oslo.messaging. > >> > >> We did have some discussion of whether we should try to override the > >> kombu default and tell it to use jsonutils too, as a number of other > >> drivers do. The concern with this was that the jsonutils handler for > >> things like datetime objects is not tz-aware, which means if you > >> send > >> a datetime object over RPC and don't explicitly handle it you could > >> lose important information. > >> > >> I'm open to being persuaded otherwise, but at the moment I'm leaning > >> toward less magic happening at the RPC layer and requiring projects > >> to explicitly handle types that aren't serializable by the standard > >> library json module. If you have a different preference, please > >> share > >> it here. > > > > Hi, > > > > I might me totally wrong here and please help me understand how the > > RabbitDriver works. What I did when I created the original patch that > > I > > looked at each drivers how they handle sending messages. The > > oslo_messaging._drivers.base.BaseDriver defines the interface with a > > send() message. The oslo_messaging._drivers.amqpdriver.AMQPDriverBase > > implements the BaseDriver interface's send() method to call _send(). > > Then _send() calls rpc_commom.serialize_msg which then calls > > jsonutils.dumps. > > > > The oslo_messaging._drivers.impl_rabbit.RabbitDriver driver inherits > > from AMQPDriverBase and does not override send() or _send() so I think > > the AMQPDriverBase ._send() is called that therefore jsonutils is used > > during sending a message with RabbitDriver. > > I did some tracing in devstack to prove my point. See the result in > https://review.opendev.org/#/c/685724/1//COMMIT_MSG at 11 > > Cheers, > gibi > > > > > Cheers, > > gibi > > > > > > [1] > > > https://github.com/openstack/oslo.messaging/blob/7734ac1376a1a9285c8245a91cf43599358bfa9d/oslo_messaging/_drivers/amqpdriver.py#L599 > > > >> > >> Thanks. > >> > >> -Ben > >> > > > > > > > -- Ken Giusti (kgiusti at gmail.com) -------------- next part -------------- An HTML attachment was scrubbed... URL: From gsteinmuller at vexxhost.com Tue Oct 1 21:00:59 2019 From: gsteinmuller at vexxhost.com (=?UTF-8?Q?Guilherme_Steinm=C3=BCller?=) Date: Tue, 1 Oct 2019 18:00:59 -0300 Subject: [OSA][openstack-ansible] Stepping down from core reviewer In-Reply-To: <99083b43-54a3-4fc3-a5c8-fec01907756d@www.fastmail.com> References: <99083b43-54a3-4fc3-a5c8-fec01907756d@www.fastmail.com> Message-ID: You've done an enormous contribution, evrard! Not only to me as a contributor but to the whole project! I wish you success! On Fri, Sep 27, 2019 at 11:26 AM Jean-Philippe Evrard < jean-philippe at evrard.me> wrote: > Hello OSA friends, > > It's with great sadness that announcing I will be stepping down from > OpenStack-Ansible's core role. > OSA has been the place where I grew from contributor to OpenStack core for > the first time, so it will always keep a special place in my mind :) It's > also where I met contributors I can now consider personal friends. It's a > project I've helped grow and prosper the last 4 years. I am very happy of > what we have all achieved. > > My last goals in OSA were to simplify it further and a focus on bare > metal. Those efforts are either merged or advanced enough nowadays. I > consider I have achieved what I wanted to: OSA is now easier than ever to > manage, contribute, and deliver. > > With more experienced people leaving and new people joining, I sure hope > the DNA of the project will stay the same: An always welcoming and friendly > community, with a no-bullshit and not-too-serious attitude. A project > focusing on operator issues and use cases, mentoring contributors to be > great members of the OpenStack community. > > Again, I want to thank you for being an amazing community, and it's been > great working with all of you. > I think there are still plenty of things OSA can achieve. If you want my > opinion, you shouldn't hesitate to contact me. I just don't have enough > time to keep up with the reviews, nor am I actively contributing enough to > stay at core. > > All the best, > Jean-Philippe Evrard (evrardjp) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Oct 1 21:33:29 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 01 Oct 2019 16:33:29 -0500 Subject: [qa][stable] tempest.api.volume.test_versions.VersionsTest.test_show_version fails on stable/pike In-Reply-To: <20190926070920.GA26051@sm-workstation> References: <423b48c2-ef1c-bf66-92f5-0d52007076c9@gmail.com> <5f21eadc-8ae3-934b-e354-e326aedba0b5@gmail.com> <20190926070920.GA26051@sm-workstation> Message-ID: <16d893e11b2.118dd8987130088.5272906037496078@ghanshyammann.com> ---- On Thu, 26 Sep 2019 02:09:20 -0500 Sean McGinnis wrote ---- > On Wed, Sep 25, 2019 at 10:00:30AM -0500, Matt Riedemann wrote: > > On 9/25/2019 9:51 AM, Matt Riedemann wrote: > > > Anyway, it sounds like this is another case where we're going to have to > > > pin tempest to a tag in devstack on stable/pike to continue running > > > tempest jobs against stable/pike changes, similar to what recently > > > happened with stable/ocata [3]. > > > > Here is the devstack patch to pin tempest to 21.0.0 in stable/pike: > > > > https://review.opendev.org/#/c/684769/ > > > > -- > > > > Thanks, > > > > Matt > > > > We should be seeing this in queens too. We will need to get this patch merged > there first, then into pike. We can either pin tempest, or get this fixed. > > https://review.opendev.org/#/c/684954/ > > It was a long standing issue that disabled API versions were still listed. This > can probably be backported back to ocata. I do not think the cinder backport will fix the issue. In my test patch, its v1 version which causing the issue and v1 should not be returned in GET / as per cinder pike code. 684954 is only taking care for v2 and v3 things if those are disabled. - https://zuul.opendev.org/t/openstack/build/e13e8a408f214e1b9d03b41c23955c7e/log/controller/logs/tempest_log.txt.gz#70152 Something else is causing this issue. -gmann > > Sean > > From ken1ohmichi at gmail.com Tue Oct 1 21:40:23 2019 From: ken1ohmichi at gmail.com (Kenichi Omichi) Date: Tue, 1 Oct 2019 14:40:23 -0700 Subject: [nova] Stepping down from core reviewer Message-ID: Hello, Today my job description is changed and I cannot have enough time for regular reviewing work of Nova project. So I need to step down from the core reviewer. I spend 6 years in the project, the experience is amazing. OpenStack gave me a lot of chances to learn technical things deeply, make friends in the world and bring me and my family to foreign country from our home country. I'd like to say thank you for everyone in the community :-) My personal private cloud is based on OpenStack, so I'd like to still keep contributing for the project if I find bugs or idea. Thanks Kenichi Omichi --- -------------- next part -------------- An HTML attachment was scrubbed... URL: From Arkady.Kanevsky at dell.com Tue Oct 1 22:03:18 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Tue, 1 Oct 2019 22:03:18 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> Message-ID: <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> On the plan it is one week after feature freeze From: Goutham Pacha Ravi Sent: Tuesday, October 1, 2019 2:40 PM To: Kanevsky, Arkady Cc: sean.mcginnis at gmx.com; OpenStack Discuss Subject: Re: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] On Tue, Oct 1, 2019 at 12:30 PM > wrote: Why do we have requirements freeze after feature freeze? It isn't after the feature freeze - it is alongside. As I understand it, requirements freeze is the same week as feature freeze - it's been the case with Train and past cycles too. -----Original Message----- From: Sean McGinnis > Sent: Tuesday, October 1, 2019 1:57 PM To: openstack-discuss at lists.openstack.org Subject: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] Hey everyone, The proposed release schedule for Ussuri was up for a few weeks with only cosmetic issues to address. The proposed schedule has now been merged and published to: https://releases.openstack.org/ussuri/schedule.html Barring any new issues with the schedule being raised, this should be our schedule for the Ussuri development cycle. The planned Ussuri release date is May 13, 2020. Thanks! Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Oct 1 22:10:22 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 01 Oct 2019 17:10:22 -0500 Subject: [nova][ptg] Review culture (was: Ussuri scope containment) In-Reply-To: <1569946782.31568.0@smtp.office365.com> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> <20191001130035.hm2alc63eab4cpek@yuggoth.org> <72a5c7e7-58a5-187d-3422-44fb110e0f10@fried.cc> <1569946782.31568.0@smtp.office365.com> Message-ID: <16d895fd668.f4f02bc6130462.7929159786020541256@ghanshyammann.com> ---- On Tue, 01 Oct 2019 11:19:44 -0500 Balázs Gibizer wrote ---- > > > On Tue, Oct 1, 2019 at 5:00 PM, Eric Fried wrote: > > Thanks for the responses, all. > > > > This subthread is becoming tangential to my original purpose, so I'm > > renaming it. >> (A) Constrain scope, drastically. We marked 25 blueprints complete in >> Train [3]. Since there has been no change to the core team, let's limit >> Ussuri to 25 blueprints [4]. If this turns out to be too few, what's the >> worst thing that happens? We finish everything, early, and wish we had >> do ne more. If that happens, drinks are on me, and we can bump the number >> for V. I like the idea here and be more practical than theoretical ways to handle such situation especially in Nova case. If the operator complains about less accepted BP then, we can ask them to invest developers in upstream which can avoid such cap. But my question is same as gibi, what will be the selection criteria (when we have a large number of ready specs)? >> (B) Require a core to commit to "caring about" a spec before we approve >> it. The point of this "core liaison" is to act as a mentor to mitigate >> the cultural issues noted above [5], and to be a first point of contact >> for reviews. I've proposed this to the spec template here [6]. +100 for this. I am sure this way we can burn more approved BP. -gmann > > > >>> The best way to get reviews is to lurk in IRC and beg. > > > >> When I joined I was taught that instead of begging go and review > >> open > >> patches which a) helps the review load of dev team b) makes you > >> known > >> in the community. Both helps getting reviews on your patches. Does > >> it > >> always work? No. Do I like begging for review? No. Do I like to get > >> repatedly pinged to review? No. So I would suggest not to declare > >> that > >> the only way to get review is to go and beg. > > > > I recognize I was generalizing; begging isn't really "the best way" to > > get reviews. Doing reviews and becoming known (and *then* begging :) > > is > > far more effective -- but is literally impossible for many > > contributors. > > Even if they have the time (percentage of work week) to dedicate > > upstream, it takes massive effort and time (calendar) to get there. We > > can not and should not expect this of every contributor. > > > > Sure, it is not easy for a new commer to read a random nova patch. But > I think we should encourage them to do so. As that is one of the way > how a newcomer will learn how nova (as software) works. I don't expect > from a newcommer to point out in a nova review that I made a mistake > about an obscure nova specific construct. But I think a newcommer still > can give us valuable feedback about the code readability, about generic > python usage, about English grammar... > > gibi > > > From openstack at fried.cc Tue Oct 1 22:11:21 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 1 Oct 2019 17:11:21 -0500 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: Kenichi- Thank you for all of your contributions over the years. efried On 10/1/19 4:40 PM, Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign country > from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > --- From gmann at ghanshyammann.com Tue Oct 1 22:46:25 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 01 Oct 2019 17:46:25 -0500 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: <16d8980d8fe.b046ac8e130820.667075030702112040@ghanshyammann.com> ---- On Tue, 01 Oct 2019 16:40:23 -0500 Kenichi Omichi wrote ---- > Hello, > Today my job description is changed and I cannot have enough time for regular reviewing work of Nova project.So I need to step down from the core reviewer. > I spend 6 years in the project, the experience is amazing.OpenStack gave me a lot of chances to learn technical things deeply, make friends in the world and bring me and my family to foreign country from our home country.I'd like to say thank you for everyone in the community :-) Thanks a lot, kenichi for your valuable contribution over the years. I have learnt a lot from you and thanks for being so helpful and humble always. You have done a lot for making Nova better and OpenStack more stable while serving as a QA developer in parallel. -gmann > > My personal private cloud is based on OpenStack, so I'd like to still keep contributing for the project if I find bugs or idea. > > ThanksKenichi Omichi > --- > From smooney at redhat.com Tue Oct 1 23:24:58 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 02 Oct 2019 00:24:58 +0100 Subject: issues creating a second vm with numa affinity In-Reply-To: References: <9D8A2486E35F0941A60430473E29F15B017EB7B8AE@MXDB1.ad.garvan.unsw.edu.au> Message-ID: <0a702d26811856186130e5ed28c908665026821b.camel@redhat.com> On Tue, 2019-10-01 at 14:39 -0400, Satish Patel wrote: > did you try to removing "hw:numa_nodes=1" ? that will have no effect the vm implcitly has a numa toplogy of 1 node due to usei cpu pinning. so hw:numa_nodes=1 is identical to what will be setting hw:cpu_policy=dedicated openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property pci_passthrough:alias='nvme:4' looking at the numa info that was provided. node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > > > node 0 size: 262029 MB > > > > node 0 free: 52787 MB > > > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > it looks like you have a dual socket host with 14 cores per secket and hyper threading enabled. looking at the flaovr hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate enables pinning and allocate 1 addtional pinned for the emulator thread. since hw:cpu_treads_policy is not defien the behavior will be determined by the numa of cores requrested. by default if the flavor.vcpu is even it would default to the require policy and try to use hyper tread siblibngs if flavor.vcpu was odd it would defualt to isolate policy and try to isolate individual cores. this was originally done to prevent a class of timing based attach that can be executed if two vms were pinned differnet hyperthread on the same core. i say be defualt as you are also useing hw:emulator_threads_policy=isolate which actully means you are askign for 21 cores and im not sure of the top of my head which policy will take effect. strictly speacking the prefer policy is used but it behavior is subtle. anway form the numa info above reaange the data to show the tread siblings node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 if you have look at this long enough you will know after kernel 4.0 enumates in a prediable way that is different form the predicable way that older kernels used to enuamrete cores in. if we boot 1 vm say on node0 which is the socket 0 in this case as well with the above flaovr i would expect the free cores to look like this node 0 cpus: - - - - - - - - - - - 11 12 13 - - - - - - - - - - 38 39 40 41 node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 looking at the pci white list there is something else that you can see passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, > > {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, > > {"address":"0000:87:00.0"} ] for all the devies the first 2 bytes are 0000 this is the pci domain. on a multi socket sytems, or at least on any 2 socket system new enouglsht to processor wth 14 cores and hypertreading you will have 2 different pci roots. 1 pci route complex per phyical processor. in a system with multiple pci root complex 1 becomes the primary pci root and is assigned the 0000 domain and the second is can be asigned a different domain adress but that depend on your kernel commandline option and the number of devices. form my experince when only a single domain is create the second numa node device start with 0000:80:xx.y or higher so {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"},{"address":"0000:09:00.0"} shoudl be on numa node 0 and shuld be assinged to the vm that leave {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, {"address":"0000:87:00.0"} and node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 so from an openstack point of view there are enough core free to pin the second vm and there are devices free on the same numa node. node 0 free: 52787 MB node 1 free: 250624 MB 200G is being used on node 0 by the first vm and there is 250 is free on node 1. as kashyap pointed out in an earlier reply the most likely cause of the "qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory" error is a libvirt interaction with a kvm kernel bug that was fix in kernel 4.19 (4.19 fixes a lot of kvm bugs and enabled nestexd virt by default so you shoudl use 4.19+ if you can) kashyap submitted https://review.opendev.org/#/c/684375/ as a possible way to workaround to the kernel issue by relaxing the requirement nova places on the memory assgined to a guest that is not used for guest ram. effectivly we belive that the root case is on the host if you run "grep DMA32 /proc/zoneinfo" the DMA32 zone will only exist on 1 nuam node. e.g. sean at workstation:~$ grep DMA32 /proc/zoneinfo Node 0, zone DMA32 Node 1, zone DMA32 https://review.opendev.org/#/c/684375/ we belive would allow the second vm to booth with numa affined guest ram but non numa affined DMA memroy however that could have negative performace implication in some cases. nova connot contol how where DMA memroy is allocated by the kernel so this cannot be fully adress by nova. ideally the best way to fix this would be to some how force your kenel to allocate DMA32 zones per numa node but i am not aware of a way to do that. so to ansewr the orginal question 'What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context?' my understanding is that it mean qemu could not allcate memory form a DMA32 zone on the same numa node as the cpus and guest ram for the PCI passthough devices which would be required when is defiend. we always require strict mode when we have a vm with a numa toplogy to ensure that the guest memroy is allocated form the node we requested but if you are using pci passtouhg and do not have DMA32 zones. it is my understanding that on newewr kernels the kvm modules allows non local DMA zones to be used. with all that said it is very uncommon to have hardware that dose not have a DMA and DMA32 zone per numa node so most peopel will never have this problem. > On Tue, Oct 1, 2019 at 2:16 PM Manuel Sopena Ballesteros > wrote: > > > > Dear Openstack user community, > > > > > > > > I have a compute node with 2 numa nodes and I would like to create 2 vms, each one using a different numa node > > through numa affinity with cpu, memory and nvme pci devices. > > > > > > > > pci passthrough whitelist > > > > [root at zeus-53 ~]# tail /etc/kolla/nova-compute/nova.conf > > > > [notifications] > > > > > > > > [filter_scheduler] > > > > enabled_filters = enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, > > ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter > > > > available_filters = nova.scheduler.filters.all_filters > > > > > > > > [pci] > > > > passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, > > {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, > > {"address":"0000:87:00.0"} ] > > > > alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"} > > > > > > > > Openstack flavor > > > > openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property > > hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property > > pci_passthrough:alias='nvme:4' > > > > > > > > The first vm is successfully created > > > > openstack server create --network hpc --flavor xlarge.numa.perf.test --image centos7.6-image --availability-zone > > nova:zeus-53.localdomain --key-name mykey kudu-1 > > > > > > > > However the second vm fails > > > > openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone > > nova:zeus-53.localdomain --key-name mykey kudu-4 > > > > > > > > Errors in nova compute node > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [req-b5a25c73-8c7d-466c-8128-71f29e7ae8aa > > 91e83343e9834c8ba0172ff369c8acac b91520cff5bd45c59a8de07c38641582 - default default] [instance: ebe4e78c-501e-4535- > > ae15-948301cbf1ae] Instance failed to spawn: libvirtError: internal error: qemu unexpectedly closed the monitor: > > 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] Traceback > > (most recent call last): > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2369, in _build_resources > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] yield > > resources > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2133, in _build_and_run_instance > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] block_device_info=block_device_info) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3142, in spawn > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] destroy_disks_on_failure=True) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5705, in _create_domain_and_network > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] destroy_disks_on_failure) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] self.force_reraise() > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5674, in _create_domain_and_network > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] post_xml_callback=post_xml_callback) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5608, in _create_domain > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] guest.launch(pause=pause) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] self._encoded_xml, errors='ignore') > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] self.force_reraise() > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] return > > self._domain.createWithFlags(flags) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] result = > > proxy_call(self._autowrap, f, *args, **kwargs) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = > > execute(f, *args, **kwargs) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > 948301cbf1ae] six.reraise(c, e, tb) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = > > meth(*args, **kwargs) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] if ret == > > -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] libvirtError: > > internal error: qemu unexpectedly closed the monitor: 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: > > Cannot allocate memory > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] > > > > > > > > Numa cell/node 1 (the one assigned on kudu-4) has enough cpu, memory, pci devices and disk capacity to fit this vm. > > NOTE: below is the information relevant I could think of that shows resources available after creating the second > > vm. > > > > > > > > [root at zeus-53 ~]# numactl -H > > > > available: 2 nodes (0-1) > > > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > > > node 0 size: 262029 MB > > > > node 0 free: 52787 MB > > > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > > > node 1 size: 262144 MB > > > > node 1 free: 250624 MB > > > > node distances: > > > > node 0 1 > > > > 0: 10 21 > > > > 1: 21 10 > > > > NOTE: this is to show that numa node/cell 1 has enough resources available (also nova-compute logs shows that kudu-4 > > is assigned to cell 1) > > > > > > > > [root at zeus-53 ~]# df -h > > > > Filesystem Size Used Avail Use% Mounted on > > > > /dev/md127 3.7T 9.1G 3.7T 1% / > > > > ... > > > > NOTE: vm disk files goes to root (/) partition > > > > > > > > [root at zeus-53 ~]# lsblk > > > > NAME MAJ:MIN RM SIZE RO > > TYPE MOUNTPOINT > > > > sda 8:0 0 59.6G 0 > > disk > > > > ├─sda1 8:1 0 1G 0 > > part /boot > > > > └─sda2 8:2 0 16G 0 > > part [SWAP] > > > > loop0 7:0 0 100G 0 > > loop > > > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > > > loop1 7:1 0 2G 0 > > loop > > > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > > > nvme0n1 259:8 0 1.8T 0 > > disk > > > > └─nvme0n1p1 259:9 0 1.8T 0 > > part > > > > └─md127 9:127 0 3.7T 0 > > raid0 / > > > > nvme1n1 259:6 0 1.8T 0 > > disk > > > > └─nvme1n1p1 259:7 0 1.8T 0 > > part > > > > └─md127 9:127 0 3.7T 0 > > raid0 / > > > > nvme2n1 259:2 0 1.8T 0 > > disk > > > > nvme3n1 259:1 0 1.8T 0 > > disk > > > > nvme4n1 259:0 0 1.8T 0 > > disk > > > > nvme5n1 259:3 0 1.8T 0 > > disk > > > > NOTE: this is to show that there are 4 nvme disks (nvme2n1, nvme3n1, nvme4n1, nvme5n1) available for the second vm > > > > > > > > What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context? > > > > > > > > Thank you very much > > > > NOTICE > > Please consider the environment before printing this email. This message and any attachments are intended for the > > addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended > > recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this > > message in error please notify us at once by return email and then delete both messages. We accept no liability for > > the distribution of viruses or similar in electronic communications. This notice should not be removed. > > From satish.txt at gmail.com Wed Oct 2 02:24:26 2019 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 1 Oct 2019 22:24:26 -0400 Subject: issues creating a second vm with numa affinity In-Reply-To: <0a702d26811856186130e5ed28c908665026821b.camel@redhat.com> References: <9D8A2486E35F0941A60430473E29F15B017EB7B8AE@MXDB1.ad.garvan.unsw.edu.au> <0a702d26811856186130e5ed28c908665026821b.camel@redhat.com> Message-ID: Sean good to hear from you, amazing reply, i took some notes from it. On Tue, Oct 1, 2019 at 7:25 PM Sean Mooney wrote: > > On Tue, 2019-10-01 at 14:39 -0400, Satish Patel wrote: > > did you try to removing "hw:numa_nodes=1" ? > that will have no effect > the vm implcitly has a numa toplogy of 1 node due to usei cpu pinning. > so hw:numa_nodes=1 is identical to what will be setting hw:cpu_policy=dedicated > > openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property > hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property > pci_passthrough:alias='nvme:4' > > looking at the numa info that was provided. > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > > > > > node 0 size: 262029 MB > > > > > > node 0 free: 52787 MB > > > > > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > > > > it looks like you have a dual socket host with 14 cores per secket and hyper threading enabled. > > looking at the flaovr > hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate > enables pinning and allocate 1 addtional pinned for the emulator thread. > > since hw:cpu_treads_policy is not defien the behavior will be determined by the numa of cores requrested. > > by default if the flavor.vcpu is even it would default to the require policy and try to use hyper tread siblibngs > if flavor.vcpu was odd it would defualt to isolate policy and try to isolate individual cores. this was originally > done to prevent a class of timing based attach that can be executed if two vms were pinned differnet hyperthread on the > same core. i say be defualt as you are also useing hw:emulator_threads_policy=isolate which actully means you are > askign for 21 cores and im not sure of the top of my head which policy will take effect. > strictly speacking the prefer policy is used but it behavior is subtle. > > anway form the numa info above reaange the data to show the tread siblings > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 > 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 > 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > if you have look at this long enough you will know after kernel 4.0 enumates in a prediable way that is different > form the predicable way that older kernels used to enuamrete cores in. > > if we boot 1 vm say on node0 which is the socket 0 in this case as well with the above flaovr i would expect the free > cores to look like this > > node 0 cpus: - - - - - - - - - - - 11 12 13 > - - - - - - - - - - 38 39 40 41 > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 > 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > looking at the pci white list there is something else that you can see > > passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, > > > {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, > > > {"address":"0000:87:00.0"} ] > > for all the devies the first 2 bytes are 0000 this is the pci domain. > > on a multi socket sytems, or at least on any 2 socket system new enouglsht to processor wth 14 cores and hypertreading > you will have 2 different pci roots. 1 pci route complex per phyical processor. in a system with multiple pci root > complex 1 becomes the primary pci root and is assigned the 0000 domain and the second is can be asigned a different > domain adress but that depend on your kernel commandline option and the number of devices. > form my experince when only a single domain is create the second numa node device start with 0000:80:xx.y or higher > so {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"},{"address":"0000:09:00.0"} shoudl > be on numa node 0 and shuld be assinged to the vm > > that leave > {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, {"address":"0000:87:00.0"} > and > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 > 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > so from an openstack point of view there are enough core free to pin the second vm and there are devices free on the > same numa node. > > node 0 free: 52787 MB > node 1 free: 250624 MB > > 200G is being used on node 0 by the first vm and there is 250 is free on node 1. > > as kashyap pointed out in an earlier reply the most likely cause of the > "qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory" error is a libvirt interaction with a kvm > kernel bug that was fix in kernel 4.19 (4.19 fixes a lot of kvm bugs and enabled nestexd virt by default so you shoudl > use 4.19+ if you can) > kashyap submitted https://review.opendev.org/#/c/684375/ as a possible way to workaround to the kernel issue by relaxing > the requirement nova places on the memory assgined to a guest that is not used for guest ram. > > effectivly we belive that the root case is on the host if you run "grep DMA32 /proc/zoneinfo" the DMA32 zone will only > exist on 1 nuam node. > > e.g. sean at workstation:~$ grep DMA32 /proc/zoneinfo > Node 0, zone DMA32 > Node 1, zone DMA32 > > https://review.opendev.org/#/c/684375/ we belive would allow the second vm to booth with numa affined guest ram > but non numa affined DMA memroy however that could have negative performace implication in some cases. > nova connot contol how where DMA memroy is allocated by the kernel so this cannot be fully adress by nova. > > ideally the best way to fix this would be to some how force your kenel to allocate DMA32 zones per numa node > but i am not aware of a way to do that. > > so to ansewr the orginal question > 'What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context?' > my understanding is that it mean > qemu could not allcate memory form a DMA32 zone on the > same numa node as the cpus and guest ram for the PCI passthough devices which would be required > when is defiend. we always require strict mode when we have a vm with a numa > toplogy to ensure that the guest memroy is allocated form the node we requested but if you are using pci > passtouhg and do not have DMA32 zones. it is my understanding that on newewr kernels the kvm modules > allows non local DMA zones to be used. with all that said it is very uncommon to have hardware that > dose not have a DMA and DMA32 zone per numa node so most peopel will never have this problem. > > > > On Tue, Oct 1, 2019 at 2:16 PM Manuel Sopena Ballesteros > > wrote: > > > > > > Dear Openstack user community, > > > > > > > > > > > > I have a compute node with 2 numa nodes and I would like to create 2 vms, each one using a different numa node > > > through numa affinity with cpu, memory and nvme pci devices. > > > > > > > > > > > > pci passthrough whitelist > > > > > > [root at zeus-53 ~]# tail /etc/kolla/nova-compute/nova.conf > > > > > > [notifications] > > > > > > > > > > > > [filter_scheduler] > > > > > > enabled_filters = enabled_filters = RetryFilter, AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, > > > ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter > > > > > > available_filters = nova.scheduler.filters.all_filters > > > > > > > > > > > > [pci] > > > > > > passthrough_whitelist = [ {"address":"0000:06:00.0"}, {"address":"0000:07:00.0"}, {"address":"0000:08:00.0"}, > > > {"address":"0000:09:00.0"}, {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, > > > {"address":"0000:87:00.0"} ] > > > > > > alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"} > > > > > > > > > > > > Openstack flavor > > > > > > openstack flavor create --public xlarge.numa.perf.test --ram 200000 --disk 700 --vcpus 20 --property > > > hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property > > > pci_passthrough:alias='nvme:4' > > > > > > > > > > > > The first vm is successfully created > > > > > > openstack server create --network hpc --flavor xlarge.numa.perf.test --image centos7.6-image --availability-zone > > > nova:zeus-53.localdomain --key-name mykey kudu-1 > > > > > > > > > > > > However the second vm fails > > > > > > openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone > > > nova:zeus-53.localdomain --key-name mykey kudu-4 > > > > > > > > > > > > Errors in nova compute node > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [req-b5a25c73-8c7d-466c-8128-71f29e7ae8aa > > > 91e83343e9834c8ba0172ff369c8acac b91520cff5bd45c59a8de07c38641582 - default default] [instance: ebe4e78c-501e-4535- > > > ae15-948301cbf1ae] Instance failed to spawn: libvirtError: internal error: qemu unexpectedly closed the monitor: > > > 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: Cannot allocate memory > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] Traceback > > > (most recent call last): > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2369, in _build_resources > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] yield > > > resources > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2133, in _build_and_run_instance > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] block_device_info=block_device_info) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3142, in spawn > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] destroy_disks_on_failure=True) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5705, in _create_domain_and_network > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] destroy_disks_on_failure) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] self.force_reraise() > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5674, in _create_domain_and_network > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] post_xml_callback=post_xml_callback) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5608, in _create_domain > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] guest.launch(pause=pause) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] self._encoded_xml, errors='ignore') > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] self.force_reraise() > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] six.reraise(self.type_, self.value, self.tb) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] return > > > self._domain.createWithFlags(flags) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] result = > > > proxy_call(self._autowrap, f, *args, **kwargs) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = > > > execute(f, *args, **kwargs) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15- > > > 948301cbf1ae] six.reraise(c, e, tb) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] rv = > > > meth(*args, **kwargs) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] File > > > "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] if ret == > > > -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] libvirtError: > > > internal error: qemu unexpectedly closed the monitor: 2019-09-27T06:45:19.118089Z qemu-kvm: kvm_init_vcpu failed: > > > Cannot allocate memory > > > > > > 2019-09-27 16:45:19.785 7 ERROR nova.compute.manager [instance: ebe4e78c-501e-4535-ae15-948301cbf1ae] > > > > > > > > > > > > Numa cell/node 1 (the one assigned on kudu-4) has enough cpu, memory, pci devices and disk capacity to fit this vm. > > > NOTE: below is the information relevant I could think of that shows resources available after creating the second > > > vm. > > > > > > > > > > > > [root at zeus-53 ~]# numactl -H > > > > > > available: 2 nodes (0-1) > > > > > > node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > > > > > > node 0 size: 262029 MB > > > > > > node 0 free: 52787 MB > > > > > > node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 > > > > > > node 1 size: 262144 MB > > > > > > node 1 free: 250624 MB > > > > > > node distances: > > > > > > node 0 1 > > > > > > 0: 10 21 > > > > > > 1: 21 10 > > > > > > NOTE: this is to show that numa node/cell 1 has enough resources available (also nova-compute logs shows that kudu-4 > > > is assigned to cell 1) > > > > > > > > > > > > [root at zeus-53 ~]# df -h > > > > > > Filesystem Size Used Avail Use% Mounted on > > > > > > /dev/md127 3.7T 9.1G 3.7T 1% / > > > > > > ... > > > > > > NOTE: vm disk files goes to root (/) partition > > > > > > > > > > > > [root at zeus-53 ~]# lsblk > > > > > > NAME MAJ:MIN RM SIZE RO > > > TYPE MOUNTPOINT > > > > > > sda 8:0 0 59.6G 0 > > > disk > > > > > > ├─sda1 8:1 0 1G 0 > > > part /boot > > > > > > └─sda2 8:2 0 16G 0 > > > part [SWAP] > > > > > > loop0 7:0 0 100G 0 > > > loop > > > > > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > > > > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > > > > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > > > > > loop1 7:1 0 2G 0 > > > loop > > > > > > └─docker-9:127-6979358884-pool 253:0 0 100G 0 dm > > > > > > ├─docker-9:127-6979358884-4301cee8d0433729cd6332ca2b6111afc85f14c48d4ce2d888a1da0ef9b5ca01 253:1 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-d59208adcb7cee3418f810f24e6c3a55d39281f713c8e76141fc61a8deba8a2b 253:2 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-106bc0838e37442eca84eb9ab17aa7a45308b7e3a38be3fb21a4fa00366fe306 253:3 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-7e16b5d012ab8744739b671fcdc8e47db5cc64e6c3d5a5fe423bfd68cfb07b20 253:4 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-f1c2545b4edbfd7b42d2a492eda8224fcf7cefc3e3a41e65d307c585acffe6a8 253:5 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-e7fd6c7b3f624f387bdb3746a7944a30c92d8ee5395e75c76288b281bd009d90 253:6 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-95a818cc7afd9867385bb9a9ea750d4cc6e162916c6ae3a157097af74578e1e4 253:7 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-9a7f28d396c149119556f382bf5c19f5925eed5d18b94407649244c7adabb4b3 253:8 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-b25941b6f115300caea977911e2d7fd3541ef187c9aa5736fe10fad638ecd0d1 253:9 0 10G 0 dm > > > > > > ├─docker-9:127-6979358884-122b201c6ad24896a205f8db4a64759ba8fbd5bbe245d0f98984268a01e6a0c4 253:10 0 10G 0 dm > > > > > > └─docker-9:127-6979358884-bc04120ba59a1b393f338a1cef64b16d920cf4e73400198e4b999bb72a42ff90 253:11 0 10G 0 dm > > > > > > nvme0n1 259:8 0 1.8T 0 > > > disk > > > > > > └─nvme0n1p1 259:9 0 1.8T 0 > > > part > > > > > > └─md127 9:127 0 3.7T 0 > > > raid0 / > > > > > > nvme1n1 259:6 0 1.8T 0 > > > disk > > > > > > └─nvme1n1p1 259:7 0 1.8T 0 > > > part > > > > > > └─md127 9:127 0 3.7T 0 > > > raid0 / > > > > > > nvme2n1 259:2 0 1.8T 0 > > > disk > > > > > > nvme3n1 259:1 0 1.8T 0 > > > disk > > > > > > nvme4n1 259:0 0 1.8T 0 > > > disk > > > > > > nvme5n1 259:3 0 1.8T 0 > > > disk > > > > > > NOTE: this is to show that there are 4 nvme disks (nvme2n1, nvme3n1, nvme4n1, nvme5n1) available for the second vm > > > > > > > > > > > > What "emu-kvm: kvm_init_vcpu failed: Cannot allocate memory" means in this context? > > > > > > > > > > > > Thank you very much > > > > > > NOTICE > > > Please consider the environment before printing this email. This message and any attachments are intended for the > > > addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended > > > recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this > > > message in error please notify us at once by return email and then delete both messages. We accept no liability for > > > the distribution of viruses or similar in electronic communications. This notice should not be removed. > > > > > From zhangbailin at inspur.com Wed Oct 2 03:21:02 2019 From: zhangbailin at inspur.com (=?gb2312?B?QnJpbiBaaGFuZyjVxbDZwdYp?=) Date: Wed, 2 Oct 2019 03:21:02 +0000 Subject: =?gb2312?B?tPC4tDogW2xpc3RzLm9wZW5zdGFjay5vcme0+reiXVtub3ZhXSBTdGVwcGlu?= =?gb2312?Q?g_down_from_core_reviewer?= In-Reply-To: References: <15310a972a58e9d50f9a255fcae249b4@sslemail.net> Message-ID: <052032cc89884b3d8c645aef4d43f308@inspur.com> Kenichi- Thank you for your contribution to nova and for the help of these newcomers, and I hope to see you often in the community. brinzhang item: [lists.openstack.org代发][nova] Stepping down from core reviewer Hello, Today my job description is changed and I cannot have enough time for regular reviewing work of Nova project. So I need to step down from the core reviewer. I spend 6 years in the project, the experience is amazing. OpenStack gave me a lot of chances to learn technical things deeply, make friends in the world and bring me and my family to foreign country from our home country. I'd like to say thank you for everyone in the community :-) My personal private cloud is based on OpenStack, so I'd like to still keep contributing for the project if I find bugs or idea. Thanks Kenichi Omichi --- -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at stackhpc.com Wed Oct 2 07:29:50 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 2 Oct 2019 09:29:50 +0200 Subject: [all][PTG] Strawman Schedule In-Reply-To: <29C580AF-47C6-426A-B571-E0D0E9E8806E@openstack.org> References: <29C580AF-47C6-426A-B571-E0D0E9E8806E@openstack.org> Message-ID: Hi Kendall, I got confirmation from all participants that they will be available all day on Friday. Thanks for adding us to the schedule. Best wishes, Pierre On Tue, 1 Oct 2019 at 17:37, Kendall Waters wrote: > > Hi Pierre, > > Most of our space at the Shanghai PTG is shared space so we can offer you a designated table in the shared room all day Friday. There will be extra chairs in the room if you need to pull up more chairs to your table. > > Best, > Kendall > > Kendall Waters > OpenStack Marketing & Events > kendall at openstack.org > > > > On Oct 1, 2019, at 5:53 AM, Pierre Riteau wrote: > > Hi Kendall, > > Friday works for all who have replied so far, but I am still expecting > answers from two people. > > Is there a room available for our Project Onboarding session that day? > Probably in the morning, though I will confirm depending on > availability of participants. > We've never run one, so I don't know how many people to expect. > > Thanks, > Pierre > > On Mon, 30 Sep 2019 at 23:29, Kendall Waters wrote: > > > Hi Pierre, > > Apologies for the oversight on Blazar. Would all day Friday work for your team? > > Thanks, > Kendall > > Kendall Waters > OpenStack Marketing & Events > kendall at openstack.org > > > > On Sep 30, 2019, at 12:27 PM, Pierre Riteau wrote: > > Hi Kendall, > > I couldn't see Blazar anywhere on the schedule. We had requested time > for a Project Onboarding session. > > Additionally, there are more people travelling than initially planned, > so we may want to allocate a half day for technical discussions as > well (probably in the shared space, since we don't expect a huge > turnout). > > Would it be possible to update the schedule accordingly? > > Thanks, > Pierre > > On Fri, 27 Sep 2019 at 19:02, Kendall Nelson wrote: > > > Hello Everyone! > > Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 > > The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. > > Please let me know if there are any conflicts! > > -Kendall (diablo_rojo) > > On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: > > > Hello Everyone! > > In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. > > We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. > > If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! > > -Kendall (diablo_rojo) > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png > > > > From pierre at stackhpc.com Wed Oct 2 07:39:12 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 2 Oct 2019 09:39:12 +0200 Subject: [kolla-ansible] migration In-Reply-To: References: Message-ID: Hi everyone, I hope you don't mind me reviving this thread, to let you know I wrote an article after we successfully completed the migration of a running OpenStack deployment to Kolla: http://www.stackhpc.com/migrating-to-kolla.html Don't hesitate to contact me if you have more questions about how this type of migration can be performed. Pierre On Mon, 1 Jul 2019 at 14:02, Ignazio Cassano wrote: > > I checked them and I modified for fitting to new installation > thanks > Ignazio > > Il giorno lun 1 lug 2019 alle ore 13:36 Mohammed Naser ha scritto: >> >> You should check your cell mapping records inside Nova. They're probably not right of you moved your database and rabbit >> >> Sorry for top posting this is from a phone. >> >> On Mon., Jul. 1, 2019, 5:46 a.m. Ignazio Cassano, wrote: >>> >>> PS >>> I presume the problem is neutron, because instances on new kvm nodes remain in building state e do not aquire address. >>> Probably the netron db imported from old openstack installation has some difrrences ....probably I must check defferences from old and new neutron services configuration files. >>> Ignazio >>> >>> Il giorno lun 1 lug 2019 alle ore 10:10 Mark Goddard ha scritto: >>>> >>>> It sounds like you got quite close to having this working. I'd suggest >>>> debugging this instance build failure. One difference with kolla is >>>> that we run libvirt inside a container. Have you stopped libvirt from >>>> running on the host? >>>> Mark >>>> >>>> On Sun, 30 Jun 2019 at 09:55, Ignazio Cassano wrote: >>>> > >>>> > Hi Mark, >>>> > let me to explain what I am trying. >>>> > I have a queens installation based on centos and pacemaker with some instances and heat stacks. >>>> > I would like to have another installation with same instances, projects, stacks ....I'd like to have same uuid for all objects (users,projects instances and so on, because it is controlled by a cloud management platform we wrote. >>>> > >>>> > I stopped controllers on old queens installation backupping the openstack database. >>>> > I installed the new kolla openstack queens on new three controllers with same addresses of the old intallation , vip as well. >>>> > One of the three controllers is also a kvm node on queens. >>>> > I stopped all containeres except rabbit,keepalive,rabbit,haproxy and mariadb. >>>> > I deleted al openstack db on mariadb container and I imported the old tables, changing the address of rabbit for pointing to the new rabbit cluster. >>>> > I restarded containers. >>>> > Changing the rabbit address on old kvm nodes, I can see the old virtual machines and I can open console on them. >>>> > I can see all networks (tenant and provider) of al installation, but when I try to create a new instance on the new kvm, it remains in buiding state. >>>> > Seems it cannot aquire an address. >>>> > Storage between old and new installation are shred on nfs NETAPP, so I can see cinder volumes. >>>> > I suppose db structure is different between a kolla installation and a manual instaltion !? >>>> > What is wrong ? >>>> > Thanks >>>> > Ignazio >>>> > >>>> > >>>> > >>>> > >>>> > Il giorno gio 27 giu 2019 alle ore 16:44 Mark Goddard ha scritto: >>>> >> >>>> >> On Thu, 27 Jun 2019 at 14:46, Ignazio Cassano wrote: >>>> >> > >>>> >> > Sorry, for my question. >>>> >> > It does not need to change anything because endpoints refer to haproxy vips. >>>> >> > So if your new glance works fine you change haproxy backends for glance. >>>> >> > Regards >>>> >> > Ignazio >>>> >> >>>> >> That's correct - only the haproxy backend needs to be updated. >>>> >> >>>> >> > >>>> >> > >>>> >> > Il giorno gio 27 giu 2019 alle ore 15:21 Ignazio Cassano ha scritto: >>>> >> >> >>>> >> >> Hello Mark, >>>> >> >> let me to verify if I understood your method. >>>> >> >> >>>> >> >> You have old controllers,haproxy,mariadb and nova computes. >>>> >> >> You installed three new controllers but kolla.ansible inventory contains old mariadb and old rabbit servers. >>>> >> >> You are deployng single service on new controllers staring with glance. >>>> >> >> When you deploy glance on new controllers, it changes the glance endpoint on old mariadb db ? >>>> >> >> Regards >>>> >> >> Ignazio >>>> >> >> >>>> >> >> Il giorno gio 27 giu 2019 alle ore 10:52 Mark Goddard ha scritto: >>>> >> >>> >>>> >> >>> On Wed, 26 Jun 2019 at 19:34, Ignazio Cassano wrote: >>>> >> >>> > >>>> >> >>> > Hello, >>>> >> >>> > Anyone have tried to migrate an existing openstack installation to kolla containers? >>>> >> >>> >>>> >> >>> Hi, >>>> >> >>> >>>> >> >>> I'm aware of two people currently working on that. Gregory Orange and >>>> >> >>> one of my colleagues, Pierre Riteau. Pierre is away currently, so I >>>> >> >>> hope he doesn't mind me quoting him from an email to Gregory. >>>> >> >>> >>>> >> >>> Mark >>>> >> >>> >>>> >> >>> "I am indeed working on a similar migration using Kolla Ansible with >>>> >> >>> Kayobe, starting from a non-containerised OpenStack deployment based >>>> >> >>> on CentOS RPMs. >>>> >> >>> Existing OpenStack services are deployed across several controller >>>> >> >>> nodes and all sit behind HAProxy, including for internal endpoints. >>>> >> >>> We have additional controller nodes that we use to deploy >>>> >> >>> containerised services. If you don't have the luxury of additional >>>> >> >>> nodes, it will be more difficult as you will need to avoid processes >>>> >> >>> clashing when listening on the same port. >>>> >> >>> >>>> >> >>> The method I am using resembles your second suggestion, however I am >>>> >> >>> deploying only one containerised service at a time, in order to >>>> >> >>> validate each of them independently. >>>> >> >>> I use the --tags option of kolla-ansible to restrict Ansible to >>>> >> >>> specific roles, and when I am happy with the resulting configuration I >>>> >> >>> update HAProxy to point to the new controllers. >>>> >> >>> >>>> >> >>> As long as the configuration matches, this should be completely >>>> >> >>> transparent for purely HTTP-based services like Glance. You need to be >>>> >> >>> more careful with services that include components listening for RPC, >>>> >> >>> such as Nova: if the new nova.conf is incorrect and you've deployed a >>>> >> >>> nova-conductor that uses it, you could get failed instances launches. >>>> >> >>> Some roles depend on others: if you are deploying the >>>> >> >>> neutron-openvswitch-agent, you need to run the openvswitch role as >>>> >> >>> well. >>>> >> >>> >>>> >> >>> I suggest starting with migrating Glance as it doesn't have any >>>> >> >>> internal services and is easy to validate. Note that properly >>>> >> >>> migrating Keystone requires keeping existing Fernet keys around, so >>>> >> >>> any token stays valid until the time it is expected to stop working >>>> >> >>> (which is fairly complex, see >>>> >> >>> https://bugs.launchpad.net/kolla-ansible/+bug/1809469). >>>> >> >>> >>>> >> >>> While initially I was using an approach similar to your first >>>> >> >>> suggestion, it can have side effects since Kolla Ansible uses these >>>> >> >>> variables when templating configuration. As an example, most services >>>> >> >>> will only have notifications enabled if enable_ceilometer is true. >>>> >> >>> >>>> >> >>> I've added existing control plane nodes to the Kolla Ansible inventory >>>> >> >>> as separate groups, which allows me to use the existing database and >>>> >> >>> RabbitMQ for the containerised services. >>>> >> >>> For example, instead of: >>>> >> >>> >>>> >> >>> [mariadb:children] >>>> >> >>> control >>>> >> >>> >>>> >> >>> you may have: >>>> >> >>> >>>> >> >>> [mariadb:children] >>>> >> >>> oldcontrol_db >>>> >> >>> >>>> >> >>> I still have to perform the migration of these underlying services to >>>> >> >>> the new control plane, I will let you know if there is any hurdle. >>>> >> >>> >>>> >> >>> A few random things to note: >>>> >> >>> >>>> >> >>> - if run on existing control plane hosts, the baremetal role removes >>>> >> >>> some packages listed in `redhat_pkg_removals` which can trigger the >>>> >> >>> removal of OpenStack dependencies using them! I've changed this >>>> >> >>> variable to an empty list. >>>> >> >>> - compare your existing deployment with a Kolla Ansible one to check >>>> >> >>> for differences in endpoints, configuration files, database users, >>>> >> >>> service users, etc. For Heat, Kolla uses the domain heat_user_domain, >>>> >> >>> while your existing deployment may use another one (and this is >>>> >> >>> hardcoded in the Kolla Heat image). Kolla Ansible uses the "service" >>>> >> >>> project while a couple of deployments I worked with were using >>>> >> >>> "services". This shouldn't matter, except there was a bug in Kolla >>>> >> >>> which prevented it from setting the roles correctly: >>>> >> >>> https://bugs.launchpad.net/kolla/+bug/1791896 (now fixed in latest >>>> >> >>> Rocky and Queens images) >>>> >> >>> - the ml2_conf.ini generated for Neutron generates physical network >>>> >> >>> names like physnet1, physnet2… you may want to override >>>> >> >>> bridge_mappings completely. >>>> >> >>> - although sometimes it could be easier to change your existing >>>> >> >>> deployment to match Kolla Ansible settings, rather than configure >>>> >> >>> Kolla Ansible to match your deployment." >>>> >> >>> >>>> >> >>> > Thanks >>>> >> >>> > Ignazio >>>> >> >>> > From renat.akhmerov at gmail.com Wed Oct 2 07:57:24 2019 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Wed, 2 Oct 2019 14:57:24 +0700 Subject: [requirements][mistral][amqp] Failing =?utf-8?Q?=E2=80=9Cdocs=E2=80=9D_?=job due to the upper constraint conflict for amqp In-Reply-To: References: Message-ID: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> Hi, We have a failing “docs” ([1]) CI job that fails because it implicitly brings amqp 2.5.2 but this lib is not allowed to be higher than 2.5.1 in the upper-constraings.txt in the requirements project ([2]). We see that there’s the patch [3] generated by the proposal bot that bumps the constraint to 2.5.2 for amqp (among others) but it was given -2. Please assist on how to address in the best way. Should we bump only amqp version in upper constraints for now? [1] https://zuul.opendev.org/t/openstack/build/6fe7c7d3e60b40458d2a98f3a293f412/log/job-output.txt#840 [2] https://github.com/openstack/requirements/blob/master/upper-constraints.txt#L258 [3] https://review.opendev.org/#/c/681382 Thanks Renat Akhmerov @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From surya.seetharaman9 at gmail.com Wed Oct 2 08:25:07 2019 From: surya.seetharaman9 at gmail.com (Surya Seetharaman) Date: Wed, 2 Oct 2019 10:25:07 +0200 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: On Tue, Oct 1, 2019 at 11:42 PM Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, make > friends in the world and bring me and my family to foreign country from our > home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still keep > contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > --- > Thanks Kenichi for all your contributions. I wish you all the best for your future endeavors. Cheers, Surya. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Wed Oct 2 08:29:40 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 2 Oct 2019 10:29:40 +0200 Subject: Release Cycle Observations In-Reply-To: References: <40ab2bd3-e23a-6877-e515-63bbc1663f66@gmail.com> <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> Message-ID: On 10/1/19 12:05 PM, Dmitry Tantsur wrote: > > > On Fri, Sep 27, 2019 at 10:47 PM Thomas Goirand > wrote: > > On 9/26/19 9:51 PM, Sean McGinnis wrote: > >> I know we'd like to have everyone CD'ing master > > > > Watch who you're lumping in with the "we" statement. ;) > > You've pinpointed what the problem is. > > Everyone but OpenStack upstream would like to stop having to upgrade > every 6 months. > > > Yep, but the same "everyone" want to have features now or better > yesterday, not in 2-3 years ;) This probably was the case a few years ago, when OpenStack was young. Now that it has matured, and has all the needed features, things have changed a lot. Thomas From no-reply at openstack.org Wed Oct 2 10:19:46 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Wed, 02 Oct 2019 10:19:46 -0000 Subject: glance 19.0.0.0rc1 (train) Message-ID: Hello everyone, A new release candidate for glance for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/glance/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/glance/src/branch/stable/train Release notes for glance can be found at: https://docs.openstack.org/releasenotes/glance/ If you find an issue that could be considered release-critical, please file it at: https://bugs.launchpad.net/glance/+bugs and tag it *train-rc-potential* to bring it to the glance release crew's attention. From jesse at odyssey4.me Wed Oct 2 11:07:19 2019 From: jesse at odyssey4.me (Jesse Pretorius) Date: Wed, 2 Oct 2019 11:07:19 +0000 Subject: [openstack-ansible] Stepping down as core reviewer Message-ID: <3f149abe04bc915fff4aa460eb07e1f0b2a44071.camel@odyssey4.me> Hi everyone, While I had hoped to manage keeping up with OSA reviews and some contributions, unfortunately there is too much on my plate in my new role to allow me to give OSA sufficient time and I feel that it's important to not give any false promises. I am therefore stepping down as a core reviewer for OSA. My journey with OpenStack-Ansible started with initial contributions before it was an official OpenStack project, went on to helping lead the project to becoming an official project in the big tent, then on to becoming a successful project with diverse contributors of which I was proud to be a part. Over time I learned a heck of a lot about building and leading an Open Source community, about developing Ansible playbooks and roles at significant scale, and about building, packaging and deploying python software. It has been a very valuable experience through which I have grown personally and professionally. This community's strengths are in its leadership by operators, its readiness to assist newcomers and in striving to maintain a deployment system which is easy to understand and use (while somehow also being ridiculously flexible). As Jean-Philippe Evrard has recently expressed, this is the DNA which makes the community special. As you should all be aware, I am always ready to help when asked and I can also share historical context if there is a need for that so please feel free to ping me on IRC or add me to a review and I'll do my best. My journey onward is working with TripleO in the upgrades team, so you'll still find me contributing to OpenStack as a whole. I'll be hanging out in #tripleo and #openstack-dev on IRC if you're looking for me. All the best, Jesse (odyssey4me) From mdulko at redhat.com Wed Oct 2 11:18:57 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Wed, 02 Oct 2019 13:18:57 +0200 Subject: [kuryr][kuryr-libnetwork] Nominating Hongbin Lu to kuryr-libnetwork and kuryr core Message-ID: <3b89b976c17cfda617cf68b0c9308f97ae013b78.camel@redhat.com> Hi, I'd like to nominate Hongbin Lu to be core reviewer in both kuryr- libnetwork and kuryr projects. Besides saying that he's doing a great job maintaining kuryr-libnetwork I'm simply surprised he don't have +2/-2 rights there and that should definitely get fixed. As there isn't a lot of people maintaining those projects anymore, I'll just skip the voting part and add Hongbin to core teams immediately. Thanks, Michał From sfinucan at redhat.com Wed Oct 2 12:15:30 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Wed, 02 Oct 2019 13:15:30 +0100 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: <9dc4ada9e1690d7da75422c5fcb3037cb28e2125.camel@redhat.com> On Tue, 2019-10-01 at 14:40 -0700, Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign > country from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi Thanks for all the help over the years. You shall be missed :( Stephen From sean.mcginnis at gmx.com Wed Oct 2 12:41:48 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 2 Oct 2019 07:41:48 -0500 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> Message-ID: <20191002124148.GA16684@sm-workstation> On Tue, Oct 01, 2019 at 10:03:18PM +0000, Arkady.Kanevsky at dell.com wrote: > On the plan it is one week after feature freeze > No, Goutham is correct, it is the same week: https://releases.openstack.org/ussuri/schedule.html This is how it has been for as long as I've been aware of our release schedule. By milestone 3 we want to start locking down the changes that could introduce instability and start preparing for the final release. Sean From Arkady.Kanevsky at dell.com Wed Oct 2 13:43:55 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 2 Oct 2019 13:43:55 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <20191002124148.GA16684@sm-workstation> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> Message-ID: <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> Sean, On https://releases.openstack.org/ussuri/schedule.html Feature freeze is R6 but Requirements freeze is R5. Thanks, Arkady -----Original Message----- From: Sean McGinnis Sent: Wednesday, October 2, 2019 7:42 AM To: Kanevsky, Arkady Cc: gouthampravi at gmail.com; openstack-discuss at lists.openstack.org Subject: Re: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] On Tue, Oct 01, 2019 at 10:03:18PM +0000, Arkady.Kanevsky at dell.com wrote: > On the plan it is one week after feature freeze > No, Goutham is correct, it is the same week: https://releases.openstack.org/ussuri/schedule.html This is how it has been for as long as I've been aware of our release schedule. By milestone 3 we want to start locking down the changes that could introduce instability and start preparing for the final release. Sean From mriedemos at gmail.com Wed Oct 2 13:48:08 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 2 Oct 2019 08:48:08 -0500 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: On 10/1/2019 5:00 AM, Mark Goddard wrote: >>> 5. What DB configuration should be used in nova.conf when running >>> online data migrations? I can see some migrations that seem to need >>> the API DB, and others that need a cell DB. If I just give it the API >>> DB, will it use the cell mappings to get to each cell DB, or do I need >>> to run it once for each cell? >> The API DB has its own set of migrations, so you obviously need API DB >> connection info to make that happen. There is no fanout to all the rest >> of the cells (currently), so you need to run it with a conf file >> pointing to the cell, for each cell you have. The latest attempt >> at making this fan out was abanoned in July with no explanation, so it >> dropped off my radar at least. > That makes sense. The rolling upgrade docs could be a little clearer > for multi-cell deployments here. > This recently merged, hopefully it helps clarify: https://review.opendev.org/#/c/671298/ >>> 6. After an upgrade, when can we restart services to unpin the compute >>> RPC version? Looking at the compute RPC API, it looks like the super >>> conductor will remain pinned until all computes have been upgraded. >>> For a cell conductor, it looks like I could restart it to unpin after >>> upgrading all computes in that cell, correct? >> Yeah. >> >>> 7. Which services require policy.{yml,json}? I can see policy >>> referenced in API, conductor and compute. >> That's a good question. I would have thought it was just API, so maybe >> someone else can chime in here, although it's not specific to cells. > Yeah, unrelated to cells, just something I wondered while digging > through our nova Ansible role. > > Here is the line that made me think policies are required in > conductors:https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. > I guess this is only required for cell conductors though? > That is not the conductor service, it's the API. -- Thanks, Matt From fungi at yuggoth.org Wed Oct 2 14:14:11 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 2 Oct 2019 14:14:11 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> Message-ID: <20191002141411.pisvn7okkmxbhx3y@yuggoth.org> On 2019-10-02 13:43:55 +0000 (+0000), Arkady.Kanevsky at dell.com wrote: > Sean, > On https://releases.openstack.org/ussuri/schedule.html > Feature freeze is R6 but > Requirements freeze is R5. [...] Could it be a local rendering or interpretation problem? When I load that same URL it tells me they're both in R5. The shaded grey band which has R5 vertically centered in the left column contains 6 ordered list entries, of which those are two. The only thing I see for the R6 week is "Final release for non-client libraries." -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at nemebean.com Wed Oct 2 14:18:42 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 2 Oct 2019 09:18:42 -0500 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> Message-ID: <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: > Sean, > On https://releases.openstack.org/ussuri/schedule.html > Feature freeze is R6 but > Requirements freeze is R5. Is your browser dropping the background color for the table cells? There are actually six bullet points in the R-5 one, but because it's vertically centered some of them may appear to be under R-6. The only thing that's in R-6 though is the final non-client library release. > Thanks, > Arkady > > -----Original Message----- > From: Sean McGinnis > Sent: Wednesday, October 2, 2019 7:42 AM > To: Kanevsky, Arkady > Cc: gouthampravi at gmail.com; openstack-discuss at lists.openstack.org > Subject: Re: [all] Planned Ussuri release schedule published > > > [EXTERNAL EMAIL] > > On Tue, Oct 01, 2019 at 10:03:18PM +0000, Arkady.Kanevsky at dell.com wrote: >> On the plan it is one week after feature freeze >> > > No, Goutham is correct, it is the same week: > > https://releases.openstack.org/ussuri/schedule.html > > This is how it has been for as long as I've been aware of our release schedule. > By milestone 3 we want to start locking down the changes that could introduce instability and start preparing for the final release. > > Sean > From sean.mcginnis at gmx.com Wed Oct 2 14:57:23 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 2 Oct 2019 09:57:23 -0500 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> Message-ID: <20191002145723.GA27063@sm-workstation> > > On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: > > Sean, > > On https://releases.openstack.org/ussuri/schedule.html > > Feature freeze is R6 but > > Requirements freeze is R5. > > Is your browser dropping the background color for the table cells? There are > actually six bullet points in the R-5 one, but because it's vertically > centered some of them may appear to be under R-6. The only thing that's in > R-6 though is the final non-client library release. > That's what I see and how the schedule is defined. I'm assuming this has to be some sort of local rendering problem. Maybe openstackdocstheme needs to bring back table cell borders? Looks fine from my view though. Sean From cems at ebi.ac.uk Wed Oct 2 15:39:16 2019 From: cems at ebi.ac.uk (Charles) Date: Wed, 2 Oct 2019 16:39:16 +0100 Subject: OOK,Airship Message-ID: <62a00fb3-ea17-1cd1-fb9f-e4b6f3434047@ebi.ac.uk> Hi, We are interested in OOK and Openstack Helm. Has anyone any experience with Airship (now that 1.0 is out)? Noticed that a few Enterprise distributions are looking at managing the Openstack control plane with Kubernetes and have been testing Airship with a view to rolling it out (Mirantis,SUSE) Is this a signal that there is momentum around Openstack Helm? Is it possible to roll out an open source production grade Airship/Openstack Helm deployment today, or is it too early? Thoughts? Charles From Arkady.Kanevsky at dell.com Wed Oct 2 16:01:22 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 2 Oct 2019 16:01:22 +0000 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <20191002145723.GA27063@sm-workstation> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> <20191002145723.GA27063@sm-workstation> Message-ID: <88053759ce094142b756c17a83e099a1@AUSX13MPS308.AMER.DELL.COM> -----Original Message----- From: Sean McGinnis Sent: Wednesday, October 2, 2019 9:57 AM To: Ben Nemec Cc: Kanevsky, Arkady; gouthampravi at gmail.com; openstack-discuss at lists.openstack.org Subject: Re: [all] Planned Ussuri release schedule published [EXTERNAL EMAIL] > > On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: > > Sean, > > On https://releases.openstack.org/ussuri/schedule.html > > Feature freeze is R6 but > > Requirements freeze is R5. > > Is your browser dropping the background color for the table cells? > There are actually six bullet points in the R-5 one, but because it's > vertically centered some of them may appear to be under R-6. The only > thing that's in > R-6 though is the final non-client library release. > That's what I see and how the schedule is defined. I'm assuming this has to be some sort of local rendering problem. Maybe openstackdocstheme needs to bring back table cell borders? Looks fine from my view though. Sean -------------- next part -------------- A non-text attachment was scrubbed... Name: U-timeline.PNG Type: image/png Size: 26795 bytes Desc: U-timeline.PNG URL: From sean.mcginnis at gmx.com Wed Oct 2 16:05:35 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 2 Oct 2019 11:05:35 -0500 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <88053759ce094142b756c17a83e099a1@AUSX13MPS308.AMER.DELL.COM> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> <20191002145723.GA27063@sm-workstation> <88053759ce094142b756c17a83e099a1@AUSX13MPS308.AMER.DELL.COM> Message-ID: <20191002160535.GA29937@sm-workstation> On Wed, Oct 02, 2019 at 04:01:22PM +0000, Arkady.Kanevsky at dell.com wrote: > > > -----Original Message----- > From: Sean McGinnis > Sent: Wednesday, October 2, 2019 9:57 AM > To: Ben Nemec > Cc: Kanevsky, Arkady; gouthampravi at gmail.com; openstack-discuss at lists.openstack.org > Subject: Re: [all] Planned Ussuri release schedule published > > > [EXTERNAL EMAIL] > > > > > On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: > > > Sean, > > > On https://releases.openstack.org/ussuri/schedule.html > > > Feature freeze is R6 but > > > Requirements freeze is R5. > > > > Is your browser dropping the background color for the table cells? > > There are actually six bullet points in the R-5 one, but because it's > > vertically centered some of them may appear to be under R-6. The only > > thing that's in > > R-6 though is the final non-client library release. > > Looks like you fixed it? Any idea what you changed in case someone else has the same issue? From mthode at mthode.org Wed Oct 2 16:34:15 2019 From: mthode at mthode.org (Matthew Thode) Date: Wed, 2 Oct 2019 11:34:15 -0500 Subject: [FFE][requirements][mistral][amqp] Failing =?utf-8?B?4oCcZG9j?= =?utf-8?B?c+KAnQ==?= job due to the upper constraint conflict for amqp In-Reply-To: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> References: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> Message-ID: <20191002163415.nu7okcn5de44txoz@mthode.org> On 19-10-02 14:57:24, Renat Akhmerov wrote: > Hi, > > We have a failing “docs” ([1]) CI job that fails because it implicitly brings amqp 2.5.2 but this lib is not allowed to be higher than 2.5.1 in the upper-constraings.txt in the requirements project ([2]). We see that there’s the patch [3] generated by the proposal bot that bumps the constraint to 2.5.2 for amqp (among others) but it was given -2. > > Please assist on how to address in the best way. Should we bump only amqp version in upper constraints for now? > > [1] https://zuul.opendev.org/t/openstack/build/6fe7c7d3e60b40458d2a98f3a293f412/log/job-output.txt#840 > [2] https://github.com/openstack/requirements/blob/master/upper-constraints.txt#L258 > [3] https://review.opendev.org/#/c/681382 > I'm going to be treating this as a FFE request to bump amqp from 2.5.1 to 2.5.2. It looks like a bugfix only release so I'm fine with it. As long as we don't need to mask 2.5.1 in global-requirements (which would cause a re-release for openstack/oslo.messaging). https://github.com/celery/py-amqp/compare/2.5.1...2.5.2 So, if you propose a constraints only bump of amqp-2.5.1 to 2.5.2 then I approve. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From fsbiz at yahoo.com Wed Oct 2 16:41:42 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Wed, 2 Oct 2019 16:41:42 +0000 (UTC) Subject: Port creation times out for some VMs in large group In-Reply-To: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> Message-ID: <1226029673.2675287.1570034502180@mail.yahoo.com> Thanks. This definitely helps. I am running a stable release of Queens.Even after this change I still see 10-15 failures when I create 100 VMs in our cluster. I have tracked this down (to a reasonable degree of certainty) to the SIGHUPs caused by DNSMASQ reloadsevery time a new MAC entry is added, deleted or updated.  It seems to be related tohttps://bugs.launchpad.net/neutron/+bug/1598078 The fix for the above bug was abandoned.  Gerrit Code Review | | | | Gerrit Code Review | | | Any further fine tuning that can be done?  Thanks,Fred. On Friday, September 27, 2019, 09:37:51 AM PDT, Chris Apsey wrote: Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away.  I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden wrote: When I create 100 VMs in our prod cluster:   openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest   Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.”   If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure.   What config variables should I be looking at?   Here are the relevant log entries from the HV:   2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds   More logs and data:   http://paste.openstack.org/show/779524/   -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Wed Oct 2 16:45:52 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 2 Oct 2019 09:45:52 -0700 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: On 10/1/19 2:40 PM, Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign country > from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. Kenichi, Thank you for all of your work in nova throughout the years. I have enjoyed working with you and I wish you all the best for the future. Hope to see you around again in nova some time down the road. :) Cheers, -melanie From ianyrchoi at gmail.com Wed Oct 2 17:10:55 2019 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Thu, 3 Oct 2019 02:10:55 +0900 Subject: [i18n] Request to be added as Vietnamese translation group coordinators In-Reply-To: <49e1a362-aeea-b230-536c-8778e3f3d885@suse.com> References: <49e1a362-aeea-b230-536c-8778e3f3d885@suse.com> Message-ID: Hello, Sorry for replying here late (I was travelling by the end of last week and have been following-up many things which I couldn't take care of). Yesterday, I approved all the open requests including requests mentioned below :) With many thanks, /Ian Andreas Jaeger wrote on 9/26/2019 10:14 PM: > On 26/09/2019 13.59, Trinh Nguyen wrote: >> Hi i18n team, >> >> Dai and I would like to volunteer as the coordinators of the >> Vietnamese translation group. If you find us qualified, please let us >> know. >> > > Looking at translate.openstack.org: > > I saw that Dai asked to be a translator and approved his request as an > admin, I do not see you in Vietnamese, please apply as translator for > Vietnamese first. > > Ian, will you reach out to the current coordinator? > > Ian, a couple of language teams have open requests, could you check > those and whether the coordinators are still alive, please? > > Andreas From satish.txt at gmail.com Wed Oct 2 17:34:12 2019 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 2 Oct 2019 13:34:12 -0400 Subject: [openstack-ansible] Stepping down as core reviewer In-Reply-To: <3f149abe04bc915fff4aa460eb07e1f0b2a44071.camel@odyssey4.me> References: <3f149abe04bc915fff4aa460eb07e1f0b2a44071.camel@odyssey4.me> Message-ID: Jesse, Damn!!! one more sad news :( I talked to you couple of time when i was building my openstack cloud using OSA and you truly encourage me to step up and what i am running multiple big cloud using OSA :) Thank you for your support and contribution. Good luck for your future projects. On Wed, Oct 2, 2019 at 7:17 AM Jesse Pretorius wrote: > > Hi everyone, > > While I had hoped to manage keeping up with OSA reviews and some > contributions, unfortunately there is too much on my plate in my new > role to allow me to give OSA sufficient time and I feel that it's > important to not give any false promises. I am therefore stepping down > as a core reviewer for OSA. > > My journey with OpenStack-Ansible started with initial contributions > before it was an official OpenStack project, went on to helping lead > the project to becoming an official project in the big tent, then on to > becoming a successful project with diverse contributors of which I was > proud to be a part. > > Over time I learned a heck of a lot about building and leading an Open > Source community, about developing Ansible playbooks and roles at > significant scale, and about building, packaging and deploying python > software. It has been a very valuable experience through which I have > grown personally and professionally. > > This community's strengths are in its leadership by operators, its > readiness to assist newcomers and in striving to maintain a deployment > system which is easy to understand and use (while somehow also being > ridiculously flexible). As Jean-Philippe Evrard has recently expressed, > this is the DNA which makes the community special. > > As you should all be aware, I am always ready to help when asked and I > can also share historical context if there is a need for that so please > feel free to ping me on IRC or add me to a review and I'll do my best. > > My journey onward is working with TripleO in the upgrades team, so > you'll still find me contributing to OpenStack as a whole. I'll be > hanging out in #tripleo and #openstack-dev on IRC if you're looking for > me. > > All the best, > > Jesse (odyssey4me) From ignaziocassano at gmail.com Wed Oct 2 17:36:04 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 2 Oct 2019 19:36:04 +0200 Subject: [kolla-ansible] migration In-Reply-To: References: Message-ID: Many tHanks Ignazio Il Mer 2 Ott 2019, 09:44 Pierre Riteau ha scritto: > Hi everyone, > > I hope you don't mind me reviving this thread, to let you know I wrote > an article after we successfully completed the migration of a running > OpenStack deployment to Kolla: > http://www.stackhpc.com/migrating-to-kolla.html > > Don't hesitate to contact me if you have more questions about how this > type of migration can be performed. > > Pierre > > On Mon, 1 Jul 2019 at 14:02, Ignazio Cassano > wrote: > > > > I checked them and I modified for fitting to new installation > > thanks > > Ignazio > > > > Il giorno lun 1 lug 2019 alle ore 13:36 Mohammed Naser < > mnaser at vexxhost.com> ha scritto: > >> > >> You should check your cell mapping records inside Nova. They're > probably not right of you moved your database and rabbit > >> > >> Sorry for top posting this is from a phone. > >> > >> On Mon., Jul. 1, 2019, 5:46 a.m. Ignazio Cassano, < > ignaziocassano at gmail.com> wrote: > >>> > >>> PS > >>> I presume the problem is neutron, because instances on new kvm nodes > remain in building state e do not aquire address. > >>> Probably the netron db imported from old openstack installation has > some difrrences ....probably I must check defferences from old and new > neutron services configuration files. > >>> Ignazio > >>> > >>> Il giorno lun 1 lug 2019 alle ore 10:10 Mark Goddard < > mark at stackhpc.com> ha scritto: > >>>> > >>>> It sounds like you got quite close to having this working. I'd suggest > >>>> debugging this instance build failure. One difference with kolla is > >>>> that we run libvirt inside a container. Have you stopped libvirt from > >>>> running on the host? > >>>> Mark > >>>> > >>>> On Sun, 30 Jun 2019 at 09:55, Ignazio Cassano < > ignaziocassano at gmail.com> wrote: > >>>> > > >>>> > Hi Mark, > >>>> > let me to explain what I am trying. > >>>> > I have a queens installation based on centos and pacemaker with > some instances and heat stacks. > >>>> > I would like to have another installation with same instances, > projects, stacks ....I'd like to have same uuid for all objects > (users,projects instances and so on, because it is controlled by a cloud > management platform we wrote. > >>>> > > >>>> > I stopped controllers on old queens installation backupping the > openstack database. > >>>> > I installed the new kolla openstack queens on new three controllers > with same addresses of the old intallation , vip as well. > >>>> > One of the three controllers is also a kvm node on queens. > >>>> > I stopped all containeres except rabbit,keepalive,rabbit,haproxy > and mariadb. > >>>> > I deleted al openstack db on mariadb container and I imported the > old tables, changing the address of rabbit for pointing to the new rabbit > cluster. > >>>> > I restarded containers. > >>>> > Changing the rabbit address on old kvm nodes, I can see the old > virtual machines and I can open console on them. > >>>> > I can see all networks (tenant and provider) of al installation, > but when I try to create a new instance on the new kvm, it remains in > buiding state. > >>>> > Seems it cannot aquire an address. > >>>> > Storage between old and new installation are shred on nfs NETAPP, > so I can see cinder volumes. > >>>> > I suppose db structure is different between a kolla installation > and a manual instaltion !? > >>>> > What is wrong ? > >>>> > Thanks > >>>> > Ignazio > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > Il giorno gio 27 giu 2019 alle ore 16:44 Mark Goddard < > mark at stackhpc.com> ha scritto: > >>>> >> > >>>> >> On Thu, 27 Jun 2019 at 14:46, Ignazio Cassano < > ignaziocassano at gmail.com> wrote: > >>>> >> > > >>>> >> > Sorry, for my question. > >>>> >> > It does not need to change anything because endpoints refer to > haproxy vips. > >>>> >> > So if your new glance works fine you change haproxy backends for > glance. > >>>> >> > Regards > >>>> >> > Ignazio > >>>> >> > >>>> >> That's correct - only the haproxy backend needs to be updated. > >>>> >> > >>>> >> > > >>>> >> > > >>>> >> > Il giorno gio 27 giu 2019 alle ore 15:21 Ignazio Cassano < > ignaziocassano at gmail.com> ha scritto: > >>>> >> >> > >>>> >> >> Hello Mark, > >>>> >> >> let me to verify if I understood your method. > >>>> >> >> > >>>> >> >> You have old controllers,haproxy,mariadb and nova computes. > >>>> >> >> You installed three new controllers but kolla.ansible inventory > contains old mariadb and old rabbit servers. > >>>> >> >> You are deployng single service on new controllers staring with > glance. > >>>> >> >> When you deploy glance on new controllers, it changes the > glance endpoint on old mariadb db ? > >>>> >> >> Regards > >>>> >> >> Ignazio > >>>> >> >> > >>>> >> >> Il giorno gio 27 giu 2019 alle ore 10:52 Mark Goddard < > mark at stackhpc.com> ha scritto: > >>>> >> >>> > >>>> >> >>> On Wed, 26 Jun 2019 at 19:34, Ignazio Cassano < > ignaziocassano at gmail.com> wrote: > >>>> >> >>> > > >>>> >> >>> > Hello, > >>>> >> >>> > Anyone have tried to migrate an existing openstack > installation to kolla containers? > >>>> >> >>> > >>>> >> >>> Hi, > >>>> >> >>> > >>>> >> >>> I'm aware of two people currently working on that. Gregory > Orange and > >>>> >> >>> one of my colleagues, Pierre Riteau. Pierre is away currently, > so I > >>>> >> >>> hope he doesn't mind me quoting him from an email to Gregory. > >>>> >> >>> > >>>> >> >>> Mark > >>>> >> >>> > >>>> >> >>> "I am indeed working on a similar migration using Kolla > Ansible with > >>>> >> >>> Kayobe, starting from a non-containerised OpenStack deployment > based > >>>> >> >>> on CentOS RPMs. > >>>> >> >>> Existing OpenStack services are deployed across several > controller > >>>> >> >>> nodes and all sit behind HAProxy, including for internal > endpoints. > >>>> >> >>> We have additional controller nodes that we use to deploy > >>>> >> >>> containerised services. If you don't have the luxury of > additional > >>>> >> >>> nodes, it will be more difficult as you will need to avoid > processes > >>>> >> >>> clashing when listening on the same port. > >>>> >> >>> > >>>> >> >>> The method I am using resembles your second suggestion, > however I am > >>>> >> >>> deploying only one containerised service at a time, in order to > >>>> >> >>> validate each of them independently. > >>>> >> >>> I use the --tags option of kolla-ansible to restrict Ansible to > >>>> >> >>> specific roles, and when I am happy with the resulting > configuration I > >>>> >> >>> update HAProxy to point to the new controllers. > >>>> >> >>> > >>>> >> >>> As long as the configuration matches, this should be completely > >>>> >> >>> transparent for purely HTTP-based services like Glance. You > need to be > >>>> >> >>> more careful with services that include components listening > for RPC, > >>>> >> >>> such as Nova: if the new nova.conf is incorrect and you've > deployed a > >>>> >> >>> nova-conductor that uses it, you could get failed instances > launches. > >>>> >> >>> Some roles depend on others: if you are deploying the > >>>> >> >>> neutron-openvswitch-agent, you need to run the openvswitch > role as > >>>> >> >>> well. > >>>> >> >>> > >>>> >> >>> I suggest starting with migrating Glance as it doesn't have any > >>>> >> >>> internal services and is easy to validate. Note that properly > >>>> >> >>> migrating Keystone requires keeping existing Fernet keys > around, so > >>>> >> >>> any token stays valid until the time it is expected to stop > working > >>>> >> >>> (which is fairly complex, see > >>>> >> >>> https://bugs.launchpad.net/kolla-ansible/+bug/1809469). > >>>> >> >>> > >>>> >> >>> While initially I was using an approach similar to your first > >>>> >> >>> suggestion, it can have side effects since Kolla Ansible uses > these > >>>> >> >>> variables when templating configuration. As an example, most > services > >>>> >> >>> will only have notifications enabled if enable_ceilometer is > true. > >>>> >> >>> > >>>> >> >>> I've added existing control plane nodes to the Kolla Ansible > inventory > >>>> >> >>> as separate groups, which allows me to use the existing > database and > >>>> >> >>> RabbitMQ for the containerised services. > >>>> >> >>> For example, instead of: > >>>> >> >>> > >>>> >> >>> [mariadb:children] > >>>> >> >>> control > >>>> >> >>> > >>>> >> >>> you may have: > >>>> >> >>> > >>>> >> >>> [mariadb:children] > >>>> >> >>> oldcontrol_db > >>>> >> >>> > >>>> >> >>> I still have to perform the migration of these underlying > services to > >>>> >> >>> the new control plane, I will let you know if there is any > hurdle. > >>>> >> >>> > >>>> >> >>> A few random things to note: > >>>> >> >>> > >>>> >> >>> - if run on existing control plane hosts, the baremetal role > removes > >>>> >> >>> some packages listed in `redhat_pkg_removals` which can > trigger the > >>>> >> >>> removal of OpenStack dependencies using them! I've changed this > >>>> >> >>> variable to an empty list. > >>>> >> >>> - compare your existing deployment with a Kolla Ansible one to > check > >>>> >> >>> for differences in endpoints, configuration files, database > users, > >>>> >> >>> service users, etc. For Heat, Kolla uses the domain > heat_user_domain, > >>>> >> >>> while your existing deployment may use another one (and this is > >>>> >> >>> hardcoded in the Kolla Heat image). Kolla Ansible uses the > "service" > >>>> >> >>> project while a couple of deployments I worked with were using > >>>> >> >>> "services". This shouldn't matter, except there was a bug in > Kolla > >>>> >> >>> which prevented it from setting the roles correctly: > >>>> >> >>> https://bugs.launchpad.net/kolla/+bug/1791896 (now fixed in > latest > >>>> >> >>> Rocky and Queens images) > >>>> >> >>> - the ml2_conf.ini generated for Neutron generates physical > network > >>>> >> >>> names like physnet1, physnet2… you may want to override > >>>> >> >>> bridge_mappings completely. > >>>> >> >>> - although sometimes it could be easier to change your existing > >>>> >> >>> deployment to match Kolla Ansible settings, rather than > configure > >>>> >> >>> Kolla Ansible to match your deployment." > >>>> >> >>> > >>>> >> >>> > Thanks > >>>> >> >>> > Ignazio > >>>> >> >>> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed Oct 2 17:48:06 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 2 Oct 2019 12:48:06 -0500 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: On 10/1/2019 4:40 PM, Kenichi Omichi wrote: > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign country > from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. Ken'ichi, thank you for all of your work over the years both in nova and the QA team. You played a key role in making microversions happen in the compute API and that has spread out to other projects so it's something you can be proud of. Good luck in your next position. -- Thanks, Matt From mriedemos at gmail.com Wed Oct 2 18:04:45 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 2 Oct 2019 13:04:45 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> Message-ID: <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> On 9/30/2019 6:09 PM, Eric Fried wrote: > Every cycle we approve some number of blueprints and then complete a low > percentage [1] of them. > > [1] Like in the neighborhood of 60%. This is anecdotal; I'm not aware of > a good way to go back and mine actual data. When Mel and I were PTLs we tracked and reported post-release numbers on blueprint activity, what was proposed, what was approved and what was completed: Ocata: http://lists.openstack.org/pipermail/openstack-dev/2017-February/111639.html Pike: http://lists.openstack.org/pipermail/openstack-dev/2017-September/121875.html Queens: http://lists.openstack.org/pipermail/openstack-dev/2018-February/127402.html Rocky: http://lists.openstack.org/pipermail/openstack-dev/2018-August/133342.html Stein: http://lists.openstack.org/pipermail/openstack-discuss/2019-March/004234.html So there are numbers in there for calculating completion percentage over the last 5 releases before Train. Of course the size of the core team and diversity of contributors over that time has changed drastically so it's not comparing apples to apples. But you said you weren't aware of data to mine so I'm giving you an axe and shovel. -- Thanks, Matt From bitskrieg at bitskrieg.net Wed Oct 2 18:30:19 2019 From: bitskrieg at bitskrieg.net (Chris Apsey) Date: Wed, 02 Oct 2019 18:30:19 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: <1226029673.2675287.1570034502180@mail.yahoo.com> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> <1226029673.2675287.1570034502180@mail.yahoo.com> Message-ID: Is that still spitting out a vif plug failure or are your instances spawning but not getting addresses? I've found that adding in the no-ping option to dnsmasq lowers load significantly, but can be dangerous if you've got potentially conflicting sources of address allocation. While it doesn't address the below bug report specifically, it may breathe some more CPU cycles into dnsmasq so it can handle other items better. R CA -------- Original Message -------- On Oct 2, 2019, 12:41, fsbiz at yahoo.com wrote: > Thanks. This definitely helps. > > I am running a stable release of Queens. > Even after this change I still see 10-15 failures when I create 100 VMs in our cluster. > > I have tracked this down (to a reasonable degree of certainty) to the SIGHUPs caused by DNSMASQ reloads > every time a new MAC entry is added, deleted or updated. > > It seems to be related to > https://bugs.launchpad.net/neutron/+bug/1598078 > > The fix for the above bug was abandoned. > [Gerrit Code Review](https://review.opendev.org/#/c/336462/) > > https://review.opendev.org/#/c/336462/ > > Gerrit Code Review > > Any further fine tuning that can be done? > > Thanks, > Fred. > > On Friday, September 27, 2019, 09:37:51 AM PDT, Chris Apsey wrote: > > Albert, > > Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ > > The problem will go away. I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. > > Either way, that should solve your problem. > > r > > Chris Apsey > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ > On Friday, September 27, 2019 12:17 PM, Albert Braden wrote: > >> When I create 100 VMs in our prod cluster: >> >> openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest >> >> Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.” >> >> If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure. >> >> What config variables should I be looking at? >> >> Here are the relevant log entries from the HV: >> >> 2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') >> >> 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds >> >> More logs and data: >> >> http://paste.openstack.org/show/779524/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Wed Oct 2 18:59:57 2019 From: dms at danplanet.com (Dan Smith) Date: Wed, 02 Oct 2019 11:59:57 -0700 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> (Matt Riedemann's message of "Wed, 2 Oct 2019 13:04:45 -0500") References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> Message-ID: > So there are numbers in there for calculating completion percentage > over the last 5 releases before Train. Of course the size of the core > team and diversity of contributors over that time has changed > drastically so it's not comparing apples to apples. But you said you > weren't aware of data to mine so I'm giving you an axe and shovel. Perhaps drastic over the last five, but not over the last three, IMHO. Some change, but not enough to account for going from 59 completed in Rocky to 25 in Train. Not all blueprints are the same size, nor require the same amount of effort on the part of any of the parties involved. Involvement ebbs and flows with other commitments, like downstream release timelines. Comparing numbers across many releases makes some sense to me, but I would definitely not think that saying "we completed 25 in T, so we will only approve 25 in U" is reasonable. > (B) Require a core to commit to "caring about" a spec before we > approve it. The point of this "core liaison" is to act as a mentor to > mitigate the cultural issues noted above [5], and to be a first point > of contact for reviews. I've proposed this to the spec template here > [6]. As I'm sure you know, we've tried the "core sponsor" thing before. I don't really think it's a bad idea, but it does have a history of not solving the problem like you might think. Constraining cores to not committing to a ton of things may help (although you'll end up with fewer things actually approved if you do that). --Dan From fsbiz at yahoo.com Wed Oct 2 19:01:00 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Wed, 2 Oct 2019 19:01:00 +0000 (UTC) Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> <1226029673.2675287.1570034502180@mail.yahoo.com> Message-ID: <1127664659.2766839.1570042860356@mail.yahoo.com> Thanks.Instances are spawning but not getting addresses.We have Infoblox as the IPAM so --no-ping should be fine.Will run the tests and update. Thanks,Fred. On Wednesday, October 2, 2019, 11:34:39 AM PDT, Chris Apsey wrote: Is that still spitting out a vif plug failure or are your instances spawning but not getting addresses? I've found that adding in the no-ping option to dnsmasq lowers load significantly, but can be dangerous if you've got potentially conflicting sources of address allocation. While it doesn't address the below bug report specifically, it may breathe some more CPU cycles into dnsmasq so it can handle other items better. R CA -------- Original Message -------- On Oct 2, 2019, 12:41, fsbiz at yahoo.com < fsbiz at yahoo.com> wrote: Thanks. This definitely helps. I am running a stable release of Queens.Even after this change I still see 10-15 failures when I create 100 VMs in our cluster. I have tracked this down (to a reasonable degree of certainty) to the SIGHUPs caused by DNSMASQ reloadsevery time a new MAC entry is added, deleted or updated.  It seems to be related tohttps://bugs.launchpad.net/neutron/+bug/1598078 The fix for the above bug was abandoned.  Gerrit Code Review | | | | Gerrit Code Review | | | Any further fine tuning that can be done?  Thanks,Fred. On Friday, September 27, 2019, 09:37:51 AM PDT, Chris Apsey wrote: Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away.  I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden wrote: When I create 100 VMs in our prod cluster:   openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest   Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.”   If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure.   What config variables should I be looking at?   Here are the relevant log entries from the HV:   2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds   More logs and data:   http://paste.openstack.org/show/779524/   -------------- next part -------------- An HTML attachment was scrubbed... URL: From kendall at openstack.org Wed Oct 2 19:06:57 2019 From: kendall at openstack.org (Kendall Waters) Date: Wed, 2 Oct 2019 14:06:57 -0500 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: <29C580AF-47C6-426A-B571-E0D0E9E8806E@openstack.org> Message-ID: <569B70C9-58F0-4860-B2A6-4F597D819FB4@openstack.org> Hi Pierre, Wonderful! You are confirmed for all day Friday. We will post an updated schedule on the website next week. Cheers, Kendall Kendall Waters OpenStack Marketing & Events kendall at openstack.org > On Oct 2, 2019, at 2:29 AM, Pierre Riteau wrote: > > Hi Kendall, > > I got confirmation from all participants that they will be available > all day on Friday. Thanks for adding us to the schedule. > > Best wishes, > Pierre > > On Tue, 1 Oct 2019 at 17:37, Kendall Waters wrote: >> >> Hi Pierre, >> >> Most of our space at the Shanghai PTG is shared space so we can offer you a designated table in the shared room all day Friday. There will be extra chairs in the room if you need to pull up more chairs to your table. >> >> Best, >> Kendall >> >> Kendall Waters >> OpenStack Marketing & Events >> kendall at openstack.org >> >> >> >> On Oct 1, 2019, at 5:53 AM, Pierre Riteau wrote: >> >> Hi Kendall, >> >> Friday works for all who have replied so far, but I am still expecting >> answers from two people. >> >> Is there a room available for our Project Onboarding session that day? >> Probably in the morning, though I will confirm depending on >> availability of participants. >> We've never run one, so I don't know how many people to expect. >> >> Thanks, >> Pierre >> >> On Mon, 30 Sep 2019 at 23:29, Kendall Waters wrote: >> >> >> Hi Pierre, >> >> Apologies for the oversight on Blazar. Would all day Friday work for your team? >> >> Thanks, >> Kendall >> >> Kendall Waters >> OpenStack Marketing & Events >> kendall at openstack.org >> >> >> >> On Sep 30, 2019, at 12:27 PM, Pierre Riteau wrote: >> >> Hi Kendall, >> >> I couldn't see Blazar anywhere on the schedule. We had requested time >> for a Project Onboarding session. >> >> Additionally, there are more people travelling than initially planned, >> so we may want to allocate a half day for technical discussions as >> well (probably in the shared space, since we don't expect a huge >> turnout). >> >> Would it be possible to update the schedule accordingly? >> >> Thanks, >> Pierre >> >> On Fri, 27 Sep 2019 at 19:02, Kendall Nelson wrote: >> >> >> Hello Everyone! >> >> Here is an updated schedule: https://usercontent.irccloud-cdn.com/file/z9iLyv8e/pvg-ptg-sched-2 >> >> The changes that were made are adding OpenStack QA to be all day Wednesday and shifting StarlingX to start on Wednesday and putting OpenStack Ops on Thursday afternoon. >> >> Please let me know if there are any conflicts! >> >> -Kendall (diablo_rojo) >> >> On Wed, Sep 25, 2019 at 2:13 PM Kendall Nelson wrote: >> >> >> Hello Everyone! >> >> In the attached picture or link [0] you will find the proposed schedule for the various tracks at the Shanghai PTG in November. >> >> We did our best to avoid the key conflicts that the track leads (PTLs, SIG leads...) mentioned in their PTG survey responses, although there was no perfect solution that would avoid all conflicts especially when the event is three-ish days long and we have over 40 teams meeting. >> >> If there are critical conflicts we missed or other issues, please let us know, by October 6th at 7:00 UTC! >> >> -Kendall (diablo_rojo) >> >> [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedule.png >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Wed Oct 2 20:09:24 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 2 Oct 2019 13:09:24 -0700 Subject: [nova][kolla] questions on cells In-Reply-To: <14cab401-c416-2eb8-b1d9-97aff0642a8e@gmail.com> References: <14cab401-c416-2eb8-b1d9-97aff0642a8e@gmail.com> Message-ID: On 9/30/19 8:14 PM, melanie witt wrote: > On 9/30/19 12:08 PM, Matt Riedemann wrote: >> On 9/30/2019 12:27 PM, Dan Smith wrote: >>>> 2. Do console proxies need to live in the cells? This is what devstack >>>> does in superconductor mode. I did some digging through nova code, and >>>> it looks that way. Testing with novncproxy agrees. This suggests we >>>> need to expose a unique proxy endpoint for each cell, and configure >>>> all computes to use the right one via e.g. novncproxy_base_url, >>>> correct? >>> I'll punt this to Melanie, as she's the console expert at this point, >>> but I imagine you're right. >>> >> >> Based on the Rocky spec [1] which says: >> >> "instead we will resolve the cell database issue by running console >> proxies per cell instead of global to a deployment, such that the cell >> database is local to the console proxy" >> >> Yes it's per-cell. There was stuff in the Rock release notes about >> this [2] and a lot of confusion around the deprecation of the >> nova-consoleauth service for which Mel knows the details, but it looks >> like we really should have something documented about this too, here >> [3] and/or here [4]. > > To echo, yes, console proxies need to run per cell. This used to be > mentioned in our docs and I looked and found it got removed by the > following commit: > > https://github.com/openstack/nova/commit/009fd0f35bcb88acc80f12e69d5fb72c0ee5391f > > > so, we just need to add back the bit about running console proxies per > cell. FYI I've proposed a patch to restore the doc about console proxies for review: https://review.opendev.org/686271 -melanie >> [1] >> https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html >> >> [2] https://docs.openstack.org/releasenotes/nova/rocky.html >> [3] https://docs.openstack.org/nova/latest/user/cellsv2-layout.html >> [4] >> https://docs.openstack.org/nova/latest/admin/remote-console-access.html >> > From openstack at fried.cc Wed Oct 2 20:32:28 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 2 Oct 2019 15:32:28 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> Message-ID: <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> > When Mel and I were PTLs we tracked and reported post-release numbers on blueprint activity, what was proposed, what was approved and what was completed Thanks Matt. I realized too late in Train that these weren't numbers I would be able to go back and collect after the fact (at least not without a great deal of manual effort) because a blueprint "disappears" from the release once we defer it. Best approximation: The specs directory for Train contains 37 approved specs. I count five completed specless blueprints in Train. So best case (assuming there were no deferred specless blueprints) that's 25/42=60%. Combining with Matt & Mel's data: Newton: 64% Ocata: 67% Pike: 72% Queens: 79% Rocky: 82% Stein: 59% Train: 60% The obvious trend is that new PTLs produce low completion percentages, and Matt would have hit 100% by V if only he hadn't quit :P But seriously... > Perhaps drastic over the last five, but not over the last three, > IMHO. Some change, but not enough to account for going from 59 > completed in Rocky to 25 in Train. Extraction of placement and departure of Jay are drastic, IMHO. But this is just the kind of thing I really wanted to avoid attempting to quantify -- see below. > I would definitely not think that saying "we > completed 25 in T, so we will only approve 25 in U" is reasonable. I agree it's an extremely primitive heuristic. It was a stab at having a cap (as opposed to *not* having a cap) without attempting to account for all the factors, an impossible ask. I'd love to discuss suggestions for other numbers, or other concrete mechanisms for saying "no" for reasons of resource rather than technical merit. My bid (as of [1]) is 30 approved, shooting for 25 completed (83%, approx the peak of the above numbers). Go. efried [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/009860.html From dms at danplanet.com Wed Oct 2 20:46:23 2019 From: dms at danplanet.com (Dan Smith) Date: Wed, 02 Oct 2019 13:46:23 -0700 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> (Eric Fried's message of "Wed, 2 Oct 2019 15:32:28 -0500") References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> Message-ID: > Extraction of placement and departure of Jay are drastic, IMHO. But this > is just the kind of thing I really wanted to avoid attempting to > quantify -- see below. I'm pretty sure Jay wasn't doing 60% of the reviews in Nova, justifying an equivalent drop in our available throughput. Further, I thought splitting out placement was supposed to *reduce* the load on the nova core team? If anything that was a time sink that is now finished, placement is off soaring on its own merits and we have a bunch of resource back as a result, no? > I'd love to discuss suggestions for other numbers, or other concrete > mechanisms for saying "no" for reasons of resource rather than technical > merit. My bid (as of [1]) is 30 approved, shooting for 25 completed > (83%, approx the peak of the above numbers). Go. How about approved specs require a majority (or some larger-than-two number) of the cores to +2 it to indicate "yes we should do this, and yes we should do it this cycle"? Some might argue that this unfairly weight efforts that have a lot of cores interested in seeing them land, instead of the actual requisite two, but it sounds like that's what you're shooting for? --Dan From gouthampravi at gmail.com Wed Oct 2 20:58:58 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Wed, 2 Oct 2019 13:58:58 -0700 Subject: [manila] Proposal to add dviroel to the core maintainers team Message-ID: Dear Zorillas and other Stackers, I would like to formalize the conversations we've been having amongst ourselves over IRC and in-person. At the outset, we have a lot of incoming changes to review, but we have limited core maintainer attention. We haven't re-jigged our core maintainers team as often as we'd like, and that's partly to blame. We have some relatively new and enthusiastic contributors that we would love to encourage to become maintainers! We've mentored contributors 1-1, n-1 before before adding them to the maintainers team. We would like to do more of this!** In this spirit, I would like your inputs on adding Douglas Viroel (dviroel) to the core maintainers team for manila and its associated projects (manila-specs, manila-ui, python-manilaclient, manila-tempest-plugin, manila-test-image, manila-image-elements). Douglas has been an active contributor for the past two releases and has valuable review inputs in the project. While he's been around here less longer than some of us, he brings a lot of experience to the table with his background in networking and shared file systems. He has a good grasp of the codebase and is enthusiastic in adding new features and fixing bugs in the Ussuri cycle and beyond. Please give me a +/-1 for this proposal. ** If you're interested in helping us maintain Manila by being part of the manila core maintainer team, please reach out to me or any of the current maintainers, we would love to work with you and help you grow into that role! Thanks, Goutham Pacha Ravi (gouthamr) From mriedemos at gmail.com Wed Oct 2 21:05:29 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 2 Oct 2019 16:05:29 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <20191001123850.f7h4wmupoo3oyzta@barron.net> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <1569915055.26355.1@smtp.office365.com> <20191001123850.f7h4wmupoo3oyzta@barron.net> Message-ID: <61306048-2fe4-059b-f033-81c9945e61e7@gmail.com> On 10/1/2019 7:38 AM, Tom Barron wrote: > There is no better way to get ones reviews stalled than to beg for > reviews with patches that are not close to ready for review and at the > same time contribute no useful reviews oneself. > > There is nothing wrong with pinging to get attention to a review if it > is ready and languishing, or if it solves an urgent issue, but even in > these cases a ping from someone who doesn't "cry wolf" and who has built > a reputation as a contributor carries more weight. This is, in large part, why we started doing the runways stuff a few cycles ago so that people wouldn't have to beg when they had blueprint work that was ready to be reviewed, meaning there was mergeable code, i.e. not large chunks of it still in WIP status or untested. It also created a timed queue of blueprints to focus on in a two week window. However, it's not part of everyone's daily review process nor does something being in a runway queue make more than one core care about it, so it's not perfect. Related to the sponsors idea elsewhere in this thread, I do believe that since we've expanded the entire core team to be able to approve specs, people that are +2 on a spec should be expected to be willing to help in reviewing the resulting blueprint code that comes out of it, but that doesn't always happen. I'm sure I'm guilty of that as well, but in my defense I will say I know I've approved at least more than one spec I don't personally care about but have felt pressured to approve it just to stop getting asked to review it, i.e. the squeaky wheel thing. -- Thanks, Matt From openstack at fried.cc Wed Oct 2 21:18:55 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 2 Oct 2019 16:18:55 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> Message-ID: <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> > I'm pretty sure Jay wasn't doing 60% of the reviews in Nova Clearly not what I was implying. > splitting out placement was supposed to *reduce* the load on the nova > core team? In a sense, that's exactly what I'm suggesting - but it took a couple releases (those releases) to get there. Both the effort to do the extraction and the overlap between the placement and nova teams during that time frame pulled resource away from nova itself. > If anything that was a time sink that is now finished, > placement is off soaring on its own merits and we have a bunch of > resource back as a result, no? Okay, I can buy that. Care to put a number on it? > How about approved specs require a majority (or some larger-than-two > number) of the cores to +2 it to indicate "yes we should do this, and > yes we should do it this cycle"? Some might argue that this unfairly > weight efforts that have a lot of cores interested in seeing them land, > instead of the actual requisite two, but it sounds like that's what > you're shooting for? I think the "core sponsor" thing will have this effect: if you can't get a core to sponsor your blueprint, it's a signal that "we" don't think it should be done (this cycle). I like the >2-core idea, though the real difference would be asking for cores to consider "should we do this *in this cycle*" when they +2 a spec. Which is good and valid, but (I think) difficult to explain/track/quantify/validate. And it's asking each core to have some sense of the "big picture" (understand the scope of all/most of the candidates) which is very difficult. > since we've expanded the entire core team to be able to approve specs, > people that are +2 on a spec should be expected to be willing to help in > reviewing the resulting blueprint code that comes out of it, but that > doesn't always happen. Agree. I considered trying to enforce that spec and/or blueprint approvers are implicitly signing up to "care about" those specs/blueprints, but I assumed that would result in a drastic reduction in willingness to be an approver :P Which I suppose would serve to reduce the number of approved blueprints in the cycle... Hm.... efried . From tpb at dyncloud.net Wed Oct 2 22:34:09 2019 From: tpb at dyncloud.net (Tom Barron) Date: Wed, 2 Oct 2019 18:34:09 -0400 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: <20191002223409.zy5jqp7lziiznfdx@barron.net> +1 from me! On 02/10/19 13:58 -0700, Goutham Pacha Ravi wrote: >Dear Zorillas and other Stackers, > >I would like to formalize the conversations we've been having amongst >ourselves over IRC and in-person. At the outset, we have a lot of >incoming changes to review, but we have limited core maintainer >attention. We haven't re-jigged our core maintainers team as often as >we'd like, and that's partly to blame. We have some relatively new and >enthusiastic contributors that we would love to encourage to become >maintainers! We've mentored contributors 1-1, n-1 before before adding >them to the maintainers team. We would like to do more of this!** > >In this spirit, I would like your inputs on adding Douglas Viroel >(dviroel) to the core maintainers team for manila and its associated >projects (manila-specs, manila-ui, python-manilaclient, >manila-tempest-plugin, manila-test-image, manila-image-elements). >Douglas has been an active contributor for the past two releases and >has valuable review inputs in the project. While he's been around here >less longer than some of us, he brings a lot of experience to the >table with his background in networking and shared file systems. He >has a good grasp of the codebase and is enthusiastic in adding new >features and fixing bugs in the Ussuri cycle and beyond. > >Please give me a +/-1 for this proposal. > >** If you're interested in helping us maintain Manila by being part of >the manila core maintainer team, please reach out to me or any of the >current maintainers, we would love to work with you and help you grow >into that role! > >Thanks, >Goutham Pacha Ravi (gouthamr) > From rodrigo.barbieri2010 at gmail.com Wed Oct 2 22:45:22 2019 From: rodrigo.barbieri2010 at gmail.com (Rodrigo Barbieri) Date: Wed, 2 Oct 2019 19:45:22 -0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: +1 -- Rodrigo Barbieri MSc Computer Scientist OpenStack Manila Core Contributor Federal University of São Carlos On Wed, Oct 2, 2019, 18:04 Goutham Pacha Ravi wrote: > Dear Zorillas and other Stackers, > > I would like to formalize the conversations we've been having amongst > ourselves over IRC and in-person. At the outset, we have a lot of > incoming changes to review, but we have limited core maintainer > attention. We haven't re-jigged our core maintainers team as often as > we'd like, and that's partly to blame. We have some relatively new and > enthusiastic contributors that we would love to encourage to become > maintainers! We've mentored contributors 1-1, n-1 before before adding > them to the maintainers team. We would like to do more of this!** > > In this spirit, I would like your inputs on adding Douglas Viroel > (dviroel) to the core maintainers team for manila and its associated > projects (manila-specs, manila-ui, python-manilaclient, > manila-tempest-plugin, manila-test-image, manila-image-elements). > Douglas has been an active contributor for the past two releases and > has valuable review inputs in the project. While he's been around here > less longer than some of us, he brings a lot of experience to the > table with his background in networking and shared file systems. He > has a good grasp of the codebase and is enthusiastic in adding new > features and fixing bugs in the Ussuri cycle and beyond. > > Please give me a +/-1 for this proposal. > > ** If you're interested in helping us maintain Manila by being part of > the manila core maintainer team, please reach out to me or any of the > current maintainers, we would love to work with you and help you grow > into that role! > > Thanks, > Goutham Pacha Ravi (gouthamr) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xingyang105 at gmail.com Thu Oct 3 00:27:32 2019 From: xingyang105 at gmail.com (Xing Yang) Date: Wed, 2 Oct 2019 20:27:32 -0400 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: +1 On Wed, Oct 2, 2019 at 5:03 PM Goutham Pacha Ravi wrote: > Dear Zorillas and other Stackers, > > I would like to formalize the conversations we've been having amongst > ourselves over IRC and in-person. At the outset, we have a lot of > incoming changes to review, but we have limited core maintainer > attention. We haven't re-jigged our core maintainers team as often as > we'd like, and that's partly to blame. We have some relatively new and > enthusiastic contributors that we would love to encourage to become > maintainers! We've mentored contributors 1-1, n-1 before before adding > them to the maintainers team. We would like to do more of this!** > > In this spirit, I would like your inputs on adding Douglas Viroel > (dviroel) to the core maintainers team for manila and its associated > projects (manila-specs, manila-ui, python-manilaclient, > manila-tempest-plugin, manila-test-image, manila-image-elements). > Douglas has been an active contributor for the past two releases and > has valuable review inputs in the project. While he's been around here > less longer than some of us, he brings a lot of experience to the > table with his background in networking and shared file systems. He > has a good grasp of the codebase and is enthusiastic in adding new > features and fixing bugs in the Ussuri cycle and beyond. > > Please give me a +/-1 for this proposal. > > ** If you're interested in helping us maintain Manila by being part of > the manila core maintainer team, please reach out to me or any of the > current maintainers, we would love to work with you and help you grow > into that role! > > Thanks, > Goutham Pacha Ravi (gouthamr) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aoren at infinidat.com Thu Oct 3 06:18:17 2019 From: aoren at infinidat.com (Amit Oren) Date: Thu, 3 Oct 2019 09:18:17 +0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: +1 On Thu, Oct 3, 2019 at 3:31 AM Xing Yang wrote: > +1 > > On Wed, Oct 2, 2019 at 5:03 PM Goutham Pacha Ravi > wrote: > >> Dear Zorillas and other Stackers, >> >> I would like to formalize the conversations we've been having amongst >> ourselves over IRC and in-person. At the outset, we have a lot of >> incoming changes to review, but we have limited core maintainer >> attention. We haven't re-jigged our core maintainers team as often as >> we'd like, and that's partly to blame. We have some relatively new and >> enthusiastic contributors that we would love to encourage to become >> maintainers! We've mentored contributors 1-1, n-1 before before adding >> them to the maintainers team. We would like to do more of this!** >> >> In this spirit, I would like your inputs on adding Douglas Viroel >> (dviroel) to the core maintainers team for manila and its associated >> projects (manila-specs, manila-ui, python-manilaclient, >> manila-tempest-plugin, manila-test-image, manila-image-elements). >> Douglas has been an active contributor for the past two releases and >> has valuable review inputs in the project. While he's been around here >> less longer than some of us, he brings a lot of experience to the >> table with his background in networking and shared file systems. He >> has a good grasp of the codebase and is enthusiastic in adding new >> features and fixing bugs in the Ussuri cycle and beyond. >> >> Please give me a +/-1 for this proposal. >> >> ** If you're interested in helping us maintain Manila by being part of >> the manila core maintainer team, please reach out to me or any of the >> current maintainers, we would love to work with you and help you grow >> into that role! >> >> Thanks, >> Goutham Pacha Ravi (gouthamr) >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bdobreli at redhat.com Thu Oct 3 07:35:16 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Thu, 3 Oct 2019 09:35:16 +0200 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: On 01.10.2019 12:00, Mark Goddard wrote: > Thanks all for your responses. Replies to Dan inline. > > On Mon, 30 Sep 2019 at 18:27, Dan Smith wrote: >> >>> 1. Is there any benefit to not having a superconductor? Presumably >>> it's a little more efficient in the single cell case? Also IIUC it >>> only requires a single message queue so is a little simpler? >> >> In a multi-cell case you need it, but you're asking about the case where >> there's only one (real) cell yeah? >> >> If the deployment is really small, then the overhead of having one is >> probably measurable and undesirable. I dunno what to tell you about >> where that cut-off is, unfortunately. However, once you're over a >> certain number of nodes, that probably shakes out a bit. The >> superconductor does things that the cell-specific ones won't have to do, >> so there's about the same amount of total load, just a potentially >> larger memory footprint for running extra services, which would be >> measurable at small scales. For a tiny deployment there's also overhead >> just in the complexity, but one of the goals of v2 has always been to >> get everyone on the same architecture, so having a "small mode" and a >> "large mode" brings with it its own complexity. > > Thanks for the explanation. We've built in a switch for single or > super mode, and single mode keeps us compatible with existing > deployments, so I guess we'll keep the switch. > >> >>> 2. Do console proxies need to live in the cells? This is what devstack >>> does in superconductor mode. I did some digging through nova code, and >>> it looks that way. Testing with novncproxy agrees. This suggests we >>> need to expose a unique proxy endpoint for each cell, and configure >>> all computes to use the right one via e.g. novncproxy_base_url, >>> correct? >> >> I'll punt this to Melanie, as she's the console expert at this point, >> but I imagine you're right. >> >>> 3. Should I upgrade the superconductor or conductor service first? >> >> Superconductor first, although they all kinda have to go around the same >> time. Superconductor, like the regular conductors, needs to look at the >> cell database directly, so if you were to upgrade superconductor before >> the cell database you'd likely have issues. I think probably the ideal >> would be to upgrade the db schema everywhere (which you can do without >> rolling code), then upgrade the top-level services (conductor, >> scheduler, api) and then you could probably get away with doing >> conductor in the cell along with computes, or whatever. If possible >> rolling the cell conductors with the top-level services would be ideal. > > I should have included my strawman deploy and upgrade flow for > context, but I'm still honing it. All DB schema changes will be done > up front in both cases. > > In terms of ordering, the API-level services (superconductor, API > scheduler) are grouped together and will be rolled first - agreeing > with what you've said. I think between Ansible's tags and limiting > actions to specific hosts, the code can be written to support > upgrading all cell conductors together, or at the same time as (well, > immediately before) the cell's computes. > > The thinking behind upgrading one cell at a time is to limit the blast > radius if something goes wrong. You suggest it would be better to roll > all cell conductors at the same time though - do you think it's safer > to run with the version disparity between conductor and computes > rather than super- and cell- conductors? I'd say upgrading one cell at a time may be in important consideration for EDGE (DCN) multi-cells deployments, where it may be technically impossible to roll it over all of the remote sites due to reasons. > >> >>> 4. Does the cell conductor need access to the API DB? >> >> Technically it should not be allowed to talk to the API DB for >> "separation of concerns" reasons. However, there are a couple of >> features that still rely on the cell conductor being able to upcall to >> the API database, such as the late affinity check. If you can only >> choose one, then I'd say configure the cell conductors to talk to the >> API DB, but if there's a knob for "isolate them" it'd be better. > > Knobs are easy to make, and difficult to keep working in all positions > :) It seems worthwhile in this case. > >> >>> 5. What DB configuration should be used in nova.conf when running >>> online data migrations? I can see some migrations that seem to need >>> the API DB, and others that need a cell DB. If I just give it the API >>> DB, will it use the cell mappings to get to each cell DB, or do I need >>> to run it once for each cell? >> >> The API DB has its own set of migrations, so you obviously need API DB >> connection info to make that happen. There is no fanout to all the rest >> of the cells (currently), so you need to run it with a conf file >> pointing to the cell, for each cell you have. The latest attempt >> at making this fan out was abanoned in July with no explanation, so it >> dropped off my radar at least. > > That makes sense. The rolling upgrade docs could be a little clearer > for multi-cell deployments here. > >> >>> 6. After an upgrade, when can we restart services to unpin the compute >>> RPC version? Looking at the compute RPC API, it looks like the super >>> conductor will remain pinned until all computes have been upgraded. >>> For a cell conductor, it looks like I could restart it to unpin after >>> upgrading all computes in that cell, correct? >> >> Yeah. >> >>> 7. Which services require policy.{yml,json}? I can see policy >>> referenced in API, conductor and compute. >> >> That's a good question. I would have thought it was just API, so maybe >> someone else can chime in here, although it's not specific to cells. > > Yeah, unrelated to cells, just something I wondered while digging > through our nova Ansible role. > > Here is the line that made me think policies are required in > conductors: https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. > I guess this is only required for cell conductors though? > >> >> --Dan > -- Best regards, Bogdan Dobrelya, Irc #bogdando From sbauza at redhat.com Thu Oct 3 07:44:25 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Thu, 3 Oct 2019 09:44:25 +0200 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> Message-ID: On Wed, Oct 2, 2019 at 11:24 PM Eric Fried wrote: > > I'm pretty sure Jay wasn't doing 60% of the reviews in Nova > > Clearly not what I was implying. > > > splitting out placement was supposed to *reduce* the load on the nova > > core team? > > In a sense, that's exactly what I'm suggesting - but it took a couple > releases (those releases) to get there. Both the effort to do the > extraction and the overlap between the placement and nova teams during > that time frame pulled resource away from nova itself. > > > If anything that was a time sink that is now finished, > > placement is off soaring on its own merits and we have a bunch of > > resource back as a result, no? > > Okay, I can buy that. Care to put a number on it? > > > How about approved specs require a majority (or some larger-than-two > > number) of the cores to +2 it to indicate "yes we should do this, and > > yes we should do it this cycle"? Some might argue that this unfairly > > weight efforts that have a lot of cores interested in seeing them land, > > instead of the actual requisite two, but it sounds like that's what > > you're shooting for? > > I think the "core sponsor" thing will have this effect: if you can't get > a core to sponsor your blueprint, it's a signal that "we" don't think it > should be done (this cycle). > > I like the >2-core idea, though the real difference would be asking for > cores to consider "should we do this *in this cycle*" when they +2 a > spec. Which is good and valid, but (I think) difficult to > explain/track/quantify/validate. And it's asking each core to have some > sense of the "big picture" (understand the scope of all/most of the > candidates) which is very difficult. > > > since we've expanded the entire core team to be able to approve specs, > > people that are +2 on a spec should be expected to be willing to help in > > reviewing the resulting blueprint code that comes out of it, but that > > doesn't always happen. > > Agree. I considered trying to enforce that spec and/or blueprint > approvers are implicitly signing up to "care about" those > specs/blueprints, but I assumed that would result in a drastic reduction > in willingness to be an approver :P > > Actually, that sounds a very reasonable suggestion from Matt. If you do care reviewing a spec, that also means you do care reviewing the implementation side. Of course, things can happen meanwhile and you can be dragged on "other stuff" (claim what you want) so you won't have time to commit on the implementation review ASAP, but your interest is still fully there. On other ways, it's a reasonable assumption to consider that cores approving a spec somehow have the responsibility to move forward with the implementation and can consequently be gently pinged for begging reviews. Which I suppose would serve to reduce the number of approved blueprints > in the cycle... Hm.... > > That's just the reflect of the reality IMHO. efried > . > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From renat.akhmerov at gmail.com Thu Oct 3 07:45:14 2019 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Thu, 3 Oct 2019 14:45:14 +0700 Subject: [FFE][requirements][mistral][amqp] Failing =?utf-8?Q?=E2=80=9Cdocs=E2=80=9D_?=job due to the upper constraint conflict for amqp In-Reply-To: <20191002163415.nu7okcn5de44txoz@mthode.org> References: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> <20191002163415.nu7okcn5de44txoz@mthode.org> Message-ID: <3cc2f690-313a-4e40-abec-8d7df96846ec@Spark> Thanks Matthew, For now we did this: https://review.opendev.org/#/c/685932/. So we just added “kombu” explicitly into our dependencies that forces to load the right version of amqp before oslo.messaging. That works. If that looks OK for you we can skip the mentioned bump. Renat Akhmerov @Nokia On 2 Oct 2019, 23:35 +0700, Matthew Thode , wrote: > On 19-10-02 14:57:24, Renat Akhmerov wrote: > > Hi, > > > > We have a failing “docs” ([1]) CI job that fails because it implicitly brings amqp 2.5.2 but this lib is not allowed to be higher than 2.5.1 in the upper-constraings.txt in the requirements project ([2]). We see that there’s the patch [3] generated by the proposal bot that bumps the constraint to 2.5.2 for amqp (among others) but it was given -2. > > > > Please assist on how to address in the best way. Should we bump only amqp version in upper constraints for now? > > > > [1] https://zuul.opendev.org/t/openstack/build/6fe7c7d3e60b40458d2a98f3a293f412/log/job-output.txt#840 > > [2] https://github.com/openstack/requirements/blob/master/upper-constraints.txt#L258 > > [3] https://review.opendev.org/#/c/681382 > > > > I'm going to be treating this as a FFE request to bump amqp from 2.5.1 > to 2.5.2. > It looks like a bugfix only release so I'm fine with it. As long as we > don't need to mask 2.5.1 in global-requirements (which would cause a > re-release for openstack/oslo.messaging). > > https://github.com/celery/py-amqp/compare/2.5.1...2.5.2 > > So, if you propose a constraints only bump of amqp-2.5.1 to 2.5.2 then I > approve. > > -- > Matthew Thode -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbauza at redhat.com Thu Oct 3 07:47:07 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Thu, 3 Oct 2019 09:47:07 +0200 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: On Tue, Oct 1, 2019 at 11:45 PM Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, make > friends in the world and bring me and my family to foreign country from our > home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still keep > contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > Your contributions were greatly appreciated over the time and thank you for all the hard work you made on polishing the API side. I can't wait for your proposals or bugs :-) Hopefully see you later. -Sylvain --- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Thu Oct 3 08:24:10 2019 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 3 Oct 2019 09:24:10 +0100 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: On Wed, 2 Oct 2019 at 14:48, Matt Riedemann wrote: > > On 10/1/2019 5:00 AM, Mark Goddard wrote: > >>> 5. What DB configuration should be used in nova.conf when running > >>> online data migrations? I can see some migrations that seem to need > >>> the API DB, and others that need a cell DB. If I just give it the API > >>> DB, will it use the cell mappings to get to each cell DB, or do I need > >>> to run it once for each cell? > >> The API DB has its own set of migrations, so you obviously need API DB > >> connection info to make that happen. There is no fanout to all the rest > >> of the cells (currently), so you need to run it with a conf file > >> pointing to the cell, for each cell you have. The latest attempt > >> at making this fan out was abanoned in July with no explanation, so it > >> dropped off my radar at least. > > That makes sense. The rolling upgrade docs could be a little clearer > > for multi-cell deployments here. > > > > This recently merged, hopefully it helps clarify: > > https://review.opendev.org/#/c/671298/ It does help a little for the schema migrations, but the point was about data migrations. > > >>> 6. After an upgrade, when can we restart services to unpin the compute > >>> RPC version? Looking at the compute RPC API, it looks like the super > >>> conductor will remain pinned until all computes have been upgraded. > >>> For a cell conductor, it looks like I could restart it to unpin after > >>> upgrading all computes in that cell, correct? > >> Yeah. > >> > >>> 7. Which services require policy.{yml,json}? I can see policy > >>> referenced in API, conductor and compute. > >> That's a good question. I would have thought it was just API, so maybe > >> someone else can chime in here, although it's not specific to cells. > > Yeah, unrelated to cells, just something I wondered while digging > > through our nova Ansible role. > > > > Here is the line that made me think policies are required in > > conductors:https://opendev.org/openstack/nova/src/commit/6d5fdb4ef4dc3e5f40298e751d966ca54b2ae902/nova/compute/api.py#L666. > > I guess this is only required for cell conductors though? > > > > That is not the conductor service, it's the API. My mistake, still learning the flow of communication. > > -- > > Thanks, > > Matt > From mark at stackhpc.com Thu Oct 3 08:28:34 2019 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 3 Oct 2019 09:28:34 +0100 Subject: [nova][kolla] questions on cells In-Reply-To: References: <14cab401-c416-2eb8-b1d9-97aff0642a8e@gmail.com> Message-ID: On Wed, 2 Oct 2019 at 21:11, melanie witt wrote: > > On 9/30/19 8:14 PM, melanie witt wrote: > > On 9/30/19 12:08 PM, Matt Riedemann wrote: > >> On 9/30/2019 12:27 PM, Dan Smith wrote: > >>>> 2. Do console proxies need to live in the cells? This is what devstack > >>>> does in superconductor mode. I did some digging through nova code, and > >>>> it looks that way. Testing with novncproxy agrees. This suggests we > >>>> need to expose a unique proxy endpoint for each cell, and configure > >>>> all computes to use the right one via e.g. novncproxy_base_url, > >>>> correct? > >>> I'll punt this to Melanie, as she's the console expert at this point, > >>> but I imagine you're right. > >>> > >> > >> Based on the Rocky spec [1] which says: > >> > >> "instead we will resolve the cell database issue by running console > >> proxies per cell instead of global to a deployment, such that the cell > >> database is local to the console proxy" > >> > >> Yes it's per-cell. There was stuff in the Rock release notes about > >> this [2] and a lot of confusion around the deprecation of the > >> nova-consoleauth service for which Mel knows the details, but it looks > >> like we really should have something documented about this too, here > >> [3] and/or here [4]. > > > > To echo, yes, console proxies need to run per cell. This used to be > > mentioned in our docs and I looked and found it got removed by the > > following commit: > > > > https://github.com/openstack/nova/commit/009fd0f35bcb88acc80f12e69d5fb72c0ee5391f > > > > > > so, we just need to add back the bit about running console proxies per > > cell. > > FYI I've proposed a patch to restore the doc about console proxies for > review: > > https://review.opendev.org/686271 Great, thanks. I know it's merged, but I added a comment. > > -melanie > > >> [1] > >> https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html > >> > >> [2] https://docs.openstack.org/releasenotes/nova/rocky.html > >> [3] https://docs.openstack.org/nova/latest/user/cellsv2-layout.html > >> [4] > >> https://docs.openstack.org/nova/latest/admin/remote-console-access.html > >> > > > > From a.settle at outlook.com Thu Oct 3 09:26:29 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Thu, 3 Oct 2019 09:26:29 +0000 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Hey, Could you add something for docs? Or combine with i18n again if Ian doesn't mind? We don't need a lot, just a room for people to ask questions about the future of the docs team. Stephen will be there, as co-PTL. There's 0 chance of it not conflicting with nova. Please :) Thank you! Alex On Wed, 2019-09-25 at 14:13 -0700, Kendall Nelson wrote: > Hello Everyone! > > In the attached picture or link [0] you will find the proposed > schedule for the various tracks at the Shanghai PTG in November. > > We did our best to avoid the key conflicts that the track leads > (PTLs, SIG leads...) mentioned in their PTG survey responses, > although there was no perfect solution that would avoid all conflicts > especially when the event is three-ish days long and we have over 40 > teams meeting. > > If there are critical conflicts we missed or other issues, please let > us know, by October 6th at 7:00 UTC! > > -Kendall (diablo_rojo) > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedu > le.png -- Alexandra Settle IRC: asettle From kchamart at redhat.com Thu Oct 3 10:10:54 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Thu, 3 Oct 2019 12:10:54 +0200 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> Message-ID: <20191003101054.GB26595@paraplu> On Mon, Sep 30, 2019 at 06:09:16PM -0500, Eric Fried wrote: > Nova developers and maintainers- [...] > I'd like to try a couple more. > > (A) Constrain scope, drastically. We marked 25 blueprints complete in > Train [3]. Since there has been no change to the core team, let's > limit Ussuri to 25 blueprints [4]. If this turns out to be too few, > what's the worst thing that happens? We finish everything, early, and > wish we had done more. If that happens, drinks are on me, and we can > bump the number for V. I welcome scope reduction, focusing on fewer features, stability, and bug fixes than "more gadgetries and gongs". Which also means: less frenzy, less split attention, fewer mistakes, more retained concentration, and more serenity. And, yeah, any reasonable person would read '25' as _an_ educated limit, rather than some "optimal limit". If we end up with bags of "spare time", there's loads of tech-debt items, performance (it's a feature, let's recall) issues, and meaningful clean-ups waiting to be tackled. [...] -- /kashyap From thierry at openstack.org Thu Oct 3 10:24:36 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 3 Oct 2019 12:24:36 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai Message-ID: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> Hi everyone, The summit is going to mainland China for the first time. It's a great opportunity to meet the Chinese community, make ourselves available for direct discussion, and on-board new team members. In order to facilitate that, the TC has been suggesting that the Foundation organizes two opportunities to "meet the project leaders" during the Summit in Shanghai: one around the Monday evening marketplace mixer, and one around the Wednesday lunch: https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24417/ https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24426/meet-the-project-leaders OpenStack PTLs, TC members, core reviewers, UC members interested in meeting the local community are all welcome. We'll also have leaders from the other OSF-supported projects around. See you there! -- Thierry Carrez (ttx) From a.settle at outlook.com Thu Oct 3 10:28:04 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Thu, 3 Oct 2019 10:28:04 +0000 Subject: [all] [tc] [docs] [release] [ptls] Docs as SIG: Ownership of docs.openstack.org In-Reply-To: References: <20190819154106.GA25909@sm-workstation> <9DABCC6E-1E61-45A6-8370-4F086428B3B6@doughellmann.com> <20190819174941.GA4730@sm-workstation> <20190819175652.dkbyerlmblqkvzdk@yuggoth.org> Message-ID: Dragging this thread back up from the depths as I've updated the governance patch as of this morning: https://review.opendev.org/#/c/657 142/ On Mon, 2019-08-19 at 14:16 -0400, Doug Hellmann wrote: > > On Aug 19, 2019, at 1:56 PM, Jeremy Stanley > > wrote: > > > > On 2019-08-19 12:49:41 -0500 (-0500), Sean McGinnis wrote: > > [...] > > > there seems to be a big difference between owning the task of > > > configuring the site for the next release (which totally makes > > > sense as a release team task) and owning the entire > > > docs.openstack.org site. > > > > That's why I also requested clarification in my earlier message on > > this thread. The vast majority of the content hosted under > > https://docs.openstack.org/ is maintained in a distributed fashion > > by the various teams writing documentation in their respective > > projects. The hosting (configuration apart from .htaccess files, > > storage, DNS, and so on) is handled by Infra/OpenDev folks. If it's > > *just* the stuff inside the "www" tree in the openstack-manuals > > repo > > then that's not a lot, but it's also possible what the release team > > actually needs to touch in there could be successfully scaled back > > even more (with the caveat that I haven't looked through it in > > detail). > > -- > > Jeremy Stanley > > > The suggestion is for the release team to take over the site > generator > for docs.openstack.org (the stuff under “www” in the current > openstack-manuals git repository) and for the SIG to own anything > that looks remotely like “content”. There isn’t much of that left > anyway, > now that most of it is in the project repositories. I like this. > > Most of what is under www is series-specific templates and data files > that tell the site generator how to insert links to parts of the > project documentation in the right places (the “install guides” page > links > to /foo/$series/install/ for example). They’re very simple, very > dumb, > templates, driven with a little custom script that wraps jinja2, > feeding the right data to the right templates based on series name. > There is pretty good documentation for how to use it in the > tools [1] and release [2] sections of the docs contributor guide. > > The current site-generator definitely could be simpler, especially > if it only linked to the master docs and *those* linked to the older > versions of themselves (so /nova/latest/ had a link that pointed to > /nova/ocata/ somewhere). That would take some work, though. > > The simplest thing we could do is just make the release team > committers > on openstack-manuals, leave everything else as it is, and exercise > trust between the two groups. If we absolutely want to separate the > builds, > then we could make a new repo with just the template-driven pages > under “www”, > but that’s going to involve changing/creating several publishing > jobs. I think this is a suitable option. I would like the docs team cores to review this, and approve. But I think this is the best/simpliest option for now. -- Alexandra Settle IRC: asettle From paye600 at gmail.com Thu Oct 3 11:04:10 2019 From: paye600 at gmail.com (Roman Gorshunov) Date: Thu, 3 Oct 2019 13:04:10 +0200 Subject: [Airship-discuss] Fwd: OOK,Airship In-Reply-To: References: <963B5DA1-1C3D-481B-A41B-D11369BC1848@openstack.org> Message-ID: Thanks Ashlee! Charles, A few companies who work on development of Airship do use it, including production uses: AT&T, SUSE, Mirantis, Ericsson, SK Telekom and others. Many of those companies (if not all) use Airship + OpenStack Helm as well. Airship, as you have mentioned, is a collection of components for undercloud control plane, which helps to deploy nodes with OS+Docker+Kubernetes on it, configure/manage it all in GitOps way, and then help to maintain the configuration. It also allows to manage deploys and maintenance of whatever runs on top of Kubernetes cluster, would that be OpenStack Helm or other software packaged in Helm format. OpenStack Helm does not really require to be running on Airship-managed cluster. It could run standalone. Yes, you can roll out an open source production grade Airship/Openstack Helm deployment today. Good example of production grade configuration could be found in airship/treasuremap repository [0] as 'seaworthy' site definition. You are welcome to try, of course. For the questions - reach out to us on IRC #airshipit at Freenode of via Airship-discuss mailing list. [0] https://opendev.org/airship/treasuremap Best regards, -- Roman Gorshunov On Wed, Oct 2, 2019 at 9:27 PM Ashlee Ferguson wrote: > > Hi Charles, > > Glad to hear you’re interested! Forwarding this to the Airship ML since there may be folks on this mailing list that will have pointers who didn't see the openstack-discuss post. > > Ashlee > > > > Begin forwarded message: > > From: Charles > Subject: OOK,Airship > Date: October 2, 2019 at 5:39:16 PM GMT+2 > To: openstack-discuss at lists.openstack.org > > Hi, > > > We are interested in OOK and Openstack Helm. > > Has anyone any experience with Airship (now that 1.0 is out)? > > Noticed that a few Enterprise distributions are looking at managing the Openstack control plane with Kubernetes and have been testing Airship with a view to rolling it out (Mirantis,SUSE) > > Is this a signal that there is momentum around Openstack Helm? > > Is it possible to roll out an open source production grade Airship/Openstack Helm deployment today, or is it too early? > > > Thoughts? > > > Charles > > > > > _______________________________________________ > Airship-discuss mailing list > Airship-discuss at lists.airshipit.org > http://lists.airshipit.org/cgi-bin/mailman/listinfo/airship-discuss From skaplons at redhat.com Thu Oct 3 11:29:03 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 3 Oct 2019 13:29:03 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> Message-ID: <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Hi Thierry, I think it’s interesting idea. Should we somehow sign up to this even (one or both, depends on which we plan to be) to let people know that PTL of specific project will be available there? Or it’s just enough to come there when will be time for that? Also, is it expected from project leaders to be available on both terms or only one is enough? > On 3 Oct 2019, at 12:24, Thierry Carrez wrote: > > Hi everyone, > > The summit is going to mainland China for the first time. It's a great > opportunity to meet the Chinese community, make ourselves available for > direct discussion, and on-board new team members. > > In order to facilitate that, the TC has been suggesting that the > Foundation organizes two opportunities to "meet the project leaders" > during the Summit in Shanghai: one around the Monday evening marketplace > mixer, and one around the Wednesday lunch: > > https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24417/ > https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24426/meet-the-project-leaders > > OpenStack PTLs, TC members, core reviewers, UC members interested in > meeting the local community are all welcome. We'll also have leaders > from the other OSF-supported projects around. > > See you there! > > -- > Thierry Carrez (ttx) > — Slawek Kaplonski Senior software engineer Red Hat From cems at ebi.ac.uk Thu Oct 3 12:13:28 2019 From: cems at ebi.ac.uk (Charles) Date: Thu, 3 Oct 2019 13:13:28 +0100 Subject: [Airship-discuss] Fwd: OOK,Airship In-Reply-To: References: <963B5DA1-1C3D-481B-A41B-D11369BC1848@openstack.org> Message-ID: <69277446-4470-3bd2-6cd4-b0f61c3e21e3@ebi.ac.uk> Hi Roman, Many thanks for the reply. I posted this on openstack-discuss because I was wondering if any users/Openstack operators out there (outside large corporations who are members of the Airship development framework) are actually running OOK in production. This could be Airship, or some other Kubernetes distribution running Openstack Helm. Our several years experience of managing Openstack so far (RHOSP/TripleO) has been bumpy due to issues with configuration maintenance /upgrades. The idea of using CI/CD and Kubernetes/Helm to manage Openstack is compelling and fits nicely into the DevOps framework here. If we were to explore this route we could 'roll our own' with a deployment say based on https://opendev.org/airship/treasuremap , or pay for and Enterprise solution that incorporates the OOK model (upcoming Mirantis and SUSE potentially). Regards Charles On 03/10/2019 12:04, Roman Gorshunov wrote: > Thanks Ashlee! > > Charles, > A few companies who work on development of Airship do use it, > including production uses: AT&T, SUSE, Mirantis, Ericsson, SK Telekom > and others. Many of those companies (if not all) use Airship + > OpenStack Helm as well. > > Airship, as you have mentioned, is a collection of components for > undercloud control plane, which helps to deploy nodes with > OS+Docker+Kubernetes on it, configure/manage it all in GitOps way, and > then help to maintain the configuration. It also allows to manage > deploys and maintenance of whatever runs on top of Kubernetes cluster, > would that be OpenStack Helm or other software packaged in Helm > format. > > OpenStack Helm does not really require to be running on > Airship-managed cluster. It could run standalone. > > Yes, you can roll out an open source production grade > Airship/Openstack Helm deployment today. Good example of production > grade configuration could be found in airship/treasuremap repository > [0] as 'seaworthy' site definition. You are welcome to try, of course. > For the questions - reach out to us on IRC #airshipit at Freenode of via > Airship-discuss mailing list. > > [0] https://opendev.org/airship/treasuremap > > Best regards, > -- > Roman Gorshunov > > On Wed, Oct 2, 2019 at 9:27 PM Ashlee Ferguson wrote: >> Hi Charles, >> >> Glad to hear you’re interested! Forwarding this to the Airship ML since there may be folks on this mailing list that will have pointers who didn't see the openstack-discuss post. >> >> Ashlee >> >> >> >> Begin forwarded message: >> >> From: Charles >> Subject: OOK,Airship >> Date: October 2, 2019 at 5:39:16 PM GMT+2 >> To: openstack-discuss at lists.openstack.org >> >> Hi, >> >> >> We are interested in OOK and Openstack Helm. >> >> Has anyone any experience with Airship (now that 1.0 is out)? >> >> Noticed that a few Enterprise distributions are looking at managing the Openstack control plane with Kubernetes and have been testing Airship with a view to rolling it out (Mirantis,SUSE) >> >> Is this a signal that there is momentum around Openstack Helm? >> >> Is it possible to roll out an open source production grade Airship/Openstack Helm deployment today, or is it too early? >> >> >> Thoughts? >> >> >> Charles >> >> >> >> >> _______________________________________________ >> Airship-discuss mailing list >> Airship-discuss at lists.airshipit.org >> http://lists.airshipit.org/cgi-bin/mailman/listinfo/airship-discuss -- Charles Short Senior Cloud Engineer EMBL-EBI Hinxton 01223494205 From fungi at yuggoth.org Thu Oct 3 12:17:56 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 3 Oct 2019 12:17:56 +0000 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <20191003101054.GB26595@paraplu> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <20191003101054.GB26595@paraplu> Message-ID: <20191003121756.4u6k2jh5p47rap5j@yuggoth.org> On 2019-10-03 12:10:54 +0200 (+0200), Kashyap Chamarthy wrote: > On Mon, Sep 30, 2019 at 06:09:16PM -0500, Eric Fried wrote: [...] > > (A) Constrain scope, drastically. We marked 25 blueprints > > complete in Train [3]. Since there has been no change to the > > core team, let's limit Ussuri to 25 blueprints [4]. If this > > turns out to be too few, what's the worst thing that happens? We > > finish everything, early, and wish we had done more. If that > > happens, drinks are on me, and we can bump the number for V. > > I welcome scope reduction, focusing on fewer features, stability, > and bug fixes than "more gadgetries and gongs". Which also means: > less frenzy, less split attention, fewer mistakes, more retained > concentration, and more serenity. And, yeah, any reasonable > person would read '25' as _an_ educated limit, rather than some > "optimal limit". [...] Viewing this from outside, 25 specs in a cycle already sounds like planning to get a *lot* done... that's completing an average of one Nova spec per week (even when averaged through the freeze weeks). Maybe as a goal it's undershooting a bit, but it's still a very impressive quantity to be able to consistently accomplish. Many thanks and congratulations to all the folks who work so hard to make this happen in Nova, cycle after cycle. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dangtrinhnt at gmail.com Thu Oct 3 14:28:15 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 3 Oct 2019 23:28:15 +0900 Subject: [i18n] Request to be added as Vietnamese translation group coordinators In-Reply-To: References: <49e1a362-aeea-b230-536c-8778e3f3d885@suse.com> Message-ID: Thanks, Ian :) On Thu, Oct 3, 2019 at 2:11 AM Ian Y. Choi wrote: > Hello, > > Sorry for replying here late (I was travelling by the end of last week > and have been following-up many things which I couldn't take care of). > > Yesterday, I approved all the open requests including requests mentioned > below :) > > > With many thanks, > > /Ian > > Andreas Jaeger wrote on 9/26/2019 10:14 PM: > > On 26/09/2019 13.59, Trinh Nguyen wrote: > >> Hi i18n team, > >> > >> Dai and I would like to volunteer as the coordinators of the > >> Vietnamese translation group. If you find us qualified, please let us > >> know. > >> > > > > Looking at translate.openstack.org: > > > > I saw that Dai asked to be a translator and approved his request as an > > admin, I do not see you in Vietnamese, please apply as translator for > > Vietnamese first. > > > > Ian, will you reach out to the current coordinator? > > > > Ian, a couple of language teams have open requests, could you check > > those and whether the coordinators are still alive, please? > > > > Andreas > > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Thu Oct 3 14:44:17 2019 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Thu, 3 Oct 2019 14:44:17 +0000 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: <1570113853.14734.1@smtp.office365.com> On Tue, Oct 1, 2019 at 11:40 PM, Kenichi Omichi wrote: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, > make friends in the world and bring me and my family to foreign > country from our home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still > keep contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > --- Thank you for your hard work and good luck with your next endeavour! Cheers, gibi From doug at doughellmann.com Thu Oct 3 15:11:01 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 3 Oct 2019 11:11:01 -0400 Subject: [all] Planned Ussuri release schedule published In-Reply-To: <20191002160535.GA29937@sm-workstation> References: <20191001185707.GA17150@sm-workstation> <3aaeaa6a5ed64e98992516786464e72e@AUSX13MPS308.AMER.DELL.COM> <918bcba414f246bbb90c4866f4630e44@AUSX13MPS308.AMER.DELL.COM> <20191002124148.GA16684@sm-workstation> <31e6312c00d24f36aebb59617d40c9d2@AUSX13MPS308.AMER.DELL.COM> <05300447-5ddd-f6c0-a799-4e61b66f469b@nemebean.com> <20191002145723.GA27063@sm-workstation> <88053759ce094142b756c17a83e099a1@AUSX13MPS308.AMER.DELL.COM> <20191002160535.GA29937@sm-workstation> Message-ID: <932AE9B9-5EDB-44F9-84BA-5ADEAC384A74@doughellmann.com> > On Oct 2, 2019, at 12:05 PM, Sean McGinnis wrote: > > On Wed, Oct 02, 2019 at 04:01:22PM +0000, Arkady.Kanevsky at dell.com wrote: >> >> >> -----Original Message----- >> From: Sean McGinnis >> Sent: Wednesday, October 2, 2019 9:57 AM >> To: Ben Nemec >> Cc: Kanevsky, Arkady; gouthampravi at gmail.com; openstack-discuss at lists.openstack.org >> Subject: Re: [all] Planned Ussuri release schedule published >> >> >> [EXTERNAL EMAIL] >> >>> >>> On 10/2/19 8:43 AM, Arkady.Kanevsky at dell.com wrote: >>>> Sean, >>>> On https://releases.openstack.org/ussuri/schedule.html >>>> Feature freeze is R6 but >>>> Requirements freeze is R5. >>> >>> Is your browser dropping the background color for the table cells? >>> There are actually six bullet points in the R-5 one, but because it's >>> vertically centered some of them may appear to be under R-6. The only >>> thing that's in >>> R-6 though is the final non-client library release. >>> > > Looks like you fixed it? Any idea what you changed in case someone else has the > same issue? https://review.opendev.org/686420 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Thu Oct 3 16:02:49 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 3 Oct 2019 11:02:49 -0500 Subject: [oslo][nova] Revert of oslo.messaging JSON serialization change In-Reply-To: References: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> <1569857750.5848.0@smtp.office365.com> <1569917983.26355.2@smtp.office365.com> Message-ID: <3ea5faa5-4d32-cb7e-6bf5-89892afa55b6@nemebean.com> TLDR: I've abandoned the revert. After looking at Gibi's investigation further I agree that rabbit was actually using the jsonutils version of dumps, so making the fake driver use it is consistent. Apologies for the confusion. -Ben On 10/1/19 3:35 PM, Ken Giusti wrote: > Sorry I'm late to the party.... > > At the risk of stating the obvious I wouldn't put much faith in the fact > that the Kafka and Amqp1 drivers use jsonutils.   The use of jsonutils > in these drivers is simply a cut-n-paste from the way old qpidd > driver.    Why jsonutils was used there... I dunno. > > IMHO the RabbitMQ driver is the authoritative source for correct driver > implementation - the Fake driver (and the others) should use the same > serialization as the rabbitmq driver if possible. > > -K > > On Tue, Oct 1, 2019 at 4:30 AM Balázs Gibizer > wrote: > > > > On Mon, Sep 30, 2019 at 5:35 PM, Balázs Gibizer > wrote: > > > > > > On Mon, Sep 30, 2019 at 4:45 PM, Ben Nemec > > > > wrote: > >>  Hi, > >> > >>  I've just proposed https://review.opendev.org/#/c/685724/ which > >>  reverts a change that recently went in to make the fake driver in > >>  oslo.messaging use jsonutils for message serialization instead of > >>  json.dumps. > >> > >>  As explained in the commit message on the revert, this is > >> problematic > >>  because the rabbit driver uses kombu's default serialization > method, > >>  which is json.dumps. By changing the fake driver to use jsonutils > >>  we've made it more lenient than the most used real driver which > >> opens > >>  us up to merging broken changes in consumers of oslo.messaging. > >> > >>  We did have some discussion of whether we should try to > override the > >>  kombu default and tell it to use jsonutils too, as a number of > other > >>  drivers do. The concern with this was that the jsonutils > handler for > >>  things like datetime objects is not tz-aware, which means if you > >> send > >>  a datetime object over RPC and don't explicitly handle it you could > >>  lose important information. > >> > >>  I'm open to being persuaded otherwise, but at the moment I'm > leaning > >>  toward less magic happening at the RPC layer and requiring projects > >>  to explicitly handle types that aren't serializable by the standard > >>  library json module. If you have a different preference, please > >> share > >>  it here. > > > > Hi, > > > > I might me totally wrong here and please help me understand how the > > RabbitDriver works. What I did when I created the original patch > that > > I > > looked at each drivers how they handle sending messages. The > > oslo_messaging._drivers.base.BaseDriver defines the interface with a > > send() message. The oslo_messaging._drivers.amqpdriver.AMQPDriverBase > > implements the BaseDriver interface's send() method to call _send(). > > Then _send() calls rpc_commom.serialize_msg which then calls > > jsonutils.dumps. > > > > The oslo_messaging._drivers.impl_rabbit.RabbitDriver driver inherits > > from AMQPDriverBase and does not override send() or _send() so I > think > > the AMQPDriverBase ._send() is called that therefore jsonutils is > used > > during sending a message with RabbitDriver. > > I did some tracing in devstack to prove my point. See the result in > https://review.opendev.org/#/c/685724/1//COMMIT_MSG at 11 > > Cheers, > gibi > > > > > Cheers, > > gibi > > > > > > [1] > > > https://github.com/openstack/oslo.messaging/blob/7734ac1376a1a9285c8245a91cf43599358bfa9d/oslo_messaging/_drivers/amqpdriver.py#L599 > > > >> > >>  Thanks. > >> > >>  -Ben > >> > > > > > > > > > -- > Ken Giusti  (kgiusti at gmail.com ) From mriedemos at gmail.com Thu Oct 3 16:16:28 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Oct 2019 11:16:28 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> Message-ID: <28312232-6a30-17de-6141-a47c2f282af9@gmail.com> On 10/2/2019 4:18 PM, Eric Fried wrote: > I like the >2-core idea, though the real difference would be asking for > cores to consider "should we do this*in this cycle*" when they +2 a > spec. Which is good and valid, but (I think) difficult to > explain/track/quantify/validate. And it's asking each core to have some > sense of the "big picture" (understand the scope of all/most of the > candidates) which is very difficult. Note that having that "big picture" is I think the main reason why historically, until very recently, there was a subgroup of the nova core team that was the specs core team, because what was approved in specs could have wide impacts to nova and thus knowing the big picture was important. I know that not all specs are the same complexity and we changed how the core team works for specs for good reasons, but given the years of "why aren't they the same core team? it's not fair." I wanted to point out it can be, as you said, very difficult to be a specs core for different reasons from a nova core. -- Thanks, Matt From kendall at openstack.org Thu Oct 3 16:32:19 2019 From: kendall at openstack.org (Kendall Waters) Date: Thu, 3 Oct 2019 11:32:19 -0500 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Hey Alex, We still have tables available on Friday. Would half a day on Friday work for the docs team? Unless Ian is okay with it, we can combine Docs with i18n in their Wednesday afternoon/Thursday morning slot. Just let me know! Cheers, Kendall Kendall Waters OpenStack Marketing & Events kendall at openstack.org > On Oct 3, 2019, at 4:26 AM, Alexandra Settle wrote: > > Hey, > > Could you add something for docs? Or combine with i18n again if Ian > doesn't mind? > > We don't need a lot, just a room for people to ask questions about the > future of the docs team. > > Stephen will be there, as co-PTL. There's 0 chance of it not > conflicting with nova. > > Please :) > > Thank you! > > Alex > > On Wed, 2019-09-25 at 14:13 -0700, Kendall Nelson wrote: >> Hello Everyone! >> >> In the attached picture or link [0] you will find the proposed >> schedule for the various tracks at the Shanghai PTG in November. >> >> We did our best to avoid the key conflicts that the track leads >> (PTLs, SIG leads...) mentioned in their PTG survey responses, >> although there was no perfect solution that would avoid all conflicts >> especially when the event is three-ish days long and we have over 40 >> teams meeting. >> >> If there are critical conflicts we missed or other issues, please let >> us know, by October 6th at 7:00 UTC! >> >> -Kendall (diablo_rojo) >> >> [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedu >> le.png > -- > Alexandra Settle > > IRC: asettle -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Oct 3 16:35:05 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Oct 2019 11:35:05 -0500 Subject: [nova][kolla] questions on cells In-Reply-To: References: Message-ID: On 10/3/2019 3:24 AM, Mark Goddard wrote: >> This recently merged, hopefully it helps clarify: >> >> https://review.opendev.org/#/c/671298/ > It does help a little for the schema migrations, but the point was > about data migrations. > That's an excellent point. Looking at devstack [1] and grenade [2] we don't necessarily do that properly. For devstack with a fresh install it doesn't really matter but it should matter for grenade since we should be migrating both cell0 and cell1. Grenade does not run in "superconductor" mode so some of the rules might be different there, i.e. grenade's nova.conf has the database pointed at cell1 while devstack has the database config pointed at cell0. Either way we're not properly running the online data migrations per cell DB as far as I can tell. Maybe we just haven't had an online data migration yet that makes that important, but it's definitely wrong. I also don't see anything in the docs for the online_data_migrations command [3] to use the --config-file option to run it against the cell DB config. I can open a bug for that. The upgrade guide should also be updated to mention that like for db sync in https://review.opendev.org/#/c/671298/. [1] https://github.com/openstack/devstack/blob/1a46c898db9c16173013d95e2bc954992121077c/lib/nova#L764 [2] https://github.com/openstack/grenade/blob/bb14e02a464db2b268930bbba0152862fe0f805e/projects/60_nova/upgrade.sh#L79 [3] https://docs.openstack.org/nova/latest/cli/nova-manage.html -- Thanks, Matt From sombrafam at gmail.com Thu Oct 3 17:17:03 2019 From: sombrafam at gmail.com (Erlon Cruz) Date: Thu, 3 Oct 2019 14:17:03 -0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: Glad to see that! +1 Em qui, 3 de out de 2019 às 03:22, Amit Oren escreveu: > +1 > > On Thu, Oct 3, 2019 at 3:31 AM Xing Yang wrote: > >> +1 >> >> On Wed, Oct 2, 2019 at 5:03 PM Goutham Pacha Ravi >> wrote: >> >>> Dear Zorillas and other Stackers, >>> >>> I would like to formalize the conversations we've been having amongst >>> ourselves over IRC and in-person. At the outset, we have a lot of >>> incoming changes to review, but we have limited core maintainer >>> attention. We haven't re-jigged our core maintainers team as often as >>> we'd like, and that's partly to blame. We have some relatively new and >>> enthusiastic contributors that we would love to encourage to become >>> maintainers! We've mentored contributors 1-1, n-1 before before adding >>> them to the maintainers team. We would like to do more of this!** >>> >>> In this spirit, I would like your inputs on adding Douglas Viroel >>> (dviroel) to the core maintainers team for manila and its associated >>> projects (manila-specs, manila-ui, python-manilaclient, >>> manila-tempest-plugin, manila-test-image, manila-image-elements). >>> Douglas has been an active contributor for the past two releases and >>> has valuable review inputs in the project. While he's been around here >>> less longer than some of us, he brings a lot of experience to the >>> table with his background in networking and shared file systems. He >>> has a good grasp of the codebase and is enthusiastic in adding new >>> features and fixing bugs in the Ussuri cycle and beyond. >>> >>> Please give me a +/-1 for this proposal. >>> >>> ** If you're interested in helping us maintain Manila by being part of >>> the manila core maintainer team, please reach out to me or any of the >>> current maintainers, we would love to work with you and help you grow >>> into that role! >>> >>> Thanks, >>> Goutham Pacha Ravi (gouthamr) >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Thu Oct 3 16:44:05 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 03 Oct 2019 11:44:05 -0500 Subject: Proposed Forum Schedule Message-ID: <5D962555.7090508@openstack.org> Hello! I'm attaching a PDF of the proposed Shanghai Forum Schedule. I'll publish the same on the actual website later this afternoon. However, there is still time for feedback/time changes, assuming there aren't conflicts for speakers/moderators. This is also available for download here: https://drive.google.com/file/d/1qp0I9xnyOK3mhBitQnk2a7VuS9XClvyF/view?usp=sharing Please respond to this thread with any concerns. Cheers, Jimmy -------------- next part -------------- A non-text attachment was scrubbed... Name: Forum Mock Schedule.pdf Type: application/pdf Size: 88937 bytes Desc: not available URL: From ben at swartzlander.org Thu Oct 3 18:04:51 2019 From: ben at swartzlander.org (Ben Swartzlander) Date: Thu, 3 Oct 2019 14:04:51 -0400 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: <8d939186-982d-429c-47fe-d95178ce0622@swartzlander.org> On 10/3/19 1:17 PM, Erlon Cruz wrote: > Glad to see that! +1 > > Em qui, 3 de out de 2019 às 03:22, Amit Oren > escreveu: > > +1 > > On Thu, Oct 3, 2019 at 3:31 AM Xing Yang > wrote: > > +1 > > On Wed, Oct 2, 2019 at 5:03 PM Goutham Pacha Ravi > > wrote: > > Dear Zorillas and other Stackers, > > I would like to formalize the conversations we've been > having amongst > ourselves over IRC and in-person. At the outset, we have a > lot of > incoming changes to review, but we have limited core maintainer > attention. We haven't re-jigged our core maintainers team as > often as > we'd like, and that's partly to blame. We have some > relatively new and > enthusiastic contributors that we would love to encourage to > become > maintainers! We've mentored contributors 1-1, n-1 before > before adding > them to the maintainers team. We would like to do more of > this!** > > In this spirit, I would like your inputs on adding Douglas > Viroel > (dviroel) to the core maintainers team for manila and its > associated > projects (manila-specs, manila-ui, python-manilaclient, > manila-tempest-plugin, manila-test-image, > manila-image-elements). > Douglas has been an active contributor for the past two > releases and > has valuable review inputs in the project. While he's been > around here > less longer than some of us, he brings a lot of experience > to the > table with his background in networking and shared file > systems. He > has a good grasp of the codebase and is enthusiastic in > adding new > features and fixing bugs in the Ussuri cycle and beyond. > > Please give me a +/-1 for this proposal. > > ** If you're interested in helping us maintain Manila by > being part of > the manila core maintainer team, please reach out to me or > any of the > current maintainers, we would love to work with you and help > you grow > into that role! > > Thanks, > Goutham Pacha Ravi (gouthamr) +1 -Ben Swartzlander From mthode at mthode.org Thu Oct 3 18:35:14 2019 From: mthode at mthode.org (Matthew Thode) Date: Thu, 3 Oct 2019 13:35:14 -0500 Subject: [FFE][requirements][mistral][amqp] Failing =?utf-8?B?4oCcZG9j?= =?utf-8?B?c+KAnQ==?= job due to the upper constraint conflict for amqp In-Reply-To: <3cc2f690-313a-4e40-abec-8d7df96846ec@Spark> References: <0567d184-ed82-4c83-ba79-2e586a300c07@Spark> <20191002163415.nu7okcn5de44txoz@mthode.org> <3cc2f690-313a-4e40-abec-8d7df96846ec@Spark> Message-ID: <20191003183514.iubdhip2bsjylcb3@mthode.org> On 19-10-03 14:45:14, Renat Akhmerov wrote: > Thanks Matthew, > > For now we did this: https://review.opendev.org/#/c/685932/. So we just added “kombu” explicitly into our dependencies that forces to load the right version of amqp before oslo.messaging. That works. If that looks OK for you we can skip the mentioned bump. > > > > Renat Akhmerov > @Nokia > On 2 Oct 2019, 23:35 +0700, Matthew Thode , wrote: > > On 19-10-02 14:57:24, Renat Akhmerov wrote: > > > Hi, > > > > > > We have a failing “docs” ([1]) CI job that fails because it implicitly brings amqp 2.5.2 but this lib is not allowed to be higher than 2.5.1 in the upper-constraings.txt in the requirements project ([2]). We see that there’s the patch [3] generated by the proposal bot that bumps the constraint to 2.5.2 for amqp (among others) but it was given -2. > > > > > > Please assist on how to address in the best way. Should we bump only amqp version in upper constraints for now? > > > > > > [1] https://zuul.opendev.org/t/openstack/build/6fe7c7d3e60b40458d2a98f3a293f412/log/job-output.txt#840 > > > [2] https://github.com/openstack/requirements/blob/master/upper-constraints.txt#L258 > > > [3] https://review.opendev.org/#/c/681382 > > > > > > > I'm going to be treating this as a FFE request to bump amqp from 2.5.1 > > to 2.5.2. > > It looks like a bugfix only release so I'm fine with it. As long as we > > don't need to mask 2.5.1 in global-requirements (which would cause a > > re-release for openstack/oslo.messaging). > > > > https://github.com/celery/py-amqp/compare/2.5.1...2.5.2 > > > > So, if you propose a constraints only bump of amqp-2.5.1 to 2.5.2 then I > > approve. > > Looks like a good workaround. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From colleen at gazlene.net Thu Oct 3 18:38:53 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Thu, 03 Oct 2019 11:38:53 -0700 Subject: [keystone] Ussuri roadmap Message-ID: <37661d40-2a1d-487b-8cd0-910219a34d01@www.fastmail.com> Hi team, In past cycles we used Trello for tracking our goals throughout a cycle, which gave us flexibility and visibility over the cycle plans as a whole in conjunction with specs and launchpad bugs. Trello is a proprietary platform and in the last few months changed its ToS to limit the number of public boards an organization can have, and the keystone team has reached that limit. Rather than try to backup and archive our old boards or create another team or a non-team board, for Ussuri I would like to try using a board on a different platform, Taiga: https://tree.taiga.io/project/keystone-ussuri-roadmap/kanban Taiga is AGPLv3 and has no restrictions on its hosted version that I've discovered yet. Many thanks to Morgan for discovering and researching it. I've copied over our incomplete stories from the Train roadmap[1] and arranged the kanban board more or less the same way as the old Trello board, but the platform seems to be very flexible and we could change the layout and workflows in any way that makes sense. For instance, while I only enabled the kanban feature, there is also a sprints/backlog mode if we wanted to take advantage of that. I can grant administrator privileges to anyone who is interested in investigating all the configuration options (or you can create your own sandbox projects to play with). The main deficiency seems to be the lack of support for "teams" or "organizations"[2], but users can be added to the board individually. Action required: * If you were a member of the old Trello keystone team and would to be a member of this board, send me an email address that I can send an invite to * Once you have an account and are added to the board, please have a look at the stories that are already there an assign yourself to the ones you are working on or plan to work on, and update their status or add relevant reviews as comments. Feel free to play with the platform's features and provide feedback in this thread or at next week's team meeting. Please also let me know if you have concerns about using this platform. Colleen [1] https://trello.com/b/ClKW9C8x/keystone-train-roadmap [2] https://tree.taiga.io/project/taiga/us/2129 From sean.mcginnis at gmx.com Thu Oct 3 19:32:45 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Oct 2019 14:32:45 -0500 Subject: [release] Release countdown for week R-1, October 7-11 Message-ID: <20191003193245.GA29220@sm-workstation> Development Focus ----------------- We are on the final mile of this Train ride! (You can thank Thierry for that one ^) Remember that the Train final release will include the latest release candidate (for cycle-with-rc deliverables) or the latest intermediary release (for cycle-with-intermediary deliverables) available. Thursday, October 10th is the deadline for final Train release candidates as well as any last cycle-with-intermediary deliverables. We will then enter a quiet period until we tag the final release on October 16th. Teams should be prioritizing fixing release-critical bugs, before that deadline. Otherwise it's time to start planning the Ussuri development cycle, including discussing Forum and PTG sessions content, in preparation of the Summit in Shanghai next month. Actions --------- Watch for any translation patches coming through on the stable/train branch and merge them quickly. If you discover a release-critical issue, please make sure to fix it on the master branch first, then backport the bugfix to the stable/train branch before triggering a new release. Please drop by #openstack-release with any questions or concerns about the upcoming release! Upcoming Deadlines & Dates -------------------------- Final Train release: October 16 Forum+PTG at Shanghai summit: November 4 From jean-philippe at evrard.me Thu Oct 3 19:58:16 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Thu, 03 Oct 2019 21:58:16 +0200 Subject: [tc] monthly meeting agenda Message-ID: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> Hello everyone, Here's the agenda for our monthly TC meeting. It will happen next Thursday (10 October) at the usual time (1400 UTC) in #openstack-tc . If you can't attend, please put your name in the "Apologies for Absence" section in the wiki [1] Our meeting chair will be Alexandra (asettle). * Follow up on past action items ** ricolin: Follow up with SIG chairs about guidelines https://etherpad.openstack.org/p/SIGs-guideline ** ttx: contact interested parties in a new 'large scale' sig (help with mnaser, jroll reaching out to verizon media) ** Release Naming - Results of the TC poll - Next action * New initiatives and/or report on previous initiatives ** Help gmann on the community goals following our new goal process ** mugsie: to sync with dhellmann or release-team to find the code for the proposal bot ** jroll - ttx: Feedback from the forum selection committee -- Follow up on https://etherpad.openstack.org/p/PVG-TC-brainstorming -- Final accepted list? ** mnaser: sync up with swift team on python3 migration Thank you everyone! Regards, JP [1]: https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting From openstack at fried.cc Thu Oct 3 21:56:50 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 3 Oct 2019 16:56:50 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> Message-ID: (B) After some very productive discussion in the nova meeting and IRC channel this morning, I have updated the nova-specs patch introducing the "Core Liaison" concept [1]. The main change is a drastic edit of the README to include a "Core Liaison FAQ". Other changes of note: * We're now going to make distinct use of the launchpad blueprint's "Definition" and "Direction" fields. As such, we can still decide to defer a blueprint whose spec is merged in the 'approved' directory. (Which really isn't different than what we were doing before; it's just that now we can do it for reasons other than "oops, this didn't get finished in time".) * The single-core-approval rule for previously approved specifications is removed. (A) Note that the idea of capping the number of specs is (mostly) unrelated, and we still haven't closed on it. I feel like we've agreed to have a targeted discussion around spec freeze time where we decide whether to defer features for resource reasons. That would be a new (and good, IMO) thing. But it's still TBD whether "30 approved for 25 completed" will apply, and/or what criteria would be used to decide what gets cut. Collected odds and ends from elsewhere in this thread: > If you do care reviewing a spec, that also means you do care reviewing > the implementation side. I agree that would be nice, and I'd like to make it happen, but separately from what's already being discussed. I added a TODO in the spec README [2]. > If we end up with bags of "spare time", there's loads of tech-debt > items, performance (it's a feature, let's recall) issues, and meaningful > clean-ups waiting to be tackled. Hear hear. > Viewing this from outside, 25 specs in a cycle already sounds like > planning to get a *lot* done... that's completing an average of one > Nova spec per week (even when averaged through the freeze weeks). > Maybe as a goal it's undershooting a bit, but it's still a very > impressive quantity to be able to consistently accomplish. Many > thanks and congratulations to all the folks who work so hard to make > this happen in Nova, cycle after cycle. That perspective literally hadn't occurred to me from here with my face mashed up against the trees [3]. Thanks fungi. > Note that having that "big picture" is I think the main reason why > historically, until very recently, there was a subgroup of the nova core > team that was the specs core team, because what was approved in specs > could have wide impacts to nova and thus knowing the big picture was > important. Good point, Matt. (Not that I think we should, or could, go back to that...) efried [1] https://review.opendev.org/#/c/685857 [2] https://review.opendev.org/#/c/685857/4/README.rst at 219 [3] For non-native speakers, this is a reference to the following idiom: https://www.dictionary.com/browse/can-t-see-the-forest-for-the-trees From mriedemos at gmail.com Thu Oct 3 23:22:33 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Oct 2019 18:22:33 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration Message-ID: Hello Cinderinos, I've now got a working patch that migrates legacy volume attachments to new style v3 attachments [1]. The fun stuff showing it working is in this paste [2]. We want to do this data migration in nova because we still have a lot of compatibility code since Queens for pre-v3 style attachments and we can't remove that compatibility code (ever) if we don't first make sure we provide a data migration routine for operators to roll through. So for example if this lands in Ussuri we can can enforce a nova-status upgrade check in V and rip out code in X. Without digging into the patch, this is the flow: 1. On nova-compute restart, query the nova DB for instances on the compute host with legacy volume attachments. 2. For each of those, create a new style attachment with the host connector and update the BlockDeviceMapping information in the nova DB (attachment_id and connection_info). 3. Delete the existing legacy attachment so when the server is deleted the volume status goes back to 'available' due to proper attachment reference counting in the Cinder DB. My main question is on #3. Right now I'm calling the v3 attachment delete API rather than the v2 os-terminate_connection API. Is that sufficient to cleanup the legacy attachment on the storage backend even though the connection was created via os-initialize_connection originally? Looking at the cinder code, attachment_delete hits the connection terminate code under the covers [3]. So that looks OK. The only thing I can really think of is if a host connector is not provided or tracked with the legacy attachment, is that going to cause problems? Note that I think volume drivers are already required to deal with that today anyway because of the "local delete" scenario in the compute API where the compute host that the server is on is down and thus we don't have a host connector to provide to Cinder to terminate the connection. So Cinder people, are you OK with this flow? Hello Novaheads, Do you have any issues with the above? Note the migration routine is threaded out on compute start so it doesn't block, similar to the ironic flavor data migration introduced in Pike. One question I have is if we should add a config option for this so operators can enable/disable it as needed. Note that this requires nova to be configured with a service user that has the admin role to do this stuff in cinder since we don't have a user token, similar to nova doing things with neutron ports without a user token. Testing this with devstack requires [4]. By default [cinder]/auth_type is None and not required so by default this migration routine is not going to run so maybe that is sufficient? Hello Operatorites, Do you have any issues with what's proposed above? [1] https://review.opendev.org/#/c/549130/ [2] http://paste.openstack.org/show/781063/ [3] https://github.com/openstack/cinder/blob/410791580ef60ddb03104bf20766859ed9d78932/cinder/volume/manager.py#L4650 [4] https://review.opendev.org/#/c/685488/ -- Thanks, Matt From thierry at openstack.org Fri Oct 4 07:48:36 2019 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 4 Oct 2019 09:48:36 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Message-ID: Slawek Kaplonski wrote: > I think it’s interesting idea. Should we somehow sign up to this even (one or both, depends on which we plan to be) to let people know that PTL of specific project will be available there? Or it’s just enough to come there when will be time for that? It would be good to have a rough idea of who will be available at each opportunity. To keep it simple, I created a sign-up sheet at: https://etherpad.openstack.org/p/meet-the-project-leaders > Also, is it expected from project leaders to be available on both terms or only one is enough? You can do one or both (or none) -- no commitment. -- Thierry From akekane at redhat.com Fri Oct 4 09:28:19 2019 From: akekane at redhat.com (Abhishek Kekane) Date: Fri, 4 Oct 2019 14:58:19 +0530 Subject: [Glance][PTG]Shanghai PTG planning Message-ID: Hello Everyone, I have prepared an etherpad [1] to plan the Shanghai PTG discussion topics for glance. The etherpad contains template to add the topic for discussion. It has also references of previous PTG planning etherpads. Even if anyone is not going to attend the PTG but wants there topic needs to be discussed can add as well. Kindly add your topics. [1] https://etherpad.openstack.org/p/Glance-Ussuri-PTG-planning Thanks, Abhishek Kekane -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at tipit.net Fri Oct 4 12:57:32 2019 From: jimmy at tipit.net (Jimmy Mcarthur) Date: Fri, 04 Oct 2019 07:57:32 -0500 Subject: Proposed Forum Schedule Message-ID: <5D9741BC.6080909@tipit.net> The forum schedule is now live: https://www.openstack.org/summit/shanghai-2019/summit-schedule/global-search?t=forum If you'd prefer to use the spreadsheet view: https://drive.google.com/file/d/1qp0I9xnyOK3mhBitQnk2a7VuS9XClvyF Please let Kendall Nelson or myself know as soon as possible if you see any conflicts. Cheers, Jimmy From zigo at debian.org Fri Oct 4 13:35:15 2019 From: zigo at debian.org (Thomas Goirand) Date: Fri, 4 Oct 2019 15:35:15 +0200 Subject: [oslo][nova] Revert of oslo.messaging JSON serialization change In-Reply-To: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> References: <12c0db52-7255-f3ff-1338-238b61507a82@nemebean.com> Message-ID: On 9/30/19 4:45 PM, Ben Nemec wrote: > The concern with this was that the jsonutils handler for > things like datetime objects is not tz-aware, which means if you send a > datetime object over RPC and don't explicitly handle it you could lose > important information. echo Etc/UTC >/etc/timezone Problem solved... :) Thomas From corey.bryant at canonical.com Fri Oct 4 13:41:12 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 4 Oct 2019 09:41:12 -0400 Subject: [charms] placement charm Message-ID: Hi All, I'd like to see if I can get some input on the current state of the Placement API split. For some background, the nova placement API was removed from nova in train, and it's been split into its own project. It's mostly just a basic API charm. The tricky part is the migration of tables from the nova_api database to the placement database. Code is located at: https://github.com/coreycb/charm-placement https://github.com/coreycb/charm-interface-placement https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) Test scenarios I've been testing with: 1) deploy nova-cc et al train, configure keystonev3, deploy instance 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy instance 3 There is currently an issue with the second test scenario where instance 2 creation errors because nova-scheduler can't find a valid placement candidate (not sure of the exact error atm). However if I delete instance 1 before creating instance 2 it is created successfully. It feels like a DB related issue but I'm really not sure so I'll keep digging. Thanks! Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From corey.bryant at canonical.com Fri Oct 4 13:48:13 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 4 Oct 2019 09:48:13 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: One other issue is "pxc-strict-mode: disabled" for percona-cluster is required to test this. /usr/share/placement/mysql-migrate-db.sh may need some updates but I haven't dug into that yet. Thanks, Corey On Fri, Oct 4, 2019 at 9:41 AM Corey Bryant wrote: > Hi All, > > I'd like to see if I can get some input on the current state of the > Placement API split. > > For some background, the nova placement API was removed from nova in > train, and it's been split into its own project. It's mostly just a basic > API charm. The tricky part is the migration of tables from the nova_api > database to the placement database. > > Code is located at: > https://github.com/coreycb/charm-placement > https://github.com/coreycb/charm-interface-placement > > https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) > > Test scenarios I've been testing with: > 1) deploy nova-cc et al train, configure keystonev3, deploy instance > 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, > deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy > instance 3 > > There is currently an issue with the second test scenario where instance 2 > creation errors because nova-scheduler can't find a valid placement > candidate (not sure of the exact error atm). However if I delete instance 1 > before creating instance 2 it is created successfully. It feels like a DB > related issue but I'm really not sure so I'll keep digging. > > Thanks! > Corey > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Fri Oct 4 14:22:43 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 4 Oct 2019 15:22:43 +0100 (BST) Subject: [nova][ptg] Ussuri scope containment In-Reply-To: <20191003101054.GB26595@paraplu> References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <20191003101054.GB26595@paraplu> Message-ID: On Thu, 3 Oct 2019, Kashyap Chamarthy wrote: > I welcome scope reduction, focusing on fewer features, stability, and > bug fixes than "more gadgetries and gongs". Which also means: less > frenzy, less split attention, fewer mistakes, more retained > concentration, and more serenity. And, yeah, any reasonable person > would read '25' as _an_ educated limit, rather than some "optimal > limit". > > If we end up with bags of "spare time", there's loads of tech-debt > items, performance (it's a feature, let's recall) issues, and meaningful > clean-ups waiting to be tackled. Since I quoted the above text and referred back to this entire thread in it, I thought I better: a) say "here here" (or is "hear hear"?) to the above 2. link to https://anticdent.org/fix-your-debt-placement-performance-summary.html which has more to say and an example of what you can get with "retained concentration" -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From corey.bryant at canonical.com Fri Oct 4 14:54:10 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 4 Oct 2019 10:54:10 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Fri, Oct 4, 2019 at 9:41 AM Corey Bryant wrote: > Hi All, > > I'd like to see if I can get some input on the current state of the > Placement API split. > > For some background, the nova placement API was removed from nova in > train, and it's been split into its own project. It's mostly just a basic > API charm. The tricky part is the migration of tables from the nova_api > database to the placement database. > > Code is located at: > https://github.com/coreycb/charm-placement > https://github.com/coreycb/charm-interface-placement > > https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) > > Test scenarios I've been testing with: > 1) deploy nova-cc et al train, configure keystonev3, deploy instance > 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, > deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy > instance 3 > > There is currently an issue with the second test scenario where instance 2 > creation errors because nova-scheduler can't find a valid placement > candidate (not sure of the exact error atm). However if I delete instance 1 > before creating instance 2 it is created successfully. It feels like a DB > related issue but I'm really not sure so I'll keep digging. > > Nothing to see here. Small compute node with limited resources. So this is not an issue. Thanks! > Corey > -------------- next part -------------- An HTML attachment was scrubbed... URL: From corey.bryant at canonical.com Fri Oct 4 15:53:35 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 4 Oct 2019 11:53:35 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Fri, Oct 4, 2019 at 9:41 AM Corey Bryant wrote: > Hi All, > > I'd like to see if I can get some input on the current state of the > Placement API split. > > For some background, the nova placement API was removed from nova in > train, and it's been split into its own project. It's mostly just a basic > API charm. The tricky part is the migration of tables from the nova_api > database to the placement database. > > Code is located at: > https://github.com/coreycb/charm-placement > https://github.com/coreycb/charm-interface-placement > > https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) > > Test scenarios I've been testing with: > 1) deploy nova-cc et al train, configure keystonev3, deploy instance > 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, > deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy > instance 3 > > There is currently an issue with the second test scenario where instance 2 > creation errors because nova-scheduler can't find a valid placement > candidate (not sure of the exact error atm). However if I delete instance 1 > before creating instance 2 it is created successfully. It feels like a DB > related issue but I'm really not sure so I'll keep digging. > > Thanks! > Corey > In case anyone needs these for testing prior to the code getting merged I've pushed placement and nova-cloud-controller charms to the charm store under my namespace. I've released them to the edge channel. https://jaas.ai/u/corey.bryant/placement/bionic/0 https://jaas.ai/u/corey.bryant/nova-cloud-controller/bionic/0 Thanks, Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From waboring at hemna.com Fri Oct 4 16:03:40 2019 From: waboring at hemna.com (Walter Boring) Date: Fri, 4 Oct 2019 12:03:40 -0400 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: References: Message-ID: So looking into the cinder code, calling attachment_delete should be what we want to call. But. I think if we don't have a host connector passed in and the attachment record doesn't have a connector saved, then that results in the volume manager not calling the cinder driver to terminate_connection and return. This also bypasses the driver's remove_export() which is the last chance for a driver to unexport a volume. Walt On Thu, Oct 3, 2019 at 7:27 PM Matt Riedemann wrote: > Hello Cinderinos, > > I've now got a working patch that migrates legacy volume attachments to > new style v3 attachments [1]. The fun stuff showing it working is in > this paste [2]. > > We want to do this data migration in nova because we still have a lot of > compatibility code since Queens for pre-v3 style attachments and we > can't remove that compatibility code (ever) if we don't first make sure > we provide a data migration routine for operators to roll through. So > for example if this lands in Ussuri we can can enforce a nova-status > upgrade check in V and rip out code in X. > > Without digging into the patch, this is the flow: > > 1. On nova-compute restart, query the nova DB for instances on the > compute host with legacy volume attachments. > > 2. For each of those, create a new style attachment with the host > connector and update the BlockDeviceMapping information in the nova DB > (attachment_id and connection_info). > > 3. Delete the existing legacy attachment so when the server is deleted > the volume status goes back to 'available' due to proper attachment > reference counting in the Cinder DB. > > My main question is on #3. Right now I'm calling the v3 attachment > delete API rather than the v2 os-terminate_connection API. Is that > sufficient to cleanup the legacy attachment on the storage backend even > though the connection was created via os-initialize_connection > originally? Looking at the cinder code, attachment_delete hits the > connection terminate code under the covers [3]. So that looks OK. The > only thing I can really think of is if a host connector is not provided > or tracked with the legacy attachment, is that going to cause problems? > Note that I think volume drivers are already required to deal with that > today anyway because of the "local delete" scenario in the compute API > where the compute host that the server is on is down and thus we don't > have a host connector to provide to Cinder to terminate the connection. > > So Cinder people, are you OK with this flow? > > Hello Novaheads, > > Do you have any issues with the above? Note the migration routine is > threaded out on compute start so it doesn't block, similar to the ironic > flavor data migration introduced in Pike. > > One question I have is if we should add a config option for this so > operators can enable/disable it as needed. Note that this requires nova > to be configured with a service user that has the admin role to do this > stuff in cinder since we don't have a user token, similar to nova doing > things with neutron ports without a user token. Testing this with > devstack requires [4]. By default [cinder]/auth_type is None and not > required so by default this migration routine is not going to run so > maybe that is sufficient? > > Hello Operatorites, > > Do you have any issues with what's proposed above? > > [1] https://review.opendev.org/#/c/549130/ > [2] http://paste.openstack.org/show/781063/ > [3] > > https://github.com/openstack/cinder/blob/410791580ef60ddb03104bf20766859ed9d78932/cinder/volume/manager.py#L4650 > [4] https://review.opendev.org/#/c/685488/ > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Fri Oct 4 17:52:34 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 4 Oct 2019 12:52:34 -0500 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Message-ID: > It would be good to have a rough idea of who will be available at each > opportunity. To keep it simple, I created a sign-up sheet at: > > https://etherpad.openstack.org/p/meet-the-project-leaders If a PTL will not be present, is it acceptable to send a delegate? efried . From fungi at yuggoth.org Fri Oct 4 18:07:12 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 4 Oct 2019 18:07:12 +0000 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Message-ID: <20191004180712.323nlymaxedoib54@yuggoth.org> On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: > > It would be good to have a rough idea of who will be available > > at each opportunity. To keep it simple, I created a sign-up > > sheet at: > > > > https://etherpad.openstack.org/p/meet-the-project-leaders > > If a PTL will not be present, is it acceptable to send a delegate? The goal, as I understand it, is to reinforce to attendees in China that OpenStack project leadership is accessible and achievable, by providing opportunities for them to be able to meet and speak in-person with a representative cross-section of our community leaders. Is that something which can be delegated? Seems to me it might convey the opposite of what's intended, but I don't know if my impression is shared by others. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From mriedemos at gmail.com Fri Oct 4 18:32:18 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Oct 2019 13:32:18 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: References: Message-ID: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> On 10/4/2019 11:03 AM, Walter Boring wrote: >   I think if we don't have a host connector passed in and the > attachment record doesn't have a connector saved, > then that results in the volume manager not calling the cinder driver to > terminate_connection and return. > This also bypasses the driver's remove_export() which is the last chance > for a driver to unexport a volume. Two things: 1. Yeah if the existing legacy attachment record doesn't have a connector I was worried about not properly cleaning on for that old connection, which is something I mentioned before, but also as mentioned we potentially have that case when a server is deleted and we can't get to the compute host to get the host connector, right? 2. If I were to use os-terminate_connection, I seem to have a tricky situation on the migration flow because right now I'm doing: a) create new attachment with host connector b) complete new attachment (put the volume back to in-use status) - if this fails I attempt to delete the new attachment c) delete the legacy attachment - I intentionally left this until the end to make sure (a) and (b) were successful. If I change (c) to be os-terminate_connection, will that screw up the accounting on the attachment created in (a)? If I did the terminate_connection first (before creating a new attachment), could that leave a window of time where the volume is shown as not attached/in-use? Maybe not since it's not the begin_detaching/os-detach API...I'm fuzzy on the cinder volume state machine here. Or maybe the flow would become: a) create new attachment with host connector b) terminate the connection for the legacy attachment - if this fails, delete the new attachment created in (a) c) complete the new attachment created in (a) - if this fails...? Without digging into the flow of a cold or live migration I want to say that's closer to what we do there, e.g. initialize_connection for the new host, terminate_connection for the old host, complete the new attachment. -- Thanks, Matt From gmann at ghanshyammann.com Fri Oct 4 19:30:19 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 04 Oct 2019 14:30:19 -0500 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <20191004180712.323nlymaxedoib54@yuggoth.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> <20191004180712.323nlymaxedoib54@yuggoth.org> Message-ID: <16d984064d2.bc633ba6242736.627005749645226424@ghanshyammann.com> ---- On Fri, 04 Oct 2019 13:07:12 -0500 Jeremy Stanley wrote ---- > On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: > > > It would be good to have a rough idea of who will be available > > > at each opportunity. To keep it simple, I created a sign-up > > > sheet at: > > > > > > https://etherpad.openstack.org/p/meet-the-project-leaders > > > > If a PTL will not be present, is it acceptable to send a delegate? > > The goal, as I understand it, is to reinforce to attendees in China > that OpenStack project leadership is accessible and achievable, by > providing opportunities for them to be able to meet and speak > in-person with a representative cross-section of our community > leaders. Is that something which can be delegated? Seems to me it > might convey the opposite of what's intended, but I don't know if my > impression is shared by others. IMO, it should be ok to delegate to other Core of that project. the main idea here is to interact with Chinese communities and help new contributors to onboard or just convey them 'if you are interested in this project, I am here to talk to you'. I think it will be more useful sessions if we have more local Core members also along with PTLs which will solve the cultural or language barrier if any. -gmann > -- > Jeremy Stanley > From colleen at gazlene.net Sat Oct 5 00:05:27 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 04 Oct 2019 17:05:27 -0700 Subject: [keystone] Keystone Team Update - Week of 30 September 2019 Message-ID: # Keystone Team Update - Week of 30 September 2019 ## News Quiet week as we wait for the final release and start preparing for Forum and next cycle. ## Action Items Team members: see action required regarding the new roadmap tracker[1]. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/009942.html ## Office Hours When there are topics to cover, the keystone team holds office hours on Tuesdays at 17:00 UTC. We won't plan to hold office hours next week. Add topics you would like to see covered during office hours to the etherpad: https://etherpad.openstack.org/p/keystone-office-hours-topics ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 7 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 33 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ## Bugs This week we opened 1 new bugs and closed 3. Bugs opened (1) Bug #1846817 (keystone:Medium) opened by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1846817 Bugs fixed (3) Bug #968696 (keystone:High) fixed by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/968696 Bug #1630434 (keystone:Medium) fixed by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1630434 Bug #1806762 (keystone:Medium) fixed by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1806762 Notably, we closed #968696 *for keystone*, as we have completed the migration of our policies to understand system scope and, when [oslo_policy]/enforce_scope is set to true and deprecated policies are overridden, system-wide requests won't respond to project-scoped tokens. This does not mean the "admin"-ness problem is solved across OpenStack, as it will have to be addressed on a service-by-service basis. ## Milestone Outlook https://releases.openstack.org/train/schedule.html Next week will be the last chance to release another RC if we need one. Please help triage and address any RC-critical bugs should they come up. Also, the release schedule for Ussuri has been published: https://releases.openstack.org/ussuri/schedule.html ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From rico.lin.guanyu at gmail.com Sat Oct 5 02:56:31 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Sat, 5 Oct 2019 10:56:31 +0800 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> Message-ID: On Thu, Oct 3, 2019 at 6:29 PM Thierry Carrez wrote: > > OpenStack PTLs, TC members, core reviewers, UC members interested in > meeting the local community are all welcome. We'll also have leaders > from the other OSF-supported projects around. > Is it possible to include SIG chairs as well? I think it is a good opportunity for people to meet SIGs and SIGs to find people and project teams too. > Thierry Carrez (ttx) > -- May The Force of OpenStack Be With You, Rico Lin irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyotishri403 at gmail.com Sat Oct 5 04:41:35 2019 From: jyotishri403 at gmail.com (Jyoti Dahiwele) Date: Sat, 5 Oct 2019 10:11:35 +0530 Subject: Neutron Dhcp-agent Message-ID: Dear Team, Please clarify on how can I use dhcp-agent of neutron as a relay and to use existing dhcp for allocation of IPs to instances? -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Sat Oct 5 17:32:30 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Sat, 5 Oct 2019 19:32:30 +0200 Subject: Neutron Dhcp-agent In-Reply-To: References: Message-ID: <7DC3F60F-41C2-4418-89A7-634D409AF40B@redhat.com> Hi, Neutron DHCP agent can’t configure DHCP relay for Your network. It don’t work like that. > On 5 Oct 2019, at 06:41, Jyoti Dahiwele wrote: > > Dear Team, > > Please clarify on how can I use dhcp-agent of neutron as a relay and to use existing dhcp for allocation of IPs to instances? — Slawek Kaplonski Senior software engineer Red Hat From akalambu at cisco.com Sat Oct 5 17:34:24 2019 From: akalambu at cisco.com (Ajay Kalambur (akalambu)) Date: Sat, 5 Oct 2019 17:34:24 +0000 Subject: [openstack][heat-cfn] CFN Signaling with heat Message-ID: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> Hi I was trying the Software Deployment/Structured deployment of heat. I somehow can never get the signaling to work I see that authentication is happening but I don’t see a POST from the VM as a result stack is stuck in CREATE_IN_PROGRESS I see this message in my heat api cfn log which seems to suggest authentication is successful but it does not seem to POST. Have included debug output from VM and also the sample heat template I used. Don’t know if the template is correct as I referred some online examples to build it 2019-10-05 10:30:00.908 7 INFO heat.api.aws.ec2token [-] Checking AWS credentials.. 2019-10-05 10:30:00.909 7 INFO heat.api.aws.ec2token [-] AWS credentials found, checking against keystone. 2019-10-05 10:30:00.910 7 INFO heat.api.aws.ec2token [-] Authenticating with http://10.10.173.9:5000/v3/ec2tokens 2019-10-05 10:30:01.315 7 INFO heat.api.aws.ec2token [-] AWS authentication successful. 2019-10-05 10:30:02.326 7 INFO eventlet.wsgi.server [req-506f22c6-4062-4a84-8e85-40317a4099ed - adccd09df89e4b71b0a42f462679e75a-b1c6eb69-3877-466b-b00d-03dc051 - 0ecadd4762a34de1ac08508db4d3caa9 0ecadd4762a34de1ac08508db4d3caa9] 10.11.59.36,10.10.173.9 - - [05/Oct/2019 10:30:02] "GET /v1/?SignatureVersion=2&AWSAccessKeyId=f7874ac9898248edaae53511230534a4&StackName=test_stack&SignatureMethod=HmacSHA256&Signature=c03Q7Hb35q9tPPuYOv6YByn5YekF96p2s5zx36sX7x4%3D&Action=DescribeStackResource&LogicalResourceId=sig-vm-1 HTTP/1.1" 200 4669 1.418045 Some debugging output from my VM: [root at sig-vm-1 fedora]# sudo os-collect-config --force --one-time --debug /var/lib/os-collect-config/local-data not found. Skipping [2019-10-05 17:32:47,058] (os-refresh-config) [INFO] Starting phase pre-configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Target: pre-configure.d dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:47,091] (os-refresh-config) [INFO] Completed phase pre-configure [2019-10-05 17:32:47,092] (os-refresh-config) [INFO] Starting phase configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/20-os-apply-config [2019/10/05 05:32:47 PM] [INFO] writing /var/run/heat-config/heat-config [2019/10/05 05:32:47 PM] [INFO] writing /etc/os-collect-config.conf [2019/10/05 05:32:47 PM] [INFO] success dib-run-parts Sat Oct 5 17:32:47 UTC 2019 20-os-apply-config completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/50-heat-config-docker-compose dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-docker-compose completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/50-heat-config-kubelet dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-kubelet completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/55-heat-config [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group Heat::Ungrouped with no hook script None [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group Heat::Ungrouped with no hook script None dib-run-parts Sat Oct 5 17:32:47 UTC 2019 55-heat-config completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Target: configure.d dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 20-os-apply-config 0.345 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-docker-compose 0.064 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-kubelet 0.134 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 55-heat-config 0.065 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Completed phase configure [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Starting phase post-configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/post-configure.d/99-refresh-completed ++ os-apply-config --key completion-handle --type raw --key-default '' + HANDLE= ++ os-apply-config --key completion-signal --type raw --key-default '' + SIGNAL= ++ os-apply-config --key instance-id --type raw --key-default '' + ID=i-0000000d + '[' -n i-0000000d ']' + '[' -n '' ']' + '[' -n '' ']' ++ os-apply-config --key deployments --type raw --key-default '' ++ jq -r 'map(select(.group == "os-apply-config") | select(.inputs[].name == "deploy_signal_id") | .id + (.inputs | map(select(.name == "deploy_signal_id")) | .[].value)) | .[]' + DEPLOYMENTS= + DEPLOYED_DIR=/var/lib/os-apply-config-deployments/deployed + '[' '!' -d /var/lib/os-apply-config-deployments/deployed ']' dib-run-parts Sat Oct 5 17:32:49 UTC 2019 99-refresh-completed completed dib-run-parts Sat Oct 5 17:32:49 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Target: post-configure.d dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 99-refresh-completed 1.206 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:49,041] (os-refresh-config) [INFO] Completed phase post-configure [2019-10-05 17:32:49,042] (os-refresh-config) [INFO] Starting phase migration dib-run-parts Sat Oct 5 17:32:49 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Target: migration.d dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:49,073] (os-refresh-config) [INFO] Completed phase migration onfig]# cat /var/run/heat-config/heat-config [{"inputs": [{"type": "String", "name": "foo", "value": "fu"}, {"type": "String", "name": "bar", "value": "barmy"}, {"type": "String", "name": "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of the server being deployed to"}, {"type": "String", "name": "deploy_action", "value": "CREATE", "description": "Name of the current action being deployed"}, {"type": "String", "name": "deploy_stack_id", "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of the stack this deployment belongs to"}, {"type": "String", "name": "deploy_resource_name", "value": "other_deployment", "description": "Name of this deployment resource in the stack"}, {"type": "String", "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": "How the server should signal to heat with the deployment output values."}, {"type": "String", "name": "deploy_signal_id", "value": "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/other_deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=28a09f5d996240b8b4a117ecb0e0142b&SignatureVersion=2&Signature=IqXbRf9MzJ%2FnzqM7CLNAsR3BiwmaaHyWQspegxYc3D8%3D", "description": "ID of signal to use for signaling output values"}, {"type": "String", "name": "deploy_signal_verb", "value": "POST", "description": "HTTP verb to use for signaling outputvalues"}], "group": "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": {"config_value_foo": "fu", "config_value_bar": "barmy"}, "id": "5c404619-ce79-48cd-b001-00ac6ff4f4e8"}, {"inputs": [{"type": "String", "name": "foo", "value": "fooooo"}, {"type": "String", "name": "bar", "value": "baaaaa"}, {"type": "String", "name": "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of the server being deployed to"}, {"type": "String", "name": "deploy_action", "value": "CREATE", "description": "Name of the current action being deployed"}, {"type": "String", "name": "deploy_stack_id", "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of the stack this deployment belongs to"}, {"type": "String", "name": "deploy_resource_name", "value": "deployment", "description": "Name of this deployment resource in the stack"}, {"type": "String", "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": "How the server should signal to heat with the deployment output values."}, {"type": "String", "name": "deploy_signal_id", "value": "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=4c3d718796e0452ea94f2ce8dc6973ef&SignatureVersion=2&Signature=rxtSBNUSF%2FEXn9wvVK4XMU%2F1RzXVDGILtZr1hmkl7gg%3D", "description": "ID of signal to use for signaling output values"}, {"type": "String", "name": "deploy_signal_verb", "value": "POST", "description": "HTTP verb to use for signaling outputvalues"}], "group": "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": {"config_value_foo": "fooooo", "config_value_bar": "baaaaa"}, "id": "f4dea0c1-73c9-4ce4-aa04-c76ef9b08859"}][root at sig-vm-1 heat-config]# [root at sig-vm-1 heat-config]# cat /etc/os-collect-config.conf [DEFAULT] command = os-refresh-config collectors = ec2 collectors = cfn collectors = local [cfn] metadata_url = http://172.29.85.87:8000/v1/ stack_name = test_stack secret_access_key = npa^GWsPtbRL7D*MYObOI*kV0i1yqKOG access_key_id = f7874ac9898248edaae53511230534a4 path = sig-vm-1.Metadata Here is my basic sample temple heat_template_version: 2013-05-23 description: > This template demonstrates how to use OS::Heat::StructuredDeployment to override substitute get_input placeholders defined in OS::Heat::StructuredConfig config. As there is no hook on the server to act on the configuration data, these deployment resource will perform no actual configuration. parameters: flavor: type: string default: 'a061cb6c-99e7-4bdb-93e4-f0037ee3e947' image: type: string default: 3be29d9f-2ce6-4b95-b80c-0dbca7acfdfe public_net_id: type: string default: 67ae0e17-6258-4fb6-8b9b-0f29f6adb9db private_net_id: type: string description: Private network id default: 995fc046-1c58-468a-b81c-e42c06fc8966 private_subnet_id: type: string description: Private subnet id default: 7598c805-3a9b-4c27-be5b-dca4d89f058c password: type: string description: SSH password default: lab123 resources: the_sg: type: OS::Neutron::SecurityGroup properties: name: the_sg description: Ping and SSH rules: - protocol: icmp - protocol: tcp port_range_min: 22 port_range_max: 22 config: type: OS::Heat::StructuredConfig properties: config: config_value_foo: {get_input: foo} config_value_bar: {get_input: bar} deployment: type: OS::Heat::StructuredDeployment properties: signal_transport: CFN_SIGNAL config: get_resource: config server: get_resource: sig-vm-1 input_values: foo: fooooo bar: baaaaa other_deployment: type: OS::Heat::StructuredDeployment properties: signal_transport: CFN_SIGNAL config: get_resource: config server: get_resource: sig-vm-1 input_values: foo: fu bar: barmy server1_port0: type: OS::Neutron::Port properties: network_id: { get_param: private_net_id } security_groups: - default fixed_ips: - subnet_id: { get_param: private_subnet_id } server1_public: type: OS::Neutron::FloatingIP properties: floating_network_id: { get_param: public_net_id } port_id: { get_resource: server1_port0 } sig-vm-1: type: OS::Nova::Server properties: name: sig-vm-1 image: { get_param: image } flavor: { get_param: flavor } networks: - port: { get_resource: server1_port0 } user_data_format: SOFTWARE_CONFIG user_data: get_resource: cloud_config cloud_config: type: OS::Heat::CloudConfig properties: cloud_config: password: { get_param: password } chpasswd: { expire: False } ssh_pwauth: True -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Sun Oct 6 09:51:11 2019 From: zigo at debian.org (Thomas Goirand) Date: Sun, 6 Oct 2019 11:51:11 +0200 Subject: Neutron Dhcp-agent In-Reply-To: References: Message-ID: <1fc4dc01-b50a-36d9-fb46-9ee412762930@debian.org> On 10/5/19 6:41 AM, Jyoti Dahiwele wrote: > Dear Team, > > Please clarify on how can I use dhcp-agent of neutron as a relay and to > use existing dhcp for allocation of IPs to instances? What Neutron does is setup a dnsmasq instance for each of your subnets, and setup L2 and L3 connectivity in the namespace of this subnet, where the dnsmasq runs. Subnets can be moved (manually) from one DHCP agent to another. Cheers, Thomas From zigo at debian.org Sun Oct 6 09:58:13 2019 From: zigo at debian.org (Thomas Goirand) Date: Sun, 6 Oct 2019 11:58:13 +0200 Subject: ANNOUNCE: Train packages repository for Debian Buster is now available and tested Message-ID: <9916226a-f844-8963-03ee-dd67bdac1dfd@debian.org> Hi, It's been a few days already, there's some fully working (and tested) Debian repositories backported to Buster for train. The URLs are using the usual scheme: deb http://buster-train.debian.net/debian/ buster-train-backports main deb-src http://buster-train.debian.net/debian/ buster-train-backports main deb http://buster-train.debian.net/debian/ buster-train-backports-nochange main deb-src http://buster-train.debian.net/debian/ buster-train-backports-nochange main Early last week, I was able to test this, doing my first deployment, and starting my first VM on it. I haven't run tempest on this yet, though my manual tests went well (ie: floating IP, ssh to instance, mounting a cinder volume over Ceph and LVM, etc.). Please do test this, and report any eventual issue. If everything goes as planned, I'll be at the Debian cloud sprint in Boston the week of the release, discussing the Debian official images for the cloud. So I will only be able to upload the final versions of projects for Train only then after (probably, during the week-end, so it's available on Monday). Cheers, Thomas Goirand (zigo) From marcin.juszkiewicz at linaro.org Mon Oct 7 06:10:50 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Mon, 7 Oct 2019 08:10:50 +0200 Subject: ANNOUNCE: Train packages repository for Debian Buster is now available and tested In-Reply-To: <9916226a-f844-8963-03ee-dd67bdac1dfd@debian.org> References: <9916226a-f844-8963-03ee-dd67bdac1dfd@debian.org> Message-ID: <2e57ae64-7aa2-c8ec-28fc-1869ffbbc386@linaro.org> W dniu 06.10.2019 o 11:58, Thomas Goirand pisze: > Hi, > > It's been a few days already, there's some fully working (and tested) > Debian repositories backported to Buster for train. The URLs are using > the usual scheme: > > deb http://buster-train.debian.net/debian/ buster-train-backports main > deb-src http://buster-train.debian.net/debian/ buster-train-backports main > deb http://buster-train.debian.net/debian/ > buster-train-backports-nochange main > deb-src http://buster-train.debian.net/debian/ > buster-train-backports-nochange main > > Early last week, I was able to test this, doing my first deployment, and > starting my first VM on it. I haven't run tempest on this yet, though my > manual tests went well (ie: floating IP, ssh to instance, mounting a > cinder volume over Ceph and LVM, etc.). > > Please do test this, and report any eventual issue. If everything goes > as planned, I'll be at the Debian cloud sprint in Boston the week of the > release, discussing the Debian official images for the cloud. So I will > only be able to upload the final versions of projects for Train only > then after (probably, during the week-end, so it's available on Monday). We use them in Kolla project. All images builds fine. Not tested deployment yet. From tbechtold at suse.com Mon Oct 7 07:30:30 2019 From: tbechtold at suse.com (Thomas Bechtold) Date: Mon, 7 Oct 2019 09:30:30 +0200 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: Message-ID: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> +1 from me, too. On 10/2/19 10:58 PM, Goutham Pacha Ravi wrote: > Dear Zorillas and other Stackers, > > I would like to formalize the conversations we've been having amongst > ourselves over IRC and in-person. At the outset, we have a lot of > incoming changes to review, but we have limited core maintainer > attention. We haven't re-jigged our core maintainers team as often as > we'd like, and that's partly to blame. We have some relatively new and > enthusiastic contributors that we would love to encourage to become > maintainers! We've mentored contributors 1-1, n-1 before before adding > them to the maintainers team. We would like to do more of this!** > > In this spirit, I would like your inputs on adding Douglas Viroel > (dviroel) to the core maintainers team for manila and its associated > projects (manila-specs, manila-ui, python-manilaclient, > manila-tempest-plugin, manila-test-image, manila-image-elements). > Douglas has been an active contributor for the past two releases and > has valuable review inputs in the project. While he's been around here > less longer than some of us, he brings a lot of experience to the > table with his background in networking and shared file systems. He > has a good grasp of the codebase and is enthusiastic in adding new > features and fixing bugs in the Ussuri cycle and beyond. > > Please give me a +/-1 for this proposal. > > ** If you're interested in helping us maintain Manila by being part of > the manila core maintainer team, please reach out to me or any of the > current maintainers, we would love to work with you and help you grow > into that role! > > Thanks, > Goutham Pacha Ravi (gouthamr) > > From bcafarel at redhat.com Mon Oct 7 08:21:35 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Mon, 7 Oct 2019 10:21:35 +0200 Subject: [neutron] Bug deputy report (week starting on 2019-09-30) Message-ID: Hello Neutrinos, train is almost ready to leave the station, and it is time for a new bug deputy rotation cycle! I was on duty last week, triaging bugs up to 1846703 included A quiet week, with most bugs having potential fixes or good discussions. First one listed could benefit from another pair of eyes Undecided: * neutron-openvswitch-agent and IPv6 - https://bugs.launchpad.net/neutron/+bug/1846494 Can not use an IPv6 address for OpenFlow connections listening address (of_listen_address) High: * Pyroute2 can return dictionary keys in bytes instead of strings - https://bugs.launchpad.net/neutron/+bug/1846360 Fix in progress: https://review.opendev.org/686206 * [mysql8] Unknown column 'public' in 'firewall_rules_v2' - https://bugs.launchpad.net/neutron/+bug/1846606 neutron-fwaas db creation failing with mysql 8 Fix in progress: https://review.opendev.org/686753 Medium: * Designate integration not fully multi region safe - https://bugs.launchpad.net/neutron/+bug/1845891 Fix released: https://review.opendev.org/684854 RFE: * routed network for hypervisor - https://bugs.launchpad.net/neutron/+bug/1846285 Proposition to have routed networks separation at hypervisor level directly, apparently already running in-house at bug reporter's Wishlist: * Avoid neutron to return error 500 when deleting port if designate is down - https://bugs.launchpad.net/neutron/+bug/1846703 Another bug for Designate support, port create and delete operations do not react the same when designate is down Some discussions also in https://review.opendev.org/685644 Opinion: * ovs VXLAN over IPv6 conflicts with linux native VXLAN over IPv4 using standard port - https://bugs.launchpad.net/neutron/+bug/1846507 Configuration issue in kolla-ansible CI, ovs-agent and CI configuration competing for IPv6 binding address. Neutron listed for possible insights on the issue Invalid: * ha router appear double vip - https://bugs.launchpad.net/neutron/+bug/1845900 Kolla issue with HA controllers when stopping L3 agent - keepalived processes are in same container and are killed at same time as agent, added Kolla to affected projects * packet loss during active L3 HA agent restart - https://bugs.launchpad.net/neutron/+bug/1846198 Similar issue for openstack-ansible, it kills all processes in control group (including keepalived processes) when restarting the systemd unit - added OSA as affected project Thanks! Passing the deputy role to slaweq -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Oct 7 08:38:05 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 7 Oct 2019 10:38:05 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> Message-ID: Eric Fried wrote: >> It would be good to have a rough idea of who will be available at each >> opportunity. To keep it simple, I created a sign-up sheet at: >> >> https://etherpad.openstack.org/p/meet-the-project-leaders > > If a PTL will not be present, is it acceptable to send a delegate? Sure! The goal is to provide an opportunity for the Chinese community to meet project team members, not to make it an exclusive event. Anyone's welcome. + we should use those opportunities to promote the on-boarding sessions which will happen later in the week. -- Thierry From thierry at openstack.org Mon Oct 7 08:39:13 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 7 Oct 2019 10:39:13 +0200 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> Message-ID: Rico Lin wrote: > On Thu, Oct 3, 2019 at 6:29 PM Thierry Carrez > wrote: > > > > > OpenStack PTLs, TC members, core reviewers, UC members interested in > > meeting the local community are all welcome. We'll also have leaders > > from the other OSF-supported projects around. > > > Is it possible to include SIG chairs as well? > I think it is a good opportunity for people to meet SIGs and SIGs to > find people and project teams too. Yes, of course (see my other response for rationale). -- Thierry Carrez (ttx) From a.settle at outlook.com Mon Oct 7 08:40:01 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Mon, 7 Oct 2019 08:40:01 +0000 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <20191004180712.323nlymaxedoib54@yuggoth.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> <20191004180712.323nlymaxedoib54@yuggoth.org> Message-ID: On Fri, 2019-10-04 at 18:07 +0000, Jeremy Stanley wrote: > On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: > > > It would be good to have a rough idea of who will be available > > > at each opportunity. To keep it simple, I created a sign-up > > > sheet at: > > > > > > https://etherpad.openstack.org/p/meet-the-project-leaders > > > > If a PTL will not be present, is it acceptable to send a delegate? > > The goal, as I understand it, is to reinforce to attendees in China > that OpenStack project leadership is accessible and achievable, by > providing opportunities for them to be able to meet and speak > in-person with a representative cross-section of our community > leaders. Is that something which can be delegated? Seems to me it > might convey the opposite of what's intended, but I don't know if my > impression is shared by others. That was indeed my intention with the initial idea proposal. Conceptually, the meetup would be to break down any preconceived notions that individuals may have. Of course, that isn't to say that there aren't many leaders in the community that don't hold an _official_ position. It's mostly to put a face to a name, to create open communication channels. I'd say it's up the the team's discretion as to whether or not they'd like to delegate the presence at this meetup. This meetup is not compulsory for anyone, so if you can't go, and can't delegate, that is also fine. -- Alexandra Settle IRC: asettle From mark at stackhpc.com Mon Oct 7 08:55:18 2019 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 7 Oct 2019 09:55:18 +0100 Subject: [kolla] Feature freeze Message-ID: Hello Koalas, We are now in feature freeze for the Train release. Cores, please do not approve feature patches on the master branch until we have created the stable/train branch. We will allow some exceptions which must be approved by the core team. Currently, we have nova cells support and IPv6-only mode. Please apply for feature freeze exceptions either on openstack-discuss or during the weekly IRC meeting. The deadline for merging features with exceptions is Friday 18th October. Please now focus on bug fixing and testing. Thanks, Mark From bluejay.ahn at gmail.com Mon Oct 7 10:36:50 2019 From: bluejay.ahn at gmail.com (Jaesuk Ahn) Date: Mon, 7 Oct 2019 19:36:50 +0900 Subject: [Airship-discuss] Fwd: OOK,Airship In-Reply-To: <69277446-4470-3bd2-6cd4-b0f61c3e21e3@ebi.ac.uk> References: <963B5DA1-1C3D-481B-A41B-D11369BC1848@openstack.org> <69277446-4470-3bd2-6cd4-b0f61c3e21e3@ebi.ac.uk> Message-ID: Hi Charles, As briefly mentioned in the previous email, SKT is running OOK in several productions: SKT's LTE/5G NSA infrastructure for a certain VNF (Virtualized Network Function), Private Cloud, Cloud infrastructure for VDI. SKT started navigating OOK in late 2016 exactly because of "bumpy experience due to issues with configuration maintenance /upgrade". We got very lucky to work with AT&T from the beginning both on openstack-helm and airship-armada. SKT now has a slightly different technology set from Airship, we have ansible+ironic+kubeadm+airship-armada+openstack-helm. You can see all the code and information from the following link. We opened our codebase in July (we call it "taco: skt all container openstack). - https://github.com/openinfradev - https://github.com/openinfradev/tacoplay In addtion, we have a concrete plan to develop "2nd generation of ook" that will be very similar to what Airship 2.0 look like. We will work with Airship community on this route. I hope it help your research on ook option. You can always ask me any question on this topic. I will be happy to help you. FYI, here is a presentation about what we did. - https://www.openstack.org/videos/summits/berlin-2018/you-can-start-small-and-grow-sk-telecoms-use-case-on-armada Thanks! Thanks. 2019년 10월 3일 (목) 오후 9:14, Charles 님이 작성: > Hi Roman, > > > Many thanks for the reply. > > I posted this on openstack-discuss because I was wondering if any > users/Openstack operators out there (outside large corporations who are > members of the Airship development framework) are actually running OOK > in production. This could be Airship, or some other Kubernetes > distribution running Openstack Helm. > > Our several years experience of managing Openstack so far > (RHOSP/TripleO) has been bumpy due to issues with configuration > maintenance /upgrades. The idea of using CI/CD and Kubernetes/Helm to > manage Openstack is compelling and fits nicely into the DevOps framework > here. If we were to explore this route we could 'roll our own' with a > deployment say based on https://opendev.org/airship/treasuremap , or pay > for and Enterprise solution that incorporates the OOK model (upcoming > Mirantis and SUSE potentially). > > Regards > > Charles > > > > > > On 03/10/2019 12:04, Roman Gorshunov wrote: > > Thanks Ashlee! > > > > Charles, > > A few companies who work on development of Airship do use it, > > including production uses: AT&T, SUSE, Mirantis, Ericsson, SK Telekom > > and others. Many of those companies (if not all) use Airship + > > OpenStack Helm as well. > > > > Airship, as you have mentioned, is a collection of components for > > undercloud control plane, which helps to deploy nodes with > > OS+Docker+Kubernetes on it, configure/manage it all in GitOps way, and > > then help to maintain the configuration. It also allows to manage > > deploys and maintenance of whatever runs on top of Kubernetes cluster, > > would that be OpenStack Helm or other software packaged in Helm > > format. > > > > OpenStack Helm does not really require to be running on > > Airship-managed cluster. It could run standalone. > > > > Yes, you can roll out an open source production grade > > Airship/Openstack Helm deployment today. Good example of production > > grade configuration could be found in airship/treasuremap repository > > [0] as 'seaworthy' site definition. You are welcome to try, of course. > > For the questions - reach out to us on IRC #airshipit at Freenode of via > > Airship-discuss mailing list. > > > > [0] https://opendev.org/airship/treasuremap > > > > Best regards, > > -- > > Roman Gorshunov > > > > On Wed, Oct 2, 2019 at 9:27 PM Ashlee Ferguson > wrote: > >> Hi Charles, > >> > >> Glad to hear you’re interested! Forwarding this to the Airship ML since > there may be folks on this mailing list that will have pointers who didn't > see the openstack-discuss post. > >> > >> Ashlee > >> > >> > >> > >> Begin forwarded message: > >> > >> From: Charles > >> Subject: OOK,Airship > >> Date: October 2, 2019 at 5:39:16 PM GMT+2 > >> To: openstack-discuss at lists.openstack.org > >> > >> Hi, > >> > >> > >> We are interested in OOK and Openstack Helm. > >> > >> Has anyone any experience with Airship (now that 1.0 is out)? > >> > >> Noticed that a few Enterprise distributions are looking at managing the > Openstack control plane with Kubernetes and have been testing Airship with > a view to rolling it out (Mirantis,SUSE) > >> > >> Is this a signal that there is momentum around Openstack Helm? > >> > >> Is it possible to roll out an open source production grade > Airship/Openstack Helm deployment today, or is it too early? > >> > >> > >> Thoughts? > >> > >> > >> Charles > >> > >> > >> > >> > >> _______________________________________________ > >> Airship-discuss mailing list > >> Airship-discuss at lists.airshipit.org > >> http://lists.airshipit.org/cgi-bin/mailman/listinfo/airship-discuss > > -- > Charles Short > Senior Cloud Engineer > EMBL-EBI > Hinxton > 01223494205 > > > _______________________________________________ > Airship-discuss mailing list > Airship-discuss at lists.airshipit.org > http://lists.airshipit.org/cgi-bin/mailman/listinfo/airship-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From no-reply at openstack.org Mon Oct 7 12:01:48 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Mon, 07 Oct 2019 12:01:48 -0000 Subject: octavia 5.0.0.0rc2 (train) Message-ID: Hello everyone, A new release candidate for octavia for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/octavia/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/octavia/src/branch/stable/train Release notes for octavia can be found at: https://docs.openstack.org/releasenotes/octavia/ If you find an issue that could be considered release-critical, please file it at: https://storyboard.openstack.org/#!/project/908 and tag it *train-rc-potential* to bring it to the octavia release crew's attention. From no-reply at openstack.org Mon Oct 7 12:03:51 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Mon, 07 Oct 2019 12:03:51 -0000 Subject: storlets 4.0.0.0rc2 (train) Message-ID: Hello everyone, A new release candidate for storlets for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/storlets/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/storlets/src/branch/stable/train Release notes for storlets can be found at: https://docs.openstack.org/releasenotes/storlets/ If you find an issue that could be considered release-critical, please file it at: https://bugs.launchpad.net/storlets/+bugs and tag it *train-rc-potential* to bring it to the storlets release crew's attention. From fungi at yuggoth.org Mon Oct 7 13:36:42 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 7 Oct 2019 13:36:42 +0000 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> <20191004180712.323nlymaxedoib54@yuggoth.org> Message-ID: <20191007133641.f4q2ylxckr362pop@yuggoth.org> On 2019-10-07 08:40:01 +0000 (+0000), Alexandra Settle wrote: > On Fri, 2019-10-04 at 18:07 +0000, Jeremy Stanley wrote: > > On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: [...] > > > If a PTL will not be present, is it acceptable to send a > > > delegate? > > > > The goal, as I understand it, is to reinforce to attendees in > > China that OpenStack project leadership is accessible and > > achievable, by providing opportunities for them to be able to > > meet and speak in-person with a representative cross-section of > > our community leaders. Is that something which can be delegated? > > Seems to me it might convey the opposite of what's intended, but > > I don't know if my impression is shared by others. > > That was indeed my intention with the initial idea proposal. > Conceptually, the meetup would be to break down any preconceived > notions that individuals may have. > > Of course, that isn't to say that there aren't many leaders in the > community that don't hold an _official_ position. It's mostly to > put a face to a name, to create open communication channels. I'd > say it's up the the team's discretion as to whether or not they'd > like to delegate the presence at this meetup. Of course, I should have clarified. I think providing folks the opportunity to meet and speak with a Nova core reviewer is great. It's definitely a type of leadership we prize highly in our community and want to encourage more of. Being "the person who showed up on behalf of the Nova PTL because they're not present" doesn't really make the Nova PTL position any more approachable on the other hand. If anything, it seems to me that it might reinforce the impression it's a distant and unachievable position. > This meetup is not compulsory for anyone, so if you can't go, and > can't delegate, that is also fine. Yep, I think having a variety of different sorts of community leaders present is what's needed, it doesn't have to (and realistically, probably can't anyway?) involve every one of the ~hundred teams, SIGs, and other organized groups within the community. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From corey.bryant at canonical.com Mon Oct 7 13:58:04 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Mon, 7 Oct 2019 09:58:04 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Fri, Oct 4, 2019 at 9:48 AM Corey Bryant wrote: > One other issue is "pxc-strict-mode: disabled" for percona-cluster is > required to test this. /usr/share/placement/mysql-migrate-db.sh may need > some updates but I haven't dug into that yet. > > I have a review up for this issue now at: https://review.opendev.org/#/c/687056/ Thanks, Corey > On Fri, Oct 4, 2019 at 9:41 AM Corey Bryant > wrote: > >> Hi All, >> >> I'd like to see if I can get some input on the current state of the >> Placement API split. >> >> For some background, the nova placement API was removed from nova in >> train, and it's been split into its own project. It's mostly just a basic >> API charm. The tricky part is the migration of tables from the nova_api >> database to the placement database. >> >> Code is located at: >> https://github.com/coreycb/charm-placement >> https://github.com/coreycb/charm-interface-placement >> >> https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) >> >> Test scenarios I've been testing with: >> 1) deploy nova-cc et al train, configure keystonev3, deploy instance >> 2) deploy nova-cc et al stein, configure keystonev3, deploy instance 1, >> deploy placement train, deploy instance 2, upgrade nova-cc to train, deploy >> instance 3 >> >> There is currently an issue with the second test scenario where instance >> 2 creation errors because nova-scheduler can't find a valid placement >> candidate (not sure of the exact error atm). However if I delete instance 1 >> before creating instance 2 it is created successfully. It feels like a DB >> related issue but I'm really not sure so I'll keep digging. >> >> Thanks! >> Corey >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon Oct 7 14:24:59 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 7 Oct 2019 16:24:59 +0200 Subject: [oslo] FFE: Support "qemu-img info" virtual size in QEMU 4.1 and late Message-ID: Hi, I request a late feature freeze exception (FFE) for https://review.opendev.org/#/c/686598/ and https://github.com/openstack/oslo.utils/commit/89bccdee95f81ddb54b427d6af172bb987fd7545 -- "Support "qemu-img info" virtual size in QEMU 4.1 and later". It will fix an issue that can be blocking for users so it's can be really valuable for operators, if we release it ASAP. They would be delighted if it were included in Train. Please let me know if you have any concerns or questions. Thank you for your consideration. Hervé -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From no-reply at openstack.org Mon Oct 7 14:40:21 2019 From: no-reply at openstack.org (no-reply at openstack.org) Date: Mon, 07 Oct 2019 14:40:21 -0000 Subject: cinder 15.0.0.0rc2 (train) Message-ID: Hello everyone, A new release candidate for cinder for the end of the Train cycle is available! You can find the source code tarball at: https://tarballs.openstack.org/cinder/ Unless release-critical issues are found that warrant a release candidate respin, this candidate will be formally released as the final Train release. You are therefore strongly encouraged to test and validate this tarball! Alternatively, you can directly test the stable/train release branch at: https://opendev.org/openstack/cinder/src/branch/stable/train Release notes for cinder can be found at: https://docs.openstack.org/releasenotes/cinder/ If you find an issue that could be considered release-critical, please file it at: https://bugs.launchpad.net/cinder/+bugs and tag it *train-rc-potential* to bring it to the cinder release crew's attention. From luka.peschke at objectif-libre.com Mon Oct 7 14:53:28 2019 From: luka.peschke at objectif-libre.com (Luka Peschke) Date: Mon, 07 Oct 2019 16:53:28 +0200 Subject: [cloudkitty] 07/10 IRC meeting recap Message-ID: <1b49a519ea12fb979e4cc688506a5e7c@objectif-libre.com> Hello everybody, This is the recap for today's IRC meeting of the cloudkitty team. The agenda can be found at [1] and the logs can be found at [2]. cloudkitty 11.0.0 and python-cloudkittyclient 3.1.0 =================================================== Cloudkitty 11.0.0 has been released on september 25th. If no critical bug is reported, it will be final release for the train cycle. The release notes for the train cycle can be found at [3]. New meeting schedule ==================== As discussed, the cloudkitty IRC meeting will now happen on the 1st and 3rd monday of each month at 14h00 UTC. This time has been chosen because cloudkitty's main contributors are split between montreal and france. Of course, if anyone from an incompatible timezone would like to take part in the meetings, we can re-adjust the schedule. From now on, we'll provide a recap to the ML after each meeting. New features / specs / projects =============================== First, I'd like to welcome our two new contributors, Quentin Anglade (qanglade) and Julien Pinchelimouroux (julien-pinchelim). Quentin has been working on porting some v1 API endpoints to v2, more specifically the ones used for rating module configuration (/v1/rating/modules). These endpoints will be included in the Ussuri version. Julien is working on a standalone dashboard for cloudkitty. It will be compatible with the standalone mode, but will also support keystone integration. It should provide a more modern and easier to use interface than the cloudkitty-dashboard horizon plugin. It will require the v2 API to work. I've been busy with some improvements to v2 API performance, in particular regarding driver loading. The spec can be found at [4]. Tempest plugin ============== Justin (jferrieu) has been working on the tempest plugin. Now that the Elasticsearch v2 storage driver is supported in devstack, we plan to add a lot more tests and some complete scenarios. Some of Justin's work on differenciating v1 and v2 API tempest tests can be found at [5]. The next meeting will happen on October 21st at 14h00 UTC. Cheers, -- Luka Peschke (peschk_l) [1] https://etherpad.openstack.org/p/cloudkitty-meeting-topics [2] http://eavesdrop.openstack.org/meetings/cloudkitty/2019/cloudkitty.2019-10-07-14.00.log.html [3] https://docs.openstack.org/releasenotes/cloudkitty/train.html [4] https://review.opendev.org/#/c/686391/ [5] https://review.opendev.org/#/c/686210/ From openstack at nemebean.com Mon Oct 7 15:32:35 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Oct 2019 10:32:35 -0500 Subject: [oslo][release][requirements] FFE: Support "qemu-img info" virtual size in QEMU 4.1 and late In-Reply-To: References: Message-ID: <6d93472a-a191-7c01-42ad-960442b2f491@nemebean.com> Tagging with release and requirements as they need to sign off on this. On 10/7/19 9:24 AM, Herve Beraud wrote: > Hi, > > I request a late feature freeze exception (FFE) for > https://review.opendev.org/#/c/686598/ and > https://github.com/openstack/oslo.utils/commit/89bccdee95f81ddb54b427d6af172bb987fd7545 > -- "Support "qemu-img info" virtual size in QEMU 4.1 and later". It will > fix an issue that can be blocking for users so it's can be really > valuable for operators, if we release it ASAP. They would be delighted > if it were included in Train. I guess I'll reiterate my question from the review: Does this need to be in the initial Train release or can we backport it immediately after? Since qemu 4.1.0 released during the Train cycle I would argue that it's fair to backport patches to support it (I'm less sure about the stein patch, but that's a separate topic). If there are consumers of OpenStack who will take the initial Train release and not any subsequent bugfix releases then that would suggest we need to do this now, but I can't imagine anyone locks themselves into the .0 release of a piece of software and refuses to take any bug fixes after that. I'm open to being persuaded otherwise though. > > Please let me know if you have any concerns or questions. Thank you for > your consideration. > > Hervé > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From openstack at nemebean.com Mon Oct 7 15:44:04 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Oct 2019 10:44:04 -0500 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older Message-ID: Hi, This is related to the FFE for train, but I wanted to discuss it separately because I think the circumstances are a bit different. Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear to me that backporting bug fixes for it is valid. The original author of the patch actually wants it for Rocky, which is basically in the same situation as Stein. I should note he's willing to carry the patch downstream if necessary. On the one hand, it sounds like this is something at least one operator wants, but on the other I'm not sure the stable policy supports backporting patches to support a version of a dependency that didn't exist when the release was initially cut. I'm soliciting opinions on how to proceed here. Reference: https://review.opendev.org/#/c/686532 Thanks. -Ben From mthode at mthode.org Mon Oct 7 15:49:56 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Oct 2019 10:49:56 -0500 Subject: [oslo][release][requirements] FFE: Support "qemu-img info" virtual size in QEMU 4.1 and late In-Reply-To: <6d93472a-a191-7c01-42ad-960442b2f491@nemebean.com> References: <6d93472a-a191-7c01-42ad-960442b2f491@nemebean.com> Message-ID: <20191007154956.lukimg63dti4kdt5@mthode.org> On 19-10-07 10:32:35, Ben Nemec wrote: > Tagging with release and requirements as they need to sign off on this. > > On 10/7/19 9:24 AM, Herve Beraud wrote: > > Hi, > > > > I request a late feature freeze exception (FFE) for > > https://review.opendev.org/#/c/686598/ and https://github.com/openstack/oslo.utils/commit/89bccdee95f81ddb54b427d6af172bb987fd7545 > > -- "Support "qemu-img info" virtual size in QEMU 4.1 and later". It will > > fix an issue that can be blocking for users so it's can be really > > valuable for operators, if we release it ASAP. They would be delighted > > if it were included in Train. > > I guess I'll reiterate my question from the review: Does this need to be in > the initial Train release or can we backport it immediately after? Since > qemu 4.1.0 released during the Train cycle I would argue that it's fair to > backport patches to support it (I'm less sure about the stein patch, but > that's a separate topic). If there are consumers of OpenStack who will take > the initial Train release and not any subsequent bugfix releases then that > would suggest we need to do this now, but I can't imagine anyone locks > themselves into the .0 release of a piece of software and refuses to take > any bug fixes after that. I'm open to being persuaded otherwise though. > > > > > Please let me know if you have any concerns or questions. Thank you for > > your consideration. > > > > Hervé > > > > -- > > Hervé Beraud > > Senior Software Engineer > > Red Hat - Openstack Oslo > > irc: hberaud > > -----BEGIN PGP SIGNATURE----- > > > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > > v6rDpkeNksZ9fFSyoY2o > > =ECSj > > -----END PGP SIGNATURE----- > > > Given that this is a backwards compatible change I think it's fine. https://github.com/openstack/oslo.utils/compare/3.41.1...89bccdee95f81ddb54b427d6af172bb987fd7545 the above link shows that this is the only commit (that's code related) as well so no issues here. The only thing we'll need to make sure of is to cherry-pick the requirements update into master (like was just done with the tempest release). -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From thierry at openstack.org Mon Oct 7 16:02:38 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 7 Oct 2019 18:02:38 +0200 Subject: [Release-job-failures] Tag of openstack/cinder for ref refs/tags/15.0.0.0rc2 failed In-Reply-To: References: Message-ID: <861d9067-070a-4f75-1f71-d15baf221760@openstack.org> zuul at openstack.org wrote: > Build failed. > > - publish-openstack-releasenotes-python3 https://zuul.opendev.org/t/openstack/build/965908bbf69141c393d4728f7de07f7d : POST_FAILURE in 29m 45s Looks like a transient failure Collect sphinx build html: ssh: connect to host 162.242.237.111 port 22: Connection timed out rsync: connection unexpectedly closed (0 bytes received so far) [Receiver] rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.1] Collect artifacts: ssh: connect to host 162.242.237.111 port 22: Connection timed out rsync: connection unexpectedly closed (0 bytes received so far) [Receiver] rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.1] Release notes should be picked up at the next RC or the final, so no need to retry/reenqueue? -- Thierry Carrez (ttx) From mthode at mthode.org Mon Oct 7 16:07:03 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Oct 2019 11:07:03 -0500 Subject: [all][requirements] requirements branched train - cycle-trailing are on notice that master is now ussuri Message-ID: <20191007160703.lzyan2777owo4cbw@mthode.org> Just a friendly ping that the train keeps rolling. Master is now ussuri. https://releases.openstack.org/constraints/upper/ussuri should work soon for those that want to switch their install_command in tox.ini within master earlier in the cycle (rather than having things pile up). I'll email again once it's working. cycle-trailing projects are on notice that if they track master, the dependencies may change from what they are currently working on (train). If you have any questions please let me know (in the #openstack-requirements channel preferably). P.S. FFE season is over -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From fungi at yuggoth.org Mon Oct 7 16:31:19 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 7 Oct 2019 16:31:19 +0000 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: References: Message-ID: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: [...] > Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear > to me that backporting bug fixes for it is valid. The original > author of the patch actually wants it for Rocky [...] Neither the changes nor the bug report indicate what the motivation is for supporting newer Qemu with (much) older OpenStack. Is there some platform which has this Qemu behavior on which folks are trying to run Rocky? Or is it a homegrown build combining these dependency versions from disparate time periods? Or maybe some other reason I'm not imagining? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gouthampravi at gmail.com Mon Oct 7 17:00:41 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Mon, 7 Oct 2019 10:00:41 -0700 Subject: [all][tc][ptl] "Meet the project leaders" opportunities in Shanghai In-Reply-To: <20191007133641.f4q2ylxckr362pop@yuggoth.org> References: <3e0661de-2e4f-3c8f-1845-85a45ba15fb4@openstack.org> <98D6814A-EEB1-4D12-A4CC-4E5608563B3D@redhat.com> <20191004180712.323nlymaxedoib54@yuggoth.org> <20191007133641.f4q2ylxckr362pop@yuggoth.org> Message-ID: On Mon, Oct 7, 2019 at 6:40 AM Jeremy Stanley wrote: > On 2019-10-07 08:40:01 +0000 (+0000), Alexandra Settle wrote: > > On Fri, 2019-10-04 at 18:07 +0000, Jeremy Stanley wrote: > > > On 2019-10-04 12:52:34 -0500 (-0500), Eric Fried wrote: > [...] > > > > If a PTL will not be present, is it acceptable to send a > > > > delegate? > > > > > > The goal, as I understand it, is to reinforce to attendees in > > > China that OpenStack project leadership is accessible and > > > achievable, by providing opportunities for them to be able to > > > meet and speak in-person with a representative cross-section of > > > our community leaders. Is that something which can be delegated? > > > Seems to me it might convey the opposite of what's intended, but > > > I don't know if my impression is shared by others. > > > > That was indeed my intention with the initial idea proposal. > > Conceptually, the meetup would be to break down any preconceived > > notions that individuals may have. > > > > Of course, that isn't to say that there aren't many leaders in the > > community that don't hold an _official_ position. It's mostly to > > put a face to a name, to create open communication channels. I'd > > say it's up the the team's discretion as to whether or not they'd > > like to delegate the presence at this meetup. > > Of course, I should have clarified. I think providing folks the > opportunity to meet and speak with a Nova core reviewer is great. > It's definitely a type of leadership we prize highly in our > community and want to encourage more of. Being "the person who > showed up on behalf of the Nova PTL because they're not present" > doesn't really make the Nova PTL position any more approachable on > the other hand. If anything, it seems to me that it might reinforce > the impression it's a distant and unachievable position. > Sure hope it doesn't. I support the concept behind this and would love to be there, but cannot, because I'm unable to travel to Shanghai. Many of the PTL-driven tasks at the event for Manila have been delegated to project maintainers that are attending. I would like to find a suitable lead for this too, along with encouraging all core reviewers that are in attendance to be part of these events: the mixer and the lunch. > > > This meetup is not compulsory for anyone, so if you can't go, and > > can't delegate, that is also fine. > > Yep, I think having a variety of different sorts of community > leaders present is what's needed, it doesn't have to (and > realistically, probably can't anyway?) involve every one of the > ~hundred teams, SIGs, and other organized groups within the > community. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Mon Oct 7 17:09:27 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 7 Oct 2019 10:09:27 -0700 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Hey Alex, So since the TC stuff is Friday we managed to shuffle things around and now docs has the afternoon on Thursday. We will get the final schedule up on the website soon. -Kendall (diablo_rojo) On Thu, Oct 3, 2019 at 9:32 AM Kendall Waters wrote: > Hey Alex, > > We still have tables available on Friday. Would half a day on Friday work > for the docs team? Unless Ian is okay with it, we can combine Docs with > i18n in their Wednesday afternoon/Thursday morning slot. Just let me know! > > Cheers, > Kendall > > > > Kendall Waters > OpenStack Marketing & Events > kendall at openstack.org > > > > On Oct 3, 2019, at 4:26 AM, Alexandra Settle wrote: > > Hey, > > Could you add something for docs? Or combine with i18n again if Ian > doesn't mind? > > We don't need a lot, just a room for people to ask questions about the > future of the docs team. > > Stephen will be there, as co-PTL. There's 0 chance of it not > conflicting with nova. > > Please :) > > Thank you! > > Alex > > On Wed, 2019-09-25 at 14:13 -0700, Kendall Nelson wrote: > > Hello Everyone! > > In the attached picture or link [0] you will find the proposed > schedule for the various tracks at the Shanghai PTG in November. > > We did our best to avoid the key conflicts that the track leads > (PTLs, SIG leads...) mentioned in their PTG survey responses, > although there was no perfect solution that would avoid all conflicts > especially when the event is three-ish days long and we have over 40 > teams meeting. > > If there are critical conflicts we missed or other issues, please let > us know, by October 6th at 7:00 UTC! > > -Kendall (diablo_rojo) > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_schedu > le.png > > -- > Alexandra Settle > IRC: asettle > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Mon Oct 7 18:02:32 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Mon, 7 Oct 2019 13:02:32 -0500 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals Message-ID: Hi Stackers, I want to request the inclusion of the support for Neutron Routed Networks in the Nova goals for the Ussuri cycle. As many of you might know, Routed networks is a feature in Neutron that enables the creation of very large virtual networks that avoids the performance penalties of large L2 broadcast domains ( https://www.openstack.org/videos/summits/barcelona-2016/scaling-up-openstack-networking-with-routed-networks). This functionality can be very helpful for large deployers who have the need to have one or a few large virtual networks shared by all their users and has been available in Neutron since very soon after the Barcelona Summit in 2016. But it is really useless until there is code in Nova that can schedule VMs to compute hosts based on the segments topology of the routed networks. Without it, VMs can land in compute hosts where their traffic cannot be routed by the underlying network infrastructure. I would like the Nova team to consider the following when making a decision about this request: 1. Work for Routed Networks was approved as a priority for the Ocata cycle, although it wasn't concluded: https://specs.openstack.org/openstack/nova-specs/priorities/ocata-priorities.html#network-aware-scheduling and https://specs.openstack.org/openstack/nova-specs/specs/pike/index.html 2. The are several large deployers that need this feature. Verizon Media, my employer, is one of them. Others that come to mind include GoDaddy and Cern. And I am sure there are others. 3. There is a WIP patch to implement the functionality some of the functionality: https://review.opendev.org/#/c/656885. We, at Verizon Media, are proposing to take over this work and finish its implementation by the end of U. What we are requesting is Nova core reviewers bandwidth to help us merge the code I will be attending the PTG in Shanghai and will make myself available to discuss this further in person any day and any time. Hopefully, we can get this feature lined up very soon Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Oct 7 18:16:39 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Oct 2019 19:16:39 +0100 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> Message-ID: On Mon, 2019-10-07 at 16:31 +0000, Jeremy Stanley wrote: > On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: > [...] > > Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear > > to me that backporting bug fixes for it is valid. The original > > author of the patch actually wants it for Rocky > > [...] > > Neither the changes nor the bug report indicate what the motivation > is for supporting newer Qemu with (much) older OpenStack. Is there > some platform which has this Qemu behavior on which folks are trying > to run Rocky? Or is it a homegrown build combining these dependency > versions from disparate time periods? Or maybe some other reason I'm > not imagining? i suspect the motivation is the fact that distos like RHEL often bump qemu and libvirt versions in minor releases. so if you deploy Queens on say rhel 7.5 orignally but you upgraged it to rhel 7.7 over time you would end up running with a qemu/libvirt that may not have existed when queens was released. when qemu has broken its public api in the past and that change in behavior has been addressed in later openstack release disto have often had to backport that fix to an openstack that was release before that depency existed. this depends on the distro. canonical for example package qemu and ovs in the ubuntu cloud archive for each given release i belive so you can go form 18.04.0 to 18.04.1 and know it wont break your openstack install but on rhel QEMU and kvm are owned by a sperate team and layered prodcut like openstack consume the output of that team which follow the RHEL release cycle not the openstack one. so i expect this to vary per distro. when a change is backportable upstream that is obviosly perferable. i dont actully think this need to be fixed in Train GA if a oslo release is done promptly that can be consumed instead. i expect this to get backported downs stream anyway so if we can avoid multiple distros doing that and backport it upstream give it backward compatibale it think that would be preferable. just my 2 cents From smooney at redhat.com Mon Oct 7 18:45:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Oct 2019 19:45:54 +0100 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: References: Message-ID: <51911abf0b59bf482719d25a9b7c370931db981d.camel@redhat.com> On Mon, 2019-10-07 at 13:02 -0500, Miguel Lavalle wrote: > Hi Stackers, > > I want to request the inclusion of the support for Neutron Routed Networks > in the Nova goals for the Ussuri cycle. +1 > As many of you might know, Routed > networks is a feature in Neutron that enables the creation of very large > virtual networks that avoids the performance penalties of large L2 > broadcast domains ( > https://www.openstack.org/videos/summits/barcelona-2016/scaling-up-openstack-networking-with-routed-networks). > This functionality can be very helpful for large deployers who have the > need to have one or a few large virtual networks shared by all their users > and has been available in Neutron since very soon after the Barcelona > Summit in 2016. But it is really useless until there is code in Nova that > can schedule VMs to compute hosts based on the segments topology of the > routed networks. Without it, VMs can land in compute hosts where their > traffic cannot be routed by the underlying network infrastructure. I would > like the Nova team to consider the following when making a decision about > this request: > > 1. Work for Routed Networks was approved as a priority for the Ocata > cycle, although it wasn't concluded: > https://specs.openstack.org/openstack/nova-specs/priorities/ocata-priorities.html#network-aware-scheduling > and > https://specs.openstack.org/openstack/nova-specs/specs/pike/index.html > 2. The are several large deployers that need this feature. Verizon > Media, my employer, is one of them. Others that come to mind include > GoDaddy and Cern. And I am sure there are others. > 3. There is a WIP patch to implement the functionality some of the > functionality: https://review.opendev.org/#/c/656885. We, at Verizon > Media, are proposing to take over this work and finish its implementation > by the end of U. What we are requesting is Nova core reviewers bandwidth to > help us merge the code > for context of other and to qualify for myself, the main work to "support this in nova" is related to the schduler and placement. if i remember correctly we prviosly discussed the idea of modeling the subnet/segment affinity between compute hosts and routed ip subents as placement aggreates that woudl be create by neutron. the available number of ips in each routed subnet would be modelled as an inventory of ips in a shareing resouce provider. during spawn when nova retrieves the port info from the precreated neturon port, neutron would pass a resources request for an ip and a aggreage using the existing resouce requests mechanism that was introduced for bandwith aware schduleing. nova then jsut need to merge the aggreage and ip requst with the other request form the port,flavor,image when it queries placment to ensure that the returned hosts are connected to the correct routed segment. > I will be attending the PTG in Shanghai and will make myself available to > discuss this further in person any day and any time. Hopefully, we can get > this feature lined up very soon i wont be at the PTG but i do think this i quite valuable as today the only way to use routed networks today is to ip_allocation=defer which means you cannot chose the ports ip ahead of time. also because we dont schduler based on compute host to segment affinity today it is not safe to live migrate, cold migrate, resize or shelve instance with routed networks today as it coudl fail due to the segment being unreachable on the selected host. if we finish move operation for ports with resouce requests which we need for port with minium bandwith then it will fix all of the above for routed networks too. > > Best regards > > Miguel From rosmaita.fossdev at gmail.com Mon Oct 7 19:42:13 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Mon, 7 Oct 2019 15:42:13 -0400 Subject: [Release-job-failures] Tag of openstack/cinder for ref refs/tags/15.0.0.0rc2 failed In-Reply-To: <861d9067-070a-4f75-1f71-d15baf221760@openstack.org> References: <861d9067-070a-4f75-1f71-d15baf221760@openstack.org> Message-ID: On 10/7/19 12:02 PM, Thierry Carrez wrote: > zuul at openstack.org wrote: >> Build failed. >> >> - publish-openstack-releasenotes-python3 >> https://zuul.opendev.org/t/openstack/build/965908bbf69141c393d4728f7de07f7d >> : POST_FAILURE in 29m 45s > > Looks like a transient failure > > Collect sphinx build html: > ssh: connect to host 162.242.237.111 port 22: Connection timed out > rsync: connection unexpectedly closed (0 bytes received so far) [Receiver] > rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.1] > > Collect artifacts: > ssh: connect to host 162.242.237.111 port 22: Connection timed out > rsync: connection unexpectedly closed (0 bytes received so far) [Receiver] > rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.1] > > Release notes should be picked up at the next RC or the final, so no > need to retry/reenqueue? > That sounds OK to me. From openstack at nemebean.com Mon Oct 7 19:43:04 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Oct 2019 14:43:04 -0500 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> Message-ID: On 10/7/19 11:31 AM, Jeremy Stanley wrote: > On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: > [...] >> Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear >> to me that backporting bug fixes for it is valid. The original >> author of the patch actually wants it for Rocky > [...] > > Neither the changes nor the bug report indicate what the motivation > is for supporting newer Qemu with (much) older OpenStack. Is there > some platform which has this Qemu behavior on which folks are trying > to run Rocky? Or is it a homegrown build combining these dependency > versions from disparate time periods? Or maybe some other reason I'm > not imagining? > In addition to the downstream reasons Sean mentioned, Mark (the original author of the patch) responded to my question on the train backport with this: """ Today, I need it in Rocky. But, I'm find to do local patching. Anybody who needs Qemu 4.1.0 likely needs it. A key feature in Qemu 4.1.0 is that this is the first release of Qemu to include proper support for migration of L1 guests that have L2 guests (nVMX / nested KVM). So, I expect it is pretty important to whoever realizes this, and whoever needs this. """ So basically a desire to use a feature of the newer qemu with older openstack, which is why I'm questioning whether this fits our stable policy. My inclination is to say it's a fairly simple, backward-compatible patch that will make users' lives easier, but I also feel like doing a backport to enable a feature, even if the actual patch is a "bugfix", is violating the spirit of the stable policy. From kennelson11 at gmail.com Mon Oct 7 19:53:04 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 7 Oct 2019 12:53:04 -0700 Subject: [all] Final PTG Schedule Message-ID: Hello Everyone! After a few weeks of shuffling and changes, we have a final schedule! It can be seen here on the 'Schedule' tab[1]. -Kendall (diablo_rojo) [1] https://www.openstack.org/PTG -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Mon Oct 7 20:00:27 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Oct 2019 15:00:27 -0500 Subject: [all][requirements] requirements branched train - cycle-trailing are on notice that master is now ussuri In-Reply-To: <20191007160703.lzyan2777owo4cbw@mthode.org> References: <20191007160703.lzyan2777owo4cbw@mthode.org> Message-ID: <20191007200027.cvdt754ftnncdz4u@mthode.org> On 19-10-07 11:07:03, Matthew Thode wrote: > Just a friendly ping that the train keeps rolling. > > Master is now ussuri. > https://releases.openstack.org/constraints/upper/ussuri should work soon > for those that want to switch their install_command in tox.ini within > master earlier in the cycle (rather than having things pile up). I'll > email again once it's working. > > cycle-trailing projects are on notice that if they track master, the > dependencies may change from what they are currently working on (train). > > If you have any questions please let me know (in the > #openstack-requirements channel preferably). > > P.S. FFE season is over > https://releases.openstack.org/constraints/upper/ussuri now works -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From smooney at redhat.com Mon Oct 7 20:08:19 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Oct 2019 21:08:19 +0100 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> Message-ID: <1c17ad14272bddd29f46ea9790d128f4ff005099.camel@redhat.com> On Mon, 2019-10-07 at 14:43 -0500, Ben Nemec wrote: > > On 10/7/19 11:31 AM, Jeremy Stanley wrote: > > On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: > > [...] > > > Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear > > > to me that backporting bug fixes for it is valid. The original > > > author of the patch actually wants it for Rocky > > > > [...] > > > > Neither the changes nor the bug report indicate what the motivation > > is for supporting newer Qemu with (much) older OpenStack. Is there > > some platform which has this Qemu behavior on which folks are trying > > to run Rocky? Or is it a homegrown build combining these dependency > > versions from disparate time periods? Or maybe some other reason I'm > > not imagining? > > > > In addition to the downstream reasons Sean mentioned, Mark (the original > author of the patch) responded to my question on the train backport with > this: > > """ > Today, I need it in Rocky. But, I'm find to do local patching. > > Anybody who needs Qemu 4.1.0 likely needs it. A key feature in Qemu > 4.1.0 is that this is the first release of Qemu to include proper > support for migration of L1 guests that have L2 guests (nVMX / nested > KVM). So, I expect it is pretty important to whoever realizes this, and > whoever needs this. > """ > > So basically a desire to use a feature of the newer qemu with older > openstack, which is why I'm questioning whether this fits our stable > policy. My inclination is to say it's a fairly simple, > backward-compatible patch that will make users' lives easier, but I also > feel like doing a backport to enable a feature, even if the actual patch > is a "bugfix", is violating the spirit of the stable policy. in many distros the older qemus allow migration of the l1 guest eventhouhg it is unsafe to do so and either work by luck or the vm will curput its memroy and likely crash. the context of the qemu issue is for years people though that live migration with nested virt worked, then it was disabeld upstream and many distos reverted that as it would break there users where they got lucky and it worked, and in 4.1 it was fixed. this does not add or remvoe any functionality in openstack nova will try to live migarte if you tell it too regardless of the qemu it has it just will fail if the live migration check was complied in. similarly if all your images did not have fractional sizes you could use 4.1.0 with older oslo releases and it would be fine. i.e. you could get lucky and for your specific usecase this might not be needed but it would be nice not do depend on luck. anyway i woudl expect any disto the chooses to support qemu 4.1.0 to backport this as required. im not sure this problematic to require a late oslo version bump before train ga but i would hope it can be fixed on stable/train > From mthode at mthode.org Mon Oct 7 20:15:53 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Oct 2019 15:15:53 -0500 Subject: [requirements][heat] remove salt from requirements (used by heat-agents tests only) Message-ID: <20191007201553.xvaeejp2meoyw3ea@mthode.org> Salt has been harsh to deal with. Upstream adding and maintaining caps has caused it to be held back. This time it's pyyaml, I'm not going to hold back the version of pyyaml for one import of salt. In any case, heat-agents uses salt in one location and may not even be using the one we define via constraints in any case. File: heat-config-salt/install.d/50-heat-config-hook-salt Installs salt from package then runs heat-config-salt/install.d/hook-salt.py In heat-config-salt/install.d/hook-salt.py is defined the only import of salt I can find and likely uses the package version as it's installed after tox sets things up. Is the heat team ok with this? -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From sean.mcginnis at gmx.com Mon Oct 7 20:19:22 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Mon, 7 Oct 2019 15:19:22 -0500 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: References: Message-ID: <20191007201922.GA7126@sm-workstation> On Mon, Oct 07, 2019 at 01:02:32PM -0500, Miguel Lavalle wrote: > Hi Stackers, > > I want to request the inclusion of the support for Neutron Routed Networks > in the Nova goals for the Ussuri cycle. As many of you might know, Routed > networks is a feature in Neutron that enables the creation of very large > virtual networks that avoids the performance penalties of large L2 > broadcast domains ( > https://www.openstack.org/videos/summits/barcelona-2016/scaling-up-openstack-networking-with-routed-networks). > This functionality can be very helpful for large deployers who have the > need to have one or a few large virtual networks shared by all their users > and has been available in Neutron since very soon after the Barcelona > Summit in 2016. But it is really useless until there is code in Nova that > can schedule VMs to compute hosts based on the segments topology of the > routed networks. Without it, VMs can land in compute hosts where their > traffic cannot be routed by the underlying network infrastructure. I would > like the Nova team to consider the following when making a decision about > this request: > Is there a community-wide effort with this, or is this really just asking that Nova prioritize this work? The cycle goals (typically) have been used for things that we need the majority of the community to focus on in order to complete. If this is just something between Neutron and Nova, I don't think it really fits as a cycle goal. I do think it would be a good thing to try to complete in Ussuri though. Just maybe not as a community goal. Sean From fsbiz at yahoo.com Mon Oct 7 20:33:20 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Mon, 7 Oct 2019 20:33:20 +0000 (UTC) Subject: Port creation times out for some VMs in large group In-Reply-To: <1127664659.2766839.1570042860356@mail.yahoo.com> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> <1226029673.2675287.1570034502180@mail.yahoo.com> <1127664659.2766839.1570042860356@mail.yahoo.com> Message-ID: <1645654897.4940251.1570480400983@mail.yahoo.com> Thanks. Yes, it helps breathe some CPU cycles. This was traced to afixed bug. https://bugs.launchpad.net/neutron/+bug/1760047 which was applied to Queens in April 2019. https://review.opendev.org/#/c/649580/ Unfortunately, the patch simply makes the code more elegant by removing the semaphores.But it does not really fix the real issue that is dhcp-client serializes all the port update messages and eachmessage is processed too slowly resulting in PXE boot timeouts. The issue still remains open. thanks,Fred. On Wednesday, October 2, 2019, 11:34:39 AM PDT, Chris Apsey wrote: Is that still spitting out a vif plug failure or are your instances spawning but not getting addresses? I've found that adding in the no-ping option to dnsmasq lowers load significantly, but can be dangerous if you've got potentially conflicting sources of address allocation. While it doesn't address the below bug report specifically, it may breathe some more CPU cycles into dnsmasq so it can handle other items better. R CA -------- Original Message -------- On Oct 2, 2019, 12:41, fsbiz at yahoo.com < fsbiz at yahoo.com> wrote: Thanks. This definitely helps. I am running a stable release of Queens.Even after this change I still see 10-15 failures when I create 100 VMs in our cluster. I have tracked this down (to a reasonable degree of certainty) to the SIGHUPs caused by DNSMASQ reloadsevery time a new MAC entry is added, deleted or updated.  It seems to be related tohttps://bugs.launchpad.net/neutron/+bug/1598078 The fix for the above bug was abandoned.  Gerrit Code Review | | | | Gerrit Code Review | | | Any further fine tuning that can be done?  Thanks,Fred. On Friday, September 27, 2019, 09:37:51 AM PDT, Chris Apsey wrote: Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away.  I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden wrote: When I create 100 VMs in our prod cluster:   openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest   Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.”   If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure.   What config variables should I be looking at?   Here are the relevant log entries from the HV:   2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds   More logs and data:   http://paste.openstack.org/show/779524/   -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Mon Oct 7 20:36:40 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Oct 2019 15:36:40 -0500 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: <1c17ad14272bddd29f46ea9790d128f4ff005099.camel@redhat.com> References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> <1c17ad14272bddd29f46ea9790d128f4ff005099.camel@redhat.com> Message-ID: On 10/7/19 3:08 PM, Sean Mooney wrote: > On Mon, 2019-10-07 at 14:43 -0500, Ben Nemec wrote: >> >> On 10/7/19 11:31 AM, Jeremy Stanley wrote: >>> On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: >>> [...] >>>> Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear >>>> to me that backporting bug fixes for it is valid. The original >>>> author of the patch actually wants it for Rocky >>> >>> [...] >>> >>> Neither the changes nor the bug report indicate what the motivation >>> is for supporting newer Qemu with (much) older OpenStack. Is there >>> some platform which has this Qemu behavior on which folks are trying >>> to run Rocky? Or is it a homegrown build combining these dependency >>> versions from disparate time periods? Or maybe some other reason I'm >>> not imagining? >>> >> >> In addition to the downstream reasons Sean mentioned, Mark (the original >> author of the patch) responded to my question on the train backport with >> this: >> >> """ >> Today, I need it in Rocky. But, I'm find to do local patching. >> >> Anybody who needs Qemu 4.1.0 likely needs it. A key feature in Qemu >> 4.1.0 is that this is the first release of Qemu to include proper >> support for migration of L1 guests that have L2 guests (nVMX / nested >> KVM). So, I expect it is pretty important to whoever realizes this, and >> whoever needs this. >> """ >> >> So basically a desire to use a feature of the newer qemu with older >> openstack, which is why I'm questioning whether this fits our stable >> policy. My inclination is to say it's a fairly simple, >> backward-compatible patch that will make users' lives easier, but I also >> feel like doing a backport to enable a feature, even if the actual patch >> is a "bugfix", is violating the spirit of the stable policy. > in many distros the older qemus allow migration of the l1 guest eventhouhg it is > unsafe to do so and either work by luck or the vm will curput its memroy and likely > crash. the context of the qemu issue is for years people though that live migration with > nested virt worked, then it was disabeld upstream and many distos reverted that as it would > break there users where they got lucky and it worked, and in 4.1 it was fixed. > > this does not add or remvoe any functionality in openstack nova will try to live migarte if you > tell it too regardless of the qemu it has it just will fail if the live migration check was complied in. > > > similarly if all your images did not have fractional sizes you could use 4.1.0 with older > oslo releases and it would be fine. i.e. you could get lucky and for your specific usecase this > might not be needed but it would be nice not do depend on luck. > > anyway i woudl expect any disto the chooses to support qemu 4.1.0 to backport this as required. > im not sure this problematic to require a late oslo version bump before train ga but i would hope > it can be fixed on stable/train Note that this discussion is separate from the train patch. I agree we should do that backport, and actually we already have. That discussion was just about timing of the release. This thread is because the fix was also proposed to stable/stein. It merged before I had a chance to start this discussion, and I'm wondering if we need to revert it. From mriedemos at gmail.com Mon Oct 7 22:18:23 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Mon, 7 Oct 2019 17:18:23 -0500 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: <20191007201922.GA7126@sm-workstation> References: <20191007201922.GA7126@sm-workstation> Message-ID: On 10/7/2019 3:19 PM, Sean McGinnis wrote: > Is there a community-wide effort with this, or is this really just asking that > Nova prioritize this work? > > The cycle goals (typically) have been used for things that we need the majority > of the community to focus on in order to complete. If this is just something > between Neutron and Nova, I don't think it really fits as a cycle goal. > > I do think it would be a good thing to try to complete in Ussuri though. Just > maybe not as a community goal. Miguel isn't talking about cycle wide goals. There are some proposed process changes for nova in Ussuri [1] along with constraining the amount of feature work approved for the release. I think Miguel is just asking that routed networks support is included in that bucket and I'm sure the answer is, like for anything, "it depends". From a wider governance perspective, if people interested in developing this feature were looking for an officially blessed thing, this would be a pop-up team. [1] https://review.opendev.org/#/c/685857/ -- Thanks, Matt From openstack at fried.cc Mon Oct 7 22:28:42 2019 From: openstack at fried.cc (Eric Fried) Date: Mon, 7 Oct 2019 17:28:42 -0500 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: References: <20191007201922.GA7126@sm-workstation> Message-ID: <85498548-b657-7b96-35e5-ed493bec0056@fried.cc> > Miguel isn't talking about cycle wide goals. There are some proposed > process changes for nova in Ussuri [1] along with constraining the > amount of feature work approved for the release. I think Miguel is just > asking that routed networks support is included in that bucket and I'm > sure the answer is, like for anything, "it depends". Agreed. What hasn't changed is that to get to the table it will need a blueprint [1] (which I don't see yet [2]) and spec [3] (likewise [4]). efried [1] https://blueprints.launchpad.net/nova/ussuri/+addspec [2] https://blueprints.launchpad.net/nova/ussuri [3] http://specs.openstack.org/openstack/nova-specs/readme.html [4] https://review.opendev.org/#/q/project:openstack/nova-specs+status:open From fsbiz at yahoo.com Mon Oct 7 22:45:15 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Mon, 7 Oct 2019 22:45:15 +0000 (UTC) Subject: [neutron]: Latest Queens release: dhcp-client takes too long processing messages and falls behind. In-Reply-To: <556991713.4938348.1570478933095@mail.yahoo.com> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> <1226029673.2675287.1570034502180@mail.yahoo.com> <1127664659.2766839.1570042860356@mail.yahoo.com> <556991713.4938348.1570478933095@mail.yahoo.com> Message-ID: <1671513214.4995921.1570488315508@mail.yahoo.com> Hi neutron team, We've been troubleshooting an issue with neutron's dhcp-client for sometime now. We were previously on neutron 12.0.5 and observed that upon reloading 5-8 baremetals simultaneously almost always led to a few baremetals failing PXE boot during provisioning and/or cleaning. This was traced to afixed bug. https://bugs.launchpad.net/neutron/+bug/1760047 which was applied to Queens in April 2019. https://review.opendev.org/#/c/649580/ We patched the above fix but found out the problem was not resolved. The fix gets rid of the semaphores by serializing the multiple messages into aPriority Queue. The Priority Queue then drains the messages serially one by one making sure notto yield during the processing of each message.  All in all this just seems like a more elegant way ofgetting rid of the semaphores but does not really  fix the issue at hand. Below are the logs from dhcp-agent in neutron release 12.0.5.  As can be seen the semaphore locks all threads for almost 6 seconds.While the below has been fixed using https://review.opendev.org/#/c/649580/ the underlying problem has not been fixed.  The semaphore has been removed but instead the message is being serialized and does not yieldresulting in PXE boot failures on the baremetal nodes. Any pointers would be appreciated. thanks,Fred 2019-10-03 18:07:37.454 318956DEBUG oslo_concurrency.lockutils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -] Acquired semaphore"dhcp-agent-network-lock-077aa2d1-605c-48ec-842d-7dd6767bfd01" lock/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212 2019-10-03 18:07:37.455 318956DEBUG neutron.agent.dhcp.agent [req-eac79995-3846-46e6-b946-c5b5ccdb7aa5 8941137e383548bda725e74a93b2f86519f6fb7446dc47dd88c63cf03c1cce94 - - -] Calling driver for network:077aa2d1-605c-48ec-842d-7dd6767bfd01 action: reload_allocations call_driver/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py:135 2019-10-03 18:07:37.456 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'dhcp_release', 'ns-8387b854-d1','10.33.27.77', '9c:71:3a:cb:7c:43', '01:9c:71:3a:cb:7c:43']execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:38.101 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'dhcp_release', 'ns-8387b854-d1','10.33.27.75', '9c:71:3a:cb:7b:fb'] execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:38.717 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'dhcp_release', 'ns-8387b854-d1','10.33.27.75', '9c:71:3a:cb:7b:fb', 'ff:3a:cb:7b:fb:00:04:8a:ef:2f:58:b4:20:45:03:80:27:0f:15:84:a4:70:7b']execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:39.631 318956DEBUG neutron.agent.linux.dhcp [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Building host file: /var/lib/neutron/dhcp/077aa2d1-605c-48ec-842d-7dd6767bfd01/host_output_hosts_file/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py:695 2019-10-03 18:07:39.632 318956DEBUG neutron.agent.linux.dhcp [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -] Donebuilding host file/var/lib/neutron/dhcp/077aa2d1-605c-48ec-842d-7dd6767bfd01/host_output_hosts_file/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py:734 2019-10-03 18:07:39.633 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'ip', 'addr', 'show','ns-8387b854-d1'] execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:40.263 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['kill', '-HUP', '319109']execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:40.843 318956DEBUG neutron.agent.linux.dhcp [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Reloading allocations for network: 077aa2d1-605c-48ec-842d-7dd6767bfd01reload_allocations/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py:524 2019-10-03 18:07:40.843 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'ip', '-4', 'route', 'list','dev', 'ns-8387b854-d1'] execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:41.462 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa5 8941137e383548bda725e74a93b2f86519f6fb7446dc47dd88c63cf03c1cce94 - - -] Running command (rootwrap daemon):['ip', 'netns', 'exec', 'qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'ip','-6', 'route', 'list', 'dev', 'ns-8387b854-d1'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2019-10-03 18:07:42.101 318956DEBUG oslo_concurrency.lockutils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -] Releasing semaphore"dhcp-agent-network-lock-077aa2d1-605c-48ec-842d-7dd6767bfd01" lock/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:228   -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnsomor at gmail.com Tue Oct 8 00:15:56 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Mon, 7 Oct 2019 17:15:56 -0700 Subject: [dev][taskflow] Accepting any decision In-Reply-To: References: Message-ID: Hi Raja, You can have a decider that goes down one of two paths and then continues with common tasks. See the Octavia code starting at Line 228 to Line 289 here: https://github.com/openstack/octavia/blob/master/octavia/controller/worker/v2/flows/amphora_flows.py#L228 We "decide" if we can use a pre-booted VM and if not, we boot one. Then once we have a VM by either path, we finish configuring it. Michael On Fri, Sep 27, 2019 at 6:49 AM Jiří Rája wrote: > > Hi, > I wrote the code in the attachment and I would like to ask if it's possible to execute next task (step3) even if one decider returns True (link from step 1 to step 3) and one returns False (link from step 2 to step 3). If it is possible could someone alter the code? Or is there any other way to do it? And if the task wouldn't have to wait for all of the links, it would be great. Thank you! > > All the best, > Rája From soulxu at gmail.com Tue Oct 8 04:46:04 2019 From: soulxu at gmail.com (Alex Xu) Date: Tue, 8 Oct 2019 12:46:04 +0800 Subject: [nova] Stepping down from core reviewer In-Reply-To: References: Message-ID: Kenichi, thanks for your contribution, I also learned a lot from you. All the best for your future endeavors! Kenichi Omichi 于2019年10月2日周三 上午5:47写道: > Hello, > > Today my job description is changed and I cannot have enough time for > regular reviewing work of Nova project. > So I need to step down from the core reviewer. > > I spend 6 years in the project, the experience is amazing. > OpenStack gave me a lot of chances to learn technical things deeply, make > friends in the world and bring me and my family to foreign country from our > home country. > I'd like to say thank you for everyone in the community :-) > > My personal private cloud is based on OpenStack, so I'd like to still keep > contributing for the project if I find bugs or idea. > > Thanks > Kenichi Omichi > > --- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Tue Oct 8 08:05:20 2019 From: zigo at debian.org (Thomas Goirand) Date: Tue, 8 Oct 2019 10:05:20 +0200 Subject: cinder 15.0.0.0rc2 (train) In-Reply-To: References: Message-ID: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> On 10/7/19 4:40 PM, no-reply at openstack.org wrote: > Hello everyone, > > A new release candidate for cinder for the end of the Train > cycle is available! You can find the source code tarball at: > > https://tarballs.openstack.org/cinder/ Hi, For the 2nd time, could we *please* re-add the tag: [release-announce] when announcing for a release? I don't mind if it's sent to -discuss instead of the announce this, but this breaks mail filters... Cheers, Thomas Goirand (zigo) From thierry at openstack.org Tue Oct 8 08:47:22 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 8 Oct 2019 10:47:22 +0200 Subject: cinder 15.0.0.0rc2 (train) In-Reply-To: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> References: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> Message-ID: Thomas Goirand wrote: > For the 2nd time, could we *please* re-add the tag: [release-announce] > when announcing for a release? I don't mind if it's sent to -discuss > instead of the announce this, but this breaks mail filters... There was no clear decision last time we brought this up (and nobody proposed patches to fix it). I think we'll just move RC announcements to release-announce. Let me see if I can push a patch for this today (there may be a few more RCs sent like this one in the mean time). -- Thierry Carrez (ttx) From smooney at redhat.com Tue Oct 8 09:25:01 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 08 Oct 2019 10:25:01 +0100 Subject: [nova] Request to include routed networks support in the Ussuri cucly goals In-Reply-To: <85498548-b657-7b96-35e5-ed493bec0056@fried.cc> References: <20191007201922.GA7126@sm-workstation> <85498548-b657-7b96-35e5-ed493bec0056@fried.cc> Message-ID: On Mon, 2019-10-07 at 17:28 -0500, Eric Fried wrote: > > Miguel isn't talking about cycle wide goals. There are some proposed > > process changes for nova in Ussuri [1] along with constraining the > > amount of feature work approved for the release. I think Miguel is just > > asking that routed networks support is included in that bucket and I'm > > sure the answer is, like for anything, "it depends". > > Agreed. What hasn't changed is that to get to the table it will need a > blueprint [1] (which I don't see yet [2]) and spec [3] (likewise [4]). for this specific effort while it would not be a community wide goal this effort might benefit form a pop-up team of nova, placement and neutron developers to Shepard it along. i have to admit while we discussed this at some length at the PTG i did not follow the neutron development to see if they had got to the point of modelling subnets/segments as placement aggregates and sharing resource providers of ips. we have definitely made progress on the nova side thanks to gibi on move operations for ports with resource requests. having a fourm to bring the 3 project together may help finally get this over the line. that said i am not sure what remains to be done on the neutron side and what nova needs to do. I speculated about the gaps in my previous responce based on the desgin we discussed in the past. The current WIP patch was uploaded by matt https://review.opendev.org/#/c/656885 so i think he understands nova process better then most, that said if migule and matt are tied up with things i can try and help with the paperwork. Matt you have not been active on that patch since may is this something you have time/intend to work on for Ussuri? im not necessarily signing up to work on this at this point but it is a feature i think we should add and given i have not finalise what work i intent to do in U i might be able to help. @matt one point on your last comment to that patch that does perplex me somewhat was the assertion/implication configuration of nova host aggreates woudl be required. part of the goal as i understood it was to require no configuration on the nova side at all. i.e. instead of haveing a config option for a prefilter to update the request spec by transforming the subnets into placement aggreates we would build on the port requests feature we used for bandwith based schduling so that neutron can provide a resouce request for an ip and aggreate per port. we could discuss this in a spec but the reason i bring it up is the current patch looks like it would be problematic if you have a cloud with multiple network backeds say sriov and calico as its a global config rather then a backend specific behavior that builds on the generic perport resource requests. anyway that is an implemantion detail/design choice that we can discuss else where i just wanted to point it out. > > efried > > [1] https://blueprints.launchpad.net/nova/ussuri/+addspec > [2] https://blueprints.launchpad.net/nova/ussuri > [3] http://specs.openstack.org/openstack/nova-specs/readme.html > [4] https://review.opendev.org/#/q/project:openstack/nova-specs+status:open > From zigo at debian.org Tue Oct 8 09:58:03 2019 From: zigo at debian.org (Thomas Goirand) Date: Tue, 8 Oct 2019 11:58:03 +0200 Subject: cinder 15.0.0.0rc2 (train) In-Reply-To: References: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> Message-ID: On 10/8/19 10:47 AM, Thierry Carrez wrote: > Thomas Goirand wrote: >> For the 2nd time, could we *please* re-add the tag: [release-announce] >> when announcing for a release? I don't mind if it's sent to -discuss >> instead of the announce this, but this breaks mail filters... > > There was no clear decision last time we brought this up (and nobody > proposed patches to fix it). > > I think we'll just move RC announcements to release-announce. Let me see > if I can push a patch for this today (there may be a few more RCs sent > like this one in the mean time). Thierry, Hopefully, I'm not too moronic here... :) I'm not trying to push any decision of changing any habits. Just trying to not forget one artifact. There's no need for any decision to add the [release announce] tag! :) I'm not sure where to propose the patch (I've searched for it), otherwise I would have done it. Thomas From thierry at openstack.org Tue Oct 8 12:00:26 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 8 Oct 2019 14:00:26 +0200 Subject: cinder 15.0.0.0rc2 (train) In-Reply-To: References: <373aef08-c753-20d8-89d7-d090973a077b@debian.org> Message-ID: Thomas Goirand wrote: > On 10/8/19 10:47 AM, Thierry Carrez wrote: >> Thomas Goirand wrote: >>> For the 2nd time, could we *please* re-add the tag: [release-announce] >>> when announcing for a release? I don't mind if it's sent to -discuss >>> instead of the announce this, but this breaks mail filters... >> >> There was no clear decision last time we brought this up (and nobody >> proposed patches to fix it). >> >> I think we'll just move RC announcements to release-announce. Let me see >> if I can push a patch for this today (there may be a few more RCs sent >> like this one in the mean time). > > Thierry, > > Hopefully, I'm not too moronic here... :) > I'm not trying to push any decision of changing any habits. Just trying > to not forget one artifact. There's no need for any decision to add the > [release announce] tag! :) Actually the [release-announce] prefix is added by the mailing-list itself, so if we add it to the subject line we'd also have to change ML settings so that it's not added twice... > I'm not sure where to propose the patch (I've searched for it), > otherwise I would have done it. That should do it: https://review.opendev.org/687275 -- Thierry Carrez (ttx) From jim at jimrollenhagen.com Tue Oct 8 12:12:57 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Tue, 8 Oct 2019 08:12:57 -0400 Subject: [tc] monthly meeting agenda In-Reply-To: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> References: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> Message-ID: On Thu, Oct 3, 2019 at 3:59 PM Jean-Philippe Evrard wrote: > Hello everyone, > > Here's the agenda for our monthly TC meeting. It will happen next > Thursday (10 October) at the usual time (1400 UTC) in #openstack-tc . > > If you can't attend, please put your name in the "Apologies for > Absence" section in the wiki [1] > > Our meeting chair will be Alexandra (asettle). > > * Follow up on past action items > ** ricolin: Follow up with SIG chairs about guidelines > https://etherpad.openstack.org/p/SIGs-guideline > ** ttx: contact interested parties in a new 'large scale' sig (help > with mnaser, jroll reaching out to verizon media) > ** Release Naming - Results of the TC poll - Next action > > * New initiatives and/or report on previous initiatives > ** Help gmann on the community goals following our new goal process > ** mugsie: to sync with dhellmann or release-team to find the code for > the proposal bot > ** jroll - ttx: Feedback from the forum selection committee -- Follow > up on https://etherpad.openstack.org/p/PVG-TC-brainstorming -- Final > accepted list? > To follow up on this asynchronously: Final schedule is here: https://www.openstack.org/summit/shanghai-2019/summit-schedule/global-search?t=forum I made notes on the etherpad about which were accepted or not. Of course, we can still discuss in the meeting. :) // jim > ** mnaser: sync up with swift team on python3 migration > > Thank you everyone! > > Regards, > JP > > [1]: > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Tue Oct 8 13:03:34 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 8 Oct 2019 09:03:34 -0400 Subject: [cinder] Train release status Message-ID: <81b0820f-8694-c827-d82e-e2e1562f9a83@gmail.com> You may have noticed that RC-2 was released yesterday. We aren't planning to do an RC-3 unless a critical bugfix is approved for backport. Here's the timeline: - Now through Friday 11 October: RC-3, etc. are cut as necessary - The "final RC" is whatever RC-n exists on 11 October - The coordinated release date is 16 October * Any bugfixes caught after 11 Oct can be merged into stable/train, but it is up to the release team whether they can be included in the release. Please do some exploratory testing on the 15.0.0.0rc2 tag (which right now is the HEAD of stable/train). If you find a critical bug, please file it in Launchpad and tag it 'train-rc-potential'. Also add it to the etherpad: https://etherpad.openstack.org/p/cinder-train-backport-potential and make some noise in #openstack-cinder so we are all aware of it. cheers, brian From jean-philippe at evrard.me Tue Oct 8 14:04:27 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 08 Oct 2019 16:04:27 +0200 Subject: [tc] monthly meeting agenda In-Reply-To: References: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> Message-ID: <1e6f227d2b341b7d7d528d30f4b3c9821e66ffe9.camel@evrard.me> On Tue, 2019-10-08 at 08:12 -0400, Jim Rollenhagen wrote: > I made notes on the etherpad about which were accepted or not. > Of course, we can still discuss in the meeting. :) Thanks! Maybe we could only discuss about what to do for our rejected sessions (in https://etherpad.openstack.org/p/PVG-TC-brainstorming )? Regards, JP From mihalis68 at gmail.com Tue Oct 8 15:18:42 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 8 Oct 2019 11:18:42 -0400 Subject: [ops] ops meetups team meeting 2019-10-8 Message-ID: Minutes from todays meeting are here: 10:56 AM Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-08-14.06.html 10:56 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-08-14.06.txt 10:56 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-08-14.06.log.html The Ops Community attending the upcoming Summit in Shanghai will have one Forum session (ops war stories). On day 4 we will also have a 3 hours session for further ops related topic discussion. Details still to be arranged. Chris - on behalf of the openstack ops meetups team -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From fsbiz at yahoo.com Tue Oct 8 16:14:22 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Tue, 8 Oct 2019 16:14:22 +0000 (UTC) Subject: Neutron dhcp-agent scalability techniques References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> Message-ID: <459655647.5382428.1570551262388@mail.yahoo.com> Hi folks, We have a rather large flat network consisting of over 300 ironic baremetal nodesand are constantly having the baremetals timing out during their PXE boot due tothe dhcp agent not able to respond in time. Looking for inputs on successful DHCP scaling techniques that would help mitigate this. thanks,Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ruslanas at lpic.lt Tue Oct 8 16:24:20 2019 From: ruslanas at lpic.lt (=?UTF-8?Q?Ruslanas_G=C5=BEibovskis?=) Date: Tue, 8 Oct 2019 18:24:20 +0200 Subject: Neutron dhcp-agent scalability techniques In-Reply-To: <459655647.5382428.1570551262388@mail.yahoo.com> References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> Message-ID: Hi, I am just curious, how much dhcp agents do you have on a network? Is controller monolithic? How much controllers do you have? On Tue, 8 Oct 2019, 18:17 fsbiz at yahoo.com, wrote: > Hi folks, > > We have a rather large flat network consisting of over 300 ironic > baremetal nodes > and are constantly having the baremetals timing out during their PXE boot > due to > the dhcp agent not able to respond in time. > > Looking for inputs on successful DHCP scaling techniques that would help > mitigate this. > > thanks, > Fred. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Oct 8 16:28:45 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 8 Oct 2019 09:28:45 -0700 Subject: Neutron dhcp-agent scalability techniques In-Reply-To: <459655647.5382428.1570551262388@mail.yahoo.com> References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> Message-ID: While not necessarily direct scaling of that subnet, you may want to look at ironic.conf's [neutron]port_setup_delay option. The default value is zero seconds, but increasing that value will cause the process to pause a little longer to give time for the neutron agent configuration to update, as the agent may not even know about the configuration as there are multiple steps with-in neutron, by the time the baremetal machine tries to PXE boot. We're hoping that in the U cycle, we'll finally have things in place where neutron tells ironic that the port setup is done and that the machine can be powered-on, but not all the code made it during Train. -Julia On Tue, Oct 8, 2019 at 9:15 AM fsbiz at yahoo.com wrote: > > Hi folks, > > We have a rather large flat network consisting of over 300 ironic baremetal nodes > and are constantly having the baremetals timing out during their PXE boot due to > the dhcp agent not able to respond in time. > > Looking for inputs on successful DHCP scaling techniques that would help mitigate this. > > thanks, > Fred. From fsbiz at yahoo.com Tue Oct 8 16:52:09 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Tue, 8 Oct 2019 16:52:09 +0000 (UTC) Subject: Neutron dhcp-agent scalability techniques In-Reply-To: References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> Message-ID: <1923689261.5373083.1570553529145@mail.yahoo.com> We have 3 controller nodes each running a DHCP agent (so 3 DHCP agents in all). Fred. On Tuesday, October 8, 2019, 09:24:34 AM PDT, Ruslanas Gžibovskis wrote: Hi, I am just curious, how much dhcp agents do you have on a network?Is controller monolithic?How much controllers do you have? On Tue, 8 Oct 2019, 18:17 fsbiz at yahoo.com, wrote: Hi folks, We have a rather large flat network consisting of over 300 ironic baremetal nodesand are constantly having the baremetals timing out during their PXE boot due tothe dhcp agent not able to respond in time. Looking for inputs on successful DHCP scaling techniques that would help mitigate this. thanks,Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Tue Oct 8 16:54:40 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Tue, 8 Oct 2019 10:54:40 -0600 Subject: [tripleo] owls at ptg Message-ID: Greetings, A number of folks from TripleO will be at the OpenDev PTG. If you would like to discuss anything and collaborate please list your topic on this etherpad [1] Thank you! [1] https://etherpad.openstack.org/p/tripleo-ussuri-topics -------------- next part -------------- An HTML attachment was scrubbed... URL: From fsbiz at yahoo.com Tue Oct 8 16:55:06 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Tue, 8 Oct 2019 16:55:06 +0000 (UTC) Subject: Neutron dhcp-agent scalability techniques In-Reply-To: References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> Message-ID: <1716201708.5392618.1570553706947@mail.yahoo.com> Thanks Julia.   We have set the port_setup_delay to 30. # Delay value to wait for Neutron agents to setup sufficient# DHCP configuration for port. (integer value)# Minimum value: 0port_setup_delay = 30 >We're hoping that in the U >cycle, we'll finally have things in place where neutron tells ironic >that the port setup is done and that the machine can be powered-on, >but not all the code made it during Train. This would be perfect. Fred. On Tuesday, October 8, 2019, 09:32:44 AM PDT, Julia Kreger wrote: While not necessarily direct scaling of that subnet, you may want to look at ironic.conf's [neutron]port_setup_delay option. The default value is zero seconds, but increasing that value will cause the process to pause a little longer to give time for the neutron agent configuration to update, as the agent may not even know about the configuration as there are multiple steps with-in neutron, by the time the baremetal machine tries to PXE boot. We're hoping that in the U cycle, we'll finally have things in place where neutron tells ironic that the port setup is done and that the machine can be powered-on, but not all the code made it during Train. -Julia On Tue, Oct 8, 2019 at 9:15 AM fsbiz at yahoo.com wrote: > > Hi folks, > > We have a rather large flat network consisting of over 300 ironic baremetal nodes > and are constantly having the baremetals timing out during their PXE boot due to > the dhcp agent not able to respond in time. > > Looking for inputs on successful DHCP scaling techniques that would help mitigate this. > > thanks, > Fred. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Tue Oct 8 16:57:19 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 9 Oct 2019 00:57:19 +0800 Subject: [tc] monthly meeting agenda In-Reply-To: <1e6f227d2b341b7d7d528d30f4b3c9821e66ffe9.camel@evrard.me> References: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> <1e6f227d2b341b7d7d528d30f4b3c9821e66ffe9.camel@evrard.me> Message-ID: I added two more topics in agenda suggestion today which might worth discuss about. * define goal select process schedule * Maintain issue with Telemetery On Tue, Oct 8, 2019 at 10:10 PM Jean-Philippe Evrard < jean-philippe at evrard.me> wrote: > > Thanks! Maybe we could only discuss about what to do for our rejected > sessions (in https://etherpad.openstack.org/p/PVG-TC-brainstorming )? That sounds like a good idea. -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Oct 8 17:05:24 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 8 Oct 2019 10:05:24 -0700 Subject: Neutron dhcp-agent scalability techniques In-Reply-To: <1716201708.5392618.1570553706947@mail.yahoo.com> References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> <1716201708.5392618.1570553706947@mail.yahoo.com> Message-ID: One other thing that comes to mind at 30 seconds is spanning-tree port forwarding delay. PXE boot often thinks once carrier is up, that it can try and send/receive packets, however switches may still block traffic waiting for spanning-tree packets. Just from a limiting possible issues, it might be a good thing to double check network side to make sure "portfast" is the operating mode for the physical ports attached to that flat network. What this would look like is the machine appears to DHCP, but the packets would never actually reach the DHCP server. -Julia On Tue, Oct 8, 2019 at 9:55 AM fsbiz at yahoo.com wrote: > > Thanks Julia. We have set the port_setup_delay to 30. > > > # Delay value to wait for Neutron agents to setup sufficient > # DHCP configuration for port. (integer value) > # Minimum value: 0 > port_setup_delay = 30 > > >We're hoping that in the U > >cycle, we'll finally have things in place where neutron tells ironic > >that the port setup is done and that the machine can be powered-on, > >but not all the code made it during Train. > > This would be perfect. > > Fred. > > > > > On Tuesday, October 8, 2019, 09:32:44 AM PDT, Julia Kreger wrote: > > > While not necessarily direct scaling of that subnet, you may want to > look at ironic.conf's [neutron]port_setup_delay option. The default > value is zero seconds, but increasing that value will cause the > process to pause a little longer to give time for the neutron agent > configuration to update, as the agent may not even know about the > configuration as there are multiple steps with-in neutron, by the time > the baremetal machine tries to PXE boot. We're hoping that in the U > cycle, we'll finally have things in place where neutron tells ironic > that the port setup is done and that the machine can be powered-on, > but not all the code made it during Train. > > -Julia > > On Tue, Oct 8, 2019 at 9:15 AM fsbiz at yahoo.com wrote: > > > > Hi folks, > > > > We have a rather large flat network consisting of over 300 ironic baremetal nodes > > and are constantly having the baremetals timing out during their PXE boot due to > > the dhcp agent not able to respond in time. > > > > Looking for inputs on successful DHCP scaling techniques that would help mitigate this. > > > > thanks, > > Fred. > From fsbiz at yahoo.com Tue Oct 8 18:34:54 2019 From: fsbiz at yahoo.com (fsbiz at yahoo.com) Date: Tue, 8 Oct 2019 18:34:54 +0000 (UTC) Subject: Neutron dhcp-agent scalability techniques In-Reply-To: References: <459655647.5382428.1570551262388.ref@mail.yahoo.com> <459655647.5382428.1570551262388@mail.yahoo.com> <1716201708.5392618.1570553706947@mail.yahoo.com> Message-ID: <667768633.5458229.1570559694224@mail.yahoo.com> Thanks Julia.  Yes, portfast is enabled on the ports of the TOR switch. Regards,Fred. On Tuesday, October 8, 2019, 10:09:35 AM PDT, Julia Kreger wrote: One other thing that comes to mind at 30 seconds is spanning-tree port forwarding delay. PXE boot often thinks once carrier is up, that it can try and send/receive packets, however switches may still block traffic waiting for spanning-tree packets.  Just from a limiting possible issues, it might be a good thing to double check network side to make sure "portfast" is the operating mode for the physical ports attached to that flat network. What this would look like is the machine appears to DHCP, but the packets would never actually reach the DHCP server. -Julia On Tue, Oct 8, 2019 at 9:55 AM fsbiz at yahoo.com wrote: > > Thanks Julia.  We have set the port_setup_delay to 30. > > > # Delay value to wait for Neutron agents to setup sufficient > # DHCP configuration for port. (integer value) > # Minimum value: 0 > port_setup_delay = 30 > > >We're hoping that in the U > >cycle, we'll finally have things in place where neutron tells ironic > >that the port setup is done and that the machine can be powered-on, > >but not all the code made it during Train. > > This would be perfect. > > Fred. > > > > > On Tuesday, October 8, 2019, 09:32:44 AM PDT, Julia Kreger wrote: > > > While not necessarily direct scaling of that subnet, you may want to > look at ironic.conf's [neutron]port_setup_delay option. The default > value is zero seconds, but increasing that value will cause the > process to pause a little longer to give time for the neutron agent > configuration to update, as the agent may not even know about the > configuration as there are multiple steps with-in neutron, by the time > the baremetal machine tries to PXE boot. We're hoping that in the U > cycle, we'll finally have things in place where neutron tells ironic > that the port setup is done and that the machine can be powered-on, > but not all the code made it during Train. > > -Julia > > On Tue, Oct 8, 2019 at 9:15 AM fsbiz at yahoo.com wrote: > > > > Hi folks, > > > > We have a rather large flat network consisting of over 300 ironic baremetal nodes > > and are constantly having the baremetals timing out during their PXE boot due to > > the dhcp agent not able to respond in time. > > > > Looking for inputs on successful DHCP scaling techniques that would help mitigate this. > > > > thanks, > > Fred. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gouthampravi at gmail.com Tue Oct 8 18:38:23 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Tue, 8 Oct 2019 11:38:23 -0700 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> References: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> Message-ID: Thank you all for responding; I've added Douglas to https://review.opendev.org/#/admin/groups/213,members. Thank you Douglas for your hard work - welcome, and glad to have you on board! On Mon, Oct 7, 2019 at 12:30 AM Thomas Bechtold wrote: > +1 from me, too. > > On 10/2/19 10:58 PM, Goutham Pacha Ravi wrote: > > Dear Zorillas and other Stackers, > > > > I would like to formalize the conversations we've been having amongst > > ourselves over IRC and in-person. At the outset, we have a lot of > > incoming changes to review, but we have limited core maintainer > > attention. We haven't re-jigged our core maintainers team as often as > > we'd like, and that's partly to blame. We have some relatively new and > > enthusiastic contributors that we would love to encourage to become > > maintainers! We've mentored contributors 1-1, n-1 before before adding > > them to the maintainers team. We would like to do more of this!** > > > > In this spirit, I would like your inputs on adding Douglas Viroel > > (dviroel) to the core maintainers team for manila and its associated > > projects (manila-specs, manila-ui, python-manilaclient, > > manila-tempest-plugin, manila-test-image, manila-image-elements). > > Douglas has been an active contributor for the past two releases and > > has valuable review inputs in the project. While he's been around here > > less longer than some of us, he brings a lot of experience to the > > table with his background in networking and shared file systems. He > > has a good grasp of the codebase and is enthusiastic in adding new > > features and fixing bugs in the Ussuri cycle and beyond. > > > > Please give me a +/-1 for this proposal. > > > > ** If you're interested in helping us maintain Manila by being part of > > the manila core maintainer team, please reach out to me or any of the > > current maintainers, we would love to work with you and help you grow > > into that role! > > > > Thanks, > > Goutham Pacha Ravi (gouthamr) > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucioseki at gmail.com Tue Oct 8 18:47:54 2019 From: lucioseki at gmail.com (Lucio Seki) Date: Tue, 8 Oct 2019 15:47:54 -0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> Message-ID: Congratulations, Douglas! On Tue, Oct 8, 2019 at 3:41 PM Goutham Pacha Ravi wrote: > Thank you all for responding; I've added Douglas to > https://review.opendev.org/#/admin/groups/213,members. > Thank you Douglas for your hard work - welcome, and glad to have you on > board! > > On Mon, Oct 7, 2019 at 12:30 AM Thomas Bechtold > wrote: > >> +1 from me, too. >> >> On 10/2/19 10:58 PM, Goutham Pacha Ravi wrote: >> > Dear Zorillas and other Stackers, >> > >> > I would like to formalize the conversations we've been having amongst >> > ourselves over IRC and in-person. At the outset, we have a lot of >> > incoming changes to review, but we have limited core maintainer >> > attention. We haven't re-jigged our core maintainers team as often as >> > we'd like, and that's partly to blame. We have some relatively new and >> > enthusiastic contributors that we would love to encourage to become >> > maintainers! We've mentored contributors 1-1, n-1 before before adding >> > them to the maintainers team. We would like to do more of this!** >> > >> > In this spirit, I would like your inputs on adding Douglas Viroel >> > (dviroel) to the core maintainers team for manila and its associated >> > projects (manila-specs, manila-ui, python-manilaclient, >> > manila-tempest-plugin, manila-test-image, manila-image-elements). >> > Douglas has been an active contributor for the past two releases and >> > has valuable review inputs in the project. While he's been around here >> > less longer than some of us, he brings a lot of experience to the >> > table with his background in networking and shared file systems. He >> > has a good grasp of the codebase and is enthusiastic in adding new >> > features and fixing bugs in the Ussuri cycle and beyond. >> > >> > Please give me a +/-1 for this proposal. >> > >> > ** If you're interested in helping us maintain Manila by being part of >> > the manila core maintainer team, please reach out to me or any of the >> > current maintainers, we would love to work with you and help you grow >> > into that role! >> > >> > Thanks, >> > Goutham Pacha Ravi (gouthamr) >> > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rodrigo.barbieri2010 at gmail.com Tue Oct 8 18:57:21 2019 From: rodrigo.barbieri2010 at gmail.com (Rodrigo Barbieri) Date: Tue, 8 Oct 2019 15:57:21 -0300 Subject: [manila] Proposal to add dviroel to the core maintainers team In-Reply-To: References: <6c9d15d4-9600-7dcd-3d19-237b49a2958e@suse.com> Message-ID: Congratulations! On Tue, Oct 8, 2019 at 3:55 PM Lucio Seki wrote: > Congratulations, Douglas! > > On Tue, Oct 8, 2019 at 3:41 PM Goutham Pacha Ravi > wrote: > >> Thank you all for responding; I've added Douglas to >> https://review.opendev.org/#/admin/groups/213,members. >> Thank you Douglas for your hard work - welcome, and glad to have you on >> board! >> >> On Mon, Oct 7, 2019 at 12:30 AM Thomas Bechtold >> wrote: >> >>> +1 from me, too. >>> >>> On 10/2/19 10:58 PM, Goutham Pacha Ravi wrote: >>> > Dear Zorillas and other Stackers, >>> > >>> > I would like to formalize the conversations we've been having amongst >>> > ourselves over IRC and in-person. At the outset, we have a lot of >>> > incoming changes to review, but we have limited core maintainer >>> > attention. We haven't re-jigged our core maintainers team as often as >>> > we'd like, and that's partly to blame. We have some relatively new and >>> > enthusiastic contributors that we would love to encourage to become >>> > maintainers! We've mentored contributors 1-1, n-1 before before adding >>> > them to the maintainers team. We would like to do more of this!** >>> > >>> > In this spirit, I would like your inputs on adding Douglas Viroel >>> > (dviroel) to the core maintainers team for manila and its associated >>> > projects (manila-specs, manila-ui, python-manilaclient, >>> > manila-tempest-plugin, manila-test-image, manila-image-elements). >>> > Douglas has been an active contributor for the past two releases and >>> > has valuable review inputs in the project. While he's been around here >>> > less longer than some of us, he brings a lot of experience to the >>> > table with his background in networking and shared file systems. He >>> > has a good grasp of the codebase and is enthusiastic in adding new >>> > features and fixing bugs in the Ussuri cycle and beyond. >>> > >>> > Please give me a +/-1 for this proposal. >>> > >>> > ** If you're interested in helping us maintain Manila by being part of >>> > the manila core maintainer team, please reach out to me or any of the >>> > current maintainers, we would love to work with you and help you grow >>> > into that role! >>> > >>> > Thanks, >>> > Goutham Pacha Ravi (gouthamr) >>> > >>> > >>> >> -- Rodrigo Barbieri MSc Computer Scientist OpenStack Manila Core Contributor Federal University of São Carlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfidente at redhat.com Tue Oct 8 20:28:29 2019 From: gfidente at redhat.com (Giulio Fidente) Date: Tue, 8 Oct 2019 22:28:29 +0200 Subject: [tripleo] owls at ptg In-Reply-To: References: Message-ID: On 10/8/19 6:54 PM, Wesley Hayutin wrote: > Greetings, > > A number of folks from TripleO will be at the OpenDev PTG.  If you would > like to discuss anything and collaborate please list your topic on this > etherpad [1] hi Wes, thanks for starting this thread. I think the Edge topic is quite big and interesting. For example, I have seen in the etherpad a proposal to discuss the nodes lifecycle management; I'd like to lead myself a session about Edge as well to review the status and the plans to support storage at the Edge ... there is quite a lot to be said regarding all most common storage components cinder, glance, manila, swift and regarding ceph support (at the Edge) Another topic about which I'd like to learn about is mistral workflows deprecation; I don't think I'd be the best person to drive this conversation though, would be nice if somebody else could pick it up I think it would also be interesting to try generalize the ffu process and review the existing process/structures created to support upgrade and ffu but not sure if we have upgrade/ffu experts in shanghai? Hopefully we can find at least half day to get people together > [1] https://etherpad.openstack.org/p/tripleo-ussuri-topics -- Giulio Fidente GPG KEY: 08D733BA From zbitter at redhat.com Tue Oct 8 21:22:34 2019 From: zbitter at redhat.com (Zane Bitter) Date: Tue, 8 Oct 2019 17:22:34 -0400 Subject: [requirements][heat] remove salt from requirements (used by heat-agents tests only) In-Reply-To: <20191007201553.xvaeejp2meoyw3ea@mthode.org> References: <20191007201553.xvaeejp2meoyw3ea@mthode.org> Message-ID: <167b0004-862d-d689-511a-504585ebf2f9@redhat.com> On 7/10/19 4:15 PM, Matthew Thode wrote: > Salt has been harsh to deal with. :( > Upstream adding and maintaining caps has caused it to be held back. > This time it's pyyaml, I'm not going to hold back the version of pyyaml > for one import of salt. > > In any case, heat-agents uses salt in one location and may not even be > using the one we define via constraints in any case. > > File: heat-config-salt/install.d/50-heat-config-hook-salt > > Installs salt from package then runs > heat-config-salt/install.d/hook-salt.py This is true, in that the repo itself appears in the form of a set of disk-image-builder elements. (Though it may not necessarily be used this way - for example in RDO we package the actual agents as RPMs, ignoring the d-i-b elements.) > In heat-config-salt/install.d/hook-salt.py is defined the only import of > salt I can find and likely uses the package version as it's installed > after tox sets things up. However, that module is tested in the unit tests (which is why salt appears in test-requirements.txt). > Is the heat team ok with this? We discussed this a little at the time that I added it to global constraints: https://review.opendev.org/604386 The issue for us is that we'd like to be able to use a lower-constraints job. There's a certain library (*cough*paunch) that keeps releasing new major versions, so it's very helpful to have tests to verify when we rewrite for the new API whether or not we have to bump the minimum version. The rest of the requirements tooling seems useful as well, and given that the team obviously maintains other repos in OpenStack we know how to use it, and it gives some confidence that we're providing the right guidance to distros wanting to package this. That said, nothing in heat-agents necessarily needs to be co-installable with OpenStack - the agents run on guest machines. So if it's not tied to the global-requirements any more then that may not be the worst thing. But IIRC when we last discussed this there was no recommended way for a project to run in that kind of configuration. If somebody with more knowledge of the requirements tooling were able to help out with suggestions then I'd be more than happy to implement them. cheers, Zane. From daniel at preussker.net Tue Oct 8 09:20:08 2019 From: daniel at preussker.net (Daniel 'f0o' Preussker) Date: Tue, 8 Oct 2019 11:20:08 +0200 Subject: [OSSA-2019-005] Octavia Amphora-Agent not requiring Client-Certificate (CVE-2019-17134) Message-ID: ===================================================================== OSSA-2019-005: Octavia Amphora-Agent not requiring Client-Certificate ===================================================================== :Date: October 07, 2019 :CVE: CVE-2019-17134 Affects ~~~~~~~ - Octavia: >=0.10.0 <2.1.2, >=3.0.0 <3.2.0, >=4.0.0 <4.1.0 Description ~~~~~~~~~~~ Daniel Preussker reported a vulnerability in amphora-agent, running within Octavia Amphora Instances which allows unauthenticated access from the management network. This leads to information disclosure and also allows changes to the configuration of the Amphora via simple HTTP requests because cmd/agent.py gunicorn cert_reqs option is incorrectly set to True instead of ssl.CERT_REQUIRED. Patches ~~~~~~~ - https://review.opendev.org/686547 (Ocata) - https://review.opendev.org/686546 (Pike) - https://review.opendev.org/686545 (Queens) - https://review.opendev.org/686544 (Rocky) - https://review.opendev.org/686543 (Stein) - https://review.opendev.org/686541 (Train) Credits ~~~~~~~ - Daniel Preussker (CVE-2019-17134) References ~~~~~~~~~~ - https://storyboard.openstack.org/#!/story/2006660 - http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17134 Notes ~~~~~ - The stable/ocata and stable/pike branches are under extended maintenance and will receive no new point releases, but patches for them are provided as a courtesy. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From akalambu at cisco.com Tue Oct 8 19:55:59 2019 From: akalambu at cisco.com (Ajay Kalambur (akalambu)) Date: Tue, 8 Oct 2019 19:55:59 +0000 Subject: [openstack][heat-cfn] CFN Signaling with heat In-Reply-To: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> References: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> Message-ID: Would be great if someone has an example template where CFN SIGNAL works so we can see whats going on From: "Ajay Kalambur (akalambu)" Date: Saturday, October 5, 2019 at 10:34 AM To: "openstack-discuss at lists.openstack.org" Subject: [openstack][heat-cfn] CFN Signaling with heat Hi I was trying the Software Deployment/Structured deployment of heat. I somehow can never get the signaling to work I see that authentication is happening but I don’t see a POST from the VM as a result stack is stuck in CREATE_IN_PROGRESS I see this message in my heat api cfn log which seems to suggest authentication is successful but it does not seem to POST. Have included debug output from VM and also the sample heat template I used. Don’t know if the template is correct as I referred some online examples to build it 2019-10-05 10:30:00.908 7 INFO heat.api.aws.ec2token [-] Checking AWS credentials.. 2019-10-05 10:30:00.909 7 INFO heat.api.aws.ec2token [-] AWS credentials found, checking against keystone. 2019-10-05 10:30:00.910 7 INFO heat.api.aws.ec2token [-] Authenticating with http://10.10.173.9:5000/v3/ec2tokens 2019-10-05 10:30:01.315 7 INFO heat.api.aws.ec2token [-] AWS authentication successful. 2019-10-05 10:30:02.326 7 INFO eventlet.wsgi.server [req-506f22c6-4062-4a84-8e85-40317a4099ed - adccd09df89e4b71b0a42f462679e75a-b1c6eb69-3877-466b-b00d-03dc051 - 0ecadd4762a34de1ac08508db4d3caa9 0ecadd4762a34de1ac08508db4d3caa9] 10.11.59.36,10.10.173.9 - - [05/Oct/2019 10:30:02] "GET /v1/?SignatureVersion=2&AWSAccessKeyId=f7874ac9898248edaae53511230534a4&StackName=test_stack&SignatureMethod=HmacSHA256&Signature=c03Q7Hb35q9tPPuYOv6YByn5YekF96p2s5zx36sX7x4%3D&Action=DescribeStackResource&LogicalResourceId=sig-vm-1 HTTP/1.1" 200 4669 1.418045 Some debugging output from my VM: [root at sig-vm-1 fedora]# sudo os-collect-config --force --one-time --debug /var/lib/os-collect-config/local-data not found. Skipping [2019-10-05 17:32:47,058] (os-refresh-config) [INFO] Starting phase pre-configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Target: pre-configure.d dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:47,091] (os-refresh-config) [INFO] Completed phase pre-configure [2019-10-05 17:32:47,092] (os-refresh-config) [INFO] Starting phase configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/20-os-apply-config [2019/10/05 05:32:47 PM] [INFO] writing /var/run/heat-config/heat-config [2019/10/05 05:32:47 PM] [INFO] writing /etc/os-collect-config.conf [2019/10/05 05:32:47 PM] [INFO] success dib-run-parts Sat Oct 5 17:32:47 UTC 2019 20-os-apply-config completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/50-heat-config-docker-compose dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-docker-compose completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/50-heat-config-kubelet dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-kubelet completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/configure.d/55-heat-config [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group Heat::Ungrouped with no hook script None [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group Heat::Ungrouped with no hook script None dib-run-parts Sat Oct 5 17:32:47 UTC 2019 55-heat-config completed dib-run-parts Sat Oct 5 17:32:47 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Target: configure.d dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 20-os-apply-config 0.345 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-docker-compose 0.064 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 50-heat-config-kubelet 0.134 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 55-heat-config 0.065 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 dib-run-parts Sat Oct 5 17:32:47 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Completed phase configure [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Starting phase post-configure dib-run-parts Sat Oct 5 17:32:47 UTC 2019 Running /usr/libexec/os-refresh-config/post-configure.d/99-refresh-completed ++ os-apply-config --key completion-handle --type raw --key-default '' + HANDLE= ++ os-apply-config --key completion-signal --type raw --key-default '' + SIGNAL= ++ os-apply-config --key instance-id --type raw --key-default '' + ID=i-0000000d + '[' -n i-0000000d ']' + '[' -n '' ']' + '[' -n '' ']' ++ os-apply-config --key deployments --type raw --key-default '' ++ jq -r 'map(select(.group == "os-apply-config") | select(.inputs[].name == "deploy_signal_id") | .id + (.inputs | map(select(.name == "deploy_signal_id")) | .[].value)) | .[]' + DEPLOYMENTS= + DEPLOYED_DIR=/var/lib/os-apply-config-deployments/deployed + '[' '!' -d /var/lib/os-apply-config-deployments/deployed ']' dib-run-parts Sat Oct 5 17:32:49 UTC 2019 99-refresh-completed completed dib-run-parts Sat Oct 5 17:32:49 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Target: post-configure.d dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 99-refresh-completed 1.206 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:49,041] (os-refresh-config) [INFO] Completed phase post-configure [2019-10-05 17:32:49,042] (os-refresh-config) [INFO] Starting phase migration dib-run-parts Sat Oct 5 17:32:49 UTC 2019 ----------------------- PROFILING ----------------------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Target: migration.d dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 Script Seconds dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------------------------- ---------- dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 dib-run-parts Sat Oct 5 17:32:49 UTC 2019 --------------------- END PROFILING --------------------- [2019-10-05 17:32:49,073] (os-refresh-config) [INFO] Completed phase migration onfig]# cat /var/run/heat-config/heat-config [{"inputs": [{"type": "String", "name": "foo", "value": "fu"}, {"type": "String", "name": "bar", "value": "barmy"}, {"type": "String", "name": "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of the server being deployed to"}, {"type": "String", "name": "deploy_action", "value": "CREATE", "description": "Name of the current action being deployed"}, {"type": "String", "name": "deploy_stack_id", "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of the stack this deployment belongs to"}, {"type": "String", "name": "deploy_resource_name", "value": "other_deployment", "description": "Name of this deployment resource in the stack"}, {"type": "String", "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": "How the server should signal to heat with the deployment output values."}, {"type": "String", "name": "deploy_signal_id", "value": "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/other_deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=28a09f5d996240b8b4a117ecb0e0142b&SignatureVersion=2&Signature=IqXbRf9MzJ%2FnzqM7CLNAsR3BiwmaaHyWQspegxYc3D8%3D", "description": "ID of signal to use for signaling output values"}, {"type": "String", "name": "deploy_signal_verb", "value": "POST", "description": "HTTP verb to use for signaling outputvalues"}], "group": "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": {"config_value_foo": "fu", "config_value_bar": "barmy"}, "id": "5c404619-ce79-48cd-b001-00ac6ff4f4e8"}, {"inputs": [{"type": "String", "name": "foo", "value": "fooooo"}, {"type": "String", "name": "bar", "value": "baaaaa"}, {"type": "String", "name": "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of the server being deployed to"}, {"type": "String", "name": "deploy_action", "value": "CREATE", "description": "Name of the current action being deployed"}, {"type": "String", "name": "deploy_stack_id", "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of the stack this deployment belongs to"}, {"type": "String", "name": "deploy_resource_name", "value": "deployment", "description": "Name of this deployment resource in the stack"}, {"type": "String", "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": "How the server should signal to heat with the deployment output values."}, {"type": "String", "name": "deploy_signal_id", "value": "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=4c3d718796e0452ea94f2ce8dc6973ef&SignatureVersion=2&Signature=rxtSBNUSF%2FEXn9wvVK4XMU%2F1RzXVDGILtZr1hmkl7gg%3D", "description": "ID of signal to use for signaling output values"}, {"type": "String", "name": "deploy_signal_verb", "value": "POST", "description": "HTTP verb to use for signaling outputvalues"}], "group": "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": {"config_value_foo": "fooooo", "config_value_bar": "baaaaa"}, "id": "f4dea0c1-73c9-4ce4-aa04-c76ef9b08859"}][root at sig-vm-1 heat-config]# [root at sig-vm-1 heat-config]# cat /etc/os-collect-config.conf [DEFAULT] command = os-refresh-config collectors = ec2 collectors = cfn collectors = local [cfn] metadata_url = http://172.29.85.87:8000/v1/ stack_name = test_stack secret_access_key = npa^GWsPtbRL7D*MYObOI*kV0i1yqKOG access_key_id = f7874ac9898248edaae53511230534a4 path = sig-vm-1.Metadata Here is my basic sample temple heat_template_version: 2013-05-23 description: > This template demonstrates how to use OS::Heat::StructuredDeployment to override substitute get_input placeholders defined in OS::Heat::StructuredConfig config. As there is no hook on the server to act on the configuration data, these deployment resource will perform no actual configuration. parameters: flavor: type: string default: 'a061cb6c-99e7-4bdb-93e4-f0037ee3e947' image: type: string default: 3be29d9f-2ce6-4b95-b80c-0dbca7acfdfe public_net_id: type: string default: 67ae0e17-6258-4fb6-8b9b-0f29f6adb9db private_net_id: type: string description: Private network id default: 995fc046-1c58-468a-b81c-e42c06fc8966 private_subnet_id: type: string description: Private subnet id default: 7598c805-3a9b-4c27-be5b-dca4d89f058c password: type: string description: SSH password default: lab123 resources: the_sg: type: OS::Neutron::SecurityGroup properties: name: the_sg description: Ping and SSH rules: - protocol: icmp - protocol: tcp port_range_min: 22 port_range_max: 22 config: type: OS::Heat::StructuredConfig properties: config: config_value_foo: {get_input: foo} config_value_bar: {get_input: bar} deployment: type: OS::Heat::StructuredDeployment properties: signal_transport: CFN_SIGNAL config: get_resource: config server: get_resource: sig-vm-1 input_values: foo: fooooo bar: baaaaa other_deployment: type: OS::Heat::StructuredDeployment properties: signal_transport: CFN_SIGNAL config: get_resource: config server: get_resource: sig-vm-1 input_values: foo: fu bar: barmy server1_port0: type: OS::Neutron::Port properties: network_id: { get_param: private_net_id } security_groups: - default fixed_ips: - subnet_id: { get_param: private_subnet_id } server1_public: type: OS::Neutron::FloatingIP properties: floating_network_id: { get_param: public_net_id } port_id: { get_resource: server1_port0 } sig-vm-1: type: OS::Nova::Server properties: name: sig-vm-1 image: { get_param: image } flavor: { get_param: flavor } networks: - port: { get_resource: server1_port0 } user_data_format: SOFTWARE_CONFIG user_data: get_resource: cloud_config cloud_config: type: OS::Heat::CloudConfig properties: cloud_config: password: { get_param: password } chpasswd: { expire: False } ssh_pwauth: True -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Wed Oct 9 00:02:40 2019 From: mthode at mthode.org (Matthew Thode) Date: Tue, 8 Oct 2019 19:02:40 -0500 Subject: [requirements][heat] remove salt from requirements (used by heat-agents tests only) In-Reply-To: <167b0004-862d-d689-511a-504585ebf2f9@redhat.com> References: <20191007201553.xvaeejp2meoyw3ea@mthode.org> <167b0004-862d-d689-511a-504585ebf2f9@redhat.com> Message-ID: <20191009000240.a4joopyehlx7pdk6@mthode.org> On 19-10-08 17:22:34, Zane Bitter wrote: > On 7/10/19 4:15 PM, Matthew Thode wrote: > > Salt has been harsh to deal with. > > :( > > > Upstream adding and maintaining caps has caused it to be held back. > > This time it's pyyaml, I'm not going to hold back the version of pyyaml > > for one import of salt. > > > > In any case, heat-agents uses salt in one location and may not even be > > using the one we define via constraints in any case. > > > > File: heat-config-salt/install.d/50-heat-config-hook-salt > > > > Installs salt from package then runs > > heat-config-salt/install.d/hook-salt.py > > This is true, in that the repo itself appears in the form of a set of > disk-image-builder elements. (Though it may not necessarily be used this way > - for example in RDO we package the actual agents as RPMs, ignoring the > d-i-b elements.) > > > In heat-config-salt/install.d/hook-salt.py is defined the only import of > > salt I can find and likely uses the package version as it's installed > > after tox sets things up. > > However, that module is tested in the unit tests (which is why salt appears > in test-requirements.txt). > > > Is the heat team ok with this? > > We discussed this a little at the time that I added it to global > constraints: https://review.opendev.org/604386 > > The issue for us is that we'd like to be able to use a lower-constraints > job. There's a certain library (*cough*paunch) that keeps releasing new > major versions, so it's very helpful to have tests to verify when we rewrite > for the new API whether or not we have to bump the minimum version. The rest > of the requirements tooling seems useful as well, and given that the team > obviously maintains other repos in OpenStack we know how to use it, and it > gives some confidence that we're providing the right guidance to distros > wanting to package this. > > That said, nothing in heat-agents necessarily needs to be co-installable > with OpenStack - the agents run on guest machines. So if it's not tied to > the global-requirements any more then that may not be the worst thing. But > IIRC when we last discussed this there was no recommended way for a project > to run in that kind of configuration. If somebody with more knowledge of the > requirements tooling were able to help out with suggestions then I'd be more > than happy to implement them. Ya, I remember that conversation :D If it helps I think we can sidestep this as the tests are currenly not using salt managed by requirements. Also, salt will not be updated while they are capping pyyaml. IIRC there was a way to tell the package installer a version you wanted. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From zbitter at redhat.com Wed Oct 9 01:29:27 2019 From: zbitter at redhat.com (Zane Bitter) Date: Tue, 8 Oct 2019 21:29:27 -0400 Subject: [openstack][heat-cfn] CFN Signaling with heat In-Reply-To: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> References: <5757C208-29A4-4D6B-9F82-1FE5B16B8359@cisco.com> Message-ID: <053c6d35-6834-8e09-2cd9-d90f030b2833@redhat.com> I'm not an expert on stuff that happens on the guest, but it looks like this is the problem: > [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group > Heat::Ungrouped with no hook script None You're using the default group that has no handler for it configured. It looks like 55_heat_config bails out before attempting to signal a response in this case. (That seems crazy to me, but here we are.) Try configuring a group (like 'script') that actually does something. Also why not use HEAT_SIGNAL as the transport? It's 2019 ;) cheers, Zane. On 5/10/19 1:34 PM, Ajay Kalambur (akalambu) wrote: > Hi > > I was trying the Software Deployment/Structured deployment of heat. > > I somehow can never get the signaling to work I see that authentication > is happening but I don’t see a POST from the VM as a result stack is > stuck in CREATE_IN_PROGRESS > > I see this message in my heat api cfn log which seems to suggest > authentication is successful but it does not seem to POST. Have included > debug output from VM and also the sample heat template I used. Don’t > know if the template is correct as I referred some online examples to > build it > > 2019-10-05 10:30:00.908 7 INFO heat.api.aws.ec2token [-] Checking AWS > credentials.. > > 2019-10-05 10:30:00.909 7 INFO heat.api.aws.ec2token [-] AWS credentials > found, checking against keystone. > > 2019-10-05 10:30:00.910 7 INFO heat.api.aws.ec2token [-] Authenticating > with http://10.10.173.9:5000/v3/ec2tokens > > 2019-10-05 10:30:01.315 7 INFO heat.api.aws.ec2token [-] AWS > authentication successful. > > 2019-10-05 10:30:02.326 7 INFO eventlet.wsgi.server > [req-506f22c6-4062-4a84-8e85-40317a4099ed - > adccd09df89e4b71b0a42f462679e75a-b1c6eb69-3877-466b-b00d-03dc051 - > 0ecadd4762a34de1ac08508db4d3caa9 0ecadd4762a34de1ac08508db4d3caa9] > 10.11.59.36,10.10.173.9 - - [05/Oct/2019 10:30:02] "GET > /v1/?SignatureVersion=2&AWSAccessKeyId=f7874ac9898248edaae53511230534a4&StackName=test_stack&SignatureMethod=HmacSHA256&Signature=c03Q7Hb35q9tPPuYOv6YByn5YekF96p2s5zx36sX7x4%3D&Action=DescribeStackResource&LogicalResourceId=sig-vm-1 > HTTP/1.1" 200 4669 1.418045 > > Some debugging output from my VM: > > [root at sig-vm-1 fedora]# sudo os-collect-config --force --one-time --debug > > /var/lib/os-collect-config/local-data not found. Skipping > > [2019-10-05 17:32:47,058] (os-refresh-config) [INFO] Starting phase > pre-configure > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 ----------------------- > PROFILING ----------------------- > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Target: pre-configure.d > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > Script                                     Seconds > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > ---------------------------------------  ---------- > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 --------------------- END > PROFILING --------------------- > > [2019-10-05 17:32:47,091] (os-refresh-config) [INFO] Completed phase > pre-configure > > [2019-10-05 17:32:47,092] (os-refresh-config) [INFO] Starting phase > configure > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/configure.d/20-os-apply-config > > [2019/10/05 05:32:47 PM] [INFO] writing /var/run/heat-config/heat-config > > [2019/10/05 05:32:47 PM] [INFO] writing /etc/os-collect-config.conf > > [2019/10/05 05:32:47 PM] [INFO] success > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 20-os-apply-config completed > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/configure.d/50-heat-config-docker-compose > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 50-heat-config-docker-compose > completed > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/configure.d/50-heat-config-kubelet > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 50-heat-config-kubelet completed > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/configure.d/55-heat-config > > [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group > Heat::Ungrouped with no hook script None > > [2019-10-05 17:32:47,724] (heat-config) [ERROR] Skipping group > Heat::Ungrouped with no hook script None > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 55-heat-config completed > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 ----------------------- > PROFILING ----------------------- > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Target: configure.d > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Script >                        Seconds > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > ---------------------------------------  ---------- > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > 20-os-apply-config                            0.345 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > 50-heat-config-docker-compose                 0.064 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > 50-heat-config-kubelet                        0.134 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > 55-heat-config                                0.065 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 --------------------- END > PROFILING --------------------- > > [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Completed phase > configure > > [2019-10-05 17:32:47,787] (os-refresh-config) [INFO] Starting phase > post-configure > > dib-run-parts Sat Oct  5 17:32:47 UTC 2019 Running > /usr/libexec/os-refresh-config/post-configure.d/99-refresh-completed > > ++ os-apply-config --key completion-handle --type raw --key-default '' > > + HANDLE= > > ++ os-apply-config --key completion-signal --type raw --key-default '' > > + SIGNAL= > > ++ os-apply-config --key instance-id --type raw --key-default '' > > + ID=i-0000000d > > + '[' -n i-0000000d ']' > > + '[' -n '' ']' > > + '[' -n '' ']' > > ++ os-apply-config --key deployments --type raw --key-default '' > > ++ jq -r 'map(select(.group == "os-apply-config") | > >               select(.inputs[].name == "deploy_signal_id") | > >               .id + (.inputs | map(select(.name == "deploy_signal_id")) > | .[].value)) | > >               .[]' > > + DEPLOYMENTS= > > + DEPLOYED_DIR=/var/lib/os-apply-config-deployments/deployed > > + '[' '!' -d /var/lib/os-apply-config-deployments/deployed ']' > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 99-refresh-completed completed > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 ----------------------- > PROFILING ----------------------- > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 Target: post-configure.d > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > Script                                     Seconds > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > ---------------------------------------  ---------- > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > 99-refresh-completed                          1.206 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 --------------------- END > PROFILING --------------------- > > [2019-10-05 17:32:49,041] (os-refresh-config) [INFO] Completed phase > post-configure > > [2019-10-05 17:32:49,042] (os-refresh-config) [INFO] Starting phase > migration > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 ----------------------- > PROFILING ----------------------- > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 Target: migration.d > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > Script                                     Seconds > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > ---------------------------------------  ---------- > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 > > dib-run-parts Sat Oct  5 17:32:49 UTC 2019 --------------------- END > PROFILING --------------------- > > [2019-10-05 17:32:49,073] (os-refresh-config) [INFO] Completed phase > migration > > onfig]# cat /var/run/heat-config/heat-config > > [{"inputs": [{"type": "String", "name": "foo", "value": "fu"}, {"type": > "String", "name": "bar", "value": "barmy"}, {"type": "String", "name": > "deploy_server_id", "value": "226ed96d-2335-436e-9707-95af73041e5f", > "description": "ID of the server being deployed to"}, {"type": "String", > "name": "deploy_action", "value": "CREATE", "description": "Name of the > current action being deployed"}, {"type": "String", "name": > "deploy_stack_id", "value": > "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", "description": "ID of > the stack this deployment belongs to"}, {"type": "String", "name": > "deploy_resource_name", "value": "other_deployment", "description": > "Name of this deployment resource in the stack"}, {"type": "String", > "name": "deploy_signal_transport", "value": "CFN_SIGNAL", "description": > "How the server should signal to heat with the deployment output > values."}, {"type": "String", "name": "deploy_signal_id", "value": > "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/other_deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=28a09f5d996240b8b4a117ecb0e0142b&SignatureVersion=2&Signature=IqXbRf9MzJ%2FnzqM7CLNAsR3BiwmaaHyWQspegxYc3D8%3D", > "description": "ID of signal to use for signaling output values"}, > {"type": "String", "name": "deploy_signal_verb", "value": "POST", > "description": "HTTP verb to use for signaling outputvalues"}], "group": > "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": > [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": > {"config_value_foo": "fu", "config_value_bar": "barmy"}, "id": > "5c404619-ce79-48cd-b001-00ac6ff4f4e8"}, {"inputs": [{"type": "String", > "name": "foo", "value": "fooooo"}, {"type": "String", "name": "bar", > "value": "baaaaa"}, {"type": "String", "name": "deploy_server_id", > "value": "226ed96d-2335-436e-9707-95af73041e5f", "description": "ID of > the server being deployed to"}, {"type": "String", "name": > "deploy_action", "value": "CREATE", "description": "Name of the current > action being deployed"}, {"type": "String", "name": "deploy_stack_id", > "value": "test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893", > "description": "ID of the stack this deployment belongs to"}, {"type": > "String", "name": "deploy_resource_name", "value": "deployment", > "description": "Name of this deployment resource in the stack"}, > {"type": "String", "name": "deploy_signal_transport", "value": > "CFN_SIGNAL", "description": "How the server should signal to heat with > the deployment output values."}, {"type": "String", "name": > "deploy_signal_id", "value": > "http://172.29.85.87:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3Aadccd09df89e4b71b0a42f462679e75a%3Astacks/test_stack/b1c6eb69-3877-466b-b00d-03dc051d1893/resources/deployment?Timestamp=2019-10-05T01%3A11%3A46Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=4c3d718796e0452ea94f2ce8dc6973ef&SignatureVersion=2&Signature=rxtSBNUSF%2FEXn9wvVK4XMU%2F1RzXVDGILtZr1hmkl7gg%3D", > "description": "ID of signal to use for signaling output values"}, > {"type": "String", "name": "deploy_signal_verb", "value": "POST", > "description": "HTTP verb to use for signaling outputvalues"}], "group": > "Heat::Ungrouped", "name": "test_stack-config-bmekpj67pq6p", "outputs": > [], "creation_time": "2019-10-05T01:14:31Z", "options": {}, "config": > {"config_value_foo": "fooooo", "config_value_bar": "baaaaa"}, "id": > "f4dea0c1-73c9-4ce4-aa04-c76ef9b08859"}][root at sig-vm-1 heat-config]# > > [root at sig-vm-1 heat-config]# cat /etc/os-collect-config.conf > > [DEFAULT] > > command = os-refresh-config > > collectors = ec2 > > collectors = cfn > > collectors = local > > [cfn] > > metadata_url = http://172.29.85.87:8000/v1/ > > stack_name = test_stack > > secret_access_key = npa^GWsPtbRL7D*MYObOI*kV0i1yqKOG > > access_key_id = f7874ac9898248edaae53511230534a4 > > path = sig-vm-1.Metadata > > *Here is my basic sample temple* > > heat_template_version: 2013-05-23 > > description: > > >   This template demonstrates how to use OS::Heat::StructuredDeployment > >   to override substitute get_input placeholders defined in > >   OS::Heat::StructuredConfig config. > >   As there is no hook on the server to act on the configuration data, > >   these deployment resource will perform no actual configuration. > > parameters: > >   flavor: > >     type: string > >     default: 'a061cb6c-99e7-4bdb-93e4-f0037ee3e947' > >   image: > >     type: string > >     default: 3be29d9f-2ce6-4b95-b80c-0dbca7acfdfe > >   public_net_id: > >     type: string > >     default: 67ae0e17-6258-4fb6-8b9b-0f29f6adb9db > >   private_net_id: > >     type: string > >     description: Private network id > >     default: 995fc046-1c58-468a-b81c-e42c06fc8966 > >   private_subnet_id: > >     type: string > >     description: Private subnet id > >     default: 7598c805-3a9b-4c27-be5b-dca4d89f058c > >   password: > >     type: string > >     description: SSH password > >     default: lab123 > > resources: > >   the_sg: > >     type: OS::Neutron::SecurityGroup > >     properties: > >       name: the_sg > >       description: Ping and SSH > >       rules: > >       - protocol: icmp > >       - protocol: tcp > >         port_range_min: 22 > >         port_range_max: 22 > >   config: > >     type: OS::Heat::StructuredConfig > >     properties: > >       config: > >        config_value_foo: {get_input: foo} > >        config_value_bar: {get_input: bar} > >   deployment: > >     type: OS::Heat::StructuredDeployment > >     properties: > >       signal_transport: CFN_SIGNAL > >       config: > >         get_resource: config > >       server: > >         get_resource: sig-vm-1 > >       input_values: > >         foo: fooooo > >         bar: baaaaa > >   other_deployment: > >     type: OS::Heat::StructuredDeployment > >     properties: > >       signal_transport: CFN_SIGNAL > >       config: > >         get_resource: config > >       server: > >         get_resource: sig-vm-1 > >       input_values: > >         foo: fu > >         bar: barmy > >   server1_port0: > >     type: OS::Neutron::Port > >     properties: > >       network_id: { get_param: private_net_id } > >       security_groups: > >         - default > >       fixed_ips: > >         - subnet_id: { get_param: private_subnet_id } > >   server1_public: > >     type: OS::Neutron::FloatingIP > >     properties: > >       floating_network_id: { get_param: public_net_id } > >       port_id: { get_resource: server1_port0 } > >   sig-vm-1: > >     type: OS::Nova::Server > >     properties: > >       name: sig-vm-1 > >       image: { get_param: image } > >       flavor: { get_param: flavor } > >       networks: > >         - port: { get_resource: server1_port0 } > >       user_data_format: SOFTWARE_CONFIG > >       user_data: > >         get_resource: cloud_config > >   cloud_config: > >     type: OS::Heat::CloudConfig > >     properties: > >       cloud_config: > >         password: { get_param: password } > >         chpasswd: { expire: False } > >         ssh_pwauth: True > From li.canwei2 at zte.com.cn Wed Oct 9 03:55:28 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 9 Oct 2019 11:55:28 +0800 (CST) Subject: =?UTF-8?B?W1dhdGNoZXJdIHRlYW0gbWVldGluZyBhdCAwODowMCBVVEMgdG9kYXk=?= Message-ID: <201910091155285054778@zte.com.cn> Hi, Watcher team will have a meeting at 08:00 UTC today in the #openstack-meeting-alt channel. The agenda is available on https://wiki.openstack.org/wiki/Watcher_Meeting_Agenda feel free to add any additional items. Thanks! Canwei Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From frode.nordahl at canonical.com Wed Oct 9 06:05:56 2019 From: frode.nordahl at canonical.com (Frode Nordahl) Date: Wed, 9 Oct 2019 08:05:56 +0200 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Fri, Oct 4, 2019 at 3:46 PM Corey Bryant wrote: > Hi All, > Hey Corey, Great to see the charm coming along! Code is located at: > https://github.com/coreycb/charm-placement > https://github.com/coreycb/charm-interface-placement > > https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) > 1) Since the interface is new I would love to see it based on the ``Endpoint`` class instead of the aging ``RelationBase`` class. Also the interface code needs unit tests. We have multiple examples of interface implementations with both in place you can get inspiration from [0]. Also consider having both a ``connected`` and ``available`` state, the available state could be set on the first relation-changed event. This increases the probability of your charm detecting a live charm in the other end of the relation, both states are also required to use the ``charms.openstack`` required relation gating code. 2) In the reactive handler you do a bespoke import of the charm class module just to activate the code, this is no longer necessary as there has been implemented a module that does automatic search and import of the class for you. Please use that instead. [1] import charms_openstack.bus import charms_openstack.charm as charm charms_openstack.bus.discover() 0: https://github.com/search?q=org%3Aopenstack+%22from+charms.reactive+import+Endpoint%22&type=Code 1: https://github.com/search?q=org%3Aopenstack+charms_openstack.bus&type=Code -- Frode Nordahl -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Wed Oct 9 08:02:54 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Wed, 09 Oct 2019 10:02:54 +0200 Subject: [tc] Weekly update Message-ID: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> Hello friends, Here's what need attention for the OpenStack TC this week. 1. You should probably prepare our next meeting, happening on Thursday. Alexandra is preparing the topics and warming up the gifs already. 2. We still need someone to step up for the OpenStack User survey on the ML [2] 3. We have plenty of patches which haven't received a vote. 4. We only have two goals for Ussuri [3]. Having more goals makes it easier to select the goals amongst the suggested ones :) If you can socialize about those, that would be awesome. Thank you everyone! JP & Rico [1]: https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [2]: http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html [3]: https://etherpad.openstack.org/p/PVG-u-series-goals From dtantsur at redhat.com Wed Oct 9 10:04:11 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 9 Oct 2019 12:04:11 +0200 Subject: Release Cycle Observations In-Reply-To: References: <40ab2bd3-e23a-6877-e515-63bbc1663f66@gmail.com> <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> Message-ID: On Wed, Oct 2, 2019 at 10:31 AM Thomas Goirand wrote: > On 10/1/19 12:05 PM, Dmitry Tantsur wrote: > > > > > > On Fri, Sep 27, 2019 at 10:47 PM Thomas Goirand > > wrote: > > > > On 9/26/19 9:51 PM, Sean McGinnis wrote: > > >> I know we'd like to have everyone CD'ing master > > > > > > Watch who you're lumping in with the "we" statement. ;) > > > > You've pinpointed what the problem is. > > > > Everyone but OpenStack upstream would like to stop having to upgrade > > every 6 months. > > > > > > Yep, but the same "everyone" want to have features now or better > > yesterday, not in 2-3 years ;) > > This probably was the case a few years ago, when OpenStack was young. > Now that it has matured, and has all the needed features, things have > changed a lot. > This is still the case often enough in my world. IPv6 comes to mind as an example. > > Thomas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.settle at outlook.com Wed Oct 9 10:13:47 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Wed, 9 Oct 2019 10:13:47 +0000 Subject: [all][PTG] Strawman Schedule In-Reply-To: References: Message-ID: Thanks so much! On Mon, 2019-10-07 at 10:09 -0700, Kendall Nelson wrote: > Hey Alex, > > So since the TC stuff is Friday we managed to shuffle things around > and now docs has the afternoon on Thursday. > > We will get the final schedule up on the website soon. > > -Kendall (diablo_rojo) > > On Thu, Oct 3, 2019 at 9:32 AM Kendall Waters > wrote: > > Hey Alex, > > > > We still have tables available on Friday. Would half a day on > > Friday work for the docs team? Unless Ian is okay with it, we can > > combine Docs with i18n in their Wednesday afternoon/Thursday > > morning slot. Just let me know! > > > > Cheers, > > Kendall > > > > > > > > Kendall Waters > > OpenStack Marketing & Events > > kendall at openstack.org > > > > > > > > > On Oct 3, 2019, at 4:26 AM, Alexandra Settle > > m> wrote: > > > > > > Hey, > > > > > > Could you add something for docs? Or combine with i18n again if > > > Ian > > > doesn't mind? > > > > > > We don't need a lot, just a room for people to ask questions > > > about the > > > future of the docs team. > > > > > > Stephen will be there, as co-PTL. There's 0 chance of it not > > > conflicting with nova. > > > > > > Please :) > > > > > > Thank you! > > > > > > Alex > > > > > > On Wed, 2019-09-25 at 14:13 -0700, Kendall Nelson wrote: > > > > Hello Everyone! > > > > > > > > In the attached picture or link [0] you will find the proposed > > > > schedule for the various tracks at the Shanghai PTG in > > > > November. > > > > > > > > We did our best to avoid the key conflicts that the track leads > > > > (PTLs, SIG leads...) mentioned in their PTG survey responses, > > > > although there was no perfect solution that would avoid all > > > > conflicts > > > > especially when the event is three-ish days long and we have > > > > over 40 > > > > teams meeting. > > > > > > > > If there are critical conflicts we missed or other issues, > > > > please let > > > > us know, by October 6th at 7:00 UTC! > > > > > > > > -Kendall (diablo_rojo) > > > > > > > > [0] https://usercontent.irccloud-cdn.com/file/00mZ3Q3M/pvg_ptg_ > > > > schedu > > > > le.png > > > -- > > > Alexandra Settle > > > IRC: asettle -- Alexandra Settle IRC: asettle From smooney at redhat.com Wed Oct 9 10:40:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 09 Oct 2019 11:40:54 +0100 Subject: Release Cycle Observations In-Reply-To: References: <40ab2bd3-e23a-6877-e515-63bbc1663f66@gmail.com> <362a82bc-a2a8-b77c-d1f2-4adad992de56@debian.org> Message-ID: On Wed, 2019-10-09 at 12:04 +0200, Dmitry Tantsur wrote: > On Wed, Oct 2, 2019 at 10:31 AM Thomas Goirand wrote: > > > On 10/1/19 12:05 PM, Dmitry Tantsur wrote: > > > > > > > > > On Fri, Sep 27, 2019 at 10:47 PM Thomas Goirand > > > wrote: > > > > > > On 9/26/19 9:51 PM, Sean McGinnis wrote: > > > >> I know we'd like to have everyone CD'ing master > > > > > > > > Watch who you're lumping in with the "we" statement. ;) > > > > > > You've pinpointed what the problem is. > > > > > > Everyone but OpenStack upstream would like to stop having to upgrade > > > every 6 months. im not sure that is true. i think if upgrades where as easy as a yum update or apt upgrade people would not mind 6 month or shorter upgrade cycle but even though tooling has imporoved we are a long way from upgrades being trivial. > > > > > > > > > Yep, but the same "everyone" want to have features now or better > > > yesterday, not in 2-3 years ;) yes and this is a double edge sword in more ways then one. we have a large proportion of our customer base that are only now upgrading to queens from Newton. so they are already running a 2-3 year out of date openstack and when they upgrade they would also like all the features that were only added in train backported to Queens which is our current LTS donwstream. Our internal data on deployments more or less shows that most non lts releases downstream are ignored by larger customers createing a pressure to backport features that we cant resonably do given our current tooling and desire to not create a large fork. > > > > This probably was the case a few years ago, when OpenStack was young. > > Now that it has matured, and has all the needed features, things have > > changed a lot. > > i dont think it has. i think many of the need feature are now avaiable in master although looking at our downstream back log there are also a lot of feature that are not avilable. the issue is that because upgrading has been so painful for many for so long they are not willing in many case to go to the latest release. maybe in another 2 years time this statement will be more correct as the majority of clouds will be running stien+(i hope). > > This is still the case often enough in my world. IPv6 comes to mind as an > example. > > > > > > Thomas > > > > From jungleboyj at gmail.com Wed Oct 9 13:28:16 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 9 Oct 2019 08:28:16 -0500 Subject: [tc] Weekly update In-Reply-To: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> References: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> Message-ID: <5af7b363-4333-6fce-38c2-cf0dc8541d4c@gmail.com> JP, I thought I had responded to the ML about helping with the OpenStack user survey.  Maybe I only thought about responding and didn't actually do it.  :-) Anyway, I am willing to take a look at it and put together a summary.  What is the time frame the TC is looking for on this? Thanks! Jay On 10/9/2019 3:02 AM, Jean-Philippe Evrard wrote: > Hello friends, > > Here's what need attention for the OpenStack TC this week. > > 1. You should probably prepare our next meeting, happening on Thursday. > Alexandra is preparing the topics and warming up the gifs already. > 2. We still need someone to step up for the OpenStack User survey on > the ML [2] > 3. We have plenty of patches which haven't received a vote. > 4. We only have two goals for Ussuri [3]. Having more goals makes it > easier to select the goals amongst the suggested ones :) If you can > socialize about those, that would be awesome. > > Thank you everyone! > JP & Rico > > [1]: > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > [2]: > http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html > > [3]: https://etherpad.openstack.org/p/PVG-u-series-goals > > From jungleboyj at gmail.com Wed Oct 9 13:31:10 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 9 Oct 2019 08:31:10 -0500 Subject: [PTLs] [TC] OpenStack User Survey - PTL & TC Feedback Message-ID: <032d7c6a-6b73-4ff9-b061-468aed7b546e@gmail.com> Jimmy, I will take a User Survey feedback for the TC and let you know if we need additional information on anything. Thanks! Jay From sean.mcginnis at gmx.com Wed Oct 9 13:44:11 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 9 Oct 2019 08:44:11 -0500 Subject: [ptl][release] Last call for RC updates Message-ID: <20191009134411.GA9816@sm-workstation> Hey everyone, This is just a reminder about tomorrow's deadline for a final RC for Train. There are several projects that have changes merged since cutting the stable/train branch. Not all of these changes need to be included in the initial Train coordinated release, but it would be good if there are translations and bug fixes merged to get them into a final RC while there's still time. After tomorrow's (Oct 10) deadline, we will only want to release something if it's absolutely critical. We will enter a quiet period from tomorrow until the coordinated release date next week to give time for packagers to complete their work and to make sure things are stable. Next week the last RC releases will then be retagged as the final release. Again, not all changes need to be included if they are not critical bugfixes or translations at this point. Stable releases can be done at any point after the official release date. Thanks for your help as we reach the end of the Train. Sean From jean-philippe at evrard.me Wed Oct 9 15:39:34 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Wed, 09 Oct 2019 17:39:34 +0200 Subject: [tc] Weekly update In-Reply-To: <5af7b363-4333-6fce-38c2-cf0dc8541d4c@gmail.com> References: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> <5af7b363-4333-6fce-38c2-cf0dc8541d4c@gmail.com> Message-ID: On Wed, 2019-10-09 at 08:28 -0500, Jay Bryant wrote: > Anyway, I am willing to take a look at it and put together a > summary. > What is the time frame the TC is looking for on this? I think it's an important exercise. However, I don't think there is a strict timeline. As long as we learn from it, I would say that we are good. Maybe we could discuss the teachings of it during the summit? The next meeting (tomorrow) seems a little bit ambitious to me... Regards, JP From sean.mcginnis at gmx.com Wed Oct 9 15:46:29 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 9 Oct 2019 10:46:29 -0500 Subject: [ptl][release] Last call for RC updates In-Reply-To: <20191009134411.GA9816@sm-workstation> References: <20191009134411.GA9816@sm-workstation> Message-ID: <20191009154629.GA26100@sm-workstation> On Wed, Oct 09, 2019 at 08:44:11AM -0500, Sean McGinnis wrote: > Hey everyone, > > This is just a reminder about tomorrow's deadline for a final RC for Train. > > There are several projects that have changes merged since cutting the > stable/train branch. Not all of these changes need to be included in the > initial Train coordinated release, but it would be good if there are > translations and bug fixes merged to get them into a final RC while there's > still time. > To try to help some teams, I have proposed RC2 releases for those deliverables that looked like they had relevant things that would be good to pick up for the final Train release. They can be found under the train-rc2 topic: https://review.opendev.org/#/q/topic:train-rc2+(status:open+OR+status:merged) Again, not all changes are necessary to be included, so we will only process these if we get an explicit ack from the PTL or release liaison that the team actually wants these extra RC releases. Feel free to +1 if you would like us to proceed, or -1 if you do not want the RC or just need a little more time to get anything else merged before tomorrow's deadline. If the latter, please take over the patch and update with the new commit hash that should be used for the release. Thanks! Sean From jungleboyj at gmail.com Wed Oct 9 15:54:59 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 9 Oct 2019 10:54:59 -0500 Subject: [tc] Weekly update In-Reply-To: References: <5c52a4fa0a39e05151f52f89dbddc8554520bd7f.camel@evrard.me> <5af7b363-4333-6fce-38c2-cf0dc8541d4c@gmail.com> Message-ID: On 10/9/2019 10:39 AM, Jean-Philippe Evrard wrote: > On Wed, 2019-10-09 at 08:28 -0500, Jay Bryant wrote: >> Anyway, I am willing to take a look at it and put together a >> summary. >> What is the time frame the TC is looking for on this? > I think it's an important exercise. > However, I don't think there is a strict timeline. As long as we learn > from it, I would say that we are good. > > Maybe we could discuss the teachings of it during the summit? The next > meeting (tomorrow) seems a little bit ambitious to me... > > Regards, > JP > JP, Awesome.  I will try to skim through what is out there before the meeting and at least have a proposal on how to proceed from tomorrow's meeting to the summit. Thanks! Jay From kendall at openstack.org Wed Oct 9 17:06:46 2019 From: kendall at openstack.org (Kendall Waters) Date: Wed, 9 Oct 2019 12:06:46 -0500 Subject: Important Shanghai PTG Information Message-ID: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> Hello Everyone! As I’m sure you already know, the Shanghai PTG is going to be a very different event from PTGs in the past so we wanted to spell out the differences so you can be better prepared. Registration & Badges Registration for the PTG is included in the cost of the Summit. It is a single registration for both events. Since there is a single registration for the event, there is also one badge for both events. You will pick it up when you check in for the Summit and keep it until the end of the PTG. The Space Rooms The space we are contracted to have for the PTG will be laid out differently. We only have a couple dedicated rooms which are allocated to those groups with the largest numbers of people. The rest of the teams will be in a single larger room together. To help people gather teams in an organized fashion, we will be naming the arrangements of tables after OpenStack releases (Austin, Bexar, Cactus, etc). Food & Beverage Rules Unfortunately, the venue does not allow ANY food or drink in any of the rooms. This includes coffee and tea. Lunch will be from 12:30 to 1:30 in the beautiful pre-function space outside of the Blue Hall. Moving Furniture You are allowed to! Yay! If the table arrangements your project/team/group lead requested don’t work for you, feel free to move the furniture around. That being said, try to keep the tables marked with their names so that others can find them during their time slots. There will also be extra chairs stacked in the corner if your team needs them. Hours This venue is particularly strict about the hours we are allowed to be there. The PTG is scheduled to run from 9:00 in the morning to 4:30 in the evening. Its reasonably likely that if you try to come early or stay late, security will talk to you. So please be kind and respectfully leave if they ask you to. Resources Power While we have been working with the venue to accomodate our power needs, we won’t have as many power strips as we have had in the past. For this reason, we want to remind everyone to charge all their devices every night and share the power strips we do have during the day. Sharing is caring! Flipcharts While we won’t have projection available, we will have some flipcharts around. Each dedicated room will have one flipchart and the big main room will have a few to share. Please feel free to grab one when you need it, but put it back when you are finished so that others can use it if they need. Again, sharing is caring! :) Onboarding A lot of the usual PTG attendees won’t be able to attend this event, but we will also have a lot of new faces. With this in mind, we have decided to add project onboarding to the PTG so that the new contributors can get up to speed with the projects meeting that week. The teams gathering that will be doing onboarding will have that denoted on the print and digital schedule on site. They have also been encouraged to promote when they will be doing their onboarding via the PTGBot and on the mailing lists. If you have any questions, please let us know! Cheers, The Kendalls (wendallkaters & diablo_rojo) -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucioseki at gmail.com Wed Oct 9 17:49:24 2019 From: lucioseki at gmail.com (Lucio Seki) Date: Wed, 9 Oct 2019 14:49:24 -0300 Subject: Important Shanghai PTG Information In-Reply-To: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> References: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> Message-ID: Hi Kendall, thanks for the info. > While we won’t have projection available Will be there projection for summit speakers? Lucio On Wed, Oct 9, 2019 at 2:11 PM Kendall Waters wrote: > Hello Everyone! > > As I’m sure you already know, the Shanghai PTG is going to be a very > different event from PTGs in the past so we wanted to spell out the > differences so you can be better prepared. > > Registration & Badges > > Registration for the PTG is included in the cost of the Summit. It is a > single registration for both events. Since there is a single registration > for the event, there is also one badge for both events. You will pick it up > when you check in for the Summit and keep it until the end of the PTG. > > The Space > > Rooms > > The space we are contracted to have for the PTG will be laid out > differently. We only have a couple dedicated rooms which are allocated to > those groups with the largest numbers of people. The rest of the teams will > be in a single larger room together. To help people gather teams in an > organized fashion, we will be naming the arrangements of tables after > OpenStack releases (Austin, Bexar, Cactus, etc). > > Food & Beverage Rules > > Unfortunately, the venue does not allow ANY food or drink in any of the > rooms. This includes coffee and tea. Lunch will be from 12:30 to 1:30 in > the beautiful pre-function space outside of the Blue Hall. > > Moving Furniture > > You are allowed to! Yay! If the table arrangements your project/team/group > lead requested don’t work for you, feel free to move the furniture around. > That being said, try to keep the tables marked with their names so that > others can find them during their time slots. There will also be extra > chairs stacked in the corner if your team needs them. > > Hours > > This venue is particularly strict about the hours we are allowed to be > there. The PTG is scheduled to run from 9:00 in the morning to 4:30 in the > evening. Its reasonably likely that if you try to come early or stay late, > security will talk to you. So please be kind and respectfully leave if they > ask you to. > > Resources > > Power > > While we have been working with the venue to accomodate our power needs, > we won’t have as many power strips as we have had in the past. For this > reason, we want to remind everyone to charge all their devices every night > and share the power strips we do have during the day. Sharing is caring! > > Flipcharts > > While we won’t have projection available, we will have some flipcharts > around. Each dedicated room will have one flipchart and the big main room > will have a few to share. Please feel free to grab one when you need it, > but put it back when you are finished so that others can use it if they > need. Again, sharing is caring! :) > > Onboarding > > A lot of the usual PTG attendees won’t be able to attend this event, but > we will also have a lot of new faces. With this in mind, we have decided to > add project onboarding to the PTG so that the new contributors can get up > to speed with the projects meeting that week. The teams gathering that will > be doing onboarding will have that denoted on the print and digital > schedule on site. They have also been encouraged to promote when they will > be doing their onboarding via the PTGBot and on the mailing lists. > > If you have any questions, please let us know! > > Cheers, > The Kendalls > (wendallkaters & diablo_rojo) > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Wed Oct 9 18:06:31 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 9 Oct 2019 11:06:31 -0700 Subject: [PTL] PTG Team Photos Message-ID: Hello Everyone! We are excited to see you in a few weeks at the PTG and wanted to share that we will be taking team photos again! Here is an ethercalc signup for the available time slots [1]. We will be providing time on Thursday Morning/Afternoon and Friday morning to come as a team to get your photo taken. Slots are only ten minutes so its *important that everyone be on time*! The location is TBD at this point, but it will likely be in the prefunction space near registration. Thanks, -Kendall Nelson (diablo_rojo) [1] https://ethercalc.openstack.org/lnupu1sx6ljl -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Wed Oct 9 19:05:09 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 9 Oct 2019 21:05:09 +0200 Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR packages issues) Message-ID: Hello Tackers! Some time ago I reported a bug in Kolla-Ansible Tacker deployment [1] Eduardo (thanks!) did some debugging to discover that you started requiring internal Glance configuration for Tacker to make it use the local filesystem via the filestore backend (internally in Tacker, not via the deployed Glance) [2] This makes us, Koalas, wonder how to approach a proper production deployment of Tacker. Tacker docs have not been updated regarding this new feature and following them may result in broken Tacker deployment (as we have now). We are especially interested in how to deal with multinode Tacker deployment. Do these new paths require any synchronization? [1] https://bugs.launchpad.net/kolla-ansible/+bug/1845142 [2] https://review.opendev.org/#/c/684275/2/ansible/roles/tacker/templates/tacker.conf.j2 Kind regards, Radek -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Wed Oct 9 22:58:58 2019 From: emilien at redhat.com (Emilien Macchi) Date: Wed, 9 Oct 2019 18:58:58 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: This thread deserves an update: - tripleo-ansible has now a paunch module, calling openstack/paunch as a library. https://opendev.org/openstack/tripleo-ansible/src/branch/master/tripleo_ansible/ansible_plugins/modules/paunch.py And is called here for paunch apply: https://opendev.org/openstack/tripleo-heat-templates/src/branch/master/common/deploy-steps-tasks.yaml#L232-L254 In theory, we could deprecate "paunch apply" now as we don't need it anymore. I was working on porting "paunch cleanup" but it's still WIP. - I've been working on a new Ansible role which could totally replace Paunch, called "tripleo-container-manage", which has been enough for me to deploy an Undercloud: https://review.opendev.org/#/c/686196. It's being tested here: https://review.opendev.org/#/c/687651/ and as you can see the undercloud was successfully deployed without Paunch. Note that some container parameters haven't been ported and upgrade untested (this is a prototype). The second approach is a serious prototype I would like to continue further but before I would like some feedback. As for the feedback received in the previous answers, people would like to keep a "print-cmd" like, which makes total sense. I was thinking we could write a proper check mode for the podman_container module, which could output the podman commands that are run by the module. We could also extract the container management tasks to its own playbook so an operator who would usually run: $ paunch debug (...) --action print-cmd replaced by: $ ansible-playbook --check -i inventory.yaml containers.yaml A few benefits of this new role: - leverage ansible modules (we plan to upstream podman_container module) - could be easier to maintain and contribute (python vs ansible) - could potentially be faster. I want to investigate usage of async actions/polls in the role. Challenges: - no unit tests like in paunch, will need good testing with Molecule - we need to invest a lot in testing it, Paunch has a lot of edge cases that we carried over the cycles to manage containers. More feedback is very welcome and anyone interested to contribute please let me know. On Tue, Sep 17, 2019 at 5:03 AM Bogdan Dobrelya wrote: > On 16.09.2019 18:07, Emilien Macchi wrote: > > On Mon, Sep 16, 2019 at 11:47 AM Rabi Mishra > > wrote: > > > > I'm not sure if podman as container tool would move in that > > direction, as it's meant to be a command line tool. If we really > > want to reduce the overhead of so many layers in TripleO and podman > > is the container tool for us (I'll ignore the k8s related > > discussions for the time being), I would think the logic of > > translating the JSON configs to podman calls should be be in ansible > > (we can even write a TripleO specific podman module). > > > > > > I think we're both in strong agreement and say "let's convert paunch > > into ansible module". > > I support the idea of calling paunch code as is from an ansible module. > Although I'm strongly opposed against re-implementing the paunch code > itself as ansible modules. That only brings maintenance burden (harder > will be much to backport fixes into Queens and Train) and more place for > potential regressions, without any functional improvements. > > > And make the module robust enough for our needs. Then we could replace > > paunch by calling the podman module directly. > > -- > > Emilien Macchi > > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Thu Oct 10 00:53:01 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 00:53:01 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> Message-ID: We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? Message with logs got moderated so logs are here: http://paste.openstack.org/show/782622/ From: Chris Apsey Sent: Friday, September 27, 2019 9:34 AM To: Albert Braden Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away. I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden > wrote: When I create 100 VMs in our prod cluster: openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.” If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure. What config variables should I be looking at? Here are the relevant log entries from the HV: 2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds More logs and data: http://paste.openstack.org/show/779524/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali74.ebrahimpour at gmail.com Wed Oct 9 07:49:42 2019 From: ali74.ebrahimpour at gmail.com (Ali Ebrahimpour) Date: Wed, 9 Oct 2019 11:19:42 +0330 Subject: monitoring openstack Message-ID: hi guys i want to install monitoring in my horizon Ui and i'm confused in setting up ceilometer or gnocchi or aodh or monasca in my project because all of them where deprecated. i setup openstack with ansible and i want to monitor the usage of cpu and ram and etc in my dashboard and i also want to know how much resources each customer used for one hour and day. Thanks in advance for your precise guidance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Thu Oct 10 00:20:24 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 00:20:24 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net> Message-ID: We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? [1] Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 2. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 3. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 4. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Start request repeated too quickly. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Failed to start Openstack Neutron Linux Bridge Agent. [2] 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] Logging enabled! 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] /usr/bin/neutron-linuxbridge-agent version 13.0.4 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Interface mappings: {'physnet1': 'eno1'} 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge mappings: {} 2019-10-09 17:05:24.522 5803 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/plugins/ml2/linuxbridge_agent.ini', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpmdyxcD/privsep.sock'] 2019-10-09 17:05:25.071 5803 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap 2019-10-09 17:05:25.022 5828 INFO oslo.privsep.daemon [-] privsep daemon starting 2019-10-09 17:05:25.025 5828 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/none 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep daemon running as pid 5828 2019-10-09 17:05:25.125 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Agent initialized successfully, now running... 2019-10-09 17:05:25.193 5803 ERROR neutron.agent.linux.utils [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Error starting thread.: Exception: Failed to spawn rootwrap process. stderr: sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Traceback (most recent call last): 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 794, in run_service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service service.start() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 86, in start 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.setup_rpc() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 153, in setup_rpc 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.context, self.sg_plugin_rpc, defer_refresh_firewall=True) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 58, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.init_firewall(defer_refresh_firewall, integration_bridge) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 83, in init_firewall 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.firewall = firewall_class() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_firewall.py", line 88, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service zone_per_port=self.CONNTRACK_ZONE_PER_PORT) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 58, in get_conntrack 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute, namespace, zone_per_port) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 75, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._populate_initial_zone_map() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 182, in _populate_initial_zone_map 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service rules = self.get_rules_for_table_func('raw') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_manager.py", line 477, in get_rules_for_table 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return self.execute(args, run_as_root=True).split('\n') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 122, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute_rootwrap_daemon(cmd, process_input, addl_env)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 109, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service LOG.error("Rootwrap error running command: %s", cmd) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.force_reraise() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service six.reraise(self.type_, self.value, self.tb) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 106, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return client.execute(cmd, process_input) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 148, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._ensure_initialized() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 115, in _ensure_initialized 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._initialize() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 85, in _initialize 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service (stderr,)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service stderr: 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.197 5803 INFO neutron.plugins.ml2.drivers.agent._common_agent [-] Stopping Linux bridge agent agent. 2019-10-09 17:05:25.198 5803 CRITICAL neutron [-] Unhandled error: AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron Traceback (most recent call last): 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/bin/neutron-linuxbridge-agent", line 10, in 2019-10-09 17:05:25.198 5803 ERROR neutron sys.exit(main()) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/cmd/eventlet/plugins/linuxbridge_neutron_agent.py", line 21, in main 2019-10-09 17:05:25.198 5803 ERROR neutron agent_main.main() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1051, in main 2019-10-09 17:05:25.198 5803 ERROR neutron launcher.wait() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 392, in wait 2019-10-09 17:05:25.198 5803 ERROR neutron status, signo = self._wait_for_exit_or_signal() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 377, in _wait_for_exit_or_signal 2019-10-09 17:05:25.198 5803 ERROR neutron self.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 292, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.services.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 760, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron service.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 117, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.set_rpc_timeout(self.quitting_rpc_timeout) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 476, in set_rpc_timeout 2019-10-09 17:05:25.198 5803 ERROR neutron self.state_rpc): 2019-10-09 17:05:25.198 5803 ERROR neutron AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron From: Chris Apsey Sent: Friday, September 27, 2019 9:34 AM To: Albert Braden Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away. I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden > wrote: When I create 100 VMs in our prod cluster: openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.” If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure. What config variables should I be looking at? Here are the relevant log entries from the HV: 2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds More logs and data: http://paste.openstack.org/show/779524/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eandersson at blizzard.com Thu Oct 10 01:40:17 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Thu, 10 Oct 2019 01:40:17 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? [1] Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 2. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 3. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 4. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Start request repeated too quickly. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Failed to start Openstack Neutron Linux Bridge Agent. [2] 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] Logging enabled! 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] /usr/bin/neutron-linuxbridge-agent version 13.0.4 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Interface mappings: {'physnet1': 'eno1'} 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge mappings: {} 2019-10-09 17:05:24.522 5803 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/plugins/ml2/linuxbridge_agent.ini', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpmdyxcD/privsep.sock'] 2019-10-09 17:05:25.071 5803 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap 2019-10-09 17:05:25.022 5828 INFO oslo.privsep.daemon [-] privsep daemon starting 2019-10-09 17:05:25.025 5828 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/none 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep daemon running as pid 5828 2019-10-09 17:05:25.125 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Agent initialized successfully, now running... 2019-10-09 17:05:25.193 5803 ERROR neutron.agent.linux.utils [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Error starting thread.: Exception: Failed to spawn rootwrap process. stderr: sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Traceback (most recent call last): 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 794, in run_service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service service.start() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 86, in start 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.setup_rpc() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 153, in setup_rpc 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.context, self.sg_plugin_rpc, defer_refresh_firewall=True) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 58, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.init_firewall(defer_refresh_firewall, integration_bridge) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 83, in init_firewall 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.firewall = firewall_class() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_firewall.py", line 88, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service zone_per_port=self.CONNTRACK_ZONE_PER_PORT) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 58, in get_conntrack 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute, namespace, zone_per_port) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 75, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._populate_initial_zone_map() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 182, in _populate_initial_zone_map 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service rules = self.get_rules_for_table_func('raw') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_manager.py", line 477, in get_rules_for_table 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return self.execute(args, run_as_root=True).split('\n') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 122, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute_rootwrap_daemon(cmd, process_input, addl_env)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 109, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service LOG.error("Rootwrap error running command: %s", cmd) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.force_reraise() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service six.reraise(self.type_, self.value, self.tb) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 106, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return client.execute(cmd, process_input) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 148, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._ensure_initialized() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 115, in _ensure_initialized 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._initialize() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 85, in _initialize 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service (stderr,)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service stderr: 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.197 5803 INFO neutron.plugins.ml2.drivers.agent._common_agent [-] Stopping Linux bridge agent agent. 2019-10-09 17:05:25.198 5803 CRITICAL neutron [-] Unhandled error: AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron Traceback (most recent call last): 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/bin/neutron-linuxbridge-agent", line 10, in 2019-10-09 17:05:25.198 5803 ERROR neutron sys.exit(main()) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/cmd/eventlet/plugins/linuxbridge_neutron_agent.py", line 21, in main 2019-10-09 17:05:25.198 5803 ERROR neutron agent_main.main() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1051, in main 2019-10-09 17:05:25.198 5803 ERROR neutron launcher.wait() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 392, in wait 2019-10-09 17:05:25.198 5803 ERROR neutron status, signo = self._wait_for_exit_or_signal() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 377, in _wait_for_exit_or_signal 2019-10-09 17:05:25.198 5803 ERROR neutron self.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 292, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.services.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 760, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron service.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 117, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.set_rpc_timeout(self.quitting_rpc_timeout) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 476, in set_rpc_timeout 2019-10-09 17:05:25.198 5803 ERROR neutron self.state_rpc): 2019-10-09 17:05:25.198 5803 ERROR neutron AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron From: Chris Apsey Sent: Friday, September 27, 2019 9:34 AM To: Albert Braden Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Albert, Do this: https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/ The problem will go away. I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented. Either way, that should solve your problem. r Chris Apsey ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Friday, September 27, 2019 12:17 PM, Albert Braden > wrote: When I create 100 VMs in our prod cluster: openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance aborted: Failed to allocate the network(s), not rescheduling.” If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure. What config variables should I be looking at? Here are the relevant log entries from the HV: 2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap18f4e419-b1') 2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9] Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds More logs and data: http://paste.openstack.org/show/779524/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eandersson at blizzard.com Thu Oct 10 01:43:09 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Thu, 10 Oct 2019 01:43:09 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? [1] Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 2. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:38 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 3. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Starting Openstack Neutron Linux Bridge Agent... Oct 9 13:48:41 us01odc-qa-ctrl1 systemd[1]: Started Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Main process exited, code=exited, status=1/FAILURE Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Service hold-off time over, scheduling restart. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Scheduled restart job, restart counter is at 4. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Stopped Openstack Neutron Linux Bridge Agent. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Start request repeated too quickly. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: neutron-linuxbridge-agent.service: Failed with result 'exit-code'. Oct 9 13:48:43 us01odc-qa-ctrl1 systemd[1]: Failed to start Openstack Neutron Linux Bridge Agent. [2] 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] Logging enabled! 2019-10-09 17:05:24.519 5803 INFO neutron.common.config [-] /usr/bin/neutron-linuxbridge-agent version 13.0.4 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Interface mappings: {'physnet1': 'eno1'} 2019-10-09 17:05:24.520 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge mappings: {} 2019-10-09 17:05:24.522 5803 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/plugins/ml2/linuxbridge_agent.ini', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpmdyxcD/privsep.sock'] 2019-10-09 17:05:25.071 5803 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap 2019-10-09 17:05:25.022 5828 INFO oslo.privsep.daemon [-] privsep daemon starting 2019-10-09 17:05:25.025 5828 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN/none 2019-10-09 17:05:25.027 5828 INFO oslo.privsep.daemon [-] privsep daemon running as pid 5828 2019-10-09 17:05:25.125 5803 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Agent initialized successfully, now running... 2019-10-09 17:05:25.193 5803 ERROR neutron.agent.linux.utils [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service [req-8aaf64a2-8f0d-44ce-888f-09ae3d1acd78 - - - - -] Error starting thread.: Exception: Failed to spawn rootwrap process. stderr: sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Traceback (most recent call last): 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 794, in run_service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service service.start() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 86, in start 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.setup_rpc() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service result = f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 153, in setup_rpc 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.context, self.sg_plugin_rpc, defer_refresh_firewall=True) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 58, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.init_firewall(defer_refresh_firewall, integration_bridge) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/securitygroups_rpc.py", line 83, in init_firewall 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.firewall = firewall_class() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_firewall.py", line 88, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service zone_per_port=self.CONNTRACK_ZONE_PER_PORT) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return f(*args, **kwargs) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 58, in get_conntrack 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute, namespace, zone_per_port) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 75, in __init__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._populate_initial_zone_map() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_conntrack.py", line 182, in _populate_initial_zone_map 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service rules = self.get_rules_for_table_func('raw') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/iptables_manager.py", line 477, in get_rules_for_table 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return self.execute(args, run_as_root=True).split('\n') 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 122, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service execute_rootwrap_daemon(cmd, process_input, addl_env)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 109, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service LOG.error("Rootwrap error running command: %s", cmd) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self.force_reraise() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service six.reraise(self.type_, self.value, self.tb) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 106, in execute_rootwrap_daemon 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service return client.execute(cmd, process_input) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 148, in execute 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._ensure_initialized() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 115, in _ensure_initialized 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service self._initialize() 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service File "/usr/lib/python2.7/dist-packages/oslo_rootwrap/client.py", line 85, in _initialize 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service (stderr,)) 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service Exception: Failed to spawn rootwrap process. 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service stderr: 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service sudo: no tty present and no askpass program specified 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.194 5803 ERROR oslo_service.service 2019-10-09 17:05:25.197 5803 INFO neutron.plugins.ml2.drivers.agent._common_agent [-] Stopping Linux bridge agent agent. 2019-10-09 17:05:25.198 5803 CRITICAL neutron [-] Unhandled error: AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron Traceback (most recent call last): 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/bin/neutron-linuxbridge-agent", line 10, in 2019-10-09 17:05:25.198 5803 ERROR neutron sys.exit(main()) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/cmd/eventlet/plugins/linuxbridge_neutron_agent.py", line 21, in main 2019-10-09 17:05:25.198 5803 ERROR neutron agent_main.main() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 1051, in main 2019-10-09 17:05:25.198 5803 ERROR neutron launcher.wait() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 392, in wait 2019-10-09 17:05:25.198 5803 ERROR neutron status, signo = self._wait_for_exit_or_signal() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 377, in _wait_for_exit_or_signal 2019-10-09 17:05:25.198 5803 ERROR neutron self.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 292, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.services.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 760, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron service.stop() 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 117, in stop 2019-10-09 17:05:25.198 5803 ERROR neutron self.set_rpc_timeout(self.quitting_rpc_timeout) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper 2019-10-09 17:05:25.198 5803 ERROR neutron result = f(*args, **kwargs) 2019-10-09 17:05:25.198 5803 ERROR neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 476, in set_rpc_timeout 2019-10-09 17:05:25.198 5803 ERROR neutron self.state_rpc): 2019-10-09 17:05:25.198 5803 ERROR neutron AttributeError: 'CommonAgentLoop' object has no attribute 'state_rpc' 2019-10-09 17:05:25.198 5803 ERROR neutron -------------- next part -------------- An HTML attachment was scrubbed... URL: From anlin.kong at gmail.com Thu Oct 10 06:09:28 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Thu, 10 Oct 2019 19:09:28 +1300 Subject: [trove] Core team change Message-ID: Hi, As the Ussuri dev cycle begins, it's time to make some changes to Trove core team. Unfortunately, except myself, there is no one actively contributing(including coding and reviewing) to Trove according to https://www.stackalytics.com/report/contribution/trove/90. Some of the reasons are probably because there was a significant change in Trove community in the history and the security concerns. However, as a member of a public cloud provider who deployed Trove, I've been trying my best to bring Trove back on track during the recent several dev cycles. Although it's very hard to make this decision, I am going to remove all the current members from the core team according to https://docs.openstack.org/project-team-guide/open-development.html, but everyone is encouraged to help review changes and join the core team again with enough valuable reviews and contributions. Thanks for all your contributions in the past. - Best regards, Lingxian Kong Catalyst Cloud -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Oct 10 07:22:33 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 10 Oct 2019 09:22:33 +0200 Subject: [ptl][release] Last call for RC updates In-Reply-To: <20191009154629.GA26100@sm-workstation> References: <20191009134411.GA9816@sm-workstation> <20191009154629.GA26100@sm-workstation> Message-ID: Hi, > On 9 Oct 2019, at 17:46, Sean McGinnis wrote: > > On Wed, Oct 09, 2019 at 08:44:11AM -0500, Sean McGinnis wrote: >> Hey everyone, >> >> This is just a reminder about tomorrow's deadline for a final RC for Train. >> >> There are several projects that have changes merged since cutting the >> stable/train branch. Not all of these changes need to be included in the >> initial Train coordinated release, but it would be good if there are >> translations and bug fixes merged to get them into a final RC while there's >> still time. >> > > To try to help some teams, I have proposed RC2 releases for those deliverables > that looked like they had relevant things that would be good to pick up for the > final Train release. They can be found under the train-rc2 topic: > > https://review.opendev.org/#/q/topic:train-rc2+(status:open+OR+status:merged) Thx a lot for doing this Sean :) > > Again, not all changes are necessary to be included, so we will only process > these if we get an explicit ack from the PTL or release liaison that the team > actually wants these extra RC releases. > > Feel free to +1 if you would like us to proceed, or -1 if you do not want the > RC or just need a little more time to get anything else merged before > tomorrow's deadline. If the latter, please take over the patch and update with > the new commit hash that should be used for the release. > > Thanks! > Sean > — Slawek Kaplonski Senior software engineer Red Hat From geguileo at redhat.com Thu Oct 10 10:00:50 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Thu, 10 Oct 2019 12:00:50 +0200 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> Message-ID: <20191010100050.hn546tikeihaho7e@localhost> On 04/10, Matt Riedemann wrote: > On 10/4/2019 11:03 AM, Walter Boring wrote: > >   I think if we don't have a host connector passed in and the > > attachment record doesn't have a connector saved, > > then that results in the volume manager not calling the cinder driver to > > terminate_connection and return. > > This also bypasses the driver's remove_export() which is the last chance > > for a driver to unexport a volume. > > Two things: > > 1. Yeah if the existing legacy attachment record doesn't have a connector I > was worried about not properly cleaning on for that old connection, which is > something I mentioned before, but also as mentioned we potentially have that > case when a server is deleted and we can't get to the compute host to get > the host connector, right? > Hi, Not really... In that case we still have the BDM info in the DB, so we can just make the 3 Cinder REST API calls ourselves (begin_detaching, terminate_connection and detach) to have the volume unmapped, the export removed, and the volume return to available as usual, without needing to go to the storage array manually. > 2. If I were to use os-terminate_connection, I seem to have a tricky > situation on the migration flow because right now I'm doing: > > a) create new attachment with host connector > b) complete new attachment (put the volume back to in-use status) > - if this fails I attempt to delete the new attachment > c) delete the legacy attachment - I intentionally left this until the end to > make sure (a) and (b) were successful. > > If I change (c) to be os-terminate_connection, will that screw up the > accounting on the attachment created in (a)? > > If I did the terminate_connection first (before creating a new attachment), > could that leave a window of time where the volume is shown as not > attached/in-use? Maybe not since it's not the begin_detaching/os-detach > API...I'm fuzzy on the cinder volume state machine here. > > Or maybe the flow would become: > > a) create new attachment with host connector This is a good idea in itself, but it's not taking into account weird behaviors that some Cinder drivers may have when you call them twice to initialize the connection on the same host. Some drivers end up creating a different mapping for the volume instead of returning the existing one; we've had bugs like this before, and that's why Nova made a change in its live instance migration code to not call intialize_connection on the source host to get the connection_info for detaching. > b) terminate the connection for the legacy attachment > - if this fails, delete the new attachment created in (a) > c) complete the new attachment created in (a) > - if this fails...? > > Without digging into the flow of a cold or live migration I want to say > that's closer to what we do there, e.g. initialize_connection for the new > host, terminate_connection for the old host, complete the new attachment. > I think any workaround we try to find has a good chance of resulting in a good number of bugs. In my opinion our options are: 1- Completely detach and re-attach the volume 2- Write new code in Cinder The new code can be either a new action or we can just add a microversion to attachment create to also accept "connection_info", and when we provide connection_info on the call the method confirms that it's a "migration" (the volume is 'in-use' and doesn't have any attachments) and it doesn't bother to call the cinder-volume to export and map the volume again and simply saves this information in the DB. I know the solution it's not "clean/nice/elegant", and I'd rather go with option 1, but that would be terrible user experience, so I'll settle for a solution that doesn't add much code to Cinder, is simple for Nova, and is less likely to result in bugs. What do you think? Regards, Gorka. PS: In this week's meeting we briefly discussed this topic and agreed to continue the conversation here and retake it on next week's meeting. > -- > > Thanks, > > Matt > From rico.lin.guanyu at gmail.com Thu Oct 10 10:23:45 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Thu, 10 Oct 2019 18:23:45 +0800 Subject: [tc] monthly meeting agenda In-Reply-To: References: <6665a2cba0fc7b3a80312638e82f4a383ac169a7.camel@evrard.me> <1e6f227d2b341b7d7d528d30f4b3c9821e66ffe9.camel@evrard.me> Message-ID: Hi all I just add topic `overall review for TC summit and PTG plans` to agenda since this is the last meeting we have before Summit and we should take some time to confirm it. On Wed, Oct 9, 2019 at 12:57 AM Rico Lin wrote: > I added two more topics in agenda suggestion today which might worth > discuss about. > * define goal select process schedule > * Maintain issue with Telemetery > > On Tue, Oct 8, 2019 at 10:10 PM Jean-Philippe Evrard < > jean-philippe at evrard.me> wrote: > > > > > Thanks! Maybe we could only discuss about what to do for our rejected > > sessions (in https://etherpad.openstack.org/p/PVG-TC-brainstorming )? > > That sounds like a good idea. > > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Thu Oct 10 12:19:04 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Thu, 10 Oct 2019 14:19:04 +0200 Subject: [tc] Time off for JP! Message-ID: Hello, I will have limited access to internet and emails until the 23rd of October. For all urgent matters, you can contact Rico Lin, our vice-chair. Talk to you all later! Regards, JP From thierry at openstack.org Thu Oct 10 15:40:17 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 10 Oct 2019 17:40:17 +0200 Subject: [ptg] Auto-generated etherpad links ! Message-ID: <86f9ea36-5c38-ef64-aa7c-dd5849143c5d@openstack.org> Hi everyone, The PTGbot grew a new feature over the summer. It now dynamically generates the list of PTG track etherpads. You can find that list at: http://ptg.openstack.org/etherpads.html If you haven't created your etherpad already, just follow the link there to create your etherpad. If you have created your track etherpad already under a different name, you can overload the automatically-generated name using the PTGbot. Just join the #openstack-ptg channel and (as a Freenode authenticated user) send the following command: #TRACKNAME etherpad Example: #keystone etherpad https://etherpad.openstack.org/p/awesome-keystone-pad That will update the link on that page automatically. Hoping to see you in Shanghai! -- Thierry Carrez (ttx) From Albert.Braden at synopsys.com Thu Oct 10 18:04:41 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 18:04:41 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn't start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: "Exception: Failed to spawn rootwrap process." If I comment out 'root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"' and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Thu Oct 10 18:07:37 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 10 Oct 2019 13:07:37 -0500 Subject: [nova] No meeting today Message-ID: Hi all. I'm going to be on PTO, it sounds like several other key members will be absent, the US-time meetings have been very sparsely attended lately, and there are a few things on the agenda for which we should really have a quorum, so I'm canceling today's meeting. https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting Thanks, efried . From eandersson at blizzard.com Thu Oct 10 18:08:02 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Thu, 10 Oct 2019 18:08:02 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: Yea - if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn't start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: "Exception: Failed to spawn rootwrap process." If I comment out 'root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"' and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Oct 10 18:21:31 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 10 Oct 2019 13:21:31 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: <20191010100050.hn546tikeihaho7e@localhost> References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> <20191010100050.hn546tikeihaho7e@localhost> Message-ID: On 10/10/2019 5:00 AM, Gorka Eguileor wrote: >> 1. Yeah if the existing legacy attachment record doesn't have a connector I >> was worried about not properly cleaning on for that old connection, which is >> something I mentioned before, but also as mentioned we potentially have that >> case when a server is deleted and we can't get to the compute host to get >> the host connector, right? >> > Hi, > > Not really... In that case we still have the BDM info in the DB, so we > can just make the 3 Cinder REST API calls ourselves (begin_detaching, > terminate_connection and detach) to have the volume unmapped, the export > removed, and the volume return to available as usual, without needing to > go to the storage array manually. I'm not sure what you mean. Yes we have the BDM in nova but if it's really old it won't have the host connector stashed away in the connection_info dict and we won't be able to pass that to the terminate_connection API: https://github.com/openstack/nova/blob/19.0.0/nova/compute/api.py#L2186 Are you talking about something else? I realize ^ is very edge case since we've been storing the connector in the BDM.connection_info since I think at least Liberty or Mitaka. > > >> 2. If I were to use os-terminate_connection, I seem to have a tricky >> situation on the migration flow because right now I'm doing: >> >> a) create new attachment with host connector >> b) complete new attachment (put the volume back to in-use status) >> - if this fails I attempt to delete the new attachment >> c) delete the legacy attachment - I intentionally left this until the end to >> make sure (a) and (b) were successful. >> >> If I change (c) to be os-terminate_connection, will that screw up the >> accounting on the attachment created in (a)? >> >> If I did the terminate_connection first (before creating a new attachment), >> could that leave a window of time where the volume is shown as not >> attached/in-use? Maybe not since it's not the begin_detaching/os-detach >> API...I'm fuzzy on the cinder volume state machine here. >> >> Or maybe the flow would become: >> >> a) create new attachment with host connector > This is a good idea in itself, but it's not taking into account weird > behaviors that some Cinder drivers may have when you call them twice to > initialize the connection on the same host. Some drivers end up > creating a different mapping for the volume instead of returning the > existing one; we've had bugs like this before, and that's why Nova made > a change in its live instance migration code to not call > intialize_connection on the source host to get the connection_info for > detaching. Huh...I thought attachments in cinder were a dime a dozen and you could create/delete them as needed, or that was the idea behind the new v3 attachments stuff. It seems to at least be what I remember John Griffith always saying we should be able to do. Also if you can't refresh the connection info on the same host then a change like this: https://review.opendev.org/#/c/579004/ Which does just that - refreshes the connection info doing reboot and start instance operations - would break on those volume drivers if I'm following you. > > >> b) terminate the connection for the legacy attachment >> - if this fails, delete the new attachment created in (a) >> c) complete the new attachment created in (a) >> - if this fails...? >> >> Without digging into the flow of a cold or live migration I want to say >> that's closer to what we do there, e.g. initialize_connection for the new >> host, terminate_connection for the old host, complete the new attachment. >> > I think any workaround we try to find has a good chance of resulting in > a good number of bugs. > > In my opinion our options are: > > 1- Completely detach and re-attach the volume I'd really like to avoid this if possible because it could screw up running applications and the migration operation itself is threaded out to not hold up the restart of the compute service. But maybe that's also true of what I've got written up today though it's closer to what we do during resize/cold migrate (though those of course involve downtime for the guest). > 2- Write new code in Cinder > > The new code can be either a new action or we can just add a > microversion to attachment create to also accept "connection_info", and > when we provide connection_info on the call the method confirms that > it's a "migration" (the volume is 'in-use' and doesn't have any > attachments) and it doesn't bother to call the cinder-volume to export > and map the volume again and simply saves this information in the DB. If the volume is in-use it would have attachments, so I'm not following you there. Even if the volume is attached the "legacy" way from a nova perspective, using os-initialize_connection, there is a volume attachment record in the cinder DB (I confirmed this in my devstack testing and the notes are in my patch). It's also precisely the problem I'm trying to solve which is without deleting the old legacy attachment, when you delete the server the volume is detached but still shows up in cinder as in-use because of the orphaned attachment. > > I know the solution it's not "clean/nice/elegant", and I'd rather go > with option 1, but that would be terrible user experience, so I'll > settle for a solution that doesn't add much code to Cinder, is simple > for Nova, and is less likely to result in bugs. > > What do you think? > > Regards, > Gorka. > > PS: In this week's meeting we briefly discussed this topic and agreed to > continue the conversation here and retake it on next week's meeting. > Thanks for discussing it and getting back to me. -- Thanks, Matt From ildiko.vancsa at gmail.com Thu Oct 10 18:26:38 2019 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Thu, 10 Oct 2019 20:26:38 +0200 Subject: [keystone][edge][k8s] Keystone - StarlingX integration feedback Message-ID: Hi, I wanted to point you to a thread that’s just started on the edge-computing mailing list: http://lists.openstack.org/pipermail/edge-computing/2019-October/000642.html The mail contains information about a use case that StarlingX has to use Keystone integrated with Kubernetes which I believe is valuable information to the Keystone team to see if there are any items to discuss further/fix/implement. Thanks, Ildikó From Albert.Braden at synopsys.com Thu Oct 10 19:13:02 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 19:13:02 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone... for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren't erroring at this time. I changed neutron's shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? From: Erik Olof Gunnar Andersson Sent: Thursday, October 10, 2019 11:08 AM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group Yea - if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden > Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn't start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: "Exception: Failed to spawn rootwrap process." If I comment out 'root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"' and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Thu Oct 10 19:20:37 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Oct 2019 14:20:37 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> <20191010100050.hn546tikeihaho7e@localhost> Message-ID: <20191010192037.GA1037@sm-workstation> > > > > > > a) create new attachment with host connector > > This is a good idea in itself, but it's not taking into account weird > > behaviors that some Cinder drivers may have when you call them twice to > > initialize the connection on the same host. Some drivers end up > > creating a different mapping for the volume instead of returning the > > existing one; we've had bugs like this before, and that's why Nova made > > a change in its live instance migration code to not call > > intialize_connection on the source host to get the connection_info for > > detaching. > > Huh...I thought attachments in cinder were a dime a dozen and you could > create/delete them as needed, or that was the idea behind the new v3 > attachments stuff. It seems to at least be what I remember John Griffith > always saying we should be able to do. > > Also if you can't refresh the connection info on the same host then a change > like this: > > https://review.opendev.org/#/c/579004/ > > Which does just that - refreshes the connection info doing reboot and start > instance operations - would break on those volume drivers if I'm following > you. > Creating attachements, using the new attachments API, is a pretty low overhead thing. The issue is/was with the way Nova was calling initialize_connection expecting it to be an idempotent operation. I think we've identified most drivers that had an issue with this. It wasn't a documented assumption on the Cinder side, so I remember when we first realized that was what Nova was doing, we found a lot of Cinder backends that had issues with this. With initialize_connection, at least how it was intended, it is telling the backend to create a new connection between the storage and the host. So every time initialize_connection was called, most backends would make configuration changes on the storage backend to expose the volume to the requested host. Depending on how that backend worked, this could result in multiple separate (and different) connection settings for how the host can access the volume. Most drivers are now aware of this (mis?)use of the call and will first check if there is an existing configuration and just return those settings if it's found. There's no real way to test and enforce that easily, so making sure all drivers including newly added ones behave that way has been up to cores remembering to look for it during code reviews. But you can create as many attachment objects in the database as you want. Sean From corey.bryant at canonical.com Thu Oct 10 20:03:58 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Thu, 10 Oct 2019 16:03:58 -0400 Subject: [charms] placement charm In-Reply-To: References: Message-ID: On Wed, Oct 9, 2019 at 2:06 AM Frode Nordahl wrote: > On Fri, Oct 4, 2019 at 3:46 PM Corey Bryant > wrote: > >> Hi All, >> > > Hey Corey, > > Great to see the charm coming along! > > Code is located at: >> https://github.com/coreycb/charm-placement >> https://github.com/coreycb/charm-interface-placement >> >> https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) >> > > 1) Since the interface is new I would love to see it based on the > ``Endpoint`` class instead of the aging ``RelationBase`` class. Also the > interface code needs unit tests. We have multiple examples of interface > implementations with both in place you can get inspiration from [0]. > > Also consider having both a ``connected`` and ``available`` state, the > available state could be set on the first relation-changed event. This > increases the probability of your charm detecting a live charm in the other > end of the relation, both states are also required to use the > ``charms.openstack`` required relation gating code. > > 2) In the reactive handler you do a bespoke import of the charm class > module just to activate the code, this is no longer necessary as there has > been implemented a module that does automatic search and import of the > class for you. Please use that instead. [1] > > > import charms_openstack.bus > import charms_openstack.charm as charm > > charms_openstack.bus.discover() > > > 0: > https://github.com/search?q=org%3Aopenstack+%22from+charms.reactive+import+Endpoint%22&type=Code > 1: > https://github.com/search?q=org%3Aopenstack+charms_openstack.bus&type=Code > > -- > Frode Nordahl > Hey Frode, Thanks very much for the input. I have these up in gerrit now with the changes you mentioned so I think we can move further reviews to gerrit: https://review.opendev.org/#/q/topic:charms-train-placement+(status:open+OR+status:merged) Thanks, Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Oct 10 20:16:46 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 10 Oct 2019 15:16:46 -0500 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: <20191010192037.GA1037@sm-workstation> References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> <20191010100050.hn546tikeihaho7e@localhost> <20191010192037.GA1037@sm-workstation> Message-ID: <0d024d78-3f54-e633-bda8-fee57e1c9999@gmail.com> On 10/10/2019 2:20 PM, Sean McGinnis wrote: > Most drivers are now aware of this (mis?)use of the call and will first check > if there is an existing configuration and just return those settings if it's > found. There's no real way to test and enforce that easily, so making sure all > drivers including newly added ones behave that way has been up to cores > remembering to look for it during code reviews. It's unrelated to what I'm trying to solve, but could a cinder tempest plugin test be added which hits the initialize_connection API multiple times without changing host connector and assert the driver is doing the correct thing, whatever that is? Maybe it's just asserting that the connection_info returned from the first call is identical to subsequent calls if the host connector dict input doesn't change? > > But you can create as many attachment objects in the database as you want. But you have to remember to delete them otherwise the volume doesn't leave in-use status even if the volume is detached from the server, so there is attachment counting that needs to happen somewhere (I know cinder does it, but part of that is also on the client side - nova in this case). With the legacy attach flow nova would being_detaching, terminate_connection and then call os-detach and I suppose os-detach could puke if the client hadn't done the attachment cleanup properly, i.e. hadn't called terminate_connection. With the v3 attachments flow we don't have that, we just create attachment, update it with host connector and then complete it. On detach we just delete the attachment and if it's the last one the volume is no longer in-use. I'm not advocating adding another os-detach-like API for the v3 attachment flow, just saying it's an issue if the client isn't aware of all that. -- Thanks, Matt From Albert.Braden at synopsys.com Thu Oct 10 20:45:38 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 10 Oct 2019 20:45:38 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, Message-ID: The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We’re not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone… for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren’t erroring at this time. I changed neutron’s shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? From: Erik Olof Gunnar Andersson > Sent: Thursday, October 10, 2019 11:08 AM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group Yea – if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden > Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn’t start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From najoy at cisco.com Thu Oct 10 23:26:25 2019 From: najoy at cisco.com (Naveen Joy (najoy)) Date: Thu, 10 Oct 2019 23:26:25 +0000 Subject: Networking-vpp 19.08.1 for VPP 19.08.1 is now available Message-ID: Hello All, We'd like to invite you to try out Networking-vpp 19.08.1. As many of you may already know, VPP is a fast user space forwarder based on the DPDK toolkit. VPP uses vector packet processing algorithms to minimize the CPU time spent on each packet to maximize throughput. Networking-vpp is a ML2 mechanism driver that controls VPP on your control and compute hosts to provide fast L2 forwarding under Neutron. This latest version of Networking-vpp is updated to work with VPP 19.08.1 In this release, we've worked on the below updates: - We've added ERSPAN support for Tap-as-a-Service (TaaS). Since ERSPAN provides remote port mirroring, you can now mirror your OpenStack traffic to a destination outside of OpenStack or to a remote OpenStack VM. This is a customized version of OpenStack TaaS. We will be working with the community to push our custom TaaS extensions upstream. In the meantime, you can access our modified TaaS code here[2]. For further info on installation and usage, you can read the TaaS-README[3]. - We've updated the code to be compatible with VPP 19.08 & 19.08.1 API changes. - We've updated the unit test framework to support python3.5 onwards. - We've added security-group support for Trunk subports. We've added support for neutron trunk_details extension. - We've fixed some bugs in our Trunk and L3 plugins that caused a race condition during port binding. - We've migrated our repo from OpenStack to OpenDev. - A recent change in nova caused live migration to fail for instances with NUMA characteristics. We've found that this is a limitation in nova and not VPP/Networking-vpp. It is still possible to use live migration with VPP/Networking-vpp. Please refer to the README[1] for further details. - We've been doing the usual round of bug fixes and updates. The code will work with VPP 19.08.1 and has been updated to keep up with Neutron Rocky and Stein. The README [1] explains how you can try out VPP with Networking-vpp using Devstack: the Devstack plugin will deploy the mechanism driver and VPP and should give you a working system with a minimum of hassle. We will be continuing our development for VPP's 20.01 release. We welcome anyone who would like to come help us. -- Naveen, Ian & Jerome [1] https://opendev.org/x/networking-vpp/src/branch/master/README.rst [2] https://github.com/jbeuque/tap-as-a-service [3] https://opendev.org/x/networking-vpp/src/branch/master/README_taas.txt -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Thu Oct 10 23:34:42 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Oct 2019 18:34:42 -0500 Subject: [release] Release countdown for week R-0, October 14-18 Message-ID: <20191010233442.GA24173@sm-workstation> Development Focus ----------------- We will be releasing the coordinated OpenStack Train release next week, on Wednesday October 16th. Thanks to everyone involved in the Train cycle! We are now in pre-release freeze, so no new deliverable will be created until final release, unless a release-critical regression is spotted. Otherwise, teams attending the PTG in Shanghai should start to plan what they will be discussing there, by creating and filling team etherpads. You can find the list of etherpads at: http://ptg.openstack.org/etherpads.html General Information ------------------- On release day, the release team will produce final versions of deliverables following the cycle-with-rc release model, by re-tagging the commit used for the last RC. A patch doing just that will be proposed. PTLs and release liaisons should watch for that final release patch from the release team. While not required, we would appreciate having an ack from each team before we approve it on the 16th. Upcoming Deadlines & Dates -------------------------- Final Train release: October 16 Forum+PTG at Shanghai summit: November 4 From sean.mcginnis at gmx.com Thu Oct 10 23:38:58 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Oct 2019 18:38:58 -0500 Subject: [release] Release countdown for week R-0, October 14-18 In-Reply-To: <20191010233442.GA24173@sm-workstation> References: <20191010233442.GA24173@sm-workstation> Message-ID: <20191010233858.GB24173@sm-workstation> > > General Information > ------------------- > > On release day, the release team will produce final versions of deliverables > following the cycle-with-rc release model, by re-tagging the commit used for > the last RC. > > A patch doing just that will be proposed. PTLs and release liaisons should > watch for that final release patch from the release team. While not required, > we would appreciate having an ack from each team before we approve it on the > 16th. > And that patch has been proposed. PTL's, please ack this patch when you have a moment: https://review.opendev.org/#/c/687991/ From kendall at openstack.org Thu Oct 10 23:40:02 2019 From: kendall at openstack.org (Kendall Waters) Date: Thu, 10 Oct 2019 18:40:02 -0500 Subject: Important Shanghai PTG Information In-Reply-To: References: <9FDF61D8-22A5-4CA6-8F5B-BAF8122121BA@openstack.org> Message-ID: Hi Lucio, Great question! Yes, there will be projection in all Summit breakout rooms. Cheers, Kendall Kendall Waters OpenStack Marketing & Events kendall at openstack.org > On Oct 9, 2019, at 12:49 PM, Lucio Seki wrote: > > Hi Kendall, thanks for the info. > > > While we won’t have projection available > Will be there projection for summit speakers? > > Lucio > > On Wed, Oct 9, 2019 at 2:11 PM Kendall Waters > wrote: > Hello Everyone! > > As I’m sure you already know, the Shanghai PTG is going to be a very different event from PTGs in the past so we wanted to spell out the differences so you can be better prepared. > > Registration & Badges > > Registration for the PTG is included in the cost of the Summit. It is a single registration for both events. Since there is a single registration for the event, there is also one badge for both events. You will pick it up when you check in for the Summit and keep it until the end of the PTG. > > The Space > > Rooms > > The space we are contracted to have for the PTG will be laid out differently. We only have a couple dedicated rooms which are allocated to those groups with the largest numbers of people. The rest of the teams will be in a single larger room together. To help people gather teams in an organized fashion, we will be naming the arrangements of tables after OpenStack releases (Austin, Bexar, Cactus, etc). > > Food & Beverage Rules > > Unfortunately, the venue does not allow ANY food or drink in any of the rooms. This includes coffee and tea. Lunch will be from 12:30 to 1:30 in the beautiful pre-function space outside of the Blue Hall. > > Moving Furniture > > You are allowed to! Yay! If the table arrangements your project/team/group lead requested don’t work for you, feel free to move the furniture around. That being said, try to keep the tables marked with their names so that others can find them during their time slots. There will also be extra chairs stacked in the corner if your team needs them. > > Hours > > This venue is particularly strict about the hours we are allowed to be there. The PTG is scheduled to run from 9:00 in the morning to 4:30 in the evening. Its reasonably likely that if you try to come early or stay late, security will talk to you. So please be kind and respectfully leave if they ask you to. > > Resources > > Power > > While we have been working with the venue to accomodate our power needs, we won’t have as many power strips as we have had in the past. For this reason, we want to remind everyone to charge all their devices every night and share the power strips we do have during the day. Sharing is caring! > > Flipcharts > > While we won’t have projection available, we will have some flipcharts around. Each dedicated room will have one flipchart and the big main room will have a few to share. Please feel free to grab one when you need it, but put it back when you are finished so that others can use it if they need. Again, sharing is caring! :) > > Onboarding > > A lot of the usual PTG attendees won’t be able to attend this event, but we will also have a lot of new faces. With this in mind, we have decided to add project onboarding to the PTG so that the new contributors can get up to speed with the projects meeting that week. The teams gathering that will be doing onboarding will have that denoted on the print and digital schedule on site. They have also been encouraged to promote when they will be doing their onboarding via the PTGBot and on the mailing lists. > > If you have any questions, please let us know! > > Cheers, > The Kendalls > (wendallkaters & diablo_rojo) > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eandersson at blizzard.com Fri Oct 11 01:18:30 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Fri, 11 Oct 2019 01:18:30 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, , Message-ID: Maybe double check that your rootwrap config is up to date? /etc/neutron/rootwrap .conf and /etc/neutron/rootwrap.d (Make sure to pick the appropriate branch in github) https://github.com/openstack/neutron/blob/master/etc/rootwrap.conf https://github.com/openstack/neutron/tree/master/etc/neutron/rootwrap.d ________________________________ From: Albert Braden Sent: Thursday, October 10, 2019 1:45 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We’re not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone… for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren’t erroring at this time. I changed neutron’s shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? From: Erik Olof Gunnar Andersson > Sent: Thursday, October 10, 2019 11:08 AM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group Yea – if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden > Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn’t start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eandersson at blizzard.com Fri Oct 11 01:21:26 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Fri, 11 Oct 2019 01:21:26 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, , , Message-ID: Btw I still think your suders is slightly incorrect. I feel like that is significant, but not a hundred. Drop the star at the end of the last line. root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Erik Olof Gunnar Andersson Sent: Thursday, October 10, 2019 6:18 PM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Maybe double check that your rootwrap config is up to date? /etc/neutron/rootwrap .conf and /etc/neutron/rootwrap.d (Make sure to pick the appropriate branch in github) https://github.com/openstack/neutron/blob/master/etc/rootwrap.conf https://github.com/openstack/neutron/tree/master/etc/neutron/rootwrap.d ________________________________ From: Albert Braden Sent: Thursday, October 10, 2019 1:45 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We’re not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone… for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren’t erroring at this time. I changed neutron’s shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dharmendra.kushwaha at india.nec.com Fri Oct 11 07:31:12 2019 From: dharmendra.kushwaha at india.nec.com (Dharmendra Kushwaha) Date: Fri, 11 Oct 2019 07:31:12 +0000 Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR packages issues) In-Reply-To: References: Message-ID: Hi Radosław, Sorry for inconvenience. We added support for vnf package with limited scope [1] in train cycle, and have ongoing activity for U cycle, so we didn't published proper doc for this feature. But yes, we will add doc for current dependent changes. I have just pushed a manual installation doc changes in [2]. We needs vnf_package_csar_path(i.e. /var/lib/tacker/vnfpackages/) path to keep extracted data locally for further actions, and filesystem_store_datadir(i.e. /var/lib/tacker/csar_files) for glance store. In case of multi node deployment, we recommend to configure filesystem_store_datadir option on shared storage to make sure the availability from other nodes. [1]: https://github.com/openstack/tacker/blob/master/releasenotes/notes/bp-tosca-csar-mgmt-driver-6dbf9e847c8fe77a.yaml [2]: https://review.opendev.org/#/c/688045/ Thanks & Regards Dharmendra Kushwaha ________________________________________ From: Radosław Piliszek Sent: Thursday, October 10, 2019 12:35 AM To: openstack-discuss Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR packages issues) Hello Tackers! Some time ago I reported a bug in Kolla-Ansible Tacker deployment [1] Eduardo (thanks!) did some debugging to discover that you started requiring internal Glance configuration for Tacker to make it use the local filesystem via the filestore backend (internally in Tacker, not via the deployed Glance) [2] This makes us, Koalas, wonder how to approach a proper production deployment of Tacker. Tacker docs have not been updated regarding this new feature and following them may result in broken Tacker deployment (as we have now). We are especially interested in how to deal with multinode Tacker deployment. Do these new paths require any synchronization? [1] https://bugs.launchpad.net/kolla-ansible/+bug/1845142 [2] https://review.opendev.org/#/c/684275/2/ansible/roles/tacker/templates/tacker.conf.j2 Kind regards, Radek ________________________________ The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or NECTI or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of NECTI or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. From ionut at fleio.com Fri Oct 11 11:06:27 2019 From: ionut at fleio.com (Ionut Biru) Date: Fri, 11 Oct 2019 14:06:27 +0300 Subject: [nova] rescue instances with volumes Message-ID: Hello guys, How do you guys rescue instances that are booted from volumes or have volumes attached to them? If i use nova rescue on instances booted from volume, api returns that it's not compatible and if i rescue an instance that has a volume attached, the drive is not available in the OS. -- Ionut Biru - https://fleio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Fri Oct 11 11:53:17 2019 From: eblock at nde.ag (Eugen Block) Date: Fri, 11 Oct 2019 11:53:17 +0000 Subject: [nova] rescue instances with volumes In-Reply-To: Message-ID: <20191011115317.Horde.t_YCohCT_o3lHR8iJ0RKQ3y@webmail.nde.ag> Hi, with nova rescue you only get access to the root disk if it's an ephemeral disk. If the instance is booted from volume you can shutdown the instance, reset the volume state to "available" and attach-status to "detached" with CLI (because you can't actually detach the root volume), then you should be able to attach that volume to a different instance. Other volumes of the affected instance can be detached and re-attached with Horizon or CLI to another instance if you need them all for the rescue mode. But with this workaround you can't use the "nova rescue" command, so you have to revert all those attachments to the original state manually. Regards, Eugen Zitat von Ionut Biru : > Hello guys, > > How do you guys rescue instances that are booted from volumes or have > volumes attached to them? > > If i use nova rescue on instances booted from volume, api returns that it's > not compatible and if i rescue an instance that has a volume attached, the > drive is not available in the OS. > > -- > Ionut Biru - https://fleio.com From satish.txt at gmail.com Fri Oct 11 12:13:25 2019 From: satish.txt at gmail.com (Satish Patel) Date: Fri, 11 Oct 2019 08:13:25 -0400 Subject: monitoring openstack In-Reply-To: References: Message-ID: <9DD5FA05-11A8-4DE7-8C09-F46BB0E3CC32@gmail.com> You only need ceilometer and gnocchi (aodh for alerting) Also look into grafana + gnocchi for beautiful graphing. Sent from my iPhone > On Oct 9, 2019, at 3:49 AM, Ali Ebrahimpour < ali74.ebrahimpour at gmail.com> wrote: > > hi guys > i want to install monitoring in my horizon Ui and i'm confused in setting up ceilometer or gnocchi or aodh or monasca in my project because all of them where deprecated. i setup openstack with ansible and i want to monitor the usage of cpu and ram and etc in my dashboard and i also want to know how much resources each customer used for one hour and day. > Thanks in advance for your precise guidance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From angeiv.zhang at gmail.com Fri Oct 11 12:22:55 2019 From: angeiv.zhang at gmail.com (Xing Zhang) Date: Fri, 11 Oct 2019 20:22:55 +0800 Subject: [neutron][security group][IPv6] IPv6 ICMPv6 port security in security group Message-ID: Hi all, When using neutron on CentOS 7 with OVSHybridIptablesFirewallDriver, create a vm with IPv4/IPv6 dual stack port, then remove all security group, we can get response with ping dhcp or router using IPv6 address in vm, while IPv4 can't. IPv6 works different with IPv4 in some cases and some useful function must work with ICMPv6 like NDP, NS, NA. Checking these two links below, neutron only drop IPv6 RA from vm, and allow all ICMPv6 ICMPv6 Type 128 Echo Request and Type 129 Echo Reply are allowed by default. Should we try to restrict ICMPv6 some types or there are some considerations and just follow ITEF 4890? IETF 4890 [section 4.3.2. Traffic That Normally Should Not Be Dropped] mentioned that: As discussed in Section 3.2 , the risks from port scanning in an IPv6 network are much less severe, and it is not necessary to filter IPv6 Echo Request messages. [section 3.2. Probing] However, the very large address space of IPv6 makes probing a less effective weapon as compared with IPv4 provided that addresses are not allocated in an easily guessable fashion. https://github.com/openstack/neutron/commit/a8a9d225d8496c044db7057552394afd6c950a8e https://www.ietf.org/rfc/rfc4890.txt Commands are: neutron port-update --no-security-groups 0307f016-0cc8-468b-bf3e-36ebe50e13ac ping6 from vm to dhcp ip6tables rules in compute node: PS: seems rules for type 131/135/143 are included in the rule # ip6tables-save | grep 08a0812a -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 131 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 135 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 143 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 -m comment --comment "Drop IPv6 Router Advts from VM Instance." -j DROP -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback full rules are at Ref #3 REF #1 ml2_config.ini [securitygroup] firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver Ref #2 Chain neutron-openvswi-o08a0812a-9 (2 references) pkts bytes target prot opt in out source destination 0 0 RETURN icmpv6 * * :: ff02::/16 ipv6-icmptype 131 /* Allow IPv6 ICMP traffic. */ 1 72 RETURN icmpv6 * * :: ff02::/16 ipv6-icmptype 135 /* Allow IPv6 ICMP traffic. */ 2 152 RETURN icmpv6 * * :: ff02::/16 ipv6-icmptype 143 /* Allow IPv6 ICMP traffic. */ 5 344 neutron-openvswi-s08a0812a-9 all * * ::/0 ::/0 0 0 DROP icmpv6 * * ::/0 ::/0 ipv6-icmptype 134 /* Drop IPv6 Router Advts from VM Instance. */ 5 344 RETURN icmpv6 * * ::/0 ::/0 /* Allow IPv6 ICMP traffic. */ 0 0 RETURN udp * * ::/0 ::/0 udp spt:546 dpt:547 /* Allow DHCP client traffic. */ 0 0 DROP udp * * ::/0 ::/0 udp spt:547 dpt:546 /* Prevent DHCP Spoofing by VM. */ 0 0 RETURN all * * ::/0 ::/0 state RELATED,ESTABLISHED /* Direct packets associated with a known session to the RETURN chain. */ 0 0 DROP all * * ::/0 ::/0 state INVALID /* Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack. */ 0 0 neutron-openvswi-sg-fallback all * * ::/0 ::/0 /* Send unmatched traffic to the fallback chain. */ Ref #3 # ip6tables-save | grep 08a0812a -A neutron-openvswi-PREROUTING -m physdev --physdev-in qvb08a0812a-9e -m comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT --zone 4104 -A neutron-openvswi-PREROUTING -i qvb08a0812a-9e -m comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT --zone 4104 -A neutron-openvswi-PREROUTING -m physdev --physdev-in tap08a0812a-9e -m comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT --zone 4104 :neutron-openvswi-i08a0812a-9 - [0:0] :neutron-openvswi-o08a0812a-9 - [0:0] :neutron-openvswi-s08a0812a-9 - [0:0] -A neutron-openvswi-FORWARD -m physdev --physdev-out tap08a0812a-9e --physdev-is-bridged -m comment --comment "Direct traffic from the VM interface to the security group chain." -j neutron-openvswi-sg-chain -A neutron-openvswi-FORWARD -m physdev --physdev-in tap08a0812a-9e --physdev-is-bridged -m comment --comment "Direct traffic from the VM interface to the security group chain." -j neutron-openvswi-sg-chain -A neutron-openvswi-INPUT -m physdev --physdev-in tap08a0812a-9e --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM to the security group chain." -j neutron-openvswi-o08a0812a-9 -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 130 -j RETURN -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 135 -j RETURN -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 136 -j RETURN -A neutron-openvswi-i08a0812a-9 -m state --state RELATED,ESTABLISHED -m comment --comment "Direct packets associated with a known session to the RETURN chain." -j RETURN -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 -j RETURN -A neutron-openvswi-i08a0812a-9 -d 20ff::c/128 -p udp -m udp --sport 547 --dport 546 -j RETURN -A neutron-openvswi-i08a0812a-9 -d fe80::/64 -p udp -m udp --sport 547 --dport 546 -j RETURN -A neutron-openvswi-i08a0812a-9 -m state --state INVALID -m comment --comment "Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP -A neutron-openvswi-i08a0812a-9 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 131 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 135 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m icmp6 --icmpv6-type 143 -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -j neutron-openvswi-s08a0812a-9 -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 -m comment --comment "Drop IPv6 Router Advts from VM Instance." -j DROP -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m comment --comment "Allow IPv6 ICMP traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -p udp -m udp --sport 546 --dport 547 -m comment --comment "Allow DHCP client traffic." -j RETURN -A neutron-openvswi-o08a0812a-9 -p udp -m udp --sport 547 --dport 546 -m comment --comment "Prevent DHCP Spoofing by VM." -j DROP -A neutron-openvswi-o08a0812a-9 -m state --state RELATED,ESTABLISHED -m comment --comment "Direct packets associated with a known session to the RETURN chain." -j RETURN -A neutron-openvswi-o08a0812a-9 -m state --state INVALID -m comment --comment "Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP -A neutron-openvswi-o08a0812a-9 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback -A neutron-openvswi-s08a0812a-9 -s 20ff::c/128 -m mac --mac-source FA:16:3E:7C:D8:C0 -m comment --comment "Allow traffic from defined IP/MAC pairs." -j RETURN -A neutron-openvswi-s08a0812a-9 -s fe80::f816:3eff:fe7c:d8c0/128 -m mac --mac-source FA:16:3E:7C:D8:C0 -m comment --comment "Allow traffic from defined IP/MAC pairs." -j RETURN -A neutron-openvswi-s08a0812a-9 -m comment --comment "Drop traffic without an IP/MAC allow rule." -j DROP -A neutron-openvswi-sg-chain -m physdev --physdev-out tap08a0812a-9e --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-i08a0812a-9 -A neutron-openvswi-sg-chain -m physdev --physdev-in tap08a0812a-9e --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-o08a0812a-9 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sshnaidm at redhat.com Fri Oct 11 13:29:01 2019 From: sshnaidm at redhat.com (Sagi Shnaidman) Date: Fri, 11 Oct 2019 16:29:01 +0300 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: > As for the feedback received in the previous answers, people would like to > keep a "print-cmd" like, which makes total sense. > I was thinking we could write a proper check mode for the podman_container > module, which could output the podman commands that are run by the module. > We could also extract the container management tasks to its own playbook > so an operator who would usually run: > $ paunch debug (...) --action print-cmd > > replaced by: > $ ansible-playbook --check -i inventory.yaml containers.yaml > > It's totally doable. Currently the module prints podman commands to syslog exactly as they are executed. We can definitely support check mode with it. > Challenges: > - no unit tests like in paunch, will need good testing with Molecule > The podman ansible modules are written in python, so i think we can still use some unittests in addition to molecule tests. > -- > Emilien Macchi > -- Best regards Sagi Shnaidman -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.slagle at gmail.com Fri Oct 11 14:55:26 2019 From: james.slagle at gmail.com (James Slagle) Date: Fri, 11 Oct 2019 10:55:26 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: On Wed, Oct 9, 2019 at 7:05 PM Emilien Macchi wrote: > > This thread deserves an update: > > - tripleo-ansible has now a paunch module, calling openstack/paunch as a library. > https://opendev.org/openstack/tripleo-ansible/src/branch/master/tripleo_ansible/ansible_plugins/modules/paunch.py > > And is called here for paunch apply: > https://opendev.org/openstack/tripleo-heat-templates/src/branch/master/common/deploy-steps-tasks.yaml#L232-L254 > > In theory, we could deprecate "paunch apply" now as we don't need it anymore. I was working on porting "paunch cleanup" but it's still WIP. > > - I've been working on a new Ansible role which could totally replace Paunch, called "tripleo-container-manage", which has been enough for me to deploy an Undercloud: https://review.opendev.org/#/c/686196. It's being tested here: https://review.opendev.org/#/c/687651/ and as you can see the undercloud was successfully deployed without Paunch. Note that some container parameters haven't been ported and upgrade untested (this is a prototype). > > The second approach is a serious prototype I would like to continue further but before I would like some feedback. > As for the feedback received in the previous answers, people would like to keep a "print-cmd" like, which makes total sense. > I was thinking we could write a proper check mode for the podman_container module, which could output the podman commands that are run by the module. > We could also extract the container management tasks to its own playbook so an operator who would usually run: > $ paunch debug (...) --action print-cmd > > replaced by: > $ ansible-playbook --check -i inventory.yaml containers.yaml > > A few benefits of this new role: > - leverage ansible modules (we plan to upstream podman_container module) > - could be easier to maintain and contribute (python vs ansible) > - could potentially be faster. I want to investigate usage of async actions/polls in the role. > > Challenges: > - no unit tests like in paunch, will need good testing with Molecule > - we need to invest a lot in testing it, Paunch has a lot of edge cases that we carried over the cycles to manage containers. > > More feedback is very welcome and anyone interested to contribute please let me know. Nice work! I like the approach with the new ansible role. I do think there will be a balance between what makes sense to keep in a python module vs an ansible task. If/then branching logic and conditional tasks based on previous results is of course all possible with ansible tasks, but it starts to become complex and difficult to manage. A higher level language (python) is much better at that. Personally, I prefer to view ansible as just an execution engine and would look to keep the actual application and business logic in proper reusable/testable code modules (python). Finding that right balance is likely something we can figure out in review feedback, ad-hoc discussions, etc. An idea for a future improvement I would like to see as we move in this direction is to switch from reading the container startup configs from a single file per step (/var/lib/tripleo-config/container-startup-config-step_{{ step }}.json), to using a directory per step instead. It would look something like: /var/lib/tripleo-config/container-startup-config/step1 /var/lib/tripleo-config/container-startup-config/step1/keystone-init-tasks.json /var/lib/tripleo-config/container-startup-config/step1/pacemaker-init-tasks.json etc. That way each service template can be converted to a proper ansible role in tripleo-ansible that just drops its config into the right directory on the managed node. When the tripleo-container-manage role is then executed, it will operate on those files. This would also make it much more clear what container caused a failure, since we could log the results individually instead of just getting back the union of all logs per step. I think you're patches already address this to some degree since you are looping over the contents of the single file. The other feedback I would offer is perhaps continue to think about keeping the container implementation pluggable in some fashion. Right now you have a tasks/podman.yaml. What might it look like if we wanted to have a tasks/kubernetes.yaml in the future, and how would that be enabled? Thanks -- -- James Slagle -- From emilien at redhat.com Fri Oct 11 15:08:18 2019 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 11 Oct 2019 11:08:18 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: On Fri, Oct 11, 2019 at 10:55 AM James Slagle wrote: [snip] > Nice work! I like the approach with the new ansible role. > > I do think there will be a balance between what makes sense to keep in > a python module vs an ansible task. If/then branching logic and > conditional tasks based on previous results is of course all possible > with ansible tasks, but it starts to become complex and difficult to > manage. A higher level language (python) is much better at that. > Personally, I prefer to view ansible as just an execution engine and > would look to keep the actual application and business logic in proper > reusable/testable code modules (python). Finding that right balance is > likely something we can figure out in review feedback, ad-hoc > discussions, etc. > Ack & agreed on my side. An idea for a future improvement I would like to see as we move in > this direction is to switch from reading the container startup configs > from a single file per step > (/var/lib/tripleo-config/container-startup-config-step_{{ step > }}.json), to using a directory per step instead. It would look > something like: > > /var/lib/tripleo-config/container-startup-config/step1 > > /var/lib/tripleo-config/container-startup-config/step1/keystone-init-tasks.json > > /var/lib/tripleo-config/container-startup-config/step1/pacemaker-init-tasks.json > etc. > > That way each service template can be converted to a proper ansible > role in tripleo-ansible that just drops its config into the right > directory on the managed node. When the tripleo-container-manage role > is then executed, it will operate on those files. This would also make > it much more clear what container caused a failure, since we could log > the results individually instead of just getting back the union of all > logs per step. I think you're patches already address this to some > degree since you are looping over the contents of the single file. > This is an excellent idea. One of the feedback I've got from the Upgrade folks is the need to be able to easily upgrade one service, and the current structure doesn't easily allow it. Your proposal is I think exactly addressing it; and indeed it'll help when migrating container config into their individual roles in tripleo-ansible. I'll add that to the backlog. The other feedback I would offer is perhaps continue to think about > keeping the container implementation pluggable in some fashion. Right > now you have a tasks/podman.yaml. What might it look like if we wanted > to have a tasks/kubernetes.yaml in the future, and how would that be > enabled? > Yes, that's what I had in mind when starting the role. The podman.yaml is for Podman logic. We will probably have docker.yaml if we want to support Docker for FFU from Queens to Train. And we can easily add a playbook "kubernetes.yaml" which will read the container config data, generate k8s YAML and then consume it via https://docs.ansible.com/ansible/latest/modules/k8s_module.html . Really there is no limit if we can make it really pluggable. Thanks for the input and the great feedback, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From lshort at redhat.com Fri Oct 11 16:32:35 2019 From: lshort at redhat.com (Luke Short) Date: Fri, 11 Oct 2019 12:32:35 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: Hey folks, +1 to the direction we're going with this. Like Emilien said, the skies the limit when using a flexible automation framework like Ansible. We're definitely going to need Molecule tests for the role and unit/integration tests for the podman_container module itself. I left a comment in the podman_container feature request in Ansible to let the broader community know that we're working towards stabilizing that module. Hopefully that will get more contributors to help us fast track upstreaming it. It doesn't seem like Paunch is really used outside of TripleO so having it in Ansible, which has wider adoption, seems really ideal. As for backports, I think it's fair to say that Paunch for the most part "just works." When it does break it's a pain to fix. Which is even more reason to move away from it. Sincerely, Luke Short, RHCE Software Engineer, OpenStack Deployment Framework Red Hat, Inc. On Fri, Oct 11, 2019 at 11:13 AM Emilien Macchi wrote: > On Fri, Oct 11, 2019 at 10:55 AM James Slagle > wrote: > [snip] > >> Nice work! I like the approach with the new ansible role. >> >> I do think there will be a balance between what makes sense to keep in >> a python module vs an ansible task. If/then branching logic and >> conditional tasks based on previous results is of course all possible >> with ansible tasks, but it starts to become complex and difficult to >> manage. A higher level language (python) is much better at that. >> Personally, I prefer to view ansible as just an execution engine and >> would look to keep the actual application and business logic in proper >> reusable/testable code modules (python). Finding that right balance is >> likely something we can figure out in review feedback, ad-hoc >> discussions, etc. >> > > Ack & agreed on my side. > > An idea for a future improvement I would like to see as we move in >> this direction is to switch from reading the container startup configs >> from a single file per step >> (/var/lib/tripleo-config/container-startup-config-step_{{ step >> }}.json), to using a directory per step instead. It would look >> something like: >> >> /var/lib/tripleo-config/container-startup-config/step1 >> >> /var/lib/tripleo-config/container-startup-config/step1/keystone-init-tasks.json >> >> /var/lib/tripleo-config/container-startup-config/step1/pacemaker-init-tasks.json >> etc. >> >> That way each service template can be converted to a proper ansible >> role in tripleo-ansible that just drops its config into the right >> directory on the managed node. When the tripleo-container-manage role >> is then executed, it will operate on those files. This would also make >> it much more clear what container caused a failure, since we could log >> the results individually instead of just getting back the union of all >> logs per step. I think you're patches already address this to some >> degree since you are looping over the contents of the single file. >> > > This is an excellent idea. One of the feedback I've got from the Upgrade > folks is the need to be able to easily upgrade one service, and the current > structure doesn't easily allow it. Your proposal is I think exactly > addressing it; and indeed it'll help when migrating container config into > their individual roles in tripleo-ansible. > I'll add that to the backlog. > > The other feedback I would offer is perhaps continue to think about >> keeping the container implementation pluggable in some fashion. Right >> now you have a tasks/podman.yaml. What might it look like if we wanted >> to have a tasks/kubernetes.yaml in the future, and how would that be >> enabled? >> > > Yes, that's what I had in mind when starting the role. The podman.yaml is > for Podman logic. > We will probably have docker.yaml if we want to support Docker for FFU > from Queens to Train. > And we can easily add a playbook "kubernetes.yaml" which will read the > container config data, generate k8s YAML and then consume it via > https://docs.ansible.com/ansible/latest/modules/k8s_module.html . Really > there is no limit if we can make it really pluggable. > > > Thanks for the input and the great feedback, > -- > Emilien Macchi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From info at dantalion.nl Fri Oct 11 06:52:39 2019 From: info at dantalion.nl (info at dantalion.nl) Date: Fri, 11 Oct 2019 08:52:39 +0200 Subject: [watcher] Thesis on improving Watcher and collaborating with OpenStack community Message-ID: Hello everyone, I am a Dutch student at the Amsterdam University of Applied Sciences (AUAS) and have recently finished my thesis. My thesis was written on improvements that were made to OpenStack Watcher between February and Juli of 2019. Specifically, many of these improvements were written to aid CERN in deploying Watcher. In addition, the thesis describes methods of collaboration and engaging in communities as well as evaluating strengths and weaknesses of communties. Since the thesis primarily resolves around OpenStack I would like to share it with the community as well. Please find the thesis attached to this email. Any feedback, remarks, future advice or other responses are appreciated. Kind regards, Corne Lukken (Dantali0n) -------------- next part -------------- A non-text attachment was scrubbed... Name: EffectsOpenStackWatcherDeploymentR.pdf Type: application/pdf Size: 1309768 bytes Desc: not available URL: From pkliczew at redhat.com Fri Oct 11 08:22:21 2019 From: pkliczew at redhat.com (Piotr Kliczewski) Date: Fri, 11 Oct 2019 10:22:21 +0200 Subject: [Openstack] FOSDEM 2020 Virtualization & IaaS Devroom CfP Message-ID: We are excited to announce that the call for proposals is now open for the Virtualization & IaaS devroom at the upcoming FOSDEM 2020, to be hosted on February 1st 2020. This year will mark FOSDEM’s 20th anniversary as one of the longest-running free and open source software developer events, attracting thousands of developers and users from all over the world. FOSDEM will be held once again in Brussels, Belgium, on February 1st & 2nd, 2020. This devroom is a collaborative effort, and is organized by dedicated folks from projects such as OpenStack, Xen Project, oVirt, QEMU, KVM, and Foreman. We would like to invite all those who are involved in these fields to submit your proposals by December 1st, 2019. About the Devroom The Virtualization & IaaS devroom will feature session topics such as open source hypervisors and virtual machine managers such as Xen Project, KVM, bhyve, and VirtualBox, and Infrastructure-as-a-Service projects such as KubeVirt, Apache CloudStack, OpenStack, oVirt, QEMU and OpenNebula. This devroom will host presentations that focus on topics of shared interest, such as KVM; libvirt; shared storage; virtualized networking; cloud security; clustering and high availability; interfacing with multiple hypervisors; hyperconverged deployments; and scaling across hundreds or thousands of servers. Presentations in this devroom will be aimed at developers working on these platforms who are looking to collaborate and improve shared infrastructure or solve common problems. We seek topics that encourage dialog between projects and continued work post-FOSDEM. Important Dates Submission deadline: 1 December 2019 Acceptance notifications: 10 December 2019 Final schedule announcement: 15th December 2019 Devroom: 1st February 2020 Submit Your Proposal All submissions must be made via the Pentabarf event planning site[1]. If you have not used Pentabarf before, you will need to create an account. If you submitted proposals for FOSDEM in previous years, you can use your existing account. After creating the account, select Create Event to start the submission process. Make sure to select Virtualization and IaaS devroom from the Track list. Please fill out all the required fields, and provide a meaningful abstract and description of your proposed session. Submission Guidelines We expect more proposals than we can possibly accept, so it is vitally important that you submit your proposal on or before the deadline. Late submissions are unlikely to be considered. All presentation slots are 30 minutes, with 20 minutes planned for presentations, and 10 minutes for Q&A. All presentations will be recorded and made available under Creative Commons licenses. In the Submission notes field, please indicate that you agree that your presentation will be licensed under the CC-By-SA-4.0 or CC-By-4.0 license and that you agree to have your presentation recorded. For example: "If my presentation is accepted for FOSDEM, I hereby agree to license all recordings, slides, and other associated materials under the Creative Commons Attribution Share-Alike 4.0 International License. Sincerely, ." In the Submission notes field, please also confirm that if your talk is accepted, you will be able to attend FOSDEM and deliver your presentation. We will not consider proposals from prospective speakers who are unsure whether they will be able to secure funds for travel and lodging to attend FOSDEM. (Sadly, we are not able to offer travel funding for prospective speakers.) Submission Guidelines Mentored presentations will have 25-minute slots, where 20 minutes will include the presentation and 5 minutes will be reserved for questions. The number of newcomer session slots is limited, so we will probably not be able to accept all applications. You must submit your talk and abstract to apply for the mentoring program, our mentors are volunteering their time and will happily provide feedback but won't write your presentation for you! If you are experiencing problems with Pentabarf, the proposal submission interface, or have other questions, you can email our devroom mailing list[2] and we will try to help you. How to Apply In addition to agreeing to video recording and confirming that you can attend FOSDEM in case your session is accepted, please write "speaker mentoring program application" in the "Submission notes" field, and list any prior speaking experience or other relevant information for your application. Code of Conduct Following the release of the updated code of conduct for FOSDEM, we'd like to remind all speakers and attendees that all of the presentations and discussions in our devroom are held under the guidelines set in the CoC and we expect attendees, speakers, and volunteers to follow the CoC at all times. If you submit a proposal and it is accepted, you will be required to confirm that you accept the FOSDEM CoC. If you have any questions about the CoC or wish to have one of the devroom organizers review your presentation slides or any other content for CoC compliance, please email us and we will do our best to assist you. Call for Volunteers We are also looking for volunteers to help run the devroom. We need assistance watching time for the speakers, and helping with video for the devroom. Please contact devroom mailing list [2] for more information. Questions? If you have any questions about this devroom, please send your questions to our devroom mailing list. You can also subscribe to the list to receive updates about important dates, session announcements, and to connect with other attendees. See you all at FOSDEM! [1] https://penta.fosdem.org/submission/FOSDEM20 [2] iaas-virt-devroom at lists.fosdem.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Fri Oct 11 01:16:17 2019 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Thu, 10 Oct 2019 21:16:17 -0400 Subject: [kolla] Support for removing previously enabled services.. Message-ID: Hey everyone, I'm pretty sure I know the answer but are there any support within Kolla itself to disable Services that we're previously enabled. For example, I was testing the Skydive Agent/Analyzer combo till I realized that it was using about 90-100% of the CPUs or computes and controllers. [image: image.png] Re-running Kolla with reconfigure but with Service set to "No" didn't remove the containers. I had to remove the containers after the reconfigure finished. This is Kolla 8.0.1 with a Stein install. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 82637 bytes Desc: not available URL: From eandersson at blizzard.com Fri Oct 11 01:19:56 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Fri, 11 Oct 2019 01:19:56 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, , , Message-ID: Btw I still think your suders is slightly incorrect. I feel like that is significant, but not a hundred. Drop the star at the end of the last line. root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Erik Olof Gunnar Andersson Sent: Thursday, October 10, 2019 6:18 PM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Maybe double check that your rootwrap config is up to date? /etc/neutron/rootwrap .conf and /etc/neutron/rootwrap.d (Make sure to pick the appropriate branch in github) https://github.com/openstack/neutron/blob/master/etc/rootwrap.conf https://github.com/openstack/neutron/tree/master/etc/neutron/rootwrap.d ________________________________ From: Albert Braden Sent: Thursday, October 10, 2019 1:45 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We’re not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone… for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren’t erroring at this time. I changed neutron’s shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? From: Erik Olof Gunnar Andersson > Sent: Thursday, October 10, 2019 11:08 AM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group Yea – if you look at your sudoers its only allowing the old traditional rootwrap, and not the new daemon. You need both. Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf Best Regards, Erik Olof Gunnar Andersson From: Albert Braden > Sent: Thursday, October 10, 2019 11:05 AM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group I have the neutron sudoers line under sudoers.d: root at us01odc-qa-ctrl1:/etc/sudoers.d# cat neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * Whatever is causing this didn’t start until I had been running the rootwrap daemon for 2 weeks, and it has not started in our prod cluster. From: Erik Olof Gunnar Andersson > Sent: Wednesday, October 9, 2019 6:40 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group You are probably missing an entry in your sudoers file. You need something like neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Albert Braden > Sent: Wednesday, October 9, 2019 5:20 PM To: Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group We tested this in dev and qa and then implemented in production and it did make a difference, but 2 weeks later we started seeing an issue, first in dev, and then in qa. In syslog we see neutron-linuxbridge-agent.service stopping and starting[1]. In neutron-linuxbridge-agent.log we see a rootwrap error[2]: “Exception: Failed to spawn rootwrap process.” If I comment out ‘root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf"’ and restart neutron services then the error goes away. How can I use the root_helper_daemon setting without creating this new error? http://paste.openstack.org/show/782622/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Fri Oct 11 20:44:03 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 11 Oct 2019 13:44:03 -0700 Subject: [keystone][edge][k8s] Keystone - StarlingX integration feedback In-Reply-To: References: Message-ID: On Thu, Oct 10, 2019, at 11:26, Ildiko Vancsa wrote: > Hi, > > I wanted to point you to a thread that’s just started on the > edge-computing mailing list: > http://lists.openstack.org/pipermail/edge-computing/2019-October/000642.html > > The mail contains information about a use case that StarlingX has to > use Keystone integrated with Kubernetes which I believe is valuable > information to the Keystone team to see if there are any items to > discuss further/fix/implement. > > Thanks, > Ildikó > > Thanks for highlighting this, I've responded on the other mailing list. Colleen From Albert.Braden at synopsys.com Fri Oct 11 21:03:40 2019 From: Albert.Braden at synopsys.com (Albert Braden) Date: Fri, 11 Oct 2019 21:03:40 +0000 Subject: Port creation times out for some VMs in large group In-Reply-To: References: <6tdNg1VrWQpIPk394P4Q8nVwh6f16_uXvkzjopF6DGOLUhl5l_4cvvEhJOrq4pjoopsvqjwgUov1p_Xvw22rld-gN1VtE-M2PXA9mWZkM-c=@bitskrieg.net>, , , Message-ID: It appears that the extra * was the issue. After removing it I can run the rootwrap daemon without errors. I'm not 100% sure because the issue took 2 weeks to show up after the initial config change, but this seems to have fixed the problem. From: Erik Olof Gunnar Andersson Sent: Thursday, October 10, 2019 6:21 PM To: Albert Braden ; Chris Apsey Cc: openstack-discuss at lists.openstack.org Subject: Re: Port creation times out for some VMs in large group Btw I still think your suders is slightly incorrect. I feel like that is significant, but not a hundred. Drop the star at the end of the last line. root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf ________________________________ From: Erik Olof Gunnar Andersson > Sent: Thursday, October 10, 2019 6:18 PM To: Albert Braden >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: Re: Port creation times out for some VMs in large group Maybe double check that your rootwrap config is up to date? /etc/neutron/rootwrap .conf and /etc/neutron/rootwrap.d (Make sure to pick the appropriate branch in github) https://github.com/openstack/neutron/blob/master/etc/rootwrap.conf https://github.com/openstack/neutron/tree/master/etc/neutron/rootwrap.d ________________________________ From: Albert Braden > Sent: Thursday, October 10, 2019 1:45 PM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org > Subject: RE: Port creation times out for some VMs in large group The errors appear to start with this line: 2019-10-10 13:42:48.261 1211336 ERROR neutron.agent.linux.utils [req-42c530f6-6e08-47c1-8ed4-dcb31c9cd972 - - - - -] Rootwrap error running command: ['iptables-save', '-t', 'raw']: Exception: Failed to spawn rootwrap process. We're not running iptables. Do we need it, to use the rootwrap daemon? From: Albert Braden > Sent: Thursday, October 10, 2019 12:13 PM To: Erik Olof Gunnar Andersson >; Chris Apsey > Cc: openstack-discuss at lists.openstack.org Subject: RE: Port creation times out for some VMs in large group It looks like something is still missing. I added the line to /etc/sudoers.d/neutron_sudoers: root at us01odc-qa-ctrl3:/var/log/neutron# cat /etc/sudoers.d/neutron_sudoers Defaults:neutron !requiretty neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf * neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf * Then I restarted neutron services and the error was gone... for a few minutes, and then it came back on ctrl3. Ctrl1/2 aren't erroring at this time. I changed neutron's shell and tested the daemon command and it seems to work: root at us01odc-qa-ctrl3:~# su - neutron neutron at us01odc-qa-ctrl3:~$ /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf /tmp/rootwrap-5b1QoP/rootwrap.sock Z%▒"▒▒▒Vs▒▒5-▒,a▒▒▒▒G▒▒▒▒v▒▒ But neutron-linuxbridge-agent.log still scrolls errors: http://paste.openstack.org/show/782740/ It appears that there is another factor besides the config, because even when the sudoers line was missing, it would work for hours or days before the error started. It has been working in our prod cluster for about a week now, without the sudoers line. It seems like it should not work that way. What am I missing? -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Fri Oct 11 21:21:09 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Fri, 11 Oct 2019 17:21:09 -0400 Subject: [neutron][security group][IPv6] IPv6 ICMPv6 port security in security group In-Reply-To: References: Message-ID: <4c668285-d214-3099-a9f0-3842bc659639@gmail.com> On 10/11/19 8:22 AM, Xing Zhang wrote: > Hi all, > > When using neutron on CentOS 7 with OVSHybridIptablesFirewallDriver, > create a vm with IPv4/IPv6 dual stack port, > then remove all security group, we can get response with ping dhcp or > router using IPv6 address in vm, while IPv4 can't. > IPv6 works different with IPv4 in some cases and some useful function > must work with ICMPv6 like NDP, NS, NA. > > Checking these two links below, neutron only drop IPv6 RA from vm, and > allow all ICMPv6 > ICMPv6 Type 128 Echo Request and Type 129 Echo Reply are allowed by default. > Should we try to restrict ICMPv6 some types or there are some > considerations and just follow ITEF 4890? The iptables rules you listed below are for egress traffic, and by default the firewall driver only drops things that could allow one instance to interfere with operation of another, for example, sending DHCP replies or IPv6 router advertisements. Only privileged neutron ports (router and dhcp) can do that. I believe the reason we were so permissive on allowing all ICMPv6 out is to not interfere with NS/NA/RS packets by accident, looking back we probably could have written more specific rules here. The OVS firewall driver actually does add more specific rules for outbound NS/NA/RS, and has been the current default for neutron for a couple of cycles. Regarding dropping other outbound IPv6 traffic, I don't think we should filter anything else by default, it would be a not-backwards-compatible change that would cause a lot of confusion. -Brian > IETF 4890 [section 4.3.2. Traffic That Normally Should Not Be Dropped] > mentioned that: > > As discussed in > Section 3.2 , the risks from port scanning in an IPv6 network are much > less severe, and it is not necessary to filter IPv6 Echo Request > messages. > > [section 3.2. Probing] > > However, the very large address space of IPv6 makes probing a less > effective weapon as compared with IPv4 provided that addresses are > not allocated in an easily guessable fashion. > > > https://github.com/openstack/neutron/commit/a8a9d225d8496c044db7057552394afd6c950a8e > > > https://www.ietf.org/rfc/rfc4890.txt > > > > Commands are: > neutron port-update --no-security-groups > 0307f016-0cc8-468b-bf3e-36ebe50e13ac > > ping6 from vm to dhcp > > ip6tables rules in compute node: > PS: seems rules for type 131/135/143 are included in the rule > > # ip6tables-save | grep 08a0812a > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 131 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 135 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 143 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 > -m comment --comment "Drop IPv6 Router Advts from VM Instance." -j DROP > -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m comment --comment "Allow > IPv6 ICMP traffic." -j RETURN > -A neutron-openvswi-o08a0812a-9 -m comment --comment "Send unmatched > traffic to the fallback chain." -j neutron-openvswi-sg-fallback > > full rules are at Ref #3 > > > > > REF #1 > ml2_config.ini > [securitygroup] > firewall_driver = > neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver > > Ref #2 > Chain neutron-openvswi-o08a0812a-9 (2 references) >  pkts bytes target     prot opt in     out     source > destination >     0     0 RETURN     icmpv6    *      *       :: > ff02::/16            ipv6-icmptype 131 /* Allow IPv6 ICMP traffic. */ >     1    72 RETURN     icmpv6    *      *       :: > ff02::/16            ipv6-icmptype 135 /* Allow IPv6 ICMP traffic. */ >     2   152 RETURN     icmpv6    *      *       :: > ff02::/16            ipv6-icmptype 143 /* Allow IPv6 ICMP traffic. */ >     5   344 neutron-openvswi-s08a0812a-9  all      *      *       ::/0 >                 ::/0 >     0     0 DROP       icmpv6    *      *       ::/0 > ::/0                 ipv6-icmptype 134 /* Drop IPv6 Router Advts from VM > Instance. */ >     5   344 RETURN     icmpv6    *      *       ::/0 > ::/0                 /* Allow IPv6 ICMP traffic. */ >     0     0 RETURN     udp      *      *       ::/0 > ::/0                 udp spt:546 dpt:547 /* Allow DHCP client traffic. */ >     0     0 DROP       udp      *      *       ::/0 > ::/0                 udp spt:547 dpt:546 /* Prevent DHCP Spoofing by VM. */ >     0     0 RETURN     all      *      *       ::/0 > ::/0                 state RELATED,ESTABLISHED /* Direct packets > associated with a known session to the RETURN chain. */ >     0     0 DROP       all      *      *       ::/0 > ::/0                 state INVALID /* Drop packets that appear related > to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in > conntrack. */ >     0     0 neutron-openvswi-sg-fallback  all      *      *       ::/0 >                 ::/0                 /* Send unmatched traffic to the > fallback chain. */ > > Ref #3 > # ip6tables-save | grep 08a0812a > > -A neutron-openvswi-PREROUTING -m physdev --physdev-in qvb08a0812a-9e -m > comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT > --zone 4104 > -A neutron-openvswi-PREROUTING -i qvb08a0812a-9e -m comment --comment > "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT --zone 4104 > -A neutron-openvswi-PREROUTING -m physdev --physdev-in tap08a0812a-9e -m > comment --comment "Set zone for 812a-9ef7-45e3-9d81-9463dd80e63e" -j CT > --zone 4104 > :neutron-openvswi-i08a0812a-9 - [0:0] > :neutron-openvswi-o08a0812a-9 - [0:0] > :neutron-openvswi-s08a0812a-9 - [0:0] > -A neutron-openvswi-FORWARD -m physdev --physdev-out tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Direct traffic from the VM > interface to the security group chain." -j neutron-openvswi-sg-chain > -A neutron-openvswi-FORWARD -m physdev --physdev-in tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Direct traffic from the VM > interface to the security group chain." -j neutron-openvswi-sg-chain > -A neutron-openvswi-INPUT -m physdev --physdev-in tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Direct incoming traffic from > VM to the security group chain." -j neutron-openvswi-o08a0812a-9 > -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 130 > -j RETURN > -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 135 > -j RETURN > -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 136 > -j RETURN > -A neutron-openvswi-i08a0812a-9 -m state --state RELATED,ESTABLISHED -m > comment --comment "Direct packets associated with a known session to the > RETURN chain." -j RETURN > -A neutron-openvswi-i08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 > -j RETURN > -A neutron-openvswi-i08a0812a-9 -d 20ff::c/128 -p udp -m udp --sport 547 > --dport 546 -j RETURN > -A neutron-openvswi-i08a0812a-9 -d fe80::/64 -p udp -m udp --sport 547 > --dport 546 -j RETURN > -A neutron-openvswi-i08a0812a-9 -m state --state INVALID -m comment > --comment "Drop packets that appear related to an existing connection > (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP > -A neutron-openvswi-i08a0812a-9 -m comment --comment "Send unmatched > traffic to the fallback chain." -j neutron-openvswi-sg-fallback > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 131 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 135 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -s ::/128 -d ff02::/16 -p ipv6-icmp -m > icmp6 --icmpv6-type 143 -m comment --comment "Allow IPv6 ICMP traffic." > -j RETURN > -A neutron-openvswi-o08a0812a-9 -j neutron-openvswi-s08a0812a-9 > -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m icmp6 --icmpv6-type 134 > -m comment --comment "Drop IPv6 Router Advts from VM Instance." -j DROP > -A neutron-openvswi-o08a0812a-9 -p ipv6-icmp -m comment --comment "Allow > IPv6 ICMP traffic." -j RETURN > -A neutron-openvswi-o08a0812a-9 -p udp -m udp --sport 546 --dport 547 -m > comment --comment "Allow DHCP client traffic." -j RETURN > -A neutron-openvswi-o08a0812a-9 -p udp -m udp --sport 547 --dport 546 -m > comment --comment "Prevent DHCP Spoofing by VM." -j DROP > -A neutron-openvswi-o08a0812a-9 -m state --state RELATED,ESTABLISHED -m > comment --comment "Direct packets associated with a known session to the > RETURN chain." -j RETURN > -A neutron-openvswi-o08a0812a-9 -m state --state INVALID -m comment > --comment "Drop packets that appear related to an existing connection > (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP > -A neutron-openvswi-o08a0812a-9 -m comment --comment "Send unmatched > traffic to the fallback chain." -j neutron-openvswi-sg-fallback > -A neutron-openvswi-s08a0812a-9 -s 20ff::c/128 -m mac --mac-source > FA:16:3E:7C:D8:C0 -m comment --comment "Allow traffic from defined > IP/MAC pairs." -j RETURN > -A neutron-openvswi-s08a0812a-9 -s fe80::f816:3eff:fe7c:d8c0/128 -m mac > --mac-source FA:16:3E:7C:D8:C0 -m comment --comment "Allow traffic from > defined IP/MAC pairs." -j RETURN > -A neutron-openvswi-s08a0812a-9 -m comment --comment "Drop traffic > without an IP/MAC allow rule." -j DROP > -A neutron-openvswi-sg-chain -m physdev --physdev-out tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Jump to the VM specific > chain." -j neutron-openvswi-i08a0812a-9 > -A neutron-openvswi-sg-chain -m physdev --physdev-in tap08a0812a-9e > --physdev-is-bridged -m comment --comment "Jump to the VM specific > chain." -j neutron-openvswi-o08a0812a-9 From colleen at gazlene.net Sat Oct 12 00:12:39 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 11 Oct 2019 17:12:39 -0700 Subject: [keystone] Keystone Team Update - Week of 7 October 2019 Message-ID: <546861eb-62a6-4c53-9e84-d6b2e285a4e6@www.fastmail.com> # Keystone Team Update - Week of 7 October 2019 ## News ### RC2 We ended up cutting a second RC in order to remove the policy.v3cloudsample.json file[1] and to include placeholder schema migrations[2]. That RC should become the Train release next week[3]. [1] https://review.opendev.org/687639 [2] https://review.opendev.org/687775 [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010074.html ## Office Hours When there are topics to cover, the keystone team holds office hours on Tuesdays at 17:00 UTC. There won't be office hours next week. Add topics you would like to see covered during office hours to the etherpad: https://etherpad.openstack.org/p/keystone-office-hours-topics ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 17 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 32 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ## Milestone Outlook https://releases.openstack.org/train/schedule.html Next week is the Train release! ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From radoslaw.piliszek at gmail.com Sat Oct 12 07:40:00 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Sat, 12 Oct 2019 09:40:00 +0200 Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR packages issues) In-Reply-To: References: Message-ID: Hi Dharmendra, thanks for the insights. We will see what we can do. In the worst case we will leave it to the operator to provide the shared filesystem (by documenting the need). Are you planning to move to using glance-api? It would solve the locality problem. Kind regards, Radek pt., 11 paź 2019 o 09:31 Dharmendra Kushwaha < dharmendra.kushwaha at india.nec.com> napisał(a): > Hi Radosław, > > Sorry for inconvenience. > We added support for vnf package with limited scope [1] in train cycle, > and have ongoing activity for U cycle, so we didn't published proper doc > for this feature. But yes, we will add doc for current dependent changes. I > have just pushed a manual installation doc changes in [2]. > We needs vnf_package_csar_path(i.e. /var/lib/tacker/vnfpackages/) path to > keep extracted data locally for further actions, and > filesystem_store_datadir(i.e. /var/lib/tacker/csar_files) for glance store. > In case of multi node deployment, we recommend to configure > filesystem_store_datadir option on shared storage to make sure the > availability from other nodes. > > [1]: > https://github.com/openstack/tacker/blob/master/releasenotes/notes/bp-tosca-csar-mgmt-driver-6dbf9e847c8fe77a.yaml > [2]: https://review.opendev.org/#/c/688045/ > > Thanks & Regards > Dharmendra Kushwaha > ________________________________________ > From: Radosław Piliszek > Sent: Thursday, October 10, 2019 12:35 AM > To: openstack-discuss > Subject: [kolla][tacker][glance] Deployment of Tacker Train (VNF CSAR > packages issues) > > Hello Tackers! > > Some time ago I reported a bug in Kolla-Ansible Tacker deployment [1] > Eduardo (thanks!) did some debugging to discover that you started > requiring internal Glance configuration for Tacker to make it use the local > filesystem via the filestore backend (internally in Tacker, not via the > deployed Glance) [2] > This makes us, Koalas, wonder how to approach a proper production > deployment of Tacker. > Tacker docs have not been updated regarding this new feature and following > them may result in broken Tacker deployment (as we have now). > We are especially interested in how to deal with multinode Tacker > deployment. Do these new paths require any synchronization? > > [1] https://bugs.launchpad.net/kolla-ansible/+bug/1845142 > [2] > https://review.opendev.org/#/c/684275/2/ansible/roles/tacker/templates/tacker.conf.j2 > > Kind regards, > Radek > > ________________________________ > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. It shall not attach any liability > on the originator or NECTI or its affiliates. Any views or opinions > presented in this email are solely those of the author and may not > necessarily reflect the opinions of NECTI or its affiliates. Any form of > reproduction, dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior written > consent of the author of this e-mail is strictly prohibited. If you have > received this email in error please delete it and notify the sender > immediately. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Sat Oct 12 13:27:11 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Sat, 12 Oct 2019 15:27:11 +0200 Subject: [kolla] Support for removing previously enabled services.. In-Reply-To: References: Message-ID: Hi Laurent, Unfortunately Kolla Ansible does not provide this functionality at the moment. On the other hand, we would welcome such functionality gladly. It needs some discussion regarding how it would work to suit operators' needs. The interesting part is the real clean-up - e.g. removing leftovers, databases, rabbitmq objects... PS: bacon rlz Kind regards, Radek On Fri, Oct 11, 2019, 19:12 Laurent Dumont wrote: > Hey everyone, > > I'm pretty sure I know the answer but are there any support within Kolla > itself to disable Services that we're previously enabled. > > For example, I was testing the Skydive Agent/Analyzer combo till I > realized that it was using about 90-100% of the CPUs or computes and > controllers. > > Re-running Kolla with reconfigure but with Service set to "No" didn't > remove the containers. I had to remove the containers after the reconfigure > finished. > > This is Kolla 8.0.1 with a Stein install. > > Thanks! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Sun Oct 13 01:01:57 2019 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Sat, 12 Oct 2019 21:01:57 -0400 Subject: [kolla] Support for removing previously enabled services.. In-Reply-To: References: Message-ID: Hey! Not a problem - it's a big rabbit hole and I get why it's really not easy to implement. It's easy to clean up containers but as you mentioned, all the rest of the bits and pieces is a tough fit. Laurent On Sat, Oct 12, 2019 at 9:27 AM Radosław Piliszek < radoslaw.piliszek at gmail.com> wrote: > Hi Laurent, > > Unfortunately Kolla Ansible does not provide this functionality at the > moment. > On the other hand, we would welcome such functionality gladly. > It needs some discussion regarding how it would work to suit operators' > needs. > The interesting part is the real clean-up - e.g. removing leftovers, > databases, rabbitmq objects... > > PS: bacon rlz > > Kind regards, > Radek > > > On Fri, Oct 11, 2019, 19:12 Laurent Dumont > wrote: > >> Hey everyone, >> >> I'm pretty sure I know the answer but are there any support within Kolla >> itself to disable Services that we're previously enabled. >> >> For example, I was testing the Skydive Agent/Analyzer combo till I >> realized that it was using about 90-100% of the CPUs or computes and >> controllers. >> >> Re-running Kolla with reconfigure but with Service set to "No" didn't >> remove the containers. I had to remove the containers after the reconfigure >> finished. >> >> This is Kolla 8.0.1 with a Stein install. >> >> Thanks! >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pramchan at yahoo.com Sun Oct 13 01:45:48 2019 From: pramchan at yahoo.com (prakash RAMCHANDRAN) Date: Sun, 13 Oct 2019 01:45:48 +0000 (UTC) Subject: [Indian OpenStack User Group} Need Volunteer Mentors for Bangalore Friday Oct 18 10AM-4 PM References: <2038858173.780724.1570931148265.ref@mail.yahoo.com> Message-ID: <2038858173.780724.1570931148265@mail.yahoo.com> Hi all, We have 150+ students and  3 classrooms & need technical mentors who can deliver following contentInformation about training OpenStack Docs: OpenStack Upstream Institute Training Content https://docs.openstack.org/upstream-training/upstream-training-content.html | | | | OpenStack Docs: OpenStack Upstream Institute | | | If you are in Bangalore   (India) and your company allows you to fulfill your Corporate Social Responsibility (CSR) or you are a motivated Professional ready to help Engineering Students please respond to highlighted link and RSVP so that Event Co-coordinators can reach you to seek your help. Alternate is you can contact madhuri.rai07  or ganesh.hiregaoudar or digambarpat  all AT gamailDOTcom. Once again appreciate all the help from tech folks from Bangalore India, who have been instrumental in supporting OpenStack for last decade. ThanksPrakash RamchandranEvent Coordinator -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Sun Oct 13 10:22:34 2019 From: smooney at redhat.com (Sean Mooney) Date: Sun, 13 Oct 2019 11:22:34 +0100 Subject: [kolla] Support for removing previously enabled services.. In-Reply-To: References: Message-ID: On Sat, 2019-10-12 at 21:01 -0400, Laurent Dumont wrote: > Hey! > > Not a problem - it's a big rabbit hole and I get why it's really not easy > to implement. It's easy to clean up containers but as you mentioned, all > the rest of the bits and pieces is a tough fit. i did add a tiny step in that direction years ago for mainly dev use https://github.com/openstack/kolla-ansible/commit/2ffb35ee5308ece3717263d38163e5fd9b29a3ae basically the tools/cleanup-containers script takes a regex of the continers to clean up as its first argument. e.g. tools/cleanup-containers "neutron|openvswitch" that is totally a hack but it was so useful for dev. i belive there is already a request to limit kolla-ansible --destroy by tags destroy should in theory be cleanup dbs in addtion to remvoing the contianers but give that currently it just remvoed everything of nothing it not that useful for operator wanting to remvoe a deployed service. my hack when used to be an tiny ansible script that copied that tool to all host then invoked it the relevent regex. anywya if you jsut wasnt to do this for dev or on a small number of hosts it might help but keep in mind that it will not clean up the configs or dbs and it only works if no vms are runnign on the host > > Laurent > > On Sat, Oct 12, 2019 at 9:27 AM Radosław Piliszek < > radoslaw.piliszek at gmail.com> wrote: > > > Hi Laurent, > > > > Unfortunately Kolla Ansible does not provide this functionality at the > > moment. > > On the other hand, we would welcome such functionality gladly. > > It needs some discussion regarding how it would work to suit operators' > > needs. > > The interesting part is the real clean-up - e.g. removing leftovers, > > databases, rabbitmq objects... > > > > PS: bacon rlz > > > > Kind regards, > > Radek > > > > > > On Fri, Oct 11, 2019, 19:12 Laurent Dumont > > wrote: > > > > > Hey everyone, > > > > > > I'm pretty sure I know the answer but are there any support within Kolla > > > itself to disable Services that we're previously enabled. > > > > > > For example, I was testing the Skydive Agent/Analyzer combo till I > > > realized that it was using about 90-100% of the CPUs or computes and > > > controllers. > > > > > > Re-running Kolla with reconfigure but with Service set to "No" didn't > > > remove the containers. I had to remove the containers after the reconfigure > > > finished. > > > > > > This is Kolla 8.0.1 with a Stein install. > > > > > > Thanks! > > > From gmann at ghanshyammann.com Sun Oct 13 15:10:34 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 13 Oct 2019 10:10:34 -0500 Subject: [goals][IPv6-Only Deployments and Testing] Update Message-ID: <16dc5abcf69.dfc6953743477.8454873137806209108@ghanshyammann.com> Hello Everyone, Below is the updated on IPv6 goal. All the projects have the ipv6 job patch proposed now. Next step is to review then as per mentioned guidelines below or help in debugging the failure. As stable/train is already cut for all the projects, we will keep merging the remaining projects listed below in Ussuri release. If your project is listed below, check the project patch and help in review/debug failure. Summary: The projects waiting for IPv6 job patch to merge: If patch is failing, help me to debug that otherwise review and merge. * Barbican * Tricircle * Vitrage * Zaqar * Glance * Monasca * Neutron stadium projects (added a more generic job for all. need debugging as few tests failing- https://review.opendev.org/#/c/686043/) * Qinling * Sahara * Searchlight * Senlin * Tacker * Ec2-Api * Freezer * Heat * Ironic * Karbor * kuryr-kubernetes (not yet ready for IPv6. as per IRC chat with dulek, IPv6 support is planned for ussuri cycle - https://review.opendev.org/#/c/682531/) * Magnum * Masakari * Mistral * Octavia (johnsom is working on this) Storyboard: ========= - https://storyboard.openstack.org/#!/story/2005477 IPv6 missing support found: ===================== 1. https://review.opendev.org/#/c/673397/ 2. https://review.opendev.org/#/c/673449/ 3. https://review.opendev.org/#/c/677524/ There are few more but need to be tracked. How you can help: ============== - Each project needs to look for and review the ipv6 job patch. - Verify it works fine on ipv6 and no ipv4 used in conf etc - Any other specific scenario needs to be added as part of project IPv6 verification. - Help on debugging and fix the bug in IPv6 job is failing. Everything related to this goal can be found under this topic: Topic: https://review.opendev.org/#/q/topic:ipv6-only-deployment-and-testing+(status:open+OR+status:merged) How to define and run new IPv6 Job on project side: ======================================= - I prepared a wiki page to describe this section - https://wiki.openstack.org/wiki/Goal-IPv6-only-deployments-and-testing Review suggestion: ============== - Main goal of these jobs will be whether your service is able to listen on IPv6 and can communicate to any other services either OpenStack or DB or rabbitmq etc on IPv6 or not. So check your proposed job with that point of view. If anything missing, comment on patch. - One example was - I missed to configure novnc address to IPv6- https://review.opendev.org/#/c/672493/ - base script as part of 'devstack-tempest-ipv6' will do basic checks for endpoints on IPv6 and some devstack var setting. But if your project needs more specific verification then it can be added in project side job as post-run playbooks as described in wiki page[1]. [1] https://wiki.openstack.org/wiki/Goal-IPv6-only-deployments-and-testing From anlin.kong at gmail.com Mon Oct 14 04:24:00 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Mon, 14 Oct 2019 17:24:00 +1300 Subject: [Trove] [Qinling] PTL on vacation Message-ID: Hi all, I will be away from 15 Oct to 15 Nov. - Best regards, Lingxian Kong Catalyst Cloud -------------- next part -------------- An HTML attachment was scrubbed... URL: From josephine.seifert at secustack.com Mon Oct 14 06:21:01 2019 From: josephine.seifert at secustack.com (Josephine Seifert) Date: Mon, 14 Oct 2019 08:21:01 +0200 Subject: [image-encryption] No meeting today Message-ID: <7d1b35dc-7a7e-76cd-c45a-1419cfa74920@secustack.com> Hi, unfortunately neither Markus (mhen) nor me can hold the meeting today. We will have our next meeting next monday. greetings Josephine (Luzi) From skaplons at redhat.com Mon Oct 14 08:11:46 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 14 Oct 2019 10:11:46 +0200 Subject: [goals][IPv6-Only Deployments and Testing] Update In-Reply-To: <16dc5abcf69.dfc6953743477.8454873137806209108@ghanshyammann.com> References: <16dc5abcf69.dfc6953743477.8454873137806209108@ghanshyammann.com> Message-ID: Hi, > On 13 Oct 2019, at 17:10, Ghanshyam Mann wrote: > > Hello Everyone, > > Below is the updated on IPv6 goal. All the projects have the ipv6 job patch proposed now. Next step is to review then as per mentioned guidelines below or help in debugging the failure. > > As stable/train is already cut for all the projects, we will keep merging the remaining projects listed below in Ussuri release. If your project is listed below, check the project patch and help in review/debug failure. > > Summary: > > The projects waiting for IPv6 job patch to merge: > If patch is failing, help me to debug that otherwise review and merge. > > * Barbican > * Tricircle > * Vitrage > * Zaqar > * Glance > * Monasca > * Neutron stadium projects (added a more generic job for all. need debugging as few tests failing- https://review.opendev.org/#/c/686043/) I will investigate and update this patch this week. > * Qinling > * Sahara > * Searchlight > * Senlin > * Tacker > * Ec2-Api > * Freezer > * Heat > * Ironic > * Karbor > * kuryr-kubernetes (not yet ready for IPv6. as per IRC chat with dulek, IPv6 support is planned for ussuri cycle - https://review.opendev.org/#/c/682531/) > * Magnum > * Masakari > * Mistral > * Octavia (johnsom is working on this) > > Storyboard: > ========= > - https://storyboard.openstack.org/#!/story/2005477 > > IPv6 missing support found: > ===================== > 1. https://review.opendev.org/#/c/673397/ > 2. https://review.opendev.org/#/c/673449/ > 3. https://review.opendev.org/#/c/677524/ > There are few more but need to be tracked. > > How you can help: > ============== > - Each project needs to look for and review the ipv6 job patch. > - Verify it works fine on ipv6 and no ipv4 used in conf etc > - Any other specific scenario needs to be added as part of project IPv6 verification. > - Help on debugging and fix the bug in IPv6 job is failing. > > Everything related to this goal can be found under this topic: > Topic: https://review.opendev.org/#/q/topic:ipv6-only-deployment-and-testing+(status:open+OR+status:merged) > > How to define and run new IPv6 Job on project side: > ======================================= > - I prepared a wiki page to describe this section - https://wiki.openstack.org/wiki/Goal-IPv6-only-deployments-and-testing > > Review suggestion: > ============== > - Main goal of these jobs will be whether your service is able to listen on IPv6 and can communicate to any > other services either OpenStack or DB or rabbitmq etc on IPv6 or not. So check your proposed job with > that point of view. If anything missing, comment on patch. > - One example was - I missed to configure novnc address to IPv6- https://review.opendev.org/#/c/672493/ > - base script as part of 'devstack-tempest-ipv6' will do basic checks for endpoints on IPv6 and some devstack var > setting. But if your project needs more specific verification then it can be added in project side job as post-run > playbooks as described in wiki page[1]. > > [1] https://wiki.openstack.org/wiki/Goal-IPv6-only-deployments-and-testing > > — Slawek Kaplonski Senior software engineer Red Hat From mark at stackhpc.com Mon Oct 14 08:51:32 2019 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 14 Oct 2019 09:51:32 +0100 Subject: [kolla] Support for removing previously enabled services.. In-Reply-To: References: Message-ID: On Sun, 13 Oct 2019 at 11:24, Sean Mooney wrote: > > On Sat, 2019-10-12 at 21:01 -0400, Laurent Dumont wrote: > > Hey! > > > > Not a problem - it's a big rabbit hole and I get why it's really not easy > > to implement. It's easy to clean up containers but as you mentioned, all > > the rest of the bits and pieces is a tough fit. True, although simply removing the containers and load balancer configuration would be a good start. > > i did add a tiny step in that direction years ago for mainly dev use > https://github.com/openstack/kolla-ansible/commit/2ffb35ee5308ece3717263d38163e5fd9b29a3ae > basically the tools/cleanup-containers script takes a regex of the continers to clean up as its > first argument. > e.g. tools/cleanup-containers "neutron|openvswitch" > > that is totally a hack but it was so useful for dev. > > > i belive there is already a request to limit kolla-ansible --destroy by tags Yes - https://review.opendev.org/504592. It will need some work to get it merged. > > destroy should in theory be cleanup dbs in addtion to remvoing the contianers but give > that currently it just remvoed everything of nothing it not that useful for operator > wanting to remvoe a deployed service. > > my hack when used to be an tiny ansible script that copied that tool to all host then invoked > it the relevent regex. > > anywya if you jsut wasnt to do this for dev or on a small number of hosts it might help but > keep in mind that it will not clean up the configs or dbs and it only works if no vms are runnign on the host > > > > > > Laurent > > > > On Sat, Oct 12, 2019 at 9:27 AM Radosław Piliszek < > > radoslaw.piliszek at gmail.com> wrote: > > > > > Hi Laurent, > > > > > > Unfortunately Kolla Ansible does not provide this functionality at the > > > moment. > > > On the other hand, we would welcome such functionality gladly. > > > It needs some discussion regarding how it would work to suit operators' > > > needs. > > > The interesting part is the real clean-up - e.g. removing leftovers, > > > databases, rabbitmq objects... > > > > > > PS: bacon rlz > > > > > > Kind regards, > > > Radek > > > > > > > > > On Fri, Oct 11, 2019, 19:12 Laurent Dumont > > > wrote: > > > > > > > Hey everyone, > > > > > > > > I'm pretty sure I know the answer but are there any support within Kolla > > > > itself to disable Services that we're previously enabled. > > > > > > > > For example, I was testing the Skydive Agent/Analyzer combo till I > > > > realized that it was using about 90-100% of the CPUs or computes and > > > > controllers. > > > > > > > > Re-running Kolla with reconfigure but with Service set to "No" didn't > > > > remove the containers. I had to remove the containers after the reconfigure > > > > finished. > > > > > > > > This is Kolla 8.0.1 with a Stein install. > > > > > > > > Thanks! > > > > > > From thierry at openstack.org Mon Oct 14 10:14:54 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 14 Oct 2019 12:14:54 +0200 Subject: [tc] Status update on naming our next releases Message-ID: <4cf0c4be-5b22-2565-6449-367670ba577d@openstack.org> Hi, Following last week TC meeting, I took the action to summarize the situation and way forward of naming releases. Naming our U release was a bit of a painful process, mostly due to the subjectivity of the process, combined with a difficult combination of letter vs. naming criteria. It triggered proposals to change the naming process and/or criteria, as we expect similar difficulties with the rest of the alphabet. We did a first round of proposals that covered the V-Z part of the alphabet. A Condorcet poll was run to select the best options. The two best ones were then run in a new Condorcet poll against "keep things the same", and the result was a draw. It was a good indication that the TC membership found none of the proposals on the table was significantly better than keeping things the same, and therefore no change was effected. That said, it does not mean we should stop proposing new models, as the current system is flawed: its subjectivity combined with a popularity contest creates problems, and its criteria (strongly tied to event locations) will not work well in the future as we work to reduce the number of global events we run while increasing the number of local events. The way forward is as follows: proposals can still be made, but they should address "any foreseeable future". That means they need to explain how they will name the V-Z releases, but also how they will roll over past Z. TC members should rollcall-vote +1 on those proposals if they think they are better than keeping things the same. They can rollcall-vote -1 on the proposals for which they think keeping things the same would be better. If one proposal gets a majority of votes (seven +1s), then after the usual grace period of 3 calendar days, it should be approved (unless a competing proposals gathers *more* positive votes in the mean time). There is no deadline for proposing, even after one such proposal is approved. Those things can always be changed in the future. However I personally don't think we should change naming systems too often, because they are only fun if they become some sort of tradition. We currently have three proposals up: Cities with 100,000+ inhabitants, TC-only poll: https://review.opendev.org/#/c/677745/ Vancouver, then words present in movie quotes about "release": https://review.opendev.org/#/c/684688/ Vancouver, then cities with 100,000+ inhabitants, community poll: https://review.opendev.org/#/c/687764/ Cheers, -- Thierry Carrez (ttx) From skaplons at redhat.com Mon Oct 14 10:47:28 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 14 Oct 2019 12:47:28 +0200 Subject: [neutron] Bug deputy report - week of October 7th Message-ID: Hi, I was on bug deputy last week. It was really quiet week. Below is summary of new bugs reported: Critical test_update_firewall_calls_get_dvr_hosts_for_router failure on rocky - https://bugs.launchpad.net/neutron/+bug/1847019 - patch is already proposed https://review.opendev.org/#/c/687085/ Medium Loading neutron-lib internationalized file - https://bugs.launchpad.net/neutron/+bug/1847586 - fix proposed already https://review.opendev.org/687861 Low '--sql' option of neutron-db-manage does not work - https://bugs.launchpad.net/neutron/+bug/1847210 Undecided l3-agent stops processing router updates - https://bugs.launchpad.net/neutron/+bug/1847203 - I asked some additional questions there but would be also good if some L3 experts could take a look into that, Incomplete cannot reuse floating IP as port's fixed-ip - https://bugs.launchpad.net/neutron/+bug/1847763 - I think it’s for Calico project but lets wait until reporter clarify that RFEs [RFE] create option in neutron.conf to disable designate+neutron consistency - https://bugs.launchpad.net/neutron/+bug/1847068 Others [RPC] digging RPC timeout for client and server - https://bugs.launchpad.net/oslo.messaging/+bug/1847747 - More oslo_messaging issue IMO, neutron added only as impacted project. — Slawek Kaplonski Senior software engineer Red Hat From rfolco at redhat.com Mon Oct 14 13:23:02 2019 From: rfolco at redhat.com (Rafael Folco) Date: Mon, 14 Oct 2019 10:23:02 -0300 Subject: [tripleo] TripleO CI Summary: Sprint 37 Message-ID: Greetings, The TripleO CI team has just completed Sprint 37 / Unified Sprint 16 (Sep 19 thru Oct 09). The following is a summary of completed work during this sprint cycle: - Started Train release branching prep work and bootstrapped a centos8 nodepool node. - Designed and implemented tests for verifying changes in the promotion server. - Added multi-arch support w/ manifests to container push in the promoter code. - Designed a test strategy for building and running jobs in zuul on ceph-ansible and podman repositories against pull requests on Github. The planned work for the next sprint [1] are: - Complete the manifest implementation with a test strategy for not breaking promotion workflow. - Improve tests for verifying a full promotion workflow running on the staging environment. - Implement CI jobs in zuul to build and run tests against ceph-ansible and podman pull requests in github. - Close-out Train release branching preparation work. - Address required changes for building a CentOS8 node for upcoming distro release support across TripleO CI jobs. The Ruck and Rover for this sprint are Rafael Folco (rfolco) and Marios Andreou (marios). Please direct questions or queries to them regarding CI status or issues in #tripleo, ideally to whomever has the ‘|ruck’ suffix on their nick. Ruck/rover notes are being tracked in etherpad [2]. Thanks, rfolco [1] https://tree.taiga.io/project/tripleo-ci-board/taskboard/unified-sprint-17 [2] https://etherpad.openstack.org/p/ruckroversprint17 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Mon Oct 14 14:00:56 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 14 Oct 2019 15:00:56 +0100 Subject: [nova][all] Roadmap for dropping Python 2 support Message-ID: <4f66a849e1d26f56ef9272e69f43460a6a6a9614.camel@redhat.com> The time has come. Train is almost out the door and we're already well into Ussuri planning and work. We agreed some time ago that this cycle was the right one to drop support for Python 2.7 [1] and to that effect I've proposed a patch to do just this in nova [2]. However, I have noticed a large number of the third party CIs are failing on this patch. This is because they are still testing with Python 2 and the patch marks nova as only supporting Python 3. As you can see in that patch, the effort to switch things over is not that significant, and I'd ask that any owners of third party CIs prioritise work to switch these things over in the next few weeks leading up to the PTG so we can merge this change as soon as possible. Please reach out on IRC if you have any concerns or questions. Cheers, Stephen [1] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html#python2-deprecation-timeline [2] https://review.opendev.org/#/c/687954/ From mriedemos at gmail.com Mon Oct 14 15:14:55 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Mon, 14 Oct 2019 10:14:55 -0500 Subject: [watcher][nova] Thesis on improving Watcher and collaborating with OpenStack community In-Reply-To: References: Message-ID: <400783eb-8ca7-e10f-5481-0940caa53dca@gmail.com> On 10/11/2019 1:52 AM, info at dantalion.nl wrote: > Hello everyone, > > I am a Dutch student at the Amsterdam University of Applied Sciences > (AUAS) and have recently finished my thesis. My thesis was written on > improvements that were made to OpenStack Watcher between February and > Juli of 2019. Specifically, many of these improvements were written to > aid CERN in deploying Watcher. In addition, the thesis describes methods > of collaboration and engaging in communities as well as evaluating > strengths and weaknesses of communties. > > Since the thesis primarily resolves around OpenStack I would like to > share it with the community as well. Please find the thesis attached to > this email. > > Any feedback, remarks, future advice or other responses are appreciated. > > Kind regards, > Corne Lukken (Dantali0n) > Thanks for sharing these results Corne, looks great. I've added the [nova] tag to the subject line to sub-thread this just for awareness to nova developers. Section 8 should be interesting for nova developers to see how simple client-side optimizations can be done to improve performance when working with the compute API. The particularly interesting one to me is the regression fix to not use limit=-1 in novaclient when listing servers but specify a hard limit to avoid extra API calls to list servers. Personally I enjoyed this little short-term side project working with the Watcher team on improving the performance of Watcher's nova data model builder code. Thanks to the Watcher team for welcoming my contributions. -- Thanks, Matt From openstack at nemebean.com Mon Oct 14 15:30:50 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 14 Oct 2019 10:30:50 -0500 Subject: [oslo] New courtesy ping list for Ussuri In-Reply-To: <473f0fcb-c8c2-e8ae-812e-15575e898d66@nemebean.com> References: <473f0fcb-c8c2-e8ae-812e-15575e898d66@nemebean.com> Message-ID: Final notice! As I mentioned in the meeting today, next week I'll move to using the new ping list. I think most people have already re-upped for next cycle, but if you haven't yet and want to, now is the time. Of course, you can always add yourself to the ping list any time. It is a wiki, after all. :-) On 9/23/19 2:59 PM, Ben Nemec wrote: > As we discussed at the beginning of the cycle, I'll be clearing the > current ping list in the next few weeks. This is to prevent courtesy > pinging people who are no longer active on the project. If you wish to > continue receiving courtesy pings at the start of the Oslo meeting > please add yourself to the new list on the agenda template [0]. Note > that the new list is above the template, called "Courtesy ping list for > Ussuri". If you add yourself again to the end of the existing list I'll > assume you want to be left on though. :-) > > Thanks. > > -Ben > > 0: https://wiki.openstack.org/wiki/Meetings/Oslo#Agenda_Template From openstack at nemebean.com Mon Oct 14 15:52:56 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 14 Oct 2019 10:52:56 -0500 Subject: [stable][oslo] Supporting qemu 4.1.0 on stein and older In-Reply-To: References: <20191007163119.g2bpn22lsooulf6b@yuggoth.org> <1c17ad14272bddd29f46ea9790d128f4ff005099.camel@redhat.com> Message-ID: <6ad1f914-c43e-5ae8-57fc-51d3e000b953@nemebean.com> Okay, circling back to wrap this topic up. It sounds like this is a pretty big win in terms of avoiding random failures either from trying to migrate a VM with nested guests on older qemu or using newer qemu with older OpenStack. Since it's a pretty simple patch and it allows our stable branches to behave more sanely, I'm inclined to go with the backport. If anyone strongly objects, please let me know ASAP before we release it. On 10/7/19 3:36 PM, Ben Nemec wrote: > > > On 10/7/19 3:08 PM, Sean Mooney wrote: >> On Mon, 2019-10-07 at 14:43 -0500, Ben Nemec wrote: >>> >>> On 10/7/19 11:31 AM, Jeremy Stanley wrote: >>>> On 2019-10-07 10:44:04 -0500 (-0500), Ben Nemec wrote: >>>> [...] >>>>> Qemu 4.1.0 did not exist during the Stein cycle, so it's not clear >>>>> to me that backporting bug fixes for it is valid. The original >>>>> author of the patch actually wants it for Rocky >>>> >>>> [...] >>>> >>>> Neither the changes nor the bug report indicate what the motivation >>>> is for supporting newer Qemu with (much) older OpenStack. Is there >>>> some platform which has this Qemu behavior on which folks are trying >>>> to run Rocky? Or is it a homegrown build combining these dependency >>>> versions from disparate time periods? Or maybe some other reason I'm >>>> not imagining? >>>> >>> >>> In addition to the downstream reasons Sean mentioned, Mark (the original >>> author of the patch) responded to my question on the train backport with >>> this: >>> >>> """ >>> Today, I need it in Rocky. But, I'm find to do local patching. >>> >>> Anybody who needs Qemu 4.1.0 likely needs it. A key feature in Qemu >>> 4.1.0 is that this is the first release of Qemu to include proper >>> support for migration of L1 guests that have L2 guests (nVMX / nested >>> KVM). So, I expect it is pretty important to whoever realizes this, and >>> whoever needs this. >>> """ >>> >>> So basically a desire to use a feature of the newer qemu with older >>> openstack, which is why I'm questioning whether this fits our stable >>> policy. My inclination is to say it's a fairly simple, >>> backward-compatible patch that will make users' lives easier, but I also >>> feel like doing a backport to enable a feature, even if the actual patch >>> is a "bugfix", is violating the spirit of the stable policy. >> in many distros the older qemus allow migration of the l1 guest >> eventhouhg it is >> unsafe to do so and either work by luck or the vm will curput its >> memroy and likely >> crash.  the context of the qemu issue is for years people though that >> live migration with >> nested virt worked, then it was disabeld upstream and many distos >> reverted that as it would >> break there users where they got lucky and it worked, and in 4.1 it >> was fixed. >> >> this does not add or remvoe any functionality in openstack nova will >> try to live migarte if you >> tell it too regardless of the qemu it has it just will fail if the >> live migration check was complied in. >> >> >> similarly if all your images did not have fractional sizes you could >> use 4.1.0 with older >> oslo releases and it would be fine. i.e. you could get lucky and for >> your specific usecase this >> might not be needed but it would be nice not do depend on luck. >> >> anyway i woudl expect any disto the chooses to support qemu 4.1.0 to >> backport this as required. >> im not sure this problematic to require a late oslo version bump >> before train ga but i would hope >> it can be fixed on stable/train > > Note that this discussion is separate from the train patch. I agree we > should do that backport, and actually we already have. That discussion > was just about timing of the release. > > This thread is because the fix was also proposed to stable/stein. It > merged before I had a chance to start this discussion, and I'm wondering > if we need to revert it. > From rico.lin.guanyu at gmail.com Mon Oct 14 16:44:29 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Tue, 15 Oct 2019 00:44:29 +0800 Subject: [all][tc] What happened in OpenStack Governance recently Message-ID: Hello everyone, Here are a few things that happened recently: - *We've got the last volley of new PTLs!* Lucian Petrut for winstackers and Nicholas Bock for Designate. Congratulations! *- Rico Lin is now the Vice-chair of the TC.* - As mentioned in [1], *we will have two `Meet the project leaders` events* during the Shanghai summit. It will be nice if OpenStack PTLs, SIG Chairs, TC members, core reviewers, UC members, interested in join. You can signup in [2] to let others know you're coming. And if you think you might be part of this or you would like to meet any of them. Yes! you should come! -* We're open for goal idea*, so add it in [3] if you have any. We also looking for V cycle goal as well. And here's some backlog [4] if you're interested to be a champion of it. - At this point you should already know, *the newest OpenStack User Survey results are already out *[8]. So analysis it and make decisions accordingly (assume those questions are essential for team dicisions). - *The Shanghai PTG schedule is finalized*. Please check [7] for more detail on each forum. We hope we can have users, operators, and developers all together to collaborate and make more successful and valuable outcome from each forum. So please join! - Big thanks to our release team, *Train final releases for cycle-with-rc projects is right around the corner* [9]. - Our official meeting happened on October 10, as announced on the ML. We decided the following: - As one of meeting actions, a ML `[tc] Status update on naming our next releases` [6] is out (Thanks to Thierry). If you would like to give your review/feedback before proposals (mentioned in that mail) gets a majority of TC votes, now is the time. As always your review and feedback matter. - Summit and PTG are near, currently, Summit presentations and Forum sessions [7] are released. At this point, teams (which plan to join PTG) should start planning for their PTG topics and formats. So please help with teams to be prepared. For TC PTG, please propose topics before October 17, so we will have two weeks to discuss, finalize, and prepare. - V cycle goal discussion will be more asynchronously, more ML for U and V cycle goal process will be out soon. - Thanks for the effort of swift team, current python 3 support for swift is good. This makes OpenStack more ready for python 3 first. - There will be a forum [5] for large scale SIG, so if you're interested, please join (we hope to have more large scale users join to provide hands and feedbacks) so we can make OpenStack a better place for large scale. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/009947.html [2] https://etherpad.openstack.org/p/meet-the-project-leaders [3] https://etherpad.openstack.org/p/PVG-u-series-goals [4] https://etherpad.openstack.org/p/community-goals [5] https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24405/facilitating-running-openstack-at-scale-join-the-large-scale-sig [6] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010106.html [7] https://www.openstack.org/summit/shanghai-2019/summit-schedule#track_groups=90 [8] http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009501.html [9] https://review.opendev.org/#/c/687991 Regards, JP & Rico -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Oct 14 17:24:13 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 14 Oct 2019 12:24:13 -0500 Subject: [all][qa][forum] Etherpad for Users / Operators adoption of QA tools / plugins sessions at Shanghai Summit Message-ID: <16dcb4c89b5.d4afc3b579599.8480309289121917289@ghanshyammann.com> Hello Everyone, I've created the etherpad for the QA feedback sessions[1] which is scheduled on Monday, November 4, 1:20pm-2:00pm. I have added a few basic feedback questions there and If you have any additional items to add or modify, please feel free to do that. In case, you are not able to attend this session, you can still write your feedback with your irc/name contact. Etherpad: https://etherpad.openstack.org/p/PVG-forum-qa-ops-user-feedback [1] https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24401/users-operators-adoption-of-qa-tools-plugins -gmann From gmann at ghanshyammann.com Mon Oct 14 17:24:24 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 14 Oct 2019 12:24:24 -0500 Subject: [qa][ptg] Ussuri PTG Planning for QA Message-ID: <16dcb4cb379.cb1085b779605.6576175916005057549@ghanshyammann.com> Hello Everyone, This is the etherpad[1] to collect the Ussuri cycle PTG topic ideas for QA. Please start adding your item/topic you want to discuss in PTG. Even you are not making to PTG physically, still, add your topic which you want us to discuss or give a thought. Anyone is welcome to add the cross-project testing topics they want to discuss related to QA. [1] https://etherpad.openstack.org/p/shanghai-ptg-qa -gmann From rico.lin.guanyu at gmail.com Mon Oct 14 17:27:23 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Tue, 15 Oct 2019 01:27:23 +0800 Subject: [tc] Weekly update Message-ID: Hello friends, Here's what needs attention for the OpenStack TC this week. 1. We have our meeting last Thursday [1], so please work on actions [2]. And notice that some actions might require TCs to votes/helps on (like in [4]). 2. We have three potential goals for Ussuri [7] now. Please provide any if you found any suitable goal idea for Ussuri (or V) cycle. 3. For TC PTG, please propose topics before October 17, so we will have two weeks to discuss, finalize, and prepare. 4. Rico will help with chair's responsibility during JP's time off [5]. 5. Some recently started mailing list with [tc] or [all] tags: - [all][tc] What happened in OpenStack Governance recently [3] - [tc] Status update on naming our next releases [4] - [tc] Time off for JP! [5] - [nova][all] Roadmap for dropping Python 2 support [6] Thank you everyone! [1] http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-10-10-14.00.log.html [2] http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-10-10-14.00.txt [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010113.html [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010106.html [5] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010061.html [6] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010109.html [7] https://etherpad.openstack.org/p/PVG-u-series-goals Regards, JP & Rico -------------- next part -------------- An HTML attachment was scrubbed... URL: From ali74.ebrahimpour at gmail.com Mon Oct 14 05:49:01 2019 From: ali74.ebrahimpour at gmail.com (Ali Ebrahimpour) Date: Mon, 14 Oct 2019 09:19:01 +0330 Subject: monitoring Message-ID: hi guys i want to install monitoring in my horizon Ui and i'm confused in setting up ceilometer or gnocchi or aodh or monasca in my project because all of them where deprecated. i setup openstack with ansible and i want to monitor the usage of cpu and ram and etc in my dashboard and i also want to know how much resources each customer used for one hour and day. Thanks in advance for your precise guidance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Oct 14 21:26:17 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 14 Oct 2019 23:26:17 +0200 Subject: [neutron] Team dinner in Shanghai Message-ID: <76CDA132-87D8-4FAD-A993-76D65E879F5E@redhat.com> Hi neutrinos, We are planning to organise some team dinner during PTG in Shanghai. If You are interested to go for such dinner, please write it in etherpad [1] together with days which works the best for You. [1] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning — Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Mon Oct 14 21:46:44 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 14 Oct 2019 23:46:44 +0200 Subject: [neutron][ptg] Ussuri planning etherpad for Neutron Message-ID: <74600817-8AD4-4593-8862-4BCB024C9B57@redhat.com> Hi, At [1] there is Ussuri PTG planning etherpad. Please add Your topics to it, even if You are not planning to be there and You want team to discuss about it. I plan to start preparing agenda during the week of October 28th so would be great if You could add Your topics to it before this date. But of course any “last minute” topics are always welcome :) [1] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning — Slawek Kaplonski Senior software engineer Red Hat From allison at openstack.org Mon Oct 14 21:59:29 2019 From: allison at openstack.org (Allison Price) Date: Mon, 14 Oct 2019 16:59:29 -0500 Subject: Writing a Train blog post? Message-ID: Hi everyone, Are you planning on writing a blog post about OpenStack Train ahead of (or after) the release this Wednesday, October 16? If so, please respond and let me know so we can include it in Train promotional activities, including the Open Infrastructure Community Newsletter. Thanks! Allison Allison Price OpenStack Foundation allison at openstack.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Oct 14 22:52:31 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 14 Oct 2019 17:52:31 -0500 Subject: [tc][all] Community-wide goal Ussuri and V cycle forum collaboration idea Message-ID: <16dcc79196d.b7dfa21684317.2121277505699030183@ghanshyammann.com> Hello Everyone, During the TC meeting on 10th Oct, we discussed the community-wide goal planning for Ussuri as well as the V cycle[1] but there was no consensus on V cycle planning so I am bringing this to ML for further thoughts. Ussuri cycle goal planning is all good here, That has been already started[2] and we have a dedicated forum session [3] for the same to discuss the goals in more detail. Question is for V cycle goal planning, whether we should discuss the V cycle goal in Ussuri goal fourm sessoin[3] or it is too early to kick off V cycle goal at least until we finalize U cycle goal first. I would like to list the below two options to proceed further (at least to decide if we need to change the existing U cycle goal forum sessions title). 1. Merge the Forum session for both cycle goal discussion (divide both in two half). This need forum session title and description change. 2. Keep forum session for U cycle goal only and start the V cycle over ML asynchronously. This will help to avoid any confusion or mixing the both cycle goal discussions. Thoughts? [1] http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-10-10-14.00.log.html#l-211 [2] https://etherpad.openstack.org/p/PVG-u-series-goals [3] https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24398/ussuri-cycle-community-wide-goals-discussion -gmann From pojadhav at redhat.com Tue Oct 15 06:15:20 2019 From: pojadhav at redhat.com (Pooja Jadhav) Date: Tue, 15 Oct 2019 11:45:20 +0530 Subject: Request to update email of Launchpad Account Message-ID: Hi Team, I am Pooja Jadhav. I have a Launchpad account with pooja.jadhav at nttdata.com but now I have left the company so I am not able to access this email any more, Hence I am not entitled to do forget password as well. So I am requesting community to suggest me alternative way to solve this problem through which I can put my Personal Email Id/ Current Corporate Id. Looking forward for your reply. Thanks & Regards, Pooja Jadhav -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinzs2048 at gmail.com Tue Oct 15 08:16:32 2019 From: kevinzs2048 at gmail.com (Shuai Zhao) Date: Tue, 15 Oct 2019 16:16:32 +0800 Subject: [neutron][Kolla] Failed to get DHCP offer packet at qvo/qvb in compute node Message-ID: Hi Neutron, I've deployed Rocky-rc2 version on Debian Buster(compute node), kernel Linux 4.19 Now the issue: The VM running on the Host(Debian Buster) could not get IP when Booting. I use tcpdump to get the packet on tap, qbr, qvb and qvo. *The DHCP broadcast packet could be dumped at tap and qbr, but not at qvo/qvb.* So the DHCP failed. All the firewall policy is neutron automatic generated. The firewall policy is never changed. (neutron-openvswitch-agent)[root@** /]# iptables -S | grep tapba5cd56c-46 -A neutron-openvswi-FORWARD -m physdev --physdev-out tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Direct traffic from the VM interface to the security group chain." -j neutron-openvswi-sg-chain -A neutron-openvswi-FORWARD -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Direct traffic from the VM interface to the security group chain." -j neutron-openvswi-sg-chain -A neutron-openvswi-INPUT -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM to the security group chain." -j neutron-openvswi-oba5cd56c-4 -A neutron-openvswi-sg-chain -m physdev --physdev-out tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-iba5cd56c-4 -A neutron-openvswi-sg-chain -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-oba5cd56c-4 (neutron-openvswitch-agent)[root@*** /]#* iptables -S | grep neutron-openvswi-oba5cd56c-4* -N neutron-openvswi-oba5cd56c-4 -A neutron-openvswi-INPUT -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM to the security group chain." -j neutron-openvswi-oba5cd56c-4 -A neutron-openvswi-oba5cd56c-4 -s 0.0.0.0/32 -d 255.255.255.255/32 -p udp -m udp --sport 68 --dport 67 -m comment --comment "Allow DHCP client traffic." -j RETURN -A neutron-openvswi-oba5cd56c-4 -j neutron-openvswi-sba5cd56c-4 -A neutron-openvswi-oba5cd56c-4 -p udp -m udp --sport 68 --dport 67 -m comment --comment "Allow DHCP client traffic." -j RETURN -A neutron-openvswi-oba5cd56c-4 -p udp -m udp --sport 67 --dport 68 -m comment --comment "Prevent DHCP Spoofing by VM." -j DROP -A neutron-openvswi-oba5cd56c-4 -m state --state RELATED,ESTABLISHED -m comment --comment "Direct packets associated with a known session to the RETURN chain." -j RETURN -A neutron-openvswi-oba5cd56c-4 -j RETURN -A neutron-openvswi-oba5cd56c-4 -m state --state INVALID -m comment --comment "Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP -A neutron-openvswi-oba5cd56c-4 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback -A neutron-openvswi-sg-chain -m physdev --physdev-in tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-oba5cd56c-4 Pls help to give some advices about that. Thanks a lot! -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinzs2048 at gmail.com Tue Oct 15 08:38:10 2019 From: kevinzs2048 at gmail.com (Shuai Zhao) Date: Tue, 15 Oct 2019 16:38:10 +0800 Subject: [neutron][Kolla] Failed to get DHCP offer packet at qvo/qvb in compute node In-Reply-To: References: Message-ID: Sorry missed ingress rules: (neutron-openvswitch-agent)[root at uk-dc-tx2-01 /]# *iptables -S | grep neutron-openvswi-iba5cd56c-4* -N neutron-openvswi-iba5cd56c-4 -A neutron-openvswi-iba5cd56c-4 -m state --state RELATED,ESTABLISHED -m comment --comment "Direct packets associated with a known session to the RETURN chain." -j RETURN -A neutron-openvswi-iba5cd56c-4 -d 192.168.200.6/32 -p udp -m udp --sport 67 --dport 68 -j RETURN -A neutron-openvswi-iba5cd56c-4 -d 255.255.255.255/32 -p udp -m udp --sport 67 --dport 68 -j RETURN -A neutron-openvswi-iba5cd56c-4 -p tcp -m tcp -m multiport --dports 1:65535 -j RETURN -A neutron-openvswi-iba5cd56c-4 -p icmp -j RETURN -A neutron-openvswi-iba5cd56c-4 -p tcp -m tcp --dport 22 -j RETURN -A neutron-openvswi-iba5cd56c-4 -m set --match-set NIPv40cd3823f-af20-4015-b9f4- src -j RETURN -A neutron-openvswi-iba5cd56c-4 -m state --state INVALID -m comment --comment "Drop packets that appear related to an existing connection (e.g. TCP ACK/FIN) but do not have an entry in conntrack." -j DROP -A neutron-openvswi-iba5cd56c-4 -m comment --comment "Send unmatched traffic to the fallback chain." -j neutron-openvswi-sg-fallback -A neutron-openvswi-sg-chain -m physdev --physdev-out tapba5cd56c-46 --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." -j neutron-openvswi-iba5cd56c-4 And *ml2_conf.ini*: [ml2] type_drivers = flat,vlan,vxlan tenant_network_types = vxlan mechanism_drivers = openvswitch,l2population extension_drivers = port_security [ml2_type_vlan] network_vlan_ranges = [ml2_type_flat] flat_networks = physnet1 [ml2_type_vxlan] vni_ranges = 1:1000 vxlan_group = 239.1.1.1 [securitygroup] firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver [agent] tunnel_types = vxlan l2_population = true arp_responder = true [ovs] datapath_type = system ovsdb_connection = tcp:127.0.0.1:6640 local_ip = 10.22.20.4 On Tue, Oct 15, 2019 at 4:16 PM Shuai Zhao wrote: > Hi Neutron, > I've deployed Rocky-rc2 version on Debian Buster(compute node), kernel > Linux 4.19 > > Now the issue: > The VM running on the Host(Debian Buster) could not get IP when Booting. I > use tcpdump to get the packet on tap, qbr, qvb and qvo. > *The DHCP broadcast packet could be dumped at tap and qbr, but not at > qvo/qvb.* So the DHCP failed. All the firewall policy is neutron > automatic generated. > > The firewall policy is never changed. > (neutron-openvswitch-agent)[root@** /]# iptables -S | grep tapba5cd56c-46 > -A neutron-openvswi-FORWARD -m physdev --physdev-out tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Direct traffic from the VM > interface to the security group chain." -j neutron-openvswi-sg-chain > -A neutron-openvswi-FORWARD -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Direct traffic from the VM > interface to the security group chain." -j neutron-openvswi-sg-chain > -A neutron-openvswi-INPUT -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM > to the security group chain." -j neutron-openvswi-oba5cd56c-4 > -A neutron-openvswi-sg-chain -m physdev --physdev-out tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." > -j neutron-openvswi-iba5cd56c-4 > -A neutron-openvswi-sg-chain -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." > -j neutron-openvswi-oba5cd56c-4 > > (neutron-openvswitch-agent)[root@*** /]#* iptables -S | grep > neutron-openvswi-oba5cd56c-4* > -N neutron-openvswi-oba5cd56c-4 > -A neutron-openvswi-INPUT -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Direct incoming traffic from VM > to the security group chain." -j neutron-openvswi-oba5cd56c-4 > -A neutron-openvswi-oba5cd56c-4 -s 0.0.0.0/32 -d 255.255.255.255/32 -p > udp -m udp --sport 68 --dport 67 -m comment --comment "Allow DHCP client > traffic." -j RETURN > -A neutron-openvswi-oba5cd56c-4 -j neutron-openvswi-sba5cd56c-4 > -A neutron-openvswi-oba5cd56c-4 -p udp -m udp --sport 68 --dport 67 -m > comment --comment "Allow DHCP client traffic." -j RETURN > -A neutron-openvswi-oba5cd56c-4 -p udp -m udp --sport 67 --dport 68 -m > comment --comment "Prevent DHCP Spoofing by VM." -j DROP > -A neutron-openvswi-oba5cd56c-4 -m state --state RELATED,ESTABLISHED -m > comment --comment "Direct packets associated with a known session to the > RETURN chain." -j RETURN > -A neutron-openvswi-oba5cd56c-4 -j RETURN > -A neutron-openvswi-oba5cd56c-4 -m state --state INVALID -m comment > --comment "Drop packets that appear related to an existing connection (e.g. > TCP ACK/FIN) but do not have an entry in conntrack." -j DROP > -A neutron-openvswi-oba5cd56c-4 -m comment --comment "Send unmatched > traffic to the fallback chain." -j neutron-openvswi-sg-fallback > -A neutron-openvswi-sg-chain -m physdev --physdev-in tapba5cd56c-46 > --physdev-is-bridged -m comment --comment "Jump to the VM specific chain." > -j neutron-openvswi-oba5cd56c-4 > > Pls help to give some advices about that. > Thanks a lot! > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Tue Oct 15 09:13:39 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Tue, 15 Oct 2019 11:13:39 +0200 Subject: [ironic] [tripleo] IPA images without RPM and YUM/DNF? Message-ID: (adding TripleO because of potential effect) Hi all, I'm working on making ironic-python-agent images smaller than they currently are. The proposed patches already reduce the default image (as built by IPA-builder) size from around 420 MiB to around 380 MiB. My next idea is to get rid of RPM and YUM databases (in case of a CentOS/RHEL image). They amount for nearly 100 MiB of the uncompressed image: $ du -sh var/lib/rpm 91M var/lib/rpm $ du -sh var/lib/yum 6.6M var/lib/yum How important for anyone is the ability to install/inspect packages inside a ramdisk? Dmitry -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Tue Oct 15 10:18:04 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 15 Oct 2019 11:18:04 +0100 Subject: [nova][ptg] Ussuri planning etherpad for nova Message-ID: <8258f512417de0b2cc70740f7aff1b1309bd3ec6.camel@redhat.com> With the PTG only a few weeks away, it's about time we started figuring out what we need to discuss there. The Ussuri PTG planning etherpad can be found at [1]. If you have something you want to discuss at the PTG then please include it there, even if you're not going to be there in person. I'm going to be away from this Friday until the summit but if we could get the bones of this in place this week, it should leave us (well, others in nova) enough time to organize the usual jumble of topics into something approaching an agenda. Stephen [1] https://etherpad.openstack.org/p/nova-shanghai-ptg From sfinucan at redhat.com Tue Oct 15 10:26:06 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 15 Oct 2019 11:26:06 +0100 Subject: [nova][ptg] Team dinner in Shanghai Message-ID: We haven't done one of these in a while (formally, anyway), so I think it would be a good idea to take advantage of the lack of fussy eaters [1] present and organise a team dinner. For anyone that hasn't joined one of these before, it's a good opportunity for people that regularly work on nova to spend some time with their fellow nova contributors IRL (in real life) and for newer contributors to m̶e̶e̶t̶ ̶t̶h̶e̶i̶r̶ ̶h̶e̶r̶o̶e̶s̶ get to know the people reviewing their code. If you are interested, please state so in the Etherpad [2] along with days that work for you and I'll try organize a suitable venue. Stephen [1] Sorry Dan, Jay :P [2] https://etherpad.openstack.org/p/nova-shanghai-ptg From thierry at openstack.org Tue Oct 15 12:41:46 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 15 Oct 2019 14:41:46 +0200 Subject: Request to update email of Launchpad Account In-Reply-To: References: Message-ID: <4ffbb61c-cd8c-4bfb-3ac4-33ab9213d591@openstack.org> Pooja Jadhav wrote: > Hi Team, > > I am Pooja Jadhav. I have a Launchpad account with > pooja.jadhav at nttdata.com but now I > have left the company so I am not able to access this email any more, > Hence I am not entitled to do forget password as well. > > So I am requesting community to suggest me alternative way to solve this > problem through which  I can put my Personal Email Id/ Current Corporate > Id. Hi Pooja, Launchpad is run by Canonical, so unfortunately the OpenStack community can't really help you. You should ask your question on #launchpad on Freenode IRC, the launchpad-users mailing-list[1] or on Launchpad itself[2]. To access those last two you'll likely have to create a new Launchpad account, but they might be able to merge them afterwards. [1] https://launchpad.net/~launchpad-users [2] https://answers.launchpad.net/launchpad/+addquestion -- Thierry Carrez (ttx) From hberaud at redhat.com Tue Oct 15 12:48:13 2019 From: hberaud at redhat.com (Herve Beraud) Date: Tue, 15 Oct 2019 14:48:13 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: I proposed some patches through heat templates and puppet-cinder to remove lock files older than 1 week and avoid file system growing. This is a solution based on a cron job, to fix that on stable branches, in a second time I'll help the fasteners project to fix the root cause by reviewing and testing the proposed patch (lock based on file offset). In next versions I hope we will use a patched fasteners and so we could drop the cron based solution. Please can you give /reviews/feedbacks: - https://review.opendev.org/688413 - https://review.opendev.org/688414 - https://review.opendev.org/688415 Thanks Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo a écrit : > On 2019/09/28 1:44, Ben Nemec wrote: > > > > > > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: > >> Hi Eric, > >> > >> On 2019/09/20 23:10, Eric Harney wrote: > >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: > >>>> Hi, > >>>> > >>>> I'm using Queens cinder with the following setting. > >>>> > >>>> --------------------------------- > >>>> [coordination] > >>>> backend_url = file://$state_path > >>>> --------------------------------- > >>>> > >>>> As a result, the files like the following were remained under the > state path after some operations.[1] > >>>> > >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume > >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot > >>>> > >>>> In my understanding, these are lock-files created for synchronization > by tooz. > >>>> But, these lock-files were not deleted after finishing operations. > >>>> Is this behaviour correct? > >>>> > >>>> [1] > >>>> e.g. Delete volume, Delete snapshot > >>> > >>> This is a known bug that's described here: > >>> > >>> https://github.com/harlowja/fasteners/issues/26 > >>> > >>> (The fasteners library is used by tooz, which is used by Cinder for > managing these lock files.) > >>> > >>> There's an old Cinder bug for it here: > >>> https://bugs.launchpad.net/cinder/+bug/1432387 > >>> > >>> but that's marked as "Won't Fix" because Cinder needs it to be fixed > in the underlying libraries. > >> Thank you for your explanation. > >> I understood the state. > >> > >> But, I have one more question. > >> Can I think this bug doesn't affect synchronization? > > > > It does not. In fact, it's important to not remove lock files while a > service is running or you can end up with synchronization issues. > > > > To clean up the leftover lock files, we generally recommend clearing the > lock_path for each service on reboot before the services have started. > > Thank you for your information. > I think that I understood this issue completely. > > Best Regards, > > > >> > >> Best regards, > >> > >>> Thanks, > >>> Eric > >>> > >> > > > > -- > _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > Rikimaru Honjo > E-mail:honjo.rikimaru at ntt-tx.co.jp > > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgolovat at redhat.com Tue Oct 15 12:55:41 2019 From: sgolovat at redhat.com (Sergii Golovatiuk) Date: Tue, 15 Oct 2019 14:55:41 +0200 Subject: [ironic] [tripleo] IPA images without RPM and YUM/DNF? In-Reply-To: References: Message-ID: Hi, Operator may run "rpm --rebuilddb" in case he needs some packages installed. Alternatively, he may build a new image with rpm/yum databases. вт, 15 окт. 2019 г. в 11:15, Dmitry Tantsur : > (adding TripleO because of potential effect) > > Hi all, > > I'm working on making ironic-python-agent images smaller than they > currently are. The proposed patches already reduce the default image (as > built by IPA-builder) size from around 420 MiB to around 380 MiB. > > My next idea is to get rid of RPM and YUM databases (in case of a > CentOS/RHEL image). They amount for nearly 100 MiB of the uncompressed > image: > $ du -sh var/lib/rpm > 91M var/lib/rpm > $ du -sh var/lib/yum > 6.6M var/lib/yum > > How important for anyone is the ability to install/inspect packages inside > a ramdisk? > > Dmitry > -- Sergii Golovatiuk Senior Software Developer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From emccormick at cirrusseven.com Tue Oct 15 13:21:39 2019 From: emccormick at cirrusseven.com (Erik McCormick) Date: Tue, 15 Oct 2019 09:21:39 -0400 Subject: [ops] Shanghai Meetup Message-ID: Greetings Operators! We have been allocated a half-day session on Thursday afternoon as part of the PTG. We have room for 50 people and would like to use it as a mini Ops Meetup. For those who haven't attended one before, these are working sessions like the forum or PTG rather than formal presentations. Sessions don't need to be of fixed length for this, especially since it's a fairly short period of time. How much time we spend on a topic will be dictated by the cadence of the discussion and interest of the attendees. Please take a few minutes to add topic suggestions here, and +1 others that you would like to talk about. https://etherpad.openstack.org/p/PVG-OPS-Forum-Brainstorming Thanks, and see you in Shanghai! -Erik -------------- next part -------------- An HTML attachment was scrubbed... URL: From denise at openstack.org Tue Oct 15 12:13:20 2019 From: denise at openstack.org (denise at openstack.org) Date: Tue, 15 Oct 2019 06:13:20 -0600 (MDT) Subject: OSF booth at KubeCon+CloudNativeCon in San Diego Message-ID: <1571141600.054118968@apps.rackspace.com> Hello Everyone, We wanted to let you know that the OpenStack Foundation will have a booth at the upcoming KubeCon+CloudNativeCon event in San Diego, CA on November 18-21, 2019. We are in booth #S23 and we will be featuring the OpenStack Foundation in addition to all the projects - Airship, Kata Containers, StarlingX, OpenStack and Zuul. At the booth we will have stickers and educational collateral about each project to distribute. We would like to invite you to help us in the following areas: Represent your project by staffing the OSF booth If you can spare 1 hour/day (or any time at all!) to represent your specific project in the OSF booth Here is the [ link ]( https://docs.google.com/spreadsheets/d/1mZzK0GHm9OQ0IL9njTWLxPLuAeR6jqmyeM60T7sQQOc/edit#gid=0 ) to the google doc to sign up Project Demos in the OSF booth If you are interested in delivering a project-specific demo in the OSF booth, please contact [ Denise ]( http://denise at openstack.org ) Looking forward to seeing all of you in San Diego! Best regards, OSF Marketing team Denise, Claire, Allison and Ashlee -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue Oct 15 14:47:55 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 09:47:55 -0500 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: On 10/15/19 7:48 AM, Herve Beraud wrote: > I proposed some patches through heat templates and puppet-cinder to > remove lock files older than 1 week and avoid file system growing. > > This is a solution based on a cron job, to fix that on stable branches, > in a second time I'll help the fasteners project to fix the root cause > by reviewing and testing the proposed patch (lock based on file offset). > In next versions I hope we will use a patched fasteners and so we could > drop the cron based solution. > > Please can you give /reviews/feedbacks: > - https://review.opendev.org/688413 > - https://review.opendev.org/688414 > - https://review.opendev.org/688415 I'm rather hesitant to recommend this. It looks like the change is only removing the -delete lock files, which are a fraction of the total lock files created by Cinder, and I don't particularly want to encourage people to start monkeying with the lock files while a service is running. Even with this limited set of deletions, I think we need a Cinder person to look and verify that we aren't making bad assumptions about how the locks are used. In essence, I don't think this is going to meaningfully reduce the amount of leftover lock files and it sets a bad precedent for how to handle them. Personally, I'd rather see a boot-time service added for each OpenStack service that goes out and wipes the lock file directory before starting the service. On a more general note, I'm going to challenge the assertion that "Customer file system growing slowly and so customer risk to facing some issues to file system usage after a long period." I have yet to hear an actual bug report from the leftover lock files. Every time this comes up it's because someone noticed a lot of lock files and thought we were leaking them. I've never heard anyone report an actual functional or performance problem as a result of the lock files. I don't think we should "fix" this until someone reports that it's actually broken. Especially because previous attempts have all resulted in very real bugs that did break people. Maybe we should have oslo.concurrency drop a file named _README (or something else likely to sort first in the file listing) into the configured lock_path that explains why the files are there and the proper way to deal with them. > > Thanks > > > Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo > > a écrit : > > On 2019/09/28 1:44, Ben Nemec wrote: > > > > > > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: > >> Hi Eric, > >> > >> On 2019/09/20 23:10, Eric Harney wrote: > >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: > >>>> Hi, > >>>> > >>>> I'm using Queens cinder with the following setting. > >>>> > >>>> --------------------------------- > >>>> [coordination] > >>>> backend_url = file://$state_path > >>>> --------------------------------- > >>>> > >>>> As a result, the files like the following were remained under > the state path after some operations.[1] > >>>> > >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume > >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot > >>>> > >>>> In my understanding, these are lock-files created for > synchronization by tooz. > >>>> But, these lock-files were not deleted after finishing operations. > >>>> Is this behaviour correct? > >>>> > >>>> [1] > >>>> e.g. Delete volume, Delete snapshot > >>> > >>> This is a known bug that's described here: > >>> > >>> https://github.com/harlowja/fasteners/issues/26 > >>> > >>> (The fasteners library is used by tooz, which is used by Cinder > for managing these lock files.) > >>> > >>> There's an old Cinder bug for it here: > >>> https://bugs.launchpad.net/cinder/+bug/1432387 > >>> > >>> but that's marked as "Won't Fix" because Cinder needs it to be > fixed in the underlying libraries. > >> Thank you for your explanation. > >> I understood the state. > >> > >> But, I have one more question. > >> Can I think this bug doesn't affect synchronization? > > > > It does not. In fact, it's important to not remove lock files > while a service is running or you can end up with synchronization > issues. > > > > To clean up the leftover lock files, we generally recommend > clearing the lock_path for each service on reboot before the > services have started. > > Thank you for your information. > I think that I understood this issue completely. > > Best Regards, > > > >> > >> Best regards, > >> > >>> Thanks, > >>> Eric > >>> > >> > > > > -- > _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > Rikimaru Honjo > E-mail:honjo.rikimaru at ntt-tx.co.jp > > > > > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From mihalis68 at gmail.com Tue Oct 15 14:51:48 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 15 Oct 2019 10:51:48 -0400 Subject: [Ops] ops meetups team meeting 2019-10-15 - minutes Message-ID: We had a brief meeting for the OpenStack Ops Meetups today on IRC, minutes linked below. The preparations for ops events at the Shanghai summit continue. There will be an "ops war stories" session during the forum, and then a mini ops meetup on day 4 (thursday) agenda still tbd, please make suggestions here : https://etherpad.openstack.org/p/PVG-OPS-Forum-Brainstorming Another bit of news is that Bloomberg intends to offer to host the next OpenStack Ops Meetup. If accepted this would be an event in our London headquarters on January 7th and 8th. More news will be shared here as and when available and via the ops notifications twitter account ( https://twitter.com/osopsmeetup) Today's meeting minutes: 10:40 AM Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-15-14.04.html 10:40 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-15-14.04.txt 10:40 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-10-15-14.04.log.html Cheers, Chris - on behalf of the OpenStack Ops Meetups team -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue Oct 15 14:54:50 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 09:54:50 -0500 Subject: [ironic] [tripleo] IPA images without RPM and YUM/DNF? In-Reply-To: References: Message-ID: <8b106a3f-e999-31d8-8497-8d84bd992832@nemebean.com> On 10/15/19 4:13 AM, Dmitry Tantsur wrote: > (adding TripleO because of potential effect) > > Hi all, > > I'm working on making ironic-python-agent images smaller than they > currently are. The proposed patches already reduce the default image (as > built by IPA-builder) size from around 420 MiB to around 380 MiB. > > My next idea is to get rid of RPM and YUM databases (in case of a > CentOS/RHEL image). They amount for nearly 100 MiB of the uncompressed > image: > $ du -sh var/lib/rpm > 91M var/lib/rpm > $ du -sh var/lib/yum > 6.6M var/lib/yum > > How important for anyone is the ability to install/inspect packages > inside a ramdisk? Back when we were building the ramdisks with DIB I'm pretty sure we were wiping the RPM db too, so unless something has changed since then I would expect this to be fine. > > Dmitry From mriedemos at gmail.com Tue Oct 15 15:24:58 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 15 Oct 2019 10:24:58 -0500 Subject: [tc][all] Community-wide goal Ussuri and V cycle forum collaboration idea In-Reply-To: <16dcc79196d.b7dfa21684317.2121277505699030183@ghanshyammann.com> References: <16dcc79196d.b7dfa21684317.2121277505699030183@ghanshyammann.com> Message-ID: <05fa700e-dba6-36ce-cf42-c7023f2515c9@gmail.com> On 10/14/2019 5:52 PM, Ghanshyam Mann wrote: > Question is for V cycle goal planning, whether we should discuss the V cycle goal in Ussuri goal fourm sessoin[3] or > it is too early to kick off V cycle goal at least until we finalize U cycle goal first. I would like to list the below two > options to proceed further (at least to decide if we need to change the existing U cycle goal forum sessions title). > > 1. Merge the Forum session for both cycle goal discussion (divide both in two half). This need forum session title and description change. > 2. Keep forum session for U cycle goal only and start the V cycle over ML asynchronously. This will help to avoid any confusion or mixing the both cycle goal discussions. So you have 40 minutes to discuss something that is notoriously hard to sort out for one release let alone the future, and to date there are only 3 goals proposed for Ussuri. Why even consider goals for V at this point when settling on goals for Train was kind of a (train)wreck (get it?!) and goal champions for Ussuri aren't necessarily champing at the bit? I won't be there so I don't have a horse in this race (yay more idioms), just commenting from the peanut gallery. -- Thanks, Matt From hberaud at redhat.com Tue Oct 15 15:41:19 2019 From: hberaud at redhat.com (Herve Beraud) Date: Tue, 15 Oct 2019 17:41:19 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: Le mar. 15 oct. 2019 à 16:48, Ben Nemec a écrit : > > > On 10/15/19 7:48 AM, Herve Beraud wrote: > > I proposed some patches through heat templates and puppet-cinder to > > remove lock files older than 1 week and avoid file system growing. > > > > This is a solution based on a cron job, to fix that on stable branches, > > in a second time I'll help the fasteners project to fix the root cause > > by reviewing and testing the proposed patch (lock based on file offset). > > In next versions I hope we will use a patched fasteners and so we could > > drop the cron based solution. > > > > Please can you give /reviews/feedbacks: > > - https://review.opendev.org/688413 > > - https://review.opendev.org/688414 > > - https://review.opendev.org/688415 > > I'm rather hesitant to recommend this. It looks like the change is only > removing the -delete lock files, which are a fraction of the total lock > files created by Cinder, and I don't particularly want to encourage > people to start monkeying with the lock files while a service is > running. Even with this limited set of deletions, I think we need a > Cinder person to look and verify that we aren't making bad assumptions > about how the locks are used. > Yes these changes should be validated by the cinder team. I chosen this approach to allow use to fix that on stable branches too, and to avoid to introduce a non backportable new feature. > > In essence, I don't think this is going to meaningfully reduce the > amount of leftover lock files and it sets a bad precedent for how to > handle them. > > Personally, I'd rather see a boot-time service added for each OpenStack > service that goes out and wipes the lock file directory before starting > the service. > I agree it can be an alternative to the proposed changes. I guess it's related to some sort of puppet code too, I'm right? (the boot-time service) > > On a more general note, I'm going to challenge the assertion that > "Customer file system growing slowly and so customer risk to facing some > issues to file system usage after a long period." I have yet to hear an > actual bug report from the leftover lock files. Every time this comes up > it's because someone noticed a lot of lock files and thought we were > leaking them. I've never heard anyone report an actual functional or > performance problem as a result of the lock files. I don't think we > should "fix" this until someone reports that it's actually broken. > Especially because previous attempts have all resulted in very real bugs > that did break people. > Yes I agreee it's more an assumption than a reality, I never seen anybody report a disk usage issue or things like this due to leftover lock files. > Maybe we should have oslo.concurrency drop a file named _README (or > something else likely to sort first in the file listing) into the > configured lock_path that explains why the files are there and the > proper way to deal with them. > Good idea. Anyway, even if nobody facing a file system issue related to files leftover, I think it's not a good thing to lets grow a FS, and we need to try to address it to prevent potential file system issues related to disk usage and lock files, but in a secure way to avoid to introduce race conditions with cinder. Cinder people need to confirm that my proposed changes can fit well with cinder's mechanismes or choose a better approach. > > > > > Thanks > > > > > > Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo > > > a > écrit : > > > > On 2019/09/28 1:44, Ben Nemec wrote: > > > > > > > > > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: > > >> Hi Eric, > > >> > > >> On 2019/09/20 23:10, Eric Harney wrote: > > >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: > > >>>> Hi, > > >>>> > > >>>> I'm using Queens cinder with the following setting. > > >>>> > > >>>> --------------------------------- > > >>>> [coordination] > > >>>> backend_url = file://$state_path > > >>>> --------------------------------- > > >>>> > > >>>> As a result, the files like the following were remained under > > the state path after some operations.[1] > > >>>> > > >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume > > >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot > > >>>> > > >>>> In my understanding, these are lock-files created for > > synchronization by tooz. > > >>>> But, these lock-files were not deleted after finishing > operations. > > >>>> Is this behaviour correct? > > >>>> > > >>>> [1] > > >>>> e.g. Delete volume, Delete snapshot > > >>> > > >>> This is a known bug that's described here: > > >>> > > >>> https://github.com/harlowja/fasteners/issues/26 > > >>> > > >>> (The fasteners library is used by tooz, which is used by Cinder > > for managing these lock files.) > > >>> > > >>> There's an old Cinder bug for it here: > > >>> https://bugs.launchpad.net/cinder/+bug/1432387 > > >>> > > >>> but that's marked as "Won't Fix" because Cinder needs it to be > > fixed in the underlying libraries. > > >> Thank you for your explanation. > > >> I understood the state. > > >> > > >> But, I have one more question. > > >> Can I think this bug doesn't affect synchronization? > > > > > > It does not. In fact, it's important to not remove lock files > > while a service is running or you can end up with synchronization > > issues. > > > > > > To clean up the leftover lock files, we generally recommend > > clearing the lock_path for each service on reboot before the > > services have started. > > > > Thank you for your information. > > I think that I understood this issue completely. > > > > Best Regards, > > > > > > >> > > >> Best regards, > > >> > > >>> Thanks, > > >>> Eric > > >>> > > >> > > > > > > > -- > > _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > > Rikimaru Honjo > > E-mail:honjo.rikimaru at ntt-tx.co.jp > > > > > > > > > > > > -- > > Hervé Beraud > > Senior Software Engineer > > Red Hat - Openstack Oslo > > irc: hberaud > > -----BEGIN PGP SIGNATURE----- > > > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > > v6rDpkeNksZ9fFSyoY2o > > =ECSj > > -----END PGP SIGNATURE----- > > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjeanner at redhat.com Tue Oct 15 15:24:55 2019 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Tue, 15 Oct 2019 17:24:55 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: On 10/15/19 4:47 PM, Ben Nemec wrote: > > > On 10/15/19 7:48 AM, Herve Beraud wrote: >> I proposed some patches through heat templates and puppet-cinder to >> remove lock files older than 1 week and avoid file system growing. >> >> This is a solution based on a cron job, to fix that on stable >> branches, in a second time I'll help the fasteners project to fix the >> root cause by reviewing and testing the proposed patch (lock based on >> file offset). In next versions I hope we will use a patched fasteners >> and so we could drop the cron based solution. >> >> Please can you give /reviews/feedbacks: >> - https://review.opendev.org/688413 >> - https://review.opendev.org/688414 >> - https://review.opendev.org/688415 > > I'm rather hesitant to recommend this. It looks like the change is only > removing the -delete lock files, which are a fraction of the total lock > files created by Cinder, and I don't particularly want to encourage > people to start monkeying with the lock files while a service is > running. Even with this limited set of deletions, I think we need a > Cinder person to look and verify that we aren't making bad assumptions > about how the locks are used. I'm also not that happy with the cron way - but apparently it might create some issues in some setup with the current way things are done: - inodes aren't infinit on ext* FS (ext3, ext4, blah) - see bellow for context - perfs might be bad (see bellow for context) So one way or another, cleanup is needed. > > In essence, I don't think this is going to meaningfully reduce the > amount of leftover lock files and it sets a bad precedent for how to > handle them. The filter is strict - on purpose, to address this specific issue. Of course, we might want to loosen it, but... do we really want that? IF we're to go with the cron thingy of course. Some more thinking is needed I guess. > > Personally, I'd rather see a boot-time service added for each OpenStack > service that goes out and wipes the lock file directory before starting > the service. Well.... former sysops here: don't count on regular reboot - once a year is a fair average - and it's usually due to some power cut... Sad world, I know ;). So a "boot-time cleanup" will help. A little. And wouldn't hurt anyway. So +1 for that idea, but I wouldn't rely only on it. And there might be some issues (see bellow) > > On a more general note, I'm going to challenge the assertion that > "Customer file system growing slowly and so customer risk to facing some > issues to file system usage after a long period." I have yet to hear an > actual bug report from the leftover lock files. Every time this comes up > it's because someone noticed a lot of lock files and thought we were > leaking them. I've never heard anyone report an actual functional or > performance problem as a result of the lock files. I don't think we > should "fix" this until someone reports that it's actually broken. > Especially because previous attempts have all resulted in very real bugs > that did break people. I'm not on your side here. Waiting to get a fire before thinking about correction and counter-measures isn't good. Since we know there's an issue, and that, eventually, it might be a really big one, it would be good to address it before it explodes in our face. The disk *space* is probably not the issue. File with 1k, on a couple of gigas, it's good. But there are other concerns: - inodes. Yes, we're supposed to get things on XFS, and that dude doesn't have inodes. But some ops might want to rely on the good(?) old ext3, or ext4. There, we might get some issues, and pretty quickly depending on the speed of lock creation (so, linked to cinder actions in this case). Or it might be some NFS with an ext4 FS. - rm limits: you probably never ever hit "rm argument list limit". But it does exist, and I did hit it (like 15 years ago - maybe it's sorted out, but I have some doubts). This means that rm will NOT be able to cope with the directory content after a certain amount (which is huge, right, but still... we're talking about some long-lasting process filling a directory). Of course, "find" might be the right tool in such case, but it will take a long, long time to delete (thinking about the "boot-time cleanup proposal" for instance). - performances: it might impact the system, for instance if one has some backup process running and wanting to eat the /var/lib/cinder directory: if the op doesn't know about this issue, they might get some surprises with long, long, loooong running backups. With a target on some ext4 - hello Inodes! Sooo... yeah. I think this issue should be addressed. Really. But I +1 the fact that it should be done "the right way", whatever it is. The "cron" might be the wrong one. Or not. We need some more feedbacks on that :). > > Maybe we should have oslo.concurrency drop a file named _README (or > something else likely to sort first in the file listing) into the > configured lock_path that explains why the files are there and the > proper way to deal with them. Hmmm... who reads README anyway? Like, really. Better getting some cleanup un place to avoid questions ;). Cheers, C. > >> >> Thanks >> >> >> Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo >> > a >> écrit : >> >>     On 2019/09/28 1:44, Ben Nemec wrote: >>      > >>      > >>      > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: >>      >> Hi Eric, >>      >> >>      >> On 2019/09/20 23:10, Eric Harney wrote: >>      >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: >>      >>>> Hi, >>      >>>> >>      >>>> I'm using Queens cinder with the following setting. >>      >>>> >>      >>>> --------------------------------- >>      >>>> [coordination] >>      >>>> backend_url = file://$state_path >>      >>>> --------------------------------- >>      >>>> >>      >>>> As a result, the files like the following were remained under >>     the state path after some operations.[1] >>      >>>> >>      >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume >>      >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot >>      >>>> >>      >>>> In my understanding, these are lock-files created for >>     synchronization by tooz. >>      >>>> But, these lock-files were not deleted after finishing >> operations. >>      >>>> Is this behaviour correct? >>      >>>> >>      >>>> [1] >>      >>>> e.g. Delete volume, Delete snapshot >>      >>> >>      >>> This is a known bug that's described here: >>      >>> >>      >>> https://github.com/harlowja/fasteners/issues/26 >>      >>> >>      >>> (The fasteners library is used by tooz, which is used by Cinder >>     for managing these lock files.) >>      >>> >>      >>> There's an old Cinder bug for it here: >>      >>> https://bugs.launchpad.net/cinder/+bug/1432387 >>      >>> >>      >>> but that's marked as "Won't Fix" because Cinder needs it to be >>     fixed in the underlying libraries. >>      >> Thank you for your explanation. >>      >> I understood the state. >>      >> >>      >> But, I have one more question. >>      >> Can I think this bug doesn't affect synchronization? >>      > >>      > It does not. In fact, it's important to not remove lock files >>     while a service is running or you can end up with synchronization >>     issues. >>      > >>      > To clean up the leftover lock files, we generally recommend >>     clearing the lock_path for each service on reboot before the >>     services have started. >> >>     Thank you for your information. >>     I think that I understood this issue completely. >> >>     Best Regards, >> >> >>      >> >>      >> Best regards, >>      >> >>      >>> Thanks, >>      >>> Eric >>      >>> >>      >> >>      > >> >>     --     _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ >>     Rikimaru Honjo >>     E-mail:honjo.rikimaru at ntt-tx.co.jp >>     >> >> >> >> >> -- >> Hervé Beraud >> Senior Software Engineer >> Red Hat - Openstack Oslo >> irc: hberaud >> -----BEGIN PGP SIGNATURE----- >> >> wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ >> Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ >> RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP >> F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G >> 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g >> glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw >> m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ >> hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 >> qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y >> F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 >> B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O >> v6rDpkeNksZ9fFSyoY2o >> =ECSj >> -----END PGP SIGNATURE----- >> > -- Cédric Jeanneret (He/Him/His) Software Engineer - OpenStack Platform Red Hat EMEA https://www.redhat.com/ From openstack at nemebean.com Tue Oct 15 17:00:38 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 12:00:38 -0500 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: <4083c539-6b3f-1908-16ac-edbbfe8eb04a@nemebean.com> On 10/15/19 10:24 AM, Cédric Jeanneret wrote: > > > On 10/15/19 4:47 PM, Ben Nemec wrote: >> >> >> On 10/15/19 7:48 AM, Herve Beraud wrote: >>> I proposed some patches through heat templates and puppet-cinder to >>> remove lock files older than 1 week and avoid file system growing. >>> >>> This is a solution based on a cron job, to fix that on stable >>> branches, in a second time I'll help the fasteners project to fix the >>> root cause by reviewing and testing the proposed patch (lock based on >>> file offset). In next versions I hope we will use a patched fasteners >>> and so we could drop the cron based solution. >>> >>> Please can you give /reviews/feedbacks: >>> - https://review.opendev.org/688413 >>> - https://review.opendev.org/688414 >>> - https://review.opendev.org/688415 >> >> I'm rather hesitant to recommend this. It looks like the change is >> only removing the -delete lock files, which are a fraction of the >> total lock files created by Cinder, and I don't particularly want to >> encourage people to start monkeying with the lock files while a >> service is running. Even with this limited set of deletions, I think >> we need a Cinder person to look and verify that we aren't making bad >> assumptions about how the locks are used. > > I'm also not that happy with the cron way - but apparently it might > create some issues in some setup with the current way things are done: > - inodes aren't infinit on ext* FS (ext3, ext4, blah) - see bellow for > context > - perfs might be bad (see bellow for context) > > So one way or another, cleanup is needed. > >> >> In essence, I don't think this is going to meaningfully reduce the >> amount of leftover lock files and it sets a bad precedent for how to >> handle them. > > The filter is strict - on purpose, to address this specific issue. Of > course, we might want to loosen it, but... do we really want that? IF > we're to go with the cron thingy of course. Some more thinking is needed > I guess. > >> >> Personally, I'd rather see a boot-time service added for each >> OpenStack service that goes out and wipes the lock file directory >> before starting the service. > > Well.... former sysops here: don't count on regular reboot - once a year > is a fair average - and it's usually due to some power cut... Sad world, > I know ;). > So a "boot-time cleanup" will help. A little. And wouldn't hurt anyway. > So +1 for that idea, but I wouldn't rely only on it. And there might be > some issues (see bellow) I understand that it doesn't happen regularly, but it's the easiest to automate safe time to clean locks. It can also be done when the service is down for maintenance, but even then you need to be careful because if you wipe the cinder locks when cinder-volume gets restarted, but cinder-scheduler is still running you might wipe an in-use lock. I don't know if that specific scenario is possible, but the point is that any process that could hold a lock needs to be down before you clear the lock directory. Since most OpenStack services have multiple separate OS services running that complicates the process. > >> >> On a more general note, I'm going to challenge the assertion that >> "Customer file system growing slowly and so customer risk to facing some >> issues to file system usage after a long period." I have yet to hear >> an actual bug report from the leftover lock files. Every time this >> comes up it's because someone noticed a lot of lock files and thought >> we were leaking them. I've never heard anyone report an actual >> functional or performance problem as a result of the lock files. I >> don't think we should "fix" this until someone reports that it's >> actually broken. Especially because previous attempts have all >> resulted in very real bugs that did break people. > > I'm not on your side here. Waiting to get a fire before thinking about > correction and counter-measures isn't good. Since we know there's an > issue, and that, eventually, it might be a really big one, it would be > good to address it before it explodes in our face. If you have a solution that fixes the problem without introducing concurrency problems then I'm all ears. :-) Until then, I'm not comfortable fixing a hypothetical problem by creating very real new problems. > The disk *space* is probably not the issue. File with 1k, on a couple of > gigas, it's good. > But there are other concerns: > > - inodes. Yes, we're supposed to get things on XFS, and that dude > doesn't have inodes. But some ops might want to rely on the good(?) old > ext3, or ext4. There, we might get some issues, and pretty quickly > depending on the speed of lock creation (so, linked to cinder actions in > this case). Or it might be some NFS with an ext4 FS. > > - rm limits: you probably never ever hit "rm argument list limit". But > it does exist, and I did hit it (like 15 years ago - maybe it's sorted > out, but I have some doubts). This means that rm will NOT be able to > cope with the directory content after a certain amount (which is huge, > right, but still... we're talking about some long-lasting process > filling a directory). > Of course, "find" might be the right tool in such case, but it will take > a long, long time to delete (thinking about the "boot-time cleanup > proposal" for instance). > > - performances: it might impact the system, for instance if one has some > backup process running and wanting to eat the /var/lib/cinder directory: > if the op doesn't know about this issue, they might get some surprises > with long, long, loooong running backups. > With a target on some ext4 - hello Inodes! Sure, but these are all still theoretical problems. I've _never_ heard of anyone running into them, and we have some pretty big OpenStack deployments in the wild. I feel like at one point I did the math to figure out what it would take to hit an inode limit because of lock files, and it was fairly absurd. Like you would have to leave a deployment running for a decade under heavy use to actually get there. I don't still have those numbers handy, but it might be a useful exercise to look at that again. And this reminds me of another thing that has been suggested in the past to address the lock file cleanup issue (we should really write all of this down if we haven't already...), which is to put them on tmpfs. That way they get cleared automatically on reboot and you don't have to manage anything. Locks don't persist over reboots anyway so it doesn't matter that it's on volatile storage. The whole file thing is actually a consequence of Linux IPC being complete garbage, not because we want persistent storage of locks. > > Sooo... yeah. I think this issue should be addressed. Really. But I +1 > the fact that it should be done "the right way", whatever it is. The > "cron" might be the wrong one. Or not. We need some more feedbacks on > that :). Patches welcome. But fair warning: This problem is a lot harder than it looks on the surface. Many solutions have been proposed over the years, all of them were worse than what we have now. > >> >> Maybe we should have oslo.concurrency drop a file named _README (or >> something else likely to sort first in the file listing) into the >> configured lock_path that explains why the files are there and the >> proper way to deal with them. > > Hmmm... who reads README anyway? Like, really. Better getting some > cleanup un place to avoid questions ;). If it heads off even one of these threads that happen every few months then it will have been worth it. :-D > > Cheers, > > C. > >> >>> >>> Thanks >>> >>> >>> Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo >>> > a >>> écrit : >>> >>>     On 2019/09/28 1:44, Ben Nemec wrote: >>>      > >>>      > >>>      > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: >>>      >> Hi Eric, >>>      >> >>>      >> On 2019/09/20 23:10, Eric Harney wrote: >>>      >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: >>>      >>>> Hi, >>>      >>>> >>>      >>>> I'm using Queens cinder with the following setting. >>>      >>>> >>>      >>>> --------------------------------- >>>      >>>> [coordination] >>>      >>>> backend_url = file://$state_path >>>      >>>> --------------------------------- >>>      >>>> >>>      >>>> As a result, the files like the following were remained under >>>     the state path after some operations.[1] >>>      >>>> >>>      >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume >>>      >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot >>>      >>>> >>>      >>>> In my understanding, these are lock-files created for >>>     synchronization by tooz. >>>      >>>> But, these lock-files were not deleted after finishing >>> operations. >>>      >>>> Is this behaviour correct? >>>      >>>> >>>      >>>> [1] >>>      >>>> e.g. Delete volume, Delete snapshot >>>      >>> >>>      >>> This is a known bug that's described here: >>>      >>> >>>      >>> https://github.com/harlowja/fasteners/issues/26 >>>      >>> >>>      >>> (The fasteners library is used by tooz, which is used by Cinder >>>     for managing these lock files.) >>>      >>> >>>      >>> There's an old Cinder bug for it here: >>>      >>> https://bugs.launchpad.net/cinder/+bug/1432387 >>>      >>> >>>      >>> but that's marked as "Won't Fix" because Cinder needs it to be >>>     fixed in the underlying libraries. >>>      >> Thank you for your explanation. >>>      >> I understood the state. >>>      >> >>>      >> But, I have one more question. >>>      >> Can I think this bug doesn't affect synchronization? >>>      > >>>      > It does not. In fact, it's important to not remove lock files >>>     while a service is running or you can end up with synchronization >>>     issues. >>>      > >>>      > To clean up the leftover lock files, we generally recommend >>>     clearing the lock_path for each service on reboot before the >>>     services have started. >>> >>>     Thank you for your information. >>>     I think that I understood this issue completely. >>> >>>     Best Regards, >>> >>> >>>      >> >>>      >> Best regards, >>>      >> >>>      >>> Thanks, >>>      >>> Eric >>>      >>> >>>      >> >>>      > >>> >>>     --     _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ >>>     Rikimaru Honjo >>>     E-mail:honjo.rikimaru at ntt-tx.co.jp >>>     >>> >>> >>> >>> >>> -- >>> Hervé Beraud >>> Senior Software Engineer >>> Red Hat - Openstack Oslo >>> irc: hberaud >>> -----BEGIN PGP SIGNATURE----- >>> >>> wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ >>> Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ >>> RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP >>> F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G >>> 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g >>> glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw >>> m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ >>> hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 >>> qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y >>> F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 >>> B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O >>> v6rDpkeNksZ9fFSyoY2o >>> =ECSj >>> -----END PGP SIGNATURE----- >>> >> > From openstack at nemebean.com Tue Oct 15 17:13:35 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 12:13:35 -0500 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> Message-ID: <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> On 10/15/19 10:41 AM, Herve Beraud wrote: > > > Le mar. 15 oct. 2019 à 16:48, Ben Nemec > a écrit : > > > > On 10/15/19 7:48 AM, Herve Beraud wrote: > > I proposed some patches through heat templates and puppet-cinder to > > remove lock files older than 1 week and avoid file system growing. > > > > This is a solution based on a cron job, to fix that on stable > branches, > > in a second time I'll help the fasteners project to fix the root > cause > > by reviewing and testing the proposed patch (lock based on file > offset). > > In next versions I hope we will use a patched fasteners and so we > could > > drop the cron based solution. > > > > Please can you give /reviews/feedbacks: > > - https://review.opendev.org/688413 > > - https://review.opendev.org/688414 > > - https://review.opendev.org/688415 > > I'm rather hesitant to recommend this. It looks like the change is only > removing the -delete lock files, which are a fraction of the total lock > files created by Cinder, and I don't particularly want to encourage > people to start monkeying with the lock files while a service is > running. Even with this limited set of deletions, I think we need a > Cinder person to look and verify that we aren't making bad assumptions > about how the locks are used. > > > Yes these changes should be validated by the cinder team. > I chosen this approach to allow use to fix that on stable branches too, > and to avoid to introduce a non backportable new feature. > > > In essence, I don't think this is going to meaningfully reduce the > amount of leftover lock files and it sets a bad precedent for how to > handle them. > > Personally, I'd rather see a boot-time service added for each OpenStack > service that goes out and wipes the lock file directory before starting > the service. > > > I agree it can be an alternative to the proposed changes. > I guess it's related to some sort of puppet code too, I'm right? (the > boot-time service) That's probably how you'd implement it in TripleO. Or maybe Ansible now. Best to check with the TripleO team on that since my knowledge is quite out of date on that project now. > > > On a more general note, I'm going to challenge the assertion that > "Customer file system growing slowly and so customer risk to facing some > issues to file system usage after a long period." I have yet to hear an > actual bug report from the leftover lock files. Every time this > comes up > it's because someone noticed a lot of lock files and thought we were > leaking them. I've never heard anyone report an actual functional or > performance problem as a result of the lock files. I don't think we > should "fix" this until someone reports that it's actually broken. > Especially because previous attempts have all resulted in very real > bugs > that did break people. > > > Yes I agreee it's more an assumption than a reality, I never seen > anybody report a disk usage issue or things like this due to leftover > lock files. > > > Maybe we should have oslo.concurrency drop a file named _README (or > something else likely to sort first in the file listing) into the > configured lock_path that explains why the files are there and the > proper way to deal with them. > > > Good idea. > > Anyway, even if nobody facing a file system issue related to files > leftover, I think it's not a good thing to lets grow a FS, and we need > to try to address it to prevent potential file system issues related to > disk usage and lock files, but in a secure way to avoid to introduce > race conditions with cinder. > > Cinder people need to confirm that my proposed changes can fit well with > cinder's mechanismes or choose a better approach. I'm opposed in general to external solutions. If lock files are to be cleaned up, it needs to happen either when the service isn't running so there's no chance of deleting an in-use lock, or it needs to be done by the service itself when it knows that it is done with the lock. Any fixes outside the service run the risk of drifting from the implementation if, for example, Cinder made a change to its locking semantics such that locks you could safely remove previously no longer could be. I believe Neutron implements lock file cleanup using [0], which is really the only way runtime lock cleanup should be done IMHO. 0: https://github.com/openstack/oslo.concurrency/blob/85c341aced7b181724b68c9d883768b5c5f7e982/oslo_concurrency/lockutils.py#L194 > > > > > > Thanks > > > > > > Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo > > > >> a écrit : > > > >     On 2019/09/28 1:44, Ben Nemec wrote: > >      > > >      > > >      > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: > >      >> Hi Eric, > >      >> > >      >> On 2019/09/20 23:10, Eric Harney wrote: > >      >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: > >      >>>> Hi, > >      >>>> > >      >>>> I'm using Queens cinder with the following setting. > >      >>>> > >      >>>> --------------------------------- > >      >>>> [coordination] > >      >>>> backend_url = file://$state_path > >      >>>> --------------------------------- > >      >>>> > >      >>>> As a result, the files like the following were remained > under > >     the state path after some operations.[1] > >      >>>> > >      >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume > >      >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot > >      >>>> > >      >>>> In my understanding, these are lock-files created for > >     synchronization by tooz. > >      >>>> But, these lock-files were not deleted after finishing > operations. > >      >>>> Is this behaviour correct? > >      >>>> > >      >>>> [1] > >      >>>> e.g. Delete volume, Delete snapshot > >      >>> > >      >>> This is a known bug that's described here: > >      >>> > >      >>> https://github.com/harlowja/fasteners/issues/26 > >      >>> > >      >>> (The fasteners library is used by tooz, which is used by > Cinder > >     for managing these lock files.) > >      >>> > >      >>> There's an old Cinder bug for it here: > >      >>> https://bugs.launchpad.net/cinder/+bug/1432387 > >      >>> > >      >>> but that's marked as "Won't Fix" because Cinder needs it > to be > >     fixed in the underlying libraries. > >      >> Thank you for your explanation. > >      >> I understood the state. > >      >> > >      >> But, I have one more question. > >      >> Can I think this bug doesn't affect synchronization? > >      > > >      > It does not. In fact, it's important to not remove lock files > >     while a service is running or you can end up with synchronization > >     issues. > >      > > >      > To clean up the leftover lock files, we generally recommend > >     clearing the lock_path for each service on reboot before the > >     services have started. > > > >     Thank you for your information. > >     I think that I understood this issue completely. > > > >     Best Regards, > > > > > >      >> > >      >> Best regards, > >      >> > >      >>> Thanks, > >      >>> Eric > >      >>> > >      >> > >      > > > > >     -- > >     _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > >     Rikimaru Honjo > > E-mail:honjo.rikimaru at ntt-tx.co.jp > > >      > > > > > > > > > > > -- > > Hervé Beraud > > Senior Software Engineer > > Red Hat - Openstack Oslo > > irc: hberaud > > -----BEGIN PGP SIGNATURE----- > > > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > > v6rDpkeNksZ9fFSyoY2o > > =ECSj > > -----END PGP SIGNATURE----- > > > > > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From jimmy at openstack.org Tue Oct 15 17:47:14 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Tue, 15 Oct 2019 12:47:14 -0500 Subject: OpenStack COA Exam Update Message-ID: <5DA60622.9040102@openstack.org> We wanted to circle back on this email thread and provide an update on the Certified OpenStack Administrator (COA) exam. We’ve listened to community feedback and so has the OpenStack ecosystem. We are excited to collaborate with Mirantis who has stepped up to donate resources, including the administration of the vendor-neutral OpenStack certification exam now running on OpenStack Rocky. We are planning to continue COA exam sales starting this Thursday, October 17. If you’re interested in becoming a COA, you will be able to buy an exam through the OpenStack website or through one of the many OpenStack training partners in the marketplace . We are excited to continue growing the market of certified OpenStack professionals and appreciate the community’s patience as we identified a solution moving forward. Please reach out if you have any questions and we will continue to update openstack.org/coa as the certification program evolves. Thanks, Jimmy -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Oct 15 18:18:03 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 15 Oct 2019 13:18:03 -0500 Subject: [all][tc] Planning for dropping the Python2 support in OpenStack Message-ID: <16dd0a42b8d.e847dd3e124645.6364180516762707559@ghanshyammann.com> Hello Everyone, Python 2.7 is going to retire in Jan 2020 [1] and we planned to drop the python 2 support from OpenStack during the start of the Ussuri cycle[2]. Time has come now to start the planning on dropping the Python2. It needs to be coordinated among various Projects, libraries, vendors driver, third party CI and testing frameworks. * Preparation for the Plan & Schedule: Etherpad: https://etherpad.openstack.org/p/drop-python2-support We discussed it in TC to come up with the plan, execute it smoothly and avoid breaking any dependent projects. I have prepared an etherpad[3](mentioned above also) to capture all the points related to this topic and most importantly the draft schedule about who can drop the support and when. The schedule is in the draft state and not final yet. The most important points are if you are dropping the support then all your consumers (OpenStack Projects, Vendors drivers etc) are ready for that. For example, oslo, os-bricks, client lib, testing framework projects will keep the python2 support until we make sure all the consumers of those projects do not require py2 support. If anyone require then how long they can support py2. These libraries, testing frameworks will be the last one to drop py2. We have planned to have a dedicated discussion in TC office hours on the 24th Thursday #openstack-tc channel. We will discuss what all need to be done and the schedules. You do not have to drop it immediately and keep eyes on this ML thread till we get the consensus on the community-level plan and schedule. Meanwhile, you can always start pre-planning for your projects, for example, stephenfin has started for Nova[4] to migrate the third party CI etc. Cinder has coordinated with all vendor drivers & their CI to migrate from py2 to py3. * Projects want to keep the py2 support? There is no mandate that projects have to drop the py2 support right now. If you want to keep the support then key things to discuss are what all you need and does all your dependent projects/libs provide the support of py2. This is something needs to be discussed case by case. If any project wants to keep the support, add that in the etherpad with a brief reason which will be helpful to discuss the need and feasibility. Feel free to provide feedback or add the missing point on the etherpad. Do not forget to attend the 24th Oct 2019, TC office hour on Thursday at 1500 UTC in #openstack-tc. [1] https://pythonclock.org/ [2] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html [3] https://etherpad.openstack.org/p/drop-python2-support [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010109.html -gmann From emilien at redhat.com Tue Oct 15 19:47:58 2019 From: emilien at redhat.com (Emilien Macchi) Date: Tue, 15 Oct 2019 15:47:58 -0400 Subject: [tripleo] Deprecating paunch CLI? In-Reply-To: References: <4bcf45b6-d915-e6d0-694f-d4a5b883dc45@redhat.com> Message-ID: On Fri, Oct 11, 2019 at 10:55 AM James Slagle wrote: > An idea for a future improvement I would like to see as we move in > this direction is to switch from reading the container startup configs > from a single file per step > (/var/lib/tripleo-config/container-startup-config-step_{{ step > }}.json), to using a directory per step instead. It would look > something like: > > /var/lib/tripleo-config/container-startup-config/step1 > > /var/lib/tripleo-config/container-startup-config/step1/keystone-init-tasks.json > > /var/lib/tripleo-config/container-startup-config/step1/pacemaker-init-tasks.json > etc. > https://review.opendev.org/#/c/688779/ is WIP and will address this idea. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue Oct 15 22:15:19 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 17:15:19 -0500 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> Message-ID: In the interest of not having to start this discussion from scratch every time, I've done a bit of a brain dump into https://review.opendev.org/#/c/688825/ that covers why things are the way they are and what we recommend people do about it. Please take a look and let me know if you see any issues with it. Thanks. -Ben From openstack at nemebean.com Tue Oct 15 22:19:31 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Oct 2019 17:19:31 -0500 Subject: [oslo] On PTO rest of the week Message-ID: <959ee3bc-ca31-c2e8-ddef-9f9b8a394c41@nemebean.com> Hey, I'm not working for the rest of the week. \o/ I also realized that I haven't announced that here. /o\ Things are pretty quiet in Oslo right now so I don't anticipate that I'll be needed in the interim, but I should have internet the whole time I'm out and I'll _probably_ be checking email. If something comes up that can't wait until Monday just holler and hopefully I'll see it. Thanks. -Ben From iwienand at redhat.com Tue Oct 15 23:03:16 2019 From: iwienand at redhat.com (Ian Wienand) Date: Wed, 16 Oct 2019 10:03:16 +1100 Subject: CentOS 8 nodes available now Message-ID: <20191015230316.GA29186@fedora19.localdomain> Hello, I'm happy to say that CentOS 8 images are now live in OpenDev infra. Using a node label of "centos-8" will get you started. --- The python environment setup on these images is different to our other images. Firstly, some background: currently during image build we go to some effort to: a) install the latest pip/virtualenv/setuptools b) ensure standard behaviour: /usr/bin/python -> python2 /usr/bin/pip -> python2 install /usr/bin/pip3 -> python3 install /usr/bin/virtualenv -> creates python2 environment by default; python3 virtualenv package in sync This means a number of things; hosts always have python2, and because we overwrite the system pip/virtualenv/setuptools we put these packages on "hold" (depending on the package manager) so jobs don't re-install them and create (even more of) a mess. This made sense in the past, when we had versions of pip/setuptools in distributions that couldn't understand syntax in requirements files (and other bugs) and didn't have the current fantastic job inheritance and modularity that Zuul (v3) provides. However, it also introduces a range of problems for various users, and has a high maintenance overhead. Thus these new images are, by default, python3 only, and have upstream pip and virtualenv packages installed. You will have a default situation: /usr/bin/python -> not provided /usr/bin/pip -> not provided, use /usr/bin/pip3 or "python3 -m pip" /usr/bin/virtualenv -> create python3 environment; provided by upstream python3-virtualenv package /usr/bin/pyvenv -> not provided (is provided by Ubuntu python3-venv), use /usr/bin/pyvenv-3 or "python3 -m venv". Ergo, the "standard behaviour" is not so standard any more. I would suggest if you wish to write somewhat portable jobs/roles etc., you do the following: * in general don't rely on "unversioned" calls of tools at all (python/pip/virtualenv) -- they can all mean different things on different platforms. * scripts should always be #!/usr/bin/python3 * use "python3 -m venv" for virtual environments (if you really need "virtualenv" because of one of the features it provides, use "-m virtualenv") * use "python3 -m pip" to install global pip packages; but try not too -- mixing packages and pip installs never works that well. * if you need python2 for some reason, use a bindep file+role to install it (don't assume it is there) --- For any Zuul admins, note that to use python3-only images similar to what we make, you'll need to set "python-path" to python3 in nodepool so that Ansible calls the correct remote binary. Keep an eye on [1] which will automate this for Ansible >=2.8 after things are merged and released. --- Most of the job setup has been tested (network configs, setting mirrors, adding swap etc.) but there's always a chance of issues with a new platform. Please bring up any issues in #openstack-infra and we'll be sure to get them fixed. --- If you're interested in the images, they are exported at https://nb01.openstack.org/images/ although they are rather large, because we pre-cache a lot. If you'd like to build your own, [2] might help with: DISTRO=centos-minimal DIB_RELEASE=8 --- Thanks, -i [1] https://review.opendev.org/#/c/682797/ [2] https://opendev.org/openstack/project-config/src/branch/master/tools/build-image.sh From smooney at redhat.com Wed Oct 16 00:35:48 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 16 Oct 2019 01:35:48 +0100 Subject: CentOS 8 nodes available now In-Reply-To: <20191015230316.GA29186@fedora19.localdomain> References: <20191015230316.GA29186@fedora19.localdomain> Message-ID: On Wed, 2019-10-16 at 10:03 +1100, Ian Wienand wrote: > Hello, > > I'm happy to say that CentOS 8 images are now live in OpenDev infra. > Using a node label of "centos-8" will get you started. > > --- > > The python environment setup on these images is different to our other > images. Firstly, some background: currently during image build we go > to some effort to: > > a) install the latest pip/virtualenv/setuptools > b) ensure standard behaviour: > /usr/bin/python -> python2 python on centos8 should be a link to python3 infact ideally python 2 should not be installed at all. > /usr/bin/pip -> python2 install > /usr/bin/pip3 -> python3 install > /usr/bin/virtualenv -> creates python2 environment by default; > python3 virtualenv package in sync > > This means a number of things; hosts always have python2, why would we want this. ideally we should try to ensure that there is non python 2 on the system so that we can ensure we are not using it bay mistake and can do an entirly python3 only install on centos8 ---- later: i realised read later that you are descibing how thigns work on other images here. > and because > we overwrite the system pip/virtualenv/setuptools we put these > packages on "hold" (depending on the package manager) so jobs don't > re-install them and create (even more of) a mess. > > This made sense in the past, when we had versions of pip/setuptools in > distributions that couldn't understand syntax in requirements files > (and other bugs) and didn't have the current fantastic job inheritance > and modularity that Zuul (v3) provides. However, it also introduces a > range of problems for various users, and has a high maintenance > overhead. > > Thus these new images are, by default, python3 only, and have upstream > pip and virtualenv packages installed. You will have a default > situation: > > /usr/bin/python -> not provided > /usr/bin/pip -> not provided, use /usr/bin/pip3 or "python3 -m pip" > /usr/bin/virtualenv -> create python3 environment; > provided by upstream python3-virtualenv package > /usr/bin/pyvenv -> not provided (is provided by Ubuntu python3-venv), > use /usr/bin/pyvenv-3 or "python3 -m venv". > oh i see you were descirbinbg the standard behavior of our other envs before this is closer to what i was expecting alther i would personally prefer to have /usr/bin/python -> /user/bin/python3 linux distros seem to be a bit split on this i belive arch and maybe debian (i saw on of the other majory disto families adopt the same apparch) link python to python3 the redhat family of operating systems do not provide python any more an leave it to the user to either use only the versions specific pythons or user update-alternitives to set there default python > Ergo, the "standard behaviour" is not so standard any more. > > I would suggest if you wish to write somewhat portable jobs/roles > etc., you do the following: > > * in general don't rely on "unversioned" calls of tools at all > (python/pip/virtualenv) -- they can all mean different things on > different platforms. > * scripts should always be #!/usr/bin/python3 see i think that is an anti pattern they could be but i think /usr/bin/python should map to /usr/bin/python3 and you should assume that it now python3. if you dont do that hen every script that has ever been writtne or packaged needs to be updated to reference python3 explictly. there were much fewer user of python1 when that tansition happened but python became a link to the default systme python and eventully pointed to python2 i think we should continue to do that and after a decase of deprecating python2 we should reclaim the python symlink and point it to python3 > * use "python3 -m venv" for virtual environments (if you really need > "virtualenv" because of one of the features it provides, use "-m > virtualenv") > * use "python3 -m pip" to install global pip packages; but try not > too -- mixing packages and pip installs never works that well. well from a devstack point of view we almost exclucive install form pip so installing python packages form the disto is the anti pattern not installing form pip. that said we shoudl consider moving devstack to use --user at somepoint. > * if you need python2 for some reason, use a bindep file+role > to install it (don't assume it is there) +1 also dont assmue python will be python > > --- > > For any Zuul admins, note that to use python3-only images similar to > what we make, you'll need to set "python-path" to python3 in nodepool > so that Ansible calls the correct remote binary. Keep an eye on [1] > which will automate this for Ansible >=2.8 after things are merged and > released. > > --- > > Most of the job setup has been tested (network configs, setting > mirrors, adding swap etc.) but there's always a chance of issues with > a new platform. Please bring up any issues in #openstack-infra and > we'll be sure to get them fixed. > > --- > > If you're interested in the images, they are exported at > > https://nb01.openstack.org/images/ > > although they are rather large, because we pre-cache a lot. If you'd > like to build your own, [2] might help with: > > DISTRO=centos-minimal > DIB_RELEASE=8 > > --- > > Thanks, thanks for al the work on this. > > -i > > [1] https://review.opendev.org/#/c/682797/ > [2] https://opendev.org/openstack/project-config/src/branch/master/tools/build-image.sh > > From Richard.Pioso at dell.com Wed Oct 16 00:55:38 2019 From: Richard.Pioso at dell.com (Richard.Pioso at dell.com) Date: Wed, 16 Oct 2019 00:55:38 +0000 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt Message-ID: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Hi, The Ironic Train release can be broken due to an entry in its driver-requirements.txt. driver-requirements.txt defines a dependency on the sushy package [1] which can be satisfied by version 1.9.0. Unfortunately, that version contains a few bugs which prevent Ironic from being able to manage Dell EMC and perhaps other vendors' bare metal hardware with its Redfish hardware type (driver). The fixes to them [2][3][4] were merged into master before the creation of stable/train. Therefore, they are available on stable/train and in the last sushy release created during the Train cycle, 2.0.0, the only other version which can satisfy the dependency today. However, consumers -- packagers, operators, and users -- could, fighting time constraints or lacking solid visibility into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the dependency, but, in so doing, unknowingly render the package or installation severely broken. A change [5] has been proposed as part of a prospective solution to this issue. It creates a new release of sushy from the change which fixes the first bug [2]. Review comments [6] discuss basing the new release on a more recent stable/train change to pick up other bug fixes and, less importantly, backward compatible feature modifications and enhancements which merged before the change from which 2.0.0 was created. Backward compatible feature modifications and enhancements are interspersed in time among the bug fixes. Once a new release is available, the sushy entry in driver-requirements.txt on stable/train would be updated. However, apparently, the stable branch policy prevents releases from being done at a point earlier than the last release within a given cycle [6], which was 2.0.0. Another possible resolution which comes to mind is to change the definition of the sushy dependency in driver-requirements.txt [1] from "sushy>=1.9.0" to "sushy>=2.0.0". Does anyone have a suggestion on how to proceed? Thank you, Rick [1] https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 [2] https://review.opendev.org/#/c/666253/ [3] https://review.opendev.org/#/c/668936/ [4] https://review.opendev.org/#/c/669889/ [5] https://review.opendev.org/#/c/688551/ [6] https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 From cboylan at sapwetik.org Wed Oct 16 01:42:52 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Tue, 15 Oct 2019 18:42:52 -0700 Subject: CentOS 8 nodes available now In-Reply-To: References: <20191015230316.GA29186@fedora19.localdomain> Message-ID: <18bfb37e-d448-4b8a-a6c8-5c4f4ee57107@www.fastmail.com> On Tue, Oct 15, 2019, at 5:35 PM, Sean Mooney wrote: > On Wed, 2019-10-16 at 10:03 +1100, Ian Wienand wrote: > > Hello, > > > > I'm happy to say that CentOS 8 images are now live in OpenDev infra. > > Using a node label of "centos-8" will get you started. > > > > --- > > > > The python environment setup on these images is different to our other > > images. Firstly, some background: currently during image build we go > > to some effort to: > > > > a) install the latest pip/virtualenv/setuptools > > b) ensure standard behaviour: > > /usr/bin/python -> python2 > python on centos8 should be a link to python3 > infact ideally python 2 should not be installed at all. > > /usr/bin/pip -> python2 install > > /usr/bin/pip3 -> python3 install > > /usr/bin/virtualenv -> creates python2 environment by default; > > python3 virtualenv package in sync > > > > This means a number of things; hosts always have python2, > why would we want this. ideally we should try to ensure that > there is non python 2 on the system so that we can ensure we are not > using > it bay mistake and can do an entirly python3 only install on centos8 > ---- > later: i realised read later that you are descibing how thigns work on > other images > here. See below but being specific about the version of python we want is one way to help ensure we test with the correct python. Also, some of our platforms don't have python3 (so we will continue to install python2 there). > > and because > > we overwrite the system pip/virtualenv/setuptools we put these > > packages on "hold" (depending on the package manager) so jobs don't > > re-install them and create (even more of) a mess. > > > > This made sense in the past, when we had versions of pip/setuptools in > > distributions that couldn't understand syntax in requirements files > > (and other bugs) and didn't have the current fantastic job inheritance > > and modularity that Zuul (v3) provides. However, it also introduces a > > range of problems for various users, and has a high maintenance > > overhead. > > > > Thus these new images are, by default, python3 only, and have upstream > > pip and virtualenv packages installed. You will have a default > > situation: > > > > /usr/bin/python -> not provided > > /usr/bin/pip -> not provided, use /usr/bin/pip3 or "python3 -m pip" > > /usr/bin/virtualenv -> create python3 environment; > > provided by upstream python3-virtualenv package > > /usr/bin/pyvenv -> not provided (is provided by Ubuntu python3-venv), > > use /usr/bin/pyvenv-3 or "python3 -m venv". > > > oh i see you were descirbinbg the standard behavior of our other envs before > this is closer to what i was expecting alther i would personally prefer to have > /usr/bin/python -> /user/bin/python3 > > linux distros seem to be a bit split on this > i belive arch and maybe debian (i saw on of the other majory disto > families adopt the same apparch) link > python to python3 > > the redhat family of operating systems do not provide python any more > an leave it to the user to either use > only the versions specific pythons or user update-alternitives to set > there default python This is actually one of the recommendations from PEP 394, https://www.python.org/dev/peps/pep-0394/#for-python-runtime-distributors. For our purposes I think it works well. Our jobs should be explicit about which version of python they use so there is no ambiguity in testing, but if jobs want to set up the alias they can opt into doing so. In general though I expect we'll stick to the various distro expectations for each distro as we build images for them. This avoids confusion when people discover `python` is something other than what you get if you download the image from the cloud provider. For this reason I don't think we should alias `python` to `python3` on CentOS8. > > Ergo, the "standard behaviour" is not so standard any more. > > > > I would suggest if you wish to write somewhat portable jobs/roles > > etc., you do the following: > > > > * in general don't rely on "unversioned" calls of tools at all > > (python/pip/virtualenv) -- they can all mean different things on > > different platforms. > > * scripts should always be #!/usr/bin/python3 > see i think that is an anti pattern they could be but i think > /usr/bin/python should map to /usr/bin/python3 and you should assume > that it now python3. if you dont do that hen > every script that has ever been > writtne or packaged needs to be updated to reference python3 > explictly. Or simply execute it with the interpreter you need: python3 /usr/local/bin/pbr freeze Is a common invocation for me. > there were much fewer user of python1 when that tansition happened > but > python became a link to the default systme python and eventully > pointed to python2 > i think we should continue to do that and after a decase of > deprecating python2 we > should reclaim the python symlink and point it to python3 > > * use "python3 -m venv" for virtual environments (if you really need > > "virtualenv" because of one of the features it provides, use "-m > > virtualenv") > > * use "python3 -m pip" to install global pip packages; but try not > > too -- mixing packages and pip installs never works that well. > well from a devstack point of view we almost exclucive install form > pip so installing python packages form the disto is the anti pattern > not installing form pip. that said we shoudl consider moving devstack to use > --user at somepoint. I actually resurrected the install into virtualenv idea when pip 10 (I think that was the version) happened as it refused to uninstall distutils installed packages. It is mostly doable though there are a few corner cases that kept it from happening. > > * if you need python2 for some reason, use a bindep file+role > > to install it (don't assume it is there) > +1 also dont assmue python will be python > > > > --- > > > > For any Zuul admins, note that to use python3-only images similar to > > what we make, you'll need to set "python-path" to python3 in nodepool > > so that Ansible calls the correct remote binary. Keep an eye on [1] > > which will automate this for Ansible >=2.8 after things are merged and > > released. > > > > --- > > > > Most of the job setup has been tested (network configs, setting > > mirrors, adding swap etc.) but there's always a chance of issues with > > a new platform. Please bring up any issues in #openstack-infra and > > we'll be sure to get them fixed. > > > > --- > > > > If you're interested in the images, they are exported at > > > > https://nb01.openstack.org/images/ > > > > although they are rather large, because we pre-cache a lot. If you'd > > like to build your own, [2] might help with: > > > > DISTRO=centos-minimal > > DIB_RELEASE=8 > > > > --- > > > > Thanks, > thanks for al the work on this. > > > > -i > > > > [1] https://review.opendev.org/#/c/682797/ > > [2] https://opendev.org/openstack/project-config/src/branch/master/tools/build-image.sh > > > > > > > From fungi at yuggoth.org Wed Oct 16 02:43:40 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 16 Oct 2019 02:43:40 +0000 Subject: CentOS 8 nodes available now In-Reply-To: References: <20191015230316.GA29186@fedora19.localdomain> Message-ID: <20191016024339.s7s24wpcprra7f3x@yuggoth.org> On 2019-10-16 01:35:48 +0100 (+0100), Sean Mooney wrote: [...] > i would personally prefer to have /usr/bin/python -> > /user/bin/python3 > > linux distros seem to be a bit split on this i belive arch and > maybe debian (i saw on of the other majory disto families adopt > the same apparch) link python to python3 [...] Debian definitely does not. The current plan for when Debian stops shipping Python 2.7 is that it will have no /usr/bin/python installed. The unversioned /usr/bin/python is and has long been an interpreter for the Python 2 language. Python 3 is a different language, and its interpreter should not by default assume the command name of the Python 2 interpreter. I think Arch Linux made a huge mistake in pretending they were the same thing, and sincerely hope no other distribution does the same. > i think /usr/bin/python should map to /usr/bin/python3 and you > should assume that it now python3. I think that's a disaster waiting to happen. > if you dont do that hen every script that has ever been writtne or > packaged needs to be updated to reference python3 explictly. Yep, that has to happen anyway in most cases to address Python 2 vs 3 language compatibility differences. Being explicit about which language a script is written in is a good thing. > there were much fewer user of python1 when that tansition happened > but python became a link to the default systme python and > eventully pointed to python2 i think we should continue to do that > and after a decase of deprecating python2 we should reclaim the > python symlink and point it to python3 [...] The language did not change in significantly backward-incompatible ways with 2.0. On the other hand "Python 3000" (3.0) was essentially meant as a redesign of the language where backward-incompatibility was a tool to abandon broken paradigms. It's possible to write software which will run under both interpreters (and we have in fact, rather a lot even), but random scripts written for Python 2 without concerns with forward-compatibility usually won't work on a Python 3 interpreter. > well from a devstack point of view we almost exclucive install > form pip so installing python packages form the disto is the anti > pattern not installing form pip. that said we shoudl consider > moving devstack to use --user at somepoint. [...] It's hard not to call https://review.openstack.org/562884 an anti-pattern. The pip maintainers these days basically don't want to have to continue supporting system-context installs, as responses on https://github.com/pypa/pip/issues/4805 clearly demonstrate. DevStack's been working around that for a year and a half now, as have our image builds (until Ian's efforts to stop doing that for the centos-8 images). Yes doing --user or venv installs is likely the core of the solution for DevStack but it needs more folks actually working to make it happen, and the ugly hack has been in place for so long I have doubts we'll see a major overhaul like that any time soon. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From emiller at genesishosting.com Wed Oct 16 04:50:19 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Tue, 15 Oct 2019 23:50:19 -0500 Subject: [Octavia] Amphora build issues Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> Hi, It seems that every build I have attempted of an amphora fails in some way. I have tried CentOS 7, Ubuntu Bionic, Xenial, and Trusty. Note that we are running Stein. I will concentrate on Ubuntu issues for now. I first create a fresh VM that is used to install the diskimage-create tool, then run (after sudo'ing to root): apt update apt -y upgrade apt-get -y install qemu qemu-system-common uuid-runtime curl kpartx git jq python-pip debootstrap libguestfs-tools pip install 'networkx==2.2' pip install argparse Babel dib-utils PyYAML git clone -b stable/stein https://github.com/openstack/octavia.git git clone https://git.openstack.org/openstack/diskimage-builder.git cd diskimage-builder pip install -r requirements.txt cd ../octavia/diskimage-create/ pip install -r requirements.txt # And finally, I run the diskimage-create script, specifying the image's OS, so ONE of these, depending on the OS: ./diskimage-create.sh -d bionic # or to use Xenial: ./diskimage-create.sh -d xenial # Note that when selecting Trusty, diskimage-create.sh error's, and so never finishes successfully. # Somewhat expected since it is quite old and unsupported. ./diskimage-create.sh -d trusty The amphorae launch when creating a load balancer, but the amphora agent fails to start, and thus is not responsive on TCP Port 9443. The log from inside the amphora is below. Has anyone successfully created an image? Am I missing something? Thanks! Eric Amphora agent fails to start inside amphora - this is logged when running the agent from the command line: 2019-10-16 03:41:04.835 1119 INFO octavia.common.config [-] /usr/local/bin/amphora-agent version 5.1.0.dev20 2019-10-16 03:41:04.835 1119 DEBUG octavia.common.config [-] command line: /usr/local/bin/amphora-agent --config-file /etc/octavia/amphora-agent.conf setup_logging /opt/amphora-agent-venv/lib/python3.5/site-packages/octavia/common/confi g.py:779 2019-10-16 03:41:05.036 1124 INFO octavia.amphorae.backends.health_daemon.health_daemon [-] Health Manager Sender starting. 2019-10-16 03:41:05.084 1119 CRITICAL octavia [-] Unhandled error: FileNotFoundError: [Errno 2] No such file or directory 2019-10-16 03:41:05.084 1119 ERROR octavia Traceback (most recent call last): 2019-10-16 03:41:05.084 1119 ERROR octavia File "/usr/local/bin/amphora-agent", line 8, in 2019-10-16 03:41:05.084 1119 ERROR octavia sys.exit(main()) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/octavia/cmd/agent.p y", line 89, in main 2019-10-16 03:41:05.084 1119 ERROR octavia AmphoraAgent(server_instance.app, options).run() 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/app/base.p y", line 72, in run 2019-10-16 03:41:05.084 1119 ERROR octavia Arbiter(self).run() 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/arbiter.py ", line 60, in __init__ 2019-10-16 03:41:05.084 1119 ERROR octavia self.setup(app) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/arbiter.py ", line 95, in setup 2019-10-16 03:41:05.084 1119 ERROR octavia self.log = self.cfg.logger_class(app.cfg) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/glogging.p y", line 200, in __init__ 2019-10-16 03:41:05.084 1119 ERROR octavia self.setup(cfg) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/glogging.p y", line 227, in setup 2019-10-16 03:41:05.084 1119 ERROR octavia self.error_log, cfg, self.syslog_fmt, "error" 2019-10-16 03:41:05.084 1119 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/gunicorn/glogging.p y", line 449, in _set_syslog_handler 2019-10-16 03:41:05.084 1119 ERROR octavia facility=facility, socktype=socktype) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/usr/lib/python3.5/logging/handlers.py", line 806, in __init__ 2019-10-16 03:41:05.084 1119 ERROR octavia self._connect_unixsocket(address) 2019-10-16 03:41:05.084 1119 ERROR octavia File "/usr/lib/python3.5/logging/handlers.py", line 823, in _connect_unixsocket 2019-10-16 03:41:05.084 1119 ERROR octavia self.socket.connect(address) 2019-10-16 03:41:05.084 1119 ERROR octavia FileNotFoundError: [Errno 2] No such file or directory -------------- next part -------------- An HTML attachment was scrubbed... URL: From adriant at catalyst.net.nz Wed Oct 16 05:04:41 2019 From: adriant at catalyst.net.nz (Adrian Turjak) Date: Wed, 16 Oct 2019 18:04:41 +1300 Subject: [ospurge] looking for project owners / considering adoption In-Reply-To: References: Message-ID: I tried to get a community goal to do project deletion per project, but we ended up deciding that a community goal wasn't ideal unless we did build a bulk delete API in each service: https://review.opendev.org/#/c/639010/ https://etherpad.openstack.org/p/community-goal-project-deletion https://etherpad.openstack.org/p/DEN-Deletion-of-resources https://etherpad.openstack.org/p/DEN-Train-PublicCloudWG-brainstorming What we decided on, but didn't get a chance to work on, was building into the OpenstackSDK OS-purge like functionality, as well as reporting functionality (of all project resources to be deleted). That way we could have per project per resource deletion logic, and all of that defined in the SDK. I was up for doing some of the work, but ended up swamped with internal work and just didn't drive or push for the deletion work upstream. If you want to do something useful, don't pursue OS-Purge, help us add that official functionality to the SDK, and then we can push for bulk deletion APIs in each project to make resource deletion more pleasant. I'd be happy to help with the work, and Monty on the SDK team will most likely be happy to as well. :) Cheers, Adrian On 1/10/19 11:48 am, Adam Harwell wrote: > I haven't seen much activity on this project in a while, and it's been > moved to opendev/x since the opendev migration... Who is the current > owner of this project? Is there anyone who actually is maintaining it, > or would mind if others wanted to adopt the project to move it forward? > > Thanks, >    --Adam Harwell From yu.chengde at 99cloud.net Wed Oct 16 10:59:58 2019 From: yu.chengde at 99cloud.net (yu.chengde at 99cloud.net) Date: Wed, 16 Oct 2019 18:59:58 +0800 Subject: [nova] Which nova container service that nova/conf/compute.py map to Message-ID: Hi, I have deployed a stein version openstack on server thought Kolla-ansible method. Then, I git clone the nova code, and ready to do coding in " nova/nova/conf/compute.py" However, many of nova containers include this file. So, I want to know that I should modify them all, or just pick a specific one. Thanks [root at chantyu kolla-ansible]# docker ps | grep nova 05f72e539974 kolla/centos-source-nova-compute:stein "dumb-init --single-…" 28 hours ago Up 2 hours nova_compute 7393a7d566ee kolla/centos-source-nova-libvirt:stein "dumb-init --single-…" 28 hours ago Up 5 hours nova_libvirt 9d8357cfa334 kolla/centos-source-nova-scheduler:stein "dumb-init --single-…" 32 hours ago Up 3 hours nova_scheduler 085b9da918df kolla/centos-source-nova-api:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_api b80e9503e93e kolla/centos-source-nova-serialproxy:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_serialproxy c15d41823a22 kolla/centos-source-nova-novncproxy:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_novncproxy c30e47cd56c6 kolla/centos-source-nova-consoleauth:stein "dumb-init --single-…" 6 days ago Up 3 hours nova_consoleauth b7d5e9ba1f11 kolla/centos-source-nova-ssh:stein "dumb-init --single-…" 7 days ago Up 5 hours nova_ssh 3f81cd0a97ce kolla/centos-source-nova-conductor:stein "dumb-init --single-…" 7 days ago Up 3 hours nova_conductor [root at chantyu kolla-ansible]# From smooney at redhat.com Wed Oct 16 11:05:20 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 16 Oct 2019 12:05:20 +0100 Subject: CentOS 8 nodes available now In-Reply-To: <20191016024339.s7s24wpcprra7f3x@yuggoth.org> References: <20191015230316.GA29186@fedora19.localdomain> <20191016024339.s7s24wpcprra7f3x@yuggoth.org> Message-ID: <3c28024b026f7f3fe2fb39dfc56687864df53be0.camel@redhat.com> TL;DR ok the way the image will work makes sense :) all i really care about is /usr/bin/python should not be a symlink to /usr/bin/python2 on python3 "only" distors to ensure we dont execute code under python 2 by mistake. On Wed, 2019-10-16 at 02:43 +0000, Jeremy Stanley wrote: > On 2019-10-16 01:35:48 +0100 (+0100), Sean Mooney wrote: > [...] > > i would personally prefer to have /usr/bin/python -> > > /user/bin/python3 > > > > linux distros seem to be a bit split on this i belive arch and > > maybe debian (i saw on of the other majory disto families adopt > > the same apparch) link python to python3 > > [...] > > Debian definitely does not. The current plan for when Debian stops > shipping Python 2.7 is that it will have no /usr/bin/python > installed. The unversioned /usr/bin/python is and has long been an > interpreter for the Python 2 language. Python 3 is a different > language, and its interpreter should not by default assume the > command name of the Python 2 interpreter. I think Arch Linux made a > huge mistake in pretending they were the same thing, and sincerely > hope no other distribution does the same. perhaps though i would argue that the code name of the python2 interperatr was always python2 or python2.7 no python. python was the name of system python the fact that they happen to be the same thin was a historical acident fo rthe last decade but if i exend your argument then we never should have had #!/usr/bin/python at all as a script entry point which in hindsight may have been correct. https://www.python.org/dev/peps/pep-0394/#for-python-runtime-distributors allow effectivly all of the possible options so there is no wrong answer just different tradeoffs. > > > i think /usr/bin/python should map to /usr/bin/python3 and you > > should assume that it now python3. > > I think that's a disaster waiting to happen. perhaps the only thin i hope we really avoid going forward is a python that maps to python2 that silntly allows things to work when we think we are running python 3 only. e.g. i prefer spipts that can run under python3 sliently doing so then over silently running on python2. if on python3 systems distro ensure that python does not map to python2 even when python 2 is installed unless the system admin expcitly sets up the symlink i guess that acive the same goal. it seams that is the path debain an rhel are taking e.g. dont provide "python" via packages so that it is only created if the sytem admin creates it. > > > if you dont do that hen every script that has ever been writtne or > > packaged needs to be updated to reference python3 explictly. > > Yep, that has to happen anyway in most cases to address Python 2 vs > 3 language compatibility differences. Being explicit about which > language a script is written in is a good thing. i guess but it feels kind of sad to say that forever we will have to type python3 in stead of python just because legacy script coudl break. i woudl prefer them to break and have the convinece. the is partly because most python2 i have encounterd has been vaild python3 but i know that that was not the case for alot of scripts. > > > there were much fewer user of python1 when that tansition happened > > but python became a link to the default systme python and > > eventully pointed to python2 i think we should continue to do that > > and after a decase of deprecating python2 we should reclaim the > > python symlink and point it to python3 > > [...] > > The language did not change in significantly backward-incompatible > ways with 2.0. On the other hand "Python 3000" (3.0) was essentially > meant as a redesign of the language where backward-incompatibility > was a tool to abandon broken paradigms. It's possible to write > software which will run under both interpreters (and we have in > fact, rather a lot even), but random scripts written for Python 2 > without concerns with forward-compatibility usually won't work on a > Python 3 interpreter. > > > well from a devstack point of view we almost exclucive install > > form pip so installing python packages form the disto is the anti > > pattern not installing form pip. that said we shoudl consider > > moving devstack to use --user at somepoint. > > [...] > > It's hard not to call https://review.openstack.org/562884 an > anti-pattern. i ment when using devstack prefering distro pacagke over pip would be an anti patteren as part of the reason we install form pip in the first place is to normalise the install so that it is contolled and as similar as possible beteen distors. the fact we have to do that in devstack ya does feel like a hack i was not aware we did that. > The pip maintainers these days basically don't want to > have to continue supporting system-context installs, as responses on > https://github.com/pypa/pip/issues/4805 clearly demonstrate. yep i was just trying to suggest that we shoudl avoid installing the disto version and basicaly only install the interpreter form the distro to avoid mixing packages as much as possible. > DevStack's been working around that for a year and a half now, as > have our image builds (until Ian's efforts to stop doing that for > the centos-8 images). Yes doing --user or venv installs is likely > the core of the solution for DevStack i do think its worth revisiting devstack venv install capablity at some point i know its there but have never really used it but hte fact that devstack installs but the requirement.txt and test-requriement.txt gloablly has caused issue in the past. i tried to remove install the test-requirement.txt in the past but some jobs depend on that to install optional packages so we cant. > but it needs more folks > actually working to make it happen, and the ugly hack has been in > place for so long I have doubts we'll see a major overhaul like that > any time soon. well the main thing that motivated me to even comment on this thread was the fact we currently have a hack that with lib_from_git where if you enable python 3 i will install the lib under python 2 and python3. the problem is the interperter line at the top of the entry point will be replaced with the python2 version due to the order of installs. so if you use libs_form _git with nova or with a lib that provides a setup tools console script entrypoint you can get into a situation where your python 3 only build can end up trying to run "python2" scripts. this has lead to some interesting errors to debug in the past. anyway i was looking forward to having a python3 only disto to not have to deal with that in the future with it looks like From radoslaw.piliszek at gmail.com Wed Oct 16 11:50:52 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 16 Oct 2019 13:50:52 +0200 Subject: [nova] Which nova container service that nova/conf/compute.py map to In-Reply-To: References: Message-ID: Hi Yu, you want to read: https://docs.openstack.org/kolla-ansible/latest/contributor/kolla-for-openstack-development.html In your case you should set: nova_dev_mode: yes in globals.yml Kind regards, Radek śr., 16 paź 2019 o 13:10 yu.chengde at 99cloud.net napisał(a): > Hi, > I have deployed a stein version openstack on server thought > Kolla-ansible method. > Then, I git clone the nova code, and ready to do coding in " > nova/nova/conf/compute.py" > However, many of nova containers include this file. > So, I want to know that I should modify them all, or just pick a > specific one. > Thanks > > > [root at chantyu kolla-ansible]# docker ps | grep nova > 05f72e539974 kolla/centos-source-nova-compute:stein > "dumb-init --single-…" 28 hours ago Up 2 hours > nova_compute > 7393a7d566ee kolla/centos-source-nova-libvirt:stein > "dumb-init --single-…" 28 hours ago Up 5 hours > nova_libvirt > 9d8357cfa334 kolla/centos-source-nova-scheduler:stein > "dumb-init --single-…" 32 hours ago Up 3 hours > nova_scheduler > 085b9da918df kolla/centos-source-nova-api:stein > "dumb-init --single-…" 6 days ago Up 3 hours > nova_api > b80e9503e93e kolla/centos-source-nova-serialproxy:stein > "dumb-init --single-…" 6 days ago Up 3 hours > nova_serialproxy > c15d41823a22 kolla/centos-source-nova-novncproxy:stein > "dumb-init --single-…" 6 days ago Up 3 hours > nova_novncproxy > c30e47cd56c6 kolla/centos-source-nova-consoleauth:stein > "dumb-init --single-…" 6 days ago Up 3 hours > nova_consoleauth > b7d5e9ba1f11 kolla/centos-source-nova-ssh:stein > "dumb-init --single-…" 7 days ago Up 5 hours > nova_ssh > 3f81cd0a97ce kolla/centos-source-nova-conductor:stein > "dumb-init --single-…" 7 days ago Up 3 hours > nova_conductor > [root at chantyu kolla-ansible]# > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Wed Oct 16 14:22:30 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 16 Oct 2019 09:22:30 -0500 Subject: OpenStack Train is officially released! Message-ID: <20191016142230.GB13004@sm-workstation> The official OpenStack Train release announcement has been sent out: http://lists.openstack.org/pipermail/openstack-announce/2019-October/002024.html Thanks to all who were part of making the Train series a success! This marks the official opening of the releases repo for Ussuri, and freezes are now lifted. Train is now a full stable branch. Thanks! Sean From juliaashleykreger at gmail.com Wed Oct 16 14:26:33 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 16 Oct 2019 07:26:33 -0700 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Message-ID: I'm okay if we just change driver-requirements.txt at this point and go ahead and cut new release for ironic. I actually feel like we should have bumped driver-requirements.txt after releasing sushy 2.0.0 anyway. The bottom line is we need to focus on the user experience of using the software. For ironic, if a vendor's class of gear just doesn't work with a possible combination, then we should try and take the least resistance and greatest impact path to remedying that situation. As for the "prevents a prior release" portion of policy, That is likely written for the projects that perform release candidates and not projects that do not. At least, that is my current feeling. That seems super counter-intuitive for ironic's release model if there is a major bug that is identified that needs to be fixed in the software we have shipped. -Julia On Tue, Oct 15, 2019 at 6:21 PM wrote: > > Hi, > > The Ironic Train release can be broken due to an entry in its driver-requirements.txt. driver-requirements.txt defines a dependency on the sushy package [1] which can be satisfied by version 1.9.0. Unfortunately, that version contains a few bugs which prevent Ironic from being able to manage Dell EMC and perhaps other vendors' bare metal hardware with its Redfish hardware type (driver). The fixes to them [2][3][4] were merged into master before the creation of stable/train. Therefore, they are available on stable/train and in the last sushy release created during the Train cycle, 2.0.0, the only other version which can satisfy the dependency today. However, consumers -- packagers, operators, and users -- could, fighting time constraints or lacking solid visibility into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the dependency, but, in so doing, unknowingly render the package or installation severely broken. > > A change [5] has been proposed as part of a prospective solution to this issue. It creates a new release of sushy from the change which fixes the first bug [2]. Review comments [6] discuss basing the new release on a more recent stable/train change to pick up other bug fixes and, less importantly, backward compatible feature modifications and enhancements which merged before the change from which 2.0.0 was created. Backward compatible feature modifications and enhancements are interspersed in time among the bug fixes. Once a new release is available, the sushy entry in driver-requirements.txt on stable/train would be updated. However, apparently, the stable branch policy prevents releases from being done at a point earlier than the last release within a given cycle [6], which was 2.0.0. > > Another possible resolution which comes to mind is to change the definition of the sushy dependency in driver-requirements.txt [1] from "sushy>=1.9.0" to "sushy>=2.0.0". > > Does anyone have a suggestion on how to proceed? > > Thank you, > Rick > > > [1] https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 > [2] https://review.opendev.org/#/c/666253/ > [3] https://review.opendev.org/#/c/668936/ > [4] https://review.opendev.org/#/c/669889/ > [5] https://review.opendev.org/#/c/688551/ > [6] https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 > > From openstack at fried.cc Wed Oct 16 14:59:51 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 16 Oct 2019 09:59:51 -0500 Subject: [nova][ptg] Ussuri scope containment In-Reply-To: References: <82c4ceb2-8bd1-75a8-1ded-f598746957aa@fried.cc> <502c53bc-1936-e369-cfbe-1950a37eb052@gmail.com> <6946cded-cc11-d4d8-d2f2-620aab76b054@fried.cc> <8e2abdab-281b-5665-3220-a3b46704fa28@fried.cc> Message-ID: <3aceecad-626b-99de-3ba5-512b178a941a@fried.cc> Update: > the nova-specs patch introducing > the "Core Liaison" concept [1]. This is merged (it's now called "Feature Liaison"). Here's the new spec template section [2] and the FAQ [3]. Thanks to those who helped shape it. > (A) Note that the idea of capping the number of specs is (mostly) > unrelated, and we still haven't closed on it. I feel like we've agreed > to have a targeted discussion around spec freeze time where we decide > whether to defer features for resource reasons. That would be a new (and > good, IMO) thing. But it's still TBD whether "30 approved for 25 > completed" will apply, and/or what criteria would be used to decide what > gets cut. Nothing new here. efried > [1] https://review.opendev.org/#/c/685857 [2] http://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/ussuri-template.html#feature-liaison [3] http://specs.openstack.org/openstack/nova-specs/readme.html#feature-liaison-faq From Arkady.Kanevsky at dell.com Wed Oct 16 15:14:18 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 16 Oct 2019 15:14:18 +0000 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Message-ID: <9a68c819b4f34bff8bba7eeb2e862180@AUSX13MPS308.AMER.DELL.COM> Julia, I am for it also. But with Train just released, from Sean email, how does it get into Train? -----Original Message----- From: Julia Kreger Sent: Wednesday, October 16, 2019 9:27 AM To: Pioso, Richard Cc: openstack-discuss Subject: Re: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt [EXTERNAL EMAIL] I'm okay if we just change driver-requirements.txt at this point and go ahead and cut new release for ironic. I actually feel like we should have bumped driver-requirements.txt after releasing sushy 2.0.0 anyway. The bottom line is we need to focus on the user experience of using the software. For ironic, if a vendor's class of gear just doesn't work with a possible combination, then we should try and take the least resistance and greatest impact path to remedying that situation. As for the "prevents a prior release" portion of policy, That is likely written for the projects that perform release candidates and not projects that do not. At least, that is my current feeling. That seems super counter-intuitive for ironic's release model if there is a major bug that is identified that needs to be fixed in the software we have shipped. -Julia On Tue, Oct 15, 2019 at 6:21 PM wrote: > > Hi, > > The Ironic Train release can be broken due to an entry in its driver-requirements.txt. driver-requirements.txt defines a dependency on the sushy package [1] which can be satisfied by version 1.9.0. Unfortunately, that version contains a few bugs which prevent Ironic from being able to manage Dell EMC and perhaps other vendors' bare metal hardware with its Redfish hardware type (driver). The fixes to them [2][3][4] were merged into master before the creation of stable/train. Therefore, they are available on stable/train and in the last sushy release created during the Train cycle, 2.0.0, the only other version which can satisfy the dependency today. However, consumers -- packagers, operators, and users -- could, fighting time constraints or lacking solid visibility into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the dependency, but, in so doing, unknowingly render the package or installation severely broken. > > A change [5] has been proposed as part of a prospective solution to this issue. It creates a new release of sushy from the change which fixes the first bug [2]. Review comments [6] discuss basing the new release on a more recent stable/train change to pick up other bug fixes and, less importantly, backward compatible feature modifications and enhancements which merged before the change from which 2.0.0 was created. Backward compatible feature modifications and enhancements are interspersed in time among the bug fixes. Once a new release is available, the sushy entry in driver-requirements.txt on stable/train would be updated. However, apparently, the stable branch policy prevents releases from being done at a point earlier than the last release within a given cycle [6], which was 2.0.0. > > Another possible resolution which comes to mind is to change the definition of the sushy dependency in driver-requirements.txt [1] from "sushy>=1.9.0" to "sushy>=2.0.0". > > Does anyone have a suggestion on how to proceed? > > Thank you, > Rick > > > [1] > https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4 > a507e9a8b3a19e8a58/driver-requirements.txt#L14 > [2] https://review.opendev.org/#/c/666253/ > [3] https://review.opendev.org/#/c/668936/ > [4] https://review.opendev.org/#/c/669889/ > [5] https://review.opendev.org/#/c/688551/ > [6] https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 > > From jim at jimrollenhagen.com Wed Oct 16 15:14:12 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Wed, 16 Oct 2019 11:14:12 -0400 Subject: [ironic][release][stable] Ironic Train release can be broken due to entry in driver-requirements.txt In-Reply-To: References: <7ed91a32b6fa4c8da1a82fc6e18f604b@ausx13mpc123.AMER.DELL.COM> Message-ID: On Wed, Oct 16, 2019 at 10:27 AM Julia Kreger wrote: > I'm okay if we just change driver-requirements.txt at this point and > go ahead and cut new release for ironic. I actually feel like we > should have bumped driver-requirements.txt after releasing sushy 2.0.0 > anyway. > Yeah, I think I agree here. The options I see are: * Ironic train depends on 2.0.0. This breaks stable policy. * We release 1.10.0 and depend on that. This breaks stable and release team policy. * We keep our dependencies the same and document that 1.9.0 is broken. This breaks no policy. The last option follows the letter of the law best, but doesn't actually help our users. If they need to use 2.0.0 to have a working system anyway, then the effect on users is the same as the first option, but in a backwards way. Let's just bring ironic to 2.0.0 and fix any breakage that comes with it. // jim > The bottom line is we need to focus on the user experience of using > the software. For ironic, if a vendor's class of gear just doesn't > work with a possible combination, then we should try and take the > least resistance and greatest impact path to remedying that situation. > > As for the "prevents a prior release" portion of policy, That is > likely written for the projects that perform release candidates and > not projects that do not. At least, that is my current feeling. That > seems super counter-intuitive for ironic's release model if there is a > major bug that is identified that needs to be fixed in the software we > have shipped. > > -Julia > > On Tue, Oct 15, 2019 at 6:21 PM wrote: > > > > Hi, > > > > The Ironic Train release can be broken due to an entry in its > driver-requirements.txt. driver-requirements.txt defines a dependency on > the sushy package [1] which can be satisfied by version 1.9.0. > Unfortunately, that version contains a few bugs which prevent Ironic from > being able to manage Dell EMC and perhaps other vendors' bare metal > hardware with its Redfish hardware type (driver). The fixes to them > [2][3][4] were merged into master before the creation of stable/train. > Therefore, they are available on stable/train and in the last sushy release > created during the Train cycle, 2.0.0, the only other version which can > satisfy the dependency today. However, consumers -- packagers, operators, > and users -- could, fighting time constraints or lacking solid visibility > into Ironic, package or install Ironic with sushy 1.9.0 to satisfy the > dependency, but, in so doing, unknowingly render the package or > installation severely broken. > > > > A change [5] has been proposed as part of a prospective solution to this > issue. It creates a new release of sushy from the change which fixes the > first bug [2]. Review comments [6] discuss basing the new release on a more > recent stable/train change to pick up other bug fixes and, less > importantly, backward compatible feature modifications and enhancements > which merged before the change from which 2.0.0 was created. Backward > compatible feature modifications and enhancements are interspersed in time > among the bug fixes. Once a new release is available, the sushy entry in > driver-requirements.txt on stable/train would be updated. However, > apparently, the stable branch policy prevents releases from being done at a > point earlier than the last release within a given cycle [6], which was > 2.0.0. > > > > Another possible resolution which comes to mind is to change the > definition of the sushy dependency in driver-requirements.txt [1] from > "sushy>=1.9.0" to "sushy>=2.0.0". > > > > Does anyone have a suggestion on how to proceed? > > > > Thank you, > > Rick > > > > > > [1] > https://opendev.org/openstack/ironic/src/commit/b8ae681b37eec617736ac4a507e9a8b3a19e8a58/driver-requirements.txt#L14 > > [2] https://review.opendev.org/#/c/666253/ > > [3] https://review.opendev.org/#/c/668936/ > > [4] https://review.opendev.org/#/c/669889/ > > [5] https://review.opendev.org/#/c/688551/ > > [6] > https://review.opendev.org/#/c/688551/1/deliverables/train/sushy.yaml at 14 > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Wed Oct 16 17:44:31 2019 From: elod.illes at est.tech (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Wed, 16 Oct 2019 17:44:31 +0000 Subject: [stable][EM] Extended Maintenance - Queens Message-ID: <1ceccd2d-a95c-8b72-c5a0-88ce44689bc0@est.tech> Hi, As it was agreed during PTG, the planned date of Extended Maintenance transition of Queens is around two weeks after Train release (a less busy period) [1]. Now that Train is released, it is a good opportunity for teams to go through the list of open and unreleased changes in Queens [2] and schedule a final release for Queens if needed. Feel free to use / edit / modify the lists (I've generated the lists for repositories which have 'follows-policy' tag). I hope this helps. [1] https://releases.openstack.org/ [2] https://etherpad.openstack.org/p/queens-final-release-before-em Thanks, Előd From kennelson11 at gmail.com Wed Oct 16 19:02:45 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 16 Oct 2019 12:02:45 -0700 Subject: [PTL] PTG Team Photos In-Reply-To: References: Message-ID: Wanted to bring this to the top of people's inboxes as a reminder :) Definitely not required, but we have lots of slots left if your team is interested! -Kendall (diablo_rojo) On Wed, Oct 9, 2019 at 11:06 AM Kendall Nelson wrote: > Hello Everyone! > > We are excited to see you in a few weeks at the PTG and wanted to share > that we will be taking team photos again! > > Here is an ethercalc signup for the available time slots [1]. We will be > providing time on Thursday Morning/Afternoon and Friday morning to come as > a team to get your photo taken. Slots are only ten minutes so its *important > that everyone be on time*! > > The location is TBD at this point, but it will likely be in the > prefunction space near registration. > > Thanks, > > -Kendall Nelson (diablo_rojo) > > [1] https://ethercalc.openstack.org/lnupu1sx6ljl > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Wed Oct 16 18:51:46 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 16 Oct 2019 11:51:46 -0700 Subject: OpenStack Train is officially released! In-Reply-To: <20191016142230.GB13004@sm-workstation> References: <20191016142230.GB13004@sm-workstation> Message-ID: Woohoo! Onward to Ussuri! [image: image.png] -Kendall (diablo_rojo) On Wed, Oct 16, 2019 at 7:23 AM Sean McGinnis wrote: > The official OpenStack Train release announcement has been sent out: > > > http://lists.openstack.org/pipermail/openstack-announce/2019-October/002024.html > > Thanks to all who were part of making the Train series a success! > > This marks the official opening of the releases repo for Ussuri, and > freezes > are now lifted. Train is now a full stable branch. > > Thanks! > Sean > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 173430 bytes Desc: not available URL: From fungi at yuggoth.org Wed Oct 16 19:09:55 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 16 Oct 2019 19:09:55 +0000 Subject: [tc] Feedback on Airship pilot project Message-ID: <20191016190954.wscdgflttnfxvhlm@yuggoth.org> Hi TC members, The Airship project will start its confirmation process with the OSF Board of Directors at the Board meeting[1] Tuesday next week. A draft of the slide deck[2] they plan to present is available for reference. Per the confirmation guidelines[3], the OSF Board of directors will take into account the feedback from representative bodies of existing confirmed Open Infrastructure Projects (OpenStack, Zuul and Kata) when evaluating Airship for confirmation. Particularly worth calling out, guideline #4 "Open collaboration" asserts the following: Project behaves as a good neighbor to other confirmed and pilot projects. If you (our community at large, not just TC members) have any observations/interactions with the Airship project which could serve as useful examples for how these projects do or do not meet this and other guidelines, please provide them on the etherpad[4] ASAP. If possible, include a citation with links to substantiate your feedback. If a TC representative can assemble this feedback and send it to the Board (for example, to the foundation mailing list) for consideration before the meeting next week, that would be appreciated. Apologies for the short notice. [1] http://lists.openstack.org/pipermail/foundation/2019-October/002800.html [2] https://www.airshipit.org/collateral/AirshipConfirmation-Review-for-the-OSF-Board.pdf [3] https://wiki.openstack.org/wiki/Governance/Foundation/OSFProjectConfirmationGuidelines [4] https://etherpad.openstack.org/p/openstack-tc-airship-confirmation-feedback -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From zigo at debian.org Wed Oct 16 19:33:02 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 16 Oct 2019 21:33:02 +0200 Subject: Debian OpenStack Train packages are officially released! [was: OpenStack Train is officially released!] In-Reply-To: <20191016142230.GB13004@sm-workstation> References: <20191016142230.GB13004@sm-workstation> Message-ID: On 10/16/19 4:22 PM, Sean McGinnis wrote: > The official OpenStack Train release announcement has been sent out: > > http://lists.openstack.org/pipermail/openstack-announce/2019-October/002024.html > > Thanks to all who were part of making the Train series a success! > > This marks the official opening of the releases repo for Ussuri, and freezes > are now lifted. Train is now a full stable branch. > > Thanks! > Sean Same, thanks everyone! Train packages for Debian have all been uploaded today, either to Sid when it was Horizon and its plugins, or to Experimental for everything else. A subsequent upload to Debian Sid will follow, but will take some time. For those willing to use Train on Buster, the usual repository scheme applies: deb http://buster-train.debian.net/debian buster-train-backports main deb-src http://buster-train.debian.net/debian buster-train-backports main deb http://buster-train.debian.net/debian buster-train-backports-nochange main deb-src http://buster-train.debian.net/debian buster-train-backports-nochange main Please report back any issue you see through the Debian bug tracker as usual. I still haven't had time to run tempest on this, but I could install Train in Buster and it worked. Cheers, Thomas Goirand (zigo) From flux.adam at gmail.com Wed Oct 16 19:48:02 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Wed, 16 Oct 2019 12:48:02 -0700 Subject: [ospurge] looking for project owners / considering adoption In-Reply-To: References: Message-ID: That's interesting -- we have already started working to add features and improve ospurge, and it seems like a plenty useful tool for our needs, but I think I agree that it would be nice to have that functionality built into the sdk. I might be able to help with both, since one is immediately useful and we (like everyone) have deadlines to meet, and the other makes sense to me as a possible future direction that could be more widely supported. Will you or someone else be hosting and discussion about this at the Shanghai summit? I'll be there and would be happy to join and discuss. --Adam On Tue, Oct 15, 2019, 22:04 Adrian Turjak wrote: > I tried to get a community goal to do project deletion per project, but > we ended up deciding that a community goal wasn't ideal unless we did > build a bulk delete API in each service: > https://review.opendev.org/#/c/639010/ > https://etherpad.openstack.org/p/community-goal-project-deletion > https://etherpad.openstack.org/p/DEN-Deletion-of-resources > https://etherpad.openstack.org/p/DEN-Train-PublicCloudWG-brainstorming > > What we decided on, but didn't get a chance to work on, was building > into the OpenstackSDK OS-purge like functionality, as well as reporting > functionality (of all project resources to be deleted). That way we > could have per project per resource deletion logic, and all of that > defined in the SDK. > > I was up for doing some of the work, but ended up swamped with internal > work and just didn't drive or push for the deletion work upstream. > > If you want to do something useful, don't pursue OS-Purge, help us add > that official functionality to the SDK, and then we can push for bulk > deletion APIs in each project to make resource deletion more pleasant. > > I'd be happy to help with the work, and Monty on the SDK team will most > likely be happy to as well. :) > > Cheers, > Adrian > > On 1/10/19 11:48 am, Adam Harwell wrote: > > I haven't seen much activity on this project in a while, and it's been > > moved to opendev/x since the opendev migration... Who is the current > > owner of this project? Is there anyone who actually is maintaining it, > > or would mind if others wanted to adopt the project to move it forward? > > > > Thanks, > > --Adam Harwell > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emiller at genesishosting.com Wed Oct 16 20:05:45 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Wed, 16 Oct 2019 15:05:45 -0500 Subject: [Octavia] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> Just an update on this. It appears that the diskimage-create script is pulling the master version of the Octavia amphora agent, instead of the Stein branch. I took a closer look at the first error line: 2019-10-16 19:47:46.389 1160 ERROR octavia File "/opt/amphora-agent-venv/lib/python3.5/site-packages/octavia/cmd/agent.p y", line 89, in main 2019-10-16 19:47:46.389 1160 ERROR octavia AmphoraAgent(server_instance.app, options).run() and it references line 89, which doesn't exist in agent.py except in the master branch. I had cloned the Stein branch of Octavia here: git clone -b stable/stein https://github.com/openstack/octavia.git I will keep looking... Eric From johnsomor at gmail.com Wed Oct 16 20:50:50 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Wed, 16 Oct 2019 13:50:50 -0700 Subject: [Octavia] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> Message-ID: Hi Eric, You are correct, diskimage-builder defaults to pulling the master version of the amphora agent. You want to set the following variables: export DIB_REPOREF_amphora_agent=stable/stein Then run the diskimage-create script. See the guide for more information: https://docs.openstack.org/octavia/latest/admin/amphora-image-build.html#environment-variables Michael On Wed, Oct 16, 2019 at 1:09 PM Eric K. Miller wrote: > > Just an update on this. > > It appears that the diskimage-create script is pulling the master > version of the Octavia amphora agent, instead of the Stein branch. > > I took a closer look at the first error line: > > 2019-10-16 19:47:46.389 1160 ERROR octavia File > "/opt/amphora-agent-venv/lib/python3.5/site-packages/octavia/cmd/agent.p > y", line 89, in main > 2019-10-16 19:47:46.389 1160 ERROR octavia > AmphoraAgent(server_instance.app, options).run() > > and it references line 89, which doesn't exist in agent.py except in the > master branch. > > I had cloned the Stein branch of Octavia here: > git clone -b stable/stein https://github.com/openstack/octavia.git > > I will keep looking... > > Eric > > > > From emiller at genesishosting.com Wed Oct 16 21:32:25 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Wed, 16 Oct 2019 16:32:25 -0500 Subject: [Octavia] Amphora build issues In-Reply-To: References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A04@gmsxchsvr01.thecreation.com> Thank you Michael! After doing this, a different problem occurs, which is logged in the octavia_worker.log. This also happened with all of my tests with CentOS amphorae. See the log snippet below. Note that creation of the load balancer fails with a status of ERROR and amphorae are deleted right after, so I wasn't able to login to the amphorae. Also note that this was deployed with Kolla Ansible 8.0.2, in case that helps. Eric 2019-10-16 16:27:49.399 23 DEBUG octavia.controller.worker.controller_worker [-] Task 'octavia.controller.worker.tasks.lifecycle_tasks.LoadBalancerIDToErrorOnRevertTask' (0af6f3fc-83d0-4093-b5e4-9cb955bf3397) transitioned into state 'REVERTING' from state 'SUCCESS' _task_receiver /var/lib/kolla/venv/lib/python2.7/site-packages/taskflow/listeners/logging.py:194 2019-10-16 16:27:49.404 23 WARNING octavia.controller.worker.controller_worker [-] Task 'octavia.controller.worker.tasks.lifecycle_tasks.LoadBalancerIDToErrorOnRevertTask' (0af6f3fc-83d0-4093-b5e4-9cb955bf3397) transitioned into state 'REVERTED' from state 'REVERTING' with result 'None' 2019-10-16 16:27:49.415 23 WARNING octavia.controller.worker.controller_worker [-] Flow 'octavia-create-loadbalancer-flow' (337640bf-0f7c-4c9e-b903-994f2c5827dd) transitioned into state 'REVERTED' from state 'RUNNING' 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server [-] Exception during message handling: WrappedFailure: WrappedFailure: [Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found] 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 166, in _process_incoming 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/octavia/controller/queue/endpoint.py", line 45, in create_load_balancer 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server self.worker.create_load_balancer(load_balancer_id, flavor) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/tenacity/__init__.py", line 292, in wrapped_f 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server return self.call(f, *args, **kw) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/tenacity/__init__.py", line 358, in call 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server do = self.iter(retry_state=retry_state) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/tenacity/__init__.py", line 319, in iter 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server return fut.result() 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/concurrent/futures/_base.py", line 455, in result 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server return self.__get_result() 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/tenacity/__init__.py", line 361, in call 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server result = fn(*args, **kwargs) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/octavia/controller/worker/controller_worker.py", line 343, in create_load_balancer 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server create_lb_tf.run() 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/taskflow/engines/action_engine/engine.py", line 247, in run 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server for _state in self.run_iter(timeout=timeout): 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/taskflow/engines/action_engine/engine.py", line 340, in run_iter 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server failure.Failure.reraise_if_any(er_failures) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/taskflow/types/failure.py", line 341, in reraise_if_any 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server raise exc.WrappedFailure(failures) 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server WrappedFailure: WrappedFailure: [Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found] 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server ~ From emiller at genesishosting.com Wed Oct 16 21:43:16 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Wed, 16 Oct 2019 16:43:16 -0500 Subject: [Octavia] Amphora build issues References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> Looking at the error, it appears it can't find the exceptions.py script, but it appears to be in the octavia worker container: 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server WrappedFailure: WrappedFailure: [Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found] (octavia-worker)[root at controller001 haproxy]# cd /var/lib/kolla/venv/lib/python2.7/site-packages/octavia/amphorae/drivers/haproxy (octavia-worker)[root at controller001 haproxy]# ls -al total 84 drwxr-xr-x. 2 root root 186 Sep 29 08:12 . drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc Eric From Arkady.Kanevsky at dell.com Wed Oct 16 21:52:52 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 16 Oct 2019 21:52:52 +0000 Subject: OpenStack Train is officially released! In-Reply-To: References: <20191016142230.GB13004@sm-workstation> Message-ID: <13b25b69a064487cb5b7f0ccabe23dc9@AUSX13MPS308.AMER.DELL.COM> Indeed! From: Kendall Nelson Sent: Wednesday, October 16, 2019 1:52 PM To: Sean McGinnis Cc: OpenStack Discuss Subject: Re: OpenStack Train is officially released! [EXTERNAL EMAIL] Woohoo! Onward to Ussuri! [image.png] -Kendall (diablo_rojo) On Wed, Oct 16, 2019 at 7:23 AM Sean McGinnis > wrote: The official OpenStack Train release announcement has been sent out: http://lists.openstack.org/pipermail/openstack-announce/2019-October/002024.html Thanks to all who were part of making the Train series a success! This marks the official opening of the releases repo for Ussuri, and freezes are now lifted. Train is now a full stable branch. Thanks! Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 173430 bytes Desc: image001.png URL: From johnsomor at gmail.com Wed Oct 16 22:36:00 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Wed, 16 Oct 2019 15:36:00 -0700 Subject: [Octavia] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> Message-ID: This is caused by the controller version being older than the image version. Our upgrade strategy requires the control plane be updated before the image. A few lines above in the log you will see it is attempting to connect to /0.5 and not finding it. You have two options: 1. update the controllers to use the latest stable/stein verison 2. build a new image, but limit it to an older version of stein (such as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) We would recommend you run the latest minor release of stable/stein. (4.1.0 is the latest) https://releases.openstack.org/stein/index.html#octavia I'm not sure why kolla would install an old release. I'm not very familiar with it. Michael On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller wrote: > > Looking at the error, it appears it can't find the exceptions.py script, but it appears to be in the octavia worker container: > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server WrappedFailure: WrappedFailure: [Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found] > > (octavia-worker)[root at controller001 haproxy]# cd /var/lib/kolla/venv/lib/python2.7/site-packages/octavia/amphorae/drivers/haproxy > (octavia-worker)[root at controller001 haproxy]# ls -al > total 84 > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > Eric From emiller at genesishosting.com Wed Oct 16 22:39:37 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Wed, 16 Oct 2019 17:39:37 -0500 Subject: [Octavia] Amphora build issues In-Reply-To: References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A08@gmsxchsvr01.thecreation.com> Thanks Michael! I will configure Kolla Ansible to pull the latest Stein release of the controller components and rebuild/install the containers and get back to you. Much appreciated for the assistance. Eric From emiller at genesishosting.com Thu Oct 17 05:14:58 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Thu, 17 Oct 2019 00:14:58 -0500 Subject: [Octavia] Amphora build issues In-Reply-To: References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> Success! The current Kolla Ansible release simply has the 4.0.1 version specified in kolla-build.conf, which can be easily updated to 4.1.0. So, I adjusted the kolla-build.conf file, re-built the Octavia containers, deleted the containers from the controllers, re-deployed Octavia with Kolla Ansible, and tested load balancer creation, and everything succeeded. Again, much appreciated for the assistance. Eric > -----Original Message----- > From: Michael Johnson [mailto:johnsomor at gmail.com] > Sent: Wednesday, October 16, 2019 5:36 PM > To: Eric K. Miller > Cc: openstack-discuss > Subject: Re: [Octavia] Amphora build issues > > This is caused by the controller version being older than the image > version. Our upgrade strategy requires the control plane be updated > before the image. > A few lines above in the log you will see it is attempting to connect > to /0.5 and not finding it. > > You have two options: > 1. update the controllers to use the latest stable/stein verison > 2. build a new image, but limit it to an older version of stein (such > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > We would recommend you run the latest minor release of stable/stein. > (4.1.0 is the latest) > https://releases.openstack.org/stein/index.html#octavia > > I'm not sure why kolla would install an old release. I'm not very > familiar with it. > > Michael > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > wrote: > > > > Looking at the error, it appears it can't find the exceptions.py script, but it > appears to be in the octavia worker container: > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > WrappedFailure: WrappedFailure: [Failure: > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > Found] > > > > (octavia-worker)[root at controller001 haproxy]# cd > /var/lib/kolla/venv/lib/python2.7/site- > packages/octavia/amphorae/drivers/haproxy > > (octavia-worker)[root at controller001 haproxy]# ls -al > > total 84 > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > Eric From radoslaw.piliszek at gmail.com Thu Oct 17 06:14:41 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 17 Oct 2019 08:14:41 +0200 Subject: [Octavia][Kolla] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> Message-ID: Hi Eric, Octavia has recently been bumped up to 4.1.0 [1] This applies to the cases when you are either using our (upstream, in-registry) images or kolla from git. Released kolla is behind. @kolla Stable branches are really stable so I am not sure whether we should not just point people willing to build their own images to use the version from git rather than PyPI, especially since our images are built from branch, not release. [1] https://review.opendev.org/688426 Kind regards, Radek czw., 17 paź 2019 o 07:23 Eric K. Miller napisał(a): > Success! > > The current Kolla Ansible release simply has the 4.0.1 version specified > in kolla-build.conf, which can be easily updated to 4.1.0. > > So, I adjusted the kolla-build.conf file, re-built the Octavia containers, > deleted the containers from the controllers, re-deployed Octavia with Kolla > Ansible, and tested load balancer creation, and everything succeeded. > > Again, much appreciated for the assistance. > > Eric > > > -----Original Message----- > > From: Michael Johnson [mailto:johnsomor at gmail.com] > > Sent: Wednesday, October 16, 2019 5:36 PM > > To: Eric K. Miller > > Cc: openstack-discuss > > Subject: Re: [Octavia] Amphora build issues > > > > This is caused by the controller version being older than the image > > version. Our upgrade strategy requires the control plane be updated > > before the image. > > A few lines above in the log you will see it is attempting to connect > > to /0.5 and not finding it. > > > > You have two options: > > 1. update the controllers to use the latest stable/stein verison > > 2. build a new image, but limit it to an older version of stein (such > > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > > > We would recommend you run the latest minor release of stable/stein. > > (4.1.0 is the latest) > > https://releases.openstack.org/stein/index.html#octavia > > > > I'm not sure why kolla would install an old release. I'm not very > > familiar with it. > > > > Michael > > > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > > wrote: > > > > > > Looking at the error, it appears it can't find the exceptions.py > script, but it > > appears to be in the octavia worker container: > > > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > > WrappedFailure: WrappedFailure: [Failure: > > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > > Found] > > > > > > (octavia-worker)[root at controller001 haproxy]# cd > > /var/lib/kolla/venv/lib/python2.7/site- > > packages/octavia/amphorae/drivers/haproxy > > > (octavia-worker)[root at controller001 haproxy]# ls -al > > > total 84 > > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > > > Eric > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sindhugauri1 at gmail.com Thu Oct 17 06:07:10 2019 From: sindhugauri1 at gmail.com (Gauri Sindhu) Date: Thu, 17 Oct 2019 11:37:10 +0530 Subject: [all][ceilometer][aodh][[docs] Possible error in Stein's Aodh documentation and how to configure cpu_util and pass on value to Aodh Message-ID: Hi all, As per the Rocky release notes , *cpu_util and *.rate meters are deprecated and will be removed in future release in favor of the Gnocchi rate calculation equivalent.* I have two doubts regarding this. Firstly, if the 'cpu_util' metric has been deprecated then why has it been used as an example in the documentation in the 'Using Alarms' section? I've attached an image of the same. Secondly, I'm using OpenStack Stein and want to use the cpu_util or its equivalent to create an alarm. If this metric is no longer available then what do I pass onto Aodh to create the alarm? There seems to be no documentation that to help me out with this. Additionally, even if Gnocchi rate calculation is to be used, how am I supposed to transfer the result to Aodh? I also cannot seem to find the Gnocchi documentation. Regards, Gauri Sindhu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cpu_util example in 'Using Alarms' section.PNG Type: image/png Size: 30607 bytes Desc: not available URL: From emiller at genesishosting.com Thu Oct 17 06:18:27 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Thu, 17 Oct 2019 01:18:27 -0500 Subject: [Octavia][Kolla] Amphora build issues In-Reply-To: References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A0C@gmsxchsvr01.thecreation.com> Hi Radek, In case this was useful, Kolla Ansible pulls the files (when using the "source" option) from: http://tarballs.openstack.org/octavia/ So, I just adjusted the version it pulled for Octavia. Eric From: Radosław Piliszek [mailto:radoslaw.piliszek at gmail.com] Sent: Thursday, October 17, 2019 1:15 AM To: Eric K. Miller Cc: Michael Johnson; openstack-discuss Subject: Re: [Octavia][Kolla] Amphora build issues Hi Eric, Octavia has recently been bumped up to 4.1.0 [1] This applies to the cases when you are either using our (upstream, in-registry) images or kolla from git. Released kolla is behind. @kolla Stable branches are really stable so I am not sure whether we should not just point people willing to build their own images to use the version from git rather than PyPI, especially since our images are built from branch, not release. [1] https://review.opendev.org/688426 Kind regards, Radek czw., 17 paź 2019 o 07:23 Eric K. Miller napisał(a): Success! The current Kolla Ansible release simply has the 4.0.1 version specified in kolla-build.conf, which can be easily updated to 4.1.0. So, I adjusted the kolla-build.conf file, re-built the Octavia containers, deleted the containers from the controllers, re-deployed Octavia with Kolla Ansible, and tested load balancer creation, and everything succeeded. Again, much appreciated for the assistance. Eric > -----Original Message----- > From: Michael Johnson [mailto:johnsomor at gmail.com] > Sent: Wednesday, October 16, 2019 5:36 PM > To: Eric K. Miller > Cc: openstack-discuss > Subject: Re: [Octavia] Amphora build issues > > This is caused by the controller version being older than the image > version. Our upgrade strategy requires the control plane be updated > before the image. > A few lines above in the log you will see it is attempting to connect > to /0.5 and not finding it. > > You have two options: > 1. update the controllers to use the latest stable/stein verison > 2. build a new image, but limit it to an older version of stein (such > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > We would recommend you run the latest minor release of stable/stein. > (4.1.0 is the latest) > https://releases.openstack.org/stein/index.html#octavia > > I'm not sure why kolla would install an old release. I'm not very > familiar with it. > > Michael > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > wrote: > > > > Looking at the error, it appears it can't find the exceptions.py script, but it > appears to be in the octavia worker container: > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > WrappedFailure: WrappedFailure: [Failure: > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > Found] > > > > (octavia-worker)[root at controller001 haproxy]# cd > /var/lib/kolla/venv/lib/python2.7/site- > packages/octavia/amphorae/drivers/haproxy > > (octavia-worker)[root at controller001 haproxy]# ls -al > > total 84 > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From emiller at genesishosting.com Thu Oct 17 06:20:15 2019 From: emiller at genesishosting.com (Eric K. Miller) Date: Thu, 17 Oct 2019 01:20:15 -0500 Subject: [Octavia][Kolla] Amphora build issues References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04661A0D@gmsxchsvr01.thecreation.com> Correction - I meant "Kolla" (not Kolla Ansible) Eric From: Eric K. Miller Sent: Thursday, October 17, 2019 1:18 AM To: 'Radosław Piliszek' Cc: Michael Johnson; openstack-discuss Subject: RE: [Octavia][Kolla] Amphora build issues Hi Radek, In case this was useful, Kolla Ansible pulls the files (when using the "source" option) from: http://tarballs.openstack.org/octavia/ So, I just adjusted the version it pulled for Octavia. Eric From: Radosław Piliszek [mailto:radoslaw.piliszek at gmail.com] Sent: Thursday, October 17, 2019 1:15 AM To: Eric K. Miller Cc: Michael Johnson; openstack-discuss Subject: Re: [Octavia][Kolla] Amphora build issues Hi Eric, Octavia has recently been bumped up to 4.1.0 [1] This applies to the cases when you are either using our (upstream, in-registry) images or kolla from git. Released kolla is behind. @kolla Stable branches are really stable so I am not sure whether we should not just point people willing to build their own images to use the version from git rather than PyPI, especially since our images are built from branch, not release. [1] https://review.opendev.org/688426 Kind regards, Radek czw., 17 paź 2019 o 07:23 Eric K. Miller napisał(a): Success! The current Kolla Ansible release simply has the 4.0.1 version specified in kolla-build.conf, which can be easily updated to 4.1.0. So, I adjusted the kolla-build.conf file, re-built the Octavia containers, deleted the containers from the controllers, re-deployed Octavia with Kolla Ansible, and tested load balancer creation, and everything succeeded. Again, much appreciated for the assistance. Eric > -----Original Message----- > From: Michael Johnson [mailto:johnsomor at gmail.com] > Sent: Wednesday, October 16, 2019 5:36 PM > To: Eric K. Miller > Cc: openstack-discuss > Subject: Re: [Octavia] Amphora build issues > > This is caused by the controller version being older than the image > version. Our upgrade strategy requires the control plane be updated > before the image. > A few lines above in the log you will see it is attempting to connect > to /0.5 and not finding it. > > You have two options: > 1. update the controllers to use the latest stable/stein verison > 2. build a new image, but limit it to an older version of stein (such > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > We would recommend you run the latest minor release of stable/stein. > (4.1.0 is the latest) > https://releases.openstack.org/stein/index.html#octavia > > I'm not sure why kolla would install an old release. I'm not very > familiar with it. > > Michael > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > wrote: > > > > Looking at the error, it appears it can't find the exceptions.py script, but it > appears to be in the octavia worker container: > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > WrappedFailure: WrappedFailure: [Failure: > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > Found] > > > > (octavia-worker)[root at controller001 haproxy]# cd > /var/lib/kolla/venv/lib/python2.7/site- > packages/octavia/amphorae/drivers/haproxy > > (octavia-worker)[root at controller001 haproxy]# ls -al > > total 84 > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaronzhu1121 at gmail.com Thu Oct 17 06:32:39 2019 From: aaronzhu1121 at gmail.com (Rong Zhu) Date: Thu, 17 Oct 2019 14:32:39 +0800 Subject: [all][ceilometer][aodh][[docs] Possible error in Stein's Aodh documentation and how to configure cpu_util and pass on value to Aodh In-Reply-To: References: Message-ID: Hi Gauri, We received a lot of feedback about cpu_utils, And we had plan to add cpu_utils back in U release. Gauri Sindhu 于2019年10月17日 周四14:20写道: > Hi all, > > As per the Rocky release notes > , *cpu_util > and *.rate meters are deprecated and will be removed in future release in > favor of the Gnocchi rate calculation equivalent.* > > I have two doubts regarding this. > > Firstly, if the 'cpu_util' metric has been deprecated then why has it been > used as an example in the documentation > in > the 'Using Alarms' section? I've attached an image of the same. > > Secondly, I'm using OpenStack Stein and want to use the cpu_util or its > equivalent to create an alarm. If this metric is no longer available then > what do I pass onto Aodh to create the alarm? There seems to be no > documentation that to help me out with this. Additionally, even if Gnocchi > rate calculation is to be used, how am I supposed to transfer the result to > Aodh? I also cannot seem to find the Gnocchi documentation. > > Regards, > Gauri Sindhu > -- Thanks, Rong Zhu -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Thu Oct 17 06:42:06 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 17 Oct 2019 08:42:06 +0200 Subject: [Octavia][Kolla] Amphora build issues In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04661A0D@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA04661A00@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A03@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A07@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0A@gmsxchsvr01.thecreation.com> <046E9C0290DD9149B106B72FC9156BEA04661A0D@gmsxchsvr01.thecreation.com> Message-ID: Hi Eric, that's exactly what the change I mentioned has done. :-) Kind regards, Radek czw., 17 paź 2019 o 08:20 Eric K. Miller napisał(a): > Correction - I meant "Kolla" (not Kolla Ansible) > > > > Eric > > > > *From:* Eric K. Miller > *Sent:* Thursday, October 17, 2019 1:18 AM > *To:* 'Radosław Piliszek' > *Cc:* Michael Johnson; openstack-discuss > *Subject:* RE: [Octavia][Kolla] Amphora build issues > > > > Hi Radek, > > > > In case this was useful, Kolla Ansible pulls the files (when using the > "source" option) from: > > http://tarballs.openstack.org/octavia/ > > > > So, I just adjusted the version it pulled for Octavia. > > > > Eric > > > > > > *From:* Radosław Piliszek [mailto:radoslaw.piliszek at gmail.com] > *Sent:* Thursday, October 17, 2019 1:15 AM > *To:* Eric K. Miller > *Cc:* Michael Johnson; openstack-discuss > *Subject:* Re: [Octavia][Kolla] Amphora build issues > > > > Hi Eric, > > > > Octavia has recently been bumped up to 4.1.0 [1] > > This applies to the cases when you are either using our (upstream, > in-registry) images or kolla from git. > > Released kolla is behind. > > > > @kolla > > Stable branches are really stable so I am not sure whether we should not > just point people willing to build their own images to use the version from > git rather than PyPI, especially since our images are built from branch, > not release. > > > > [1] https://review.opendev.org/688426 > > > > Kind regards, > > Radek > > > > czw., 17 paź 2019 o 07:23 Eric K. Miller > napisał(a): > > Success! > > The current Kolla Ansible release simply has the 4.0.1 version specified > in kolla-build.conf, which can be easily updated to 4.1.0. > > So, I adjusted the kolla-build.conf file, re-built the Octavia containers, > deleted the containers from the controllers, re-deployed Octavia with Kolla > Ansible, and tested load balancer creation, and everything succeeded. > > Again, much appreciated for the assistance. > > Eric > > > -----Original Message----- > > From: Michael Johnson [mailto:johnsomor at gmail.com] > > Sent: Wednesday, October 16, 2019 5:36 PM > > To: Eric K. Miller > > Cc: openstack-discuss > > Subject: Re: [Octavia] Amphora build issues > > > > This is caused by the controller version being older than the image > > version. Our upgrade strategy requires the control plane be updated > > before the image. > > A few lines above in the log you will see it is attempting to connect > > to /0.5 and not finding it. > > > > You have two options: > > 1. update the controllers to use the latest stable/stein verison > > 2. build a new image, but limit it to an older version of stein (such > > as commit 15358a71e4aeffee8a3283ed08137e3b6daab52e) > > > > We would recommend you run the latest minor release of stable/stein. > > (4.1.0 is the latest) > > https://releases.openstack.org/stein/index.html#octavia > > > > I'm not sure why kolla would install an old release. I'm not very > > familiar with it. > > > > Michael > > > > On Wed, Oct 16, 2019 at 2:43 PM Eric K. Miller > > wrote: > > > > > > Looking at the error, it appears it can't find the exceptions.py > script, but it > > appears to be in the octavia worker container: > > > > > > 2019-10-16 16:27:49.416 23 ERROR oslo_messaging.rpc.server > > WrappedFailure: WrappedFailure: [Failure: > > octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not Found, > > Failure: octavia.amphorae.drivers.haproxy.exceptions.NotFound: Not > > Found] > > > > > > (octavia-worker)[root at controller001 haproxy]# cd > > /var/lib/kolla/venv/lib/python2.7/site- > > packages/octavia/amphorae/drivers/haproxy > > > (octavia-worker)[root at controller001 haproxy]# ls -al > > > total 84 > > > drwxr-xr-x. 2 root root 186 Sep 29 08:12 . > > > drwxr-xr-x. 6 root root 156 Sep 29 08:12 .. > > > -rw-r--r--. 1 root root 3400 Sep 29 08:12 data_models.py > > > -rw-r--r--. 1 root root 4797 Sep 29 08:12 data_models.pyc > > > -rw-r--r--. 1 root root 2298 Sep 29 08:12 exceptions.py > > > -rw-r--r--. 1 root root 3326 Sep 29 08:12 exceptions.pyc > > > -rw-r--r--. 1 root root 572 Sep 29 08:12 __init__.py > > > -rw-r--r--. 1 root root 163 Sep 29 08:12 __init__.pyc > > > -rw-r--r--. 1 root root 26572 Sep 29 08:12 rest_api_driver.py > > > -rw-r--r--. 1 root root 25269 Sep 29 08:12 rest_api_driver.pyc > > > > > > Eric > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sindhugauri1 at gmail.com Thu Oct 17 08:34:26 2019 From: sindhugauri1 at gmail.com (Gauri Sindhu) Date: Thu, 17 Oct 2019 14:04:26 +0530 Subject: [all][ceilometer][aodh][[docs] Possible error in Stein's Aodh documentation and how to configure cpu_util and pass on value to Aodh In-Reply-To: References: Message-ID: Hi Rong, Is there any workaround for this at the moment? Is there any other replacement for the cpu_util metric so that we can transfer the metric or data to the alarm? Regards, Gauri Sindhu On Thu, Oct 17, 2019 at 12:02 PM Rong Zhu wrote: > Hi Gauri, > > We received a lot of feedback about cpu_utils, And we had plan to add > cpu_utils back in U release. > > > Gauri Sindhu 于2019年10月17日 周四14:20写道: > >> Hi all, >> >> As per the Rocky release notes >> , *cpu_util >> and *.rate meters are deprecated and will be removed in future release in >> favor of the Gnocchi rate calculation equivalent.* >> >> I have two doubts regarding this. >> >> Firstly, if the 'cpu_util' metric has been deprecated then why has it >> been used as an example in the documentation >> in >> the 'Using Alarms' section? I've attached an image of the same. >> >> Secondly, I'm using OpenStack Stein and want to use the cpu_util or its >> equivalent to create an alarm. If this metric is no longer available then >> what do I pass onto Aodh to create the alarm? There seems to be no >> documentation that to help me out with this. Additionally, even if Gnocchi >> rate calculation is to be used, how am I supposed to transfer the result to >> Aodh? I also cannot seem to find the Gnocchi documentation. >> >> Regards, >> Gauri Sindhu >> > -- > Thanks, > Rong Zhu > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tetsuro.nakamura.bc at hco.ntt.co.jp Thu Oct 17 09:31:53 2019 From: tetsuro.nakamura.bc at hco.ntt.co.jp (Tetsuro Nakamura) Date: Thu, 17 Oct 2019 18:31:53 +0900 Subject: [placement][ptg] Ussuri Placement Topics Message-ID: <9db858d3-7fdd-fb69-a27e-2d9af0f86dfa@hco.ntt.co.jp_1> Hi Placementers, We won't have a specific meeting space for Shanghai PTG, but I'd like to have retrospective on Stein, and gather and note work items we have. Please put your ideas on the etherpad [1]. [1] https://etherpad.openstack.org/p/placement-shanghai-ptg -- Tetsuro Nakamura NTT Network Service Systems Laboratories TEL:0422 59 6914(National)/+81 422 59 6914(International) 3-9-11, Midori-Cho Musashino-Shi, Tokyo 180-8585 Japan From sfinucan at redhat.com Thu Oct 17 09:34:42 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Thu, 17 Oct 2019 10:34:42 +0100 Subject: CentOS 8 nodes available now In-Reply-To: <3c28024b026f7f3fe2fb39dfc56687864df53be0.camel@redhat.com> References: <20191015230316.GA29186@fedora19.localdomain> <20191016024339.s7s24wpcprra7f3x@yuggoth.org> <3c28024b026f7f3fe2fb39dfc56687864df53be0.camel@redhat.com> Message-ID: <84ef448aba6e07132ba2662fe0e7cd6fecf8df8b.camel@redhat.com> On Wed, 2019-10-16 at 12:05 +0100, Sean Mooney wrote: > > but it needs more folks > > actually working to make it happen, and the ugly hack has been in > > place for so long I have doubts we'll see a major overhaul like that > > any time soon. > well the main thing that motivated me to even comment on this thread > was the fact we currently have a hack that with lib_from_git where if you enable > python 3 i will install the lib under python 2 and python3. the problem is the > interperter line at the top of the entry point will be replaced with the python2 version > due to the order of installs. so if you use libs_form _git with nova or with a lib that provides > a setup tools console script entrypoint you can get into a situation where your python 3 only > build can end up trying to run "python2" scripts. this has lead to some interesting > errors to debug in the past. anyway i was looking forward to having a python3 only disto > to not have to deal with that in the future with it looks like You should probably look at https://review.opendev.org/#/c/687585/ so Stephen From hberaud at redhat.com Thu Oct 17 09:51:07 2019 From: hberaud at redhat.com (Herve Beraud) Date: Thu, 17 Oct 2019 11:51:07 +0200 Subject: [cinder][tooz]Lock-files are remained In-Reply-To: References: <88881fd9-22f3-a4df-c5a9-e5346255ef4b@redhat.com> <05583de8-e593-e4e0-5d0f-05dc5e49ad5c@ntt-tx.co.jp_1> <0bc4efb7-5308-bcce-bcb0-3bc38eafc4e6@nemebean.com> <0f06d375-4796-e839-f1c6-737ca08f320e@nemebean.com> Message-ID: Thanks Ben for your feedbacks. I already tried to follow the `remove_external_lock_file` few months ago, but unfortunately, I don't think we can goes like this with Cinder... As Gorka has explained to me few months ago: > Those are not the only type of locks we use in Cinder. Those are the > ones we call "Global locks" and use TooZ so the DLM can be configured > for Cinder Active-Active. > > We also use Oslo's synchronized locks. > > More information is available in the Cinder HA dev ref I wrote last > year. It has a section dedicated to the topic of mutual exclusion and > the 4 types we currently have in Cinder [1]: > > - Database locking using resource states. > - Process locks. > - Node locks. > - Global locks. > > As for calling the remove_external_lock_file_with_prefix directly on > delete, I don't think that's something we can do, as the locks may still > be in use. Example: > > - Start deleting volume -> get lock > - Try to clone volume -> wait for lock > - Finish deleting volume -> release and delete lock > - Cloning recreates the lock when acquiring it > - Cloning fails because the volume no longer exists but leaves the lock So the Cinder workflow and mechanisms seems to definitively forbid to us the possibility to use the remove features of oslo.concurrency... Also like discussed on the review (https://review.opendev.org/#/c/688413), this issue can't be fixed in the underlying libraries, and I think that if we want to fix that on stable branches then Cinder need to address it directly by adding some piece of code who will be triggered if needed and in a safely manner, in other words, only Cinder can really address it and remove safely these file. See the discussion extract on the review ( https://review.opendev.org/#/c/688413): > Thanks Gorka for your feedback, then in view of all the discussions > about this topic I suppose only Cinder can really address it safely > on stable branches. > > > It is not a safe assumption that *-delete_volume file locks can be > > removed just because they have not been used in a couple of days. > > A new volume clone could come in that would use it and then we > > could have a race condition if the cron job was running. > > > > The only way to be sure that it can be removed is checking in the > > Cinder DB and making sure that the volume has been deleted or it > > doesn't even exist (DB has been purged). > > > > Same thing with detach_volume, delete_snapshot, and those that are > > directly volume ids locks. > > I definitely think that it can't be fixed in the underlying > libraries like Eric has suggested [1], indeed, as you has explained > only Cinder can know if a lock file can be removed safely. > > > In my opinion the fix should be done in fasteners, or we should add > > code in Cinder that cleans up all locks related to a volume or > > snapshot when this one is deleted. > > I agree the most better solution is to fix the root cause and so to > fix fasteners, but I don't think it's can be backported to stable > branches because we will need to bump a requirement version on > stable branche in this case and also because it'll introduce new > features, so I guess Cinder need to add some code to remove these > files and possibly backport it to stable branches. > > [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009563.html The Fasteners fix IMHO can only be used by future versions of openstack, due to the version bump and due to the new features added. I think that it could be available only from the ussuri or future cycle like V. The main goal of the cron approach was to definitively avoid to unearth this topic each 6 months, try to address it on stable branches, and try to take care of the file system usage even if it's a theoretical issue, but by getting feedbacks from the Cinder team and their warnings I don't think that this track is still followable. Definitely, this is not an oslo.concurrency bug. Anyway your proposed "Administrator Guide" is a must to have, to track things in one place, inform users and avoid to spend time to explain the same things again and again about this topic... so it's worth-it. I'll review it and propose my related knowledge on this topic. oslo.concurrency can't address this safely because we risk to introduce race conditions and worse situations than the leftover lock files. So, due to all these elements, only cinder can address it for the moment and for fix that on stable branches too. Le mer. 16 oct. 2019 à 00:15, Ben Nemec a écrit : > In the interest of not having to start this discussion from scratch > every time, I've done a bit of a brain dump into > https://review.opendev.org/#/c/688825/ that covers why things are the > way they are and what we recommend people do about it. Please take a > look and let me know if you see any issues with it. > > Thanks. > > -Ben > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Thu Oct 17 10:24:19 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Thu, 17 Oct 2019 12:24:19 +0200 Subject: [nova][cinder][ops] question/confirmation of legacy vol attachment migration In-Reply-To: References: <37e953ee-f3c8-9797-446f-f3e3db9dcad6@gmail.com> <20191010100050.hn546tikeihaho7e@localhost> Message-ID: <20191017102419.pa3qqlqgrlp2b7qx@localhost> On 10/10, Matt Riedemann wrote: > On 10/10/2019 5:00 AM, Gorka Eguileor wrote: > > > 1. Yeah if the existing legacy attachment record doesn't have a connector I > > > was worried about not properly cleaning on for that old connection, which is > > > something I mentioned before, but also as mentioned we potentially have that > > > case when a server is deleted and we can't get to the compute host to get > > > the host connector, right? > > > > > Hi, > > > > Not really... In that case we still have the BDM info in the DB, so we > > can just make the 3 Cinder REST API calls ourselves (begin_detaching, > > terminate_connection and detach) to have the volume unmapped, the export > > removed, and the volume return to available as usual, without needing to > > go to the storage array manually. > > I'm not sure what you mean. Yes we have the BDM in nova but if it's really > old it won't have the host connector stashed away in the connection_info > dict and we won't be able to pass that to the terminate_connection API: > > https://github.com/openstack/nova/blob/19.0.0/nova/compute/api.py#L2186 > > Are you talking about something else? I realize ^ is very edge case since > we've been storing the connector in the BDM.connection_info since I think at > least Liberty or Mitaka. Hi, I didn't know that Nova didn't use to store the connector... For those cases it is definitely going to be a problem. If you have one such cases, and the Nova compute node is down (so you cannot get the connector info), then we should just wait until the node is back up to do the migration. > > > > > > > > 2. If I were to use os-terminate_connection, I seem to have a tricky > > > situation on the migration flow because right now I'm doing: > > > > > > a) create new attachment with host connector > > > b) complete new attachment (put the volume back to in-use status) > > > - if this fails I attempt to delete the new attachment > > > c) delete the legacy attachment - I intentionally left this until the end to > > > make sure (a) and (b) were successful. > > > > > > If I change (c) to be os-terminate_connection, will that screw up the > > > accounting on the attachment created in (a)? > > > > > > If I did the terminate_connection first (before creating a new attachment), > > > could that leave a window of time where the volume is shown as not > > > attached/in-use? Maybe not since it's not the begin_detaching/os-detach > > > API...I'm fuzzy on the cinder volume state machine here. > > > > > > Or maybe the flow would become: > > > > > > a) create new attachment with host connector > > This is a good idea in itself, but it's not taking into account weird > > behaviors that some Cinder drivers may have when you call them twice to > > initialize the connection on the same host. Some drivers end up > > creating a different mapping for the volume instead of returning the > > existing one; we've had bugs like this before, and that's why Nova made > > a change in its live instance migration code to not call > > intialize_connection on the source host to get the connection_info for > > detaching. > > Huh...I thought attachments in cinder were a dime a dozen and you could > create/delete them as needed, or that was the idea behind the new v3 > attachments stuff. It seems to at least be what I remember John Griffith > always saying we should be able to do. Sure, you can create them freely, but the old and new API's were not meant to be mixed, which is what we would be doing here. The more I look at this, the more I think it is a bad idea. First, when you create the new attachment on a volum