From renat.akhmerov at gmail.com Mon Feb 1 07:52:08 2021 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Mon, 1 Feb 2021 14:52:08 +0700 Subject: Mistral Maintenance In-Reply-To: <9b9c894a-65fb-493a-9d1c-a81b7991de2c@Spark> References: <9b9c894a-65fb-493a-9d1c-a81b7991de2c@Spark> Message-ID: <5403a167-345a-4055-90ef-0d16fdb4ca40@Spark> Hi, Just for those who don’t know me, I’m Renat Akhmerov who originally started the project called Mistral (Workflow Service) and I’ve been the PTL of the project most of the time. The project was started in 2013 when I was at Mirantis, then in 2016 I moved to Nokia and continued to work on the project. My activities at Nokia were related with Mistral on 90%. On 25th of January I joined the company called Tailify where I won’t be contributing to Mistral officially anymore. On that note, I want to say that for now I’ll try to provide minimally required maintenance (reviews, releases, CI fixes) and consulting but I don’t think I’ll continue to be the major contributor of the project. And, most likely, I’ll stop working on the project completely at some point. Don’t know when though, it depends on my new load, my capacity and desire. I still care about the project and want it to live further. So, if anybody wants to step up as the PTL or just take over the development, it would be great and I could provide all the required help and guidance. In this case, I won’t be announcing my candidacy in the upcoming election. Thanks Renat Akhmerov @Tailify -------------- next part -------------- An HTML attachment was scrubbed... URL: From renat.akhmerov at gmail.com Mon Feb 1 07:53:11 2021 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Mon, 1 Feb 2021 14:53:11 +0700 Subject: [mistral][ptl] Fwd: Mistral Maintenance In-Reply-To: <5403a167-345a-4055-90ef-0d16fdb4ca40@Spark> References: <9b9c894a-65fb-493a-9d1c-a81b7991de2c@Spark> <5403a167-345a-4055-90ef-0d16fdb4ca40@Spark> Message-ID: <79479ebd-76e1-4319-b310-dc29c414f933@Spark> Just resending with the right tags... Thanks ---------- Forwarded message ---------- From: Renat Akhmerov Date: 1 Feb 2021, 14:52 +0700 To: OpenStack Discuss Subject: Mistral Maintenance > Hi, > > Just for those who don’t know me, I’m Renat Akhmerov who originally started the project called Mistral (Workflow Service) and I’ve been the PTL of the project most of the time. The project was started in 2013 when I was at Mirantis, then in 2016 I moved to Nokia and continued to work on the project. My activities at Nokia were related with Mistral on 90%. On 25th of January I joined the company called Tailify where I won’t be contributing to Mistral officially anymore. On that note, I want to say that for now I’ll try to provide minimally required maintenance (reviews, releases, CI fixes) and consulting but I don’t think I’ll continue to be the major contributor of the project. And, most likely, I’ll stop working on the project completely at some point. Don’t know when though, it depends on my new load, my capacity and desire. > > I still care about the project and want it to live further. So, if anybody wants to step up as the PTL or just take over the development, it would be great and I could provide all the required help and guidance. In this case, I won’t be announcing my candidacy in the upcoming election. > > > Thanks > > Renat Akhmerov > @Tailify > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amoralej at redhat.com Mon Feb 1 09:05:16 2021 From: amoralej at redhat.com (Alfredo Moralejo Alonso) Date: Mon, 1 Feb 2021 10:05:16 +0100 Subject: [all] Eventlet broken again with SSL, this time under Python 3.9 In-Reply-To: <6241612007690@mail.yandex.ru> References: <6e817a0e-aaa7-9444-fca3-6c5ae8ed2ae7@debian.org> <6f653877fa1da9b2d191b3d3818307f9b29f60bb.camel@redhat.com> <7441d18e5313af6d76a28cabb3866e05dad6f6d5.camel@redhat.com> <666451611999668@mail.yandex.ru> <6241612007690@mail.yandex.ru> Message-ID: We updated kombu and amqp on Jan 28th in RDO https://review.rdoproject.org/r/#/c/31661/ so it may be related to it. Could you point me to some logs about the failure? Best regards. Alfredo On Sat, Jan 30, 2021 at 1:15 PM Dmitriy Rabotyagov wrote: > Yeah, they do: > [root at centos-distro openstack-ansible]# rpm -qa | egrep "amqp|kombu" > python3-kombu-5.0.2-1.el8.noarch > python3-amqp-5.0.3-1.el8.noarch > [root at centos-distro openstack-ansible]# > > But not sure about keystoneauth1 since I see this at the point in > oslo.messaging. Full error in systemd looks like this: > Jan 30 11:51:04 aio1 nova-conductor[97314]: 2021-01-30 11:51:04.543 97314 > ERROR oslo.messaging._drivers.impl_rabbit > [req-61609624-b577-475d-996e-bc8f9899eae0 - - - - -] Connection failed: > [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897) > > > 30.01.2021, 12:42, "Thomas Goirand" : > > On 1/30/21 10:47 AM, Dmitriy Rabotyagov wrote: > >> In the meanwhile we see that most of the services fail to interact > with rabbitmq over self-signed SSL in case RDO packages are used even with > Python 3.6. > >> We don't see this happening when installing things with pip packages > though. Both rdo and pip version of eventlet we used was 0.30.0. > >> > >> RDO started failing for us several days back with: > >> ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify > failed (_ssl.c:897) > >> > >> Not sure, maybe it's not related directly to eventlet, but sounds like > it might be. > > > > Does RDO has version 5.0.3 of AMQP and version 5.0.2 of Kombu? That's > > what I had to do in Debian to pass this stage. > > > > Though the next issue is what I wrote, when a service tries to validate > > a keystone token (ie: keystoneauth1 calls requests that calls urllib3, > > which in turns calls Python 3.9 SSL, and then crash with maximum > > recursion depth exceeded). I'm no 100% sure the problem is in Eventlet, > > but it really looks like it, as it's similar to another SSL crash we had > > in Python 3.7. > > > > Cheers, > > > > Thomas Goirand (zigo) > > > -- > Kind Regards, > Dmitriy Rabotyagov > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Feb 1 10:00:49 2021 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 1 Feb 2021 11:00:49 +0100 Subject: Mistral Maintenance In-Reply-To: <5403a167-345a-4055-90ef-0d16fdb4ca40@Spark> References: <9b9c894a-65fb-493a-9d1c-a81b7991de2c@Spark> <5403a167-345a-4055-90ef-0d16fdb4ca40@Spark> Message-ID: Renat Akhmerov wrote: > Just for those who don’t know me, I’m Renat Akhmerov who originally > started the project called Mistral (Workflow Service) and I’ve been the > PTL of the project most of the time. The project was started in 2013 > when I was at Mirantis, then in 2016 I moved to Nokia and continued to > work on the project. My activities at Nokia were related with Mistral on > 90%. On 25th of January I joined the company called Tailify where I > won’t be contributing to Mistral officially anymore. On that note, I > want to say that for now I’ll try to provide minimally required > maintenance (reviews, releases, CI fixes) and consulting but I don’t > think I’ll continue to be the major contributor of the project. And, > most likely, I’ll stop working on the project completely at some point. > Don’t know when though, it depends on my new load, my capacity and desire. > > I still care about the project and want it to live further. So, if > anybody wants to step up as the PTL or just take over the development, > it would be great and I could provide all the required help and > guidance. In this case, I won’t be announcing my candidacy in the > upcoming election. Thanks for your efforts leading Mistral, and ensuring a smooth transition as you focus on other things, Renat ! -- Thierry Carrez (ttx) From renat.akhmerov at gmail.com Mon Feb 1 10:15:41 2021 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Mon, 1 Feb 2021 17:15:41 +0700 Subject: Mistral Maintenance In-Reply-To: References: <9b9c894a-65fb-493a-9d1c-a81b7991de2c@Spark> <5403a167-345a-4055-90ef-0d16fdb4ca40@Spark> Message-ID: Thank you Thierry :) Feb 1st. 2021 г., 17:01 +0700, Thierry Carrez , wrote: > Renat Akhmerov wrote: > > Just for those who don’t know me, I’m Renat Akhmerov who originally > > started the project called Mistral (Workflow Service) and I’ve been the > > PTL of the project most of the time. The project was started in 2013 > > when I was at Mirantis, then in 2016 I moved to Nokia and continued to > > work on the project. My activities at Nokia were related with Mistral on > > 90%. On 25th of January I joined the company called Tailify where I > > won’t be contributing to Mistral officially anymore. On that note, I > > want to say that for now I’ll try to provide minimally required > > maintenance (reviews, releases, CI fixes) and consulting but I don’t > > think I’ll continue to be the major contributor of the project. And, > > most likely, I’ll stop working on the project completely at some point. > > Don’t know when though, it depends on my new load, my capacity and desire. > > > > I still care about the project and want it to live further. So, if > > anybody wants to step up as the PTL or just take over the development, > > it would be great and I could provide all the required help and > > guidance. In this case, I won’t be announcing my candidacy in the > > upcoming election. > > Thanks for your efforts leading Mistral, and ensuring a smooth > transition as you focus on other things, Renat ! > > -- > Thierry Carrez (ttx) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Mon Feb 1 12:21:45 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Mon, 01 Feb 2021 14:21:45 +0200 Subject: [all] Eventlet broken again with SSL, this time under Python 3.9 In-Reply-To: References: <6e817a0e-aaa7-9444-fca3-6c5ae8ed2ae7@debian.org> <6f653877fa1da9b2d191b3d3818307f9b29f60bb.camel@redhat.com> <7441d18e5313af6d76a28cabb3866e05dad6f6d5.camel@redhat.com> <666451611999668@mail.yandex.ru> <6241612007690@mail.yandex.ru> Message-ID: <88041612182037@mail.yandex.ru> Yes, I can confirm that amqp of version 5.0.3 and later does not accept self-signed certificates in case root ca has not been provided. It has been bumped to 5.0.5 in u-c recently which made things fail for us everywhere now. However, in case of adding root CA into the system things continue working properly. 01.02.2021, 11:05, "Alfredo Moralejo Alonso" : > We updated kombu and amqp on Jan 28th in RDO https://review.rdoproject.org/r/#/c/31661/ so it may be related to it. > > Could you point me to some logs about the failure? > > Best regards. > > Alfredo > > On Sat, Jan 30, 2021 at 1:15 PM Dmitriy Rabotyagov wrote: >> Yeah, they do: >> [root at centos-distro openstack-ansible]# rpm -qa | egrep "amqp|kombu" >> python3-kombu-5.0.2-1.el8.noarch >> python3-amqp-5.0.3-1.el8.noarch >> [root at centos-distro openstack-ansible]# >> >> But not sure about keystoneauth1 since I see this at the point in oslo.messaging. Full error in systemd looks like this: >> Jan 30 11:51:04 aio1 nova-conductor[97314]: 2021-01-30 11:51:04.543 97314 ERROR oslo.messaging._drivers.impl_rabbit [req-61609624-b577-475d-996e-bc8f9899eae0 - - - - -] Connection failed: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897) >> >> 30.01.2021, 12:42, "Thomas Goirand" : >>> On 1/30/21 10:47 AM, Dmitriy Rabotyagov wrote: >>>>  In the meanwhile we see that most of the services fail to interact with rabbitmq over self-signed SSL in case RDO packages are used even with Python 3.6. >>>>  We don't see this happening when installing things with pip packages though. Both rdo and pip version of eventlet we used was 0.30.0. >>>> >>>>  RDO started failing for us several days back with: >>>>  ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897) >>>> >>>>  Not sure, maybe it's not related directly to eventlet, but sounds like it might be. >>> >>> Does RDO has version 5.0.3 of AMQP and version 5.0.2 of Kombu? That's >>> what I had to do in Debian to pass this stage. >>> >>> Though the next issue is what I wrote, when a service tries to validate >>> a keystone token (ie: keystoneauth1 calls requests that calls urllib3, >>> which in turns calls Python 3.9 SSL, and then crash with maximum >>> recursion depth exceeded). I'm no 100% sure the problem is in Eventlet, >>> but it really looks like it, as it's similar to another SSL crash we had >>> in Python 3.7. >>> >>> Cheers, >>> >>> Thomas Goirand (zigo) >> >> -- >> Kind Regards, >> Dmitriy Rabotyagov --  Kind Regards, Dmitriy Rabotyagov From geguileo at redhat.com Mon Feb 1 12:57:17 2021 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 1 Feb 2021 13:57:17 +0100 Subject: [dev][cinder][keystone] Properly consuming system-scope in cinder In-Reply-To: <1774f7582e2.126a1dcb261735.4477287504407985916@ghanshyammann.com> References: <20210129172347.7wi3cv3gnneb46dj@localhost> <1774f7582e2.126a1dcb261735.4477287504407985916@ghanshyammann.com> Message-ID: <20210201125717.imyp5vyzhn5t44fj@localhost> On 29/01, Ghanshyam Mann wrote: > ---- On Fri, 29 Jan 2021 11:23:47 -0600 Gorka Eguileor wrote ---- > > On 28/01, Lance Bragstad wrote: > > > Hey folks, > > > > > > As I'm sure some of the cinder folks are aware, I'm updating cinder > > > policies to include support for some default personas keystone ships with. > > > Some of those personas use system-scope (e.g., system-reader and > > > system-admin) and I've already proposed a series of patches that describe > > > what those changes look like from a policy perspective [0]. > > > > > > The question now is how we test those changes. To help guide that decision, > > > I worked on three different testing approaches. The first was to continue > > > testing policy using unit tests in cinder with mocked context objects. The > > > second was to use DDT with keystonemiddleware mocked to remove a dependency > > > on keystone. The third also used DDT, but included changes to update > > > NoAuthMiddleware so that it wasn't as opinionated about authentication or > > > authorization. I brought each approach in the cinder meeting this week > > > where we discussed a fourth approach, doing everything in tempest. I > > > summarized all of this in an etherpad [1] > > > > > > Up to yesterday morning, the only approach I hadn't tinkered with manually > > > was tempest. I spent some time today figuring that out, resulting in a > > > patch to cinderlib [2] to enable a protection test job, and > > > cinder_tempest_plugin [3] that adds the plumbing and some example tests. > > > > > > In the process of implementing support for tempest testing, I noticed that > > > service catalogs for system-scoped tokens don't contain cinder endpoints > > > [4]. This is because the cinder endpoint contains endpoint templating in > > > the URL [5], which keystone will substitute with the project ID of the > > > token, if and only if the catalog is built for a project-scoped token. > > > System and domain-scoped tokens do not have a reasonable project ID to use > > > in this case, so the templating is skipped, resulting in a cinder service > > > in the catalog without endpoints [6]. > > > > > > This cascades in the client, specifically tempest's volume client, because > > > it can't find a suitable endpoint for request to the volume service [7]. > > > > > > Initially, my testing approaches were to provide examples for cinder > > > developers to assess the viability of each approach before committing to a > > > protection testing strategy. But, the tempest approach highlighted a larger > > > issue for how we integrate system-scope support into cinder because of the > > > assumption there will always be a project ID in the path (for the majority > > > of the cinder API). I can think of two ways to approach the problem, but > > > I'm hoping others have more. > > > > > > > Hi Lance, > > > > Sorry to hear that the Cinder is giving you such trouble. > > > > > First, we remove project IDs from cinder's API path. > > > > > > This would be similar to how nova (and I assume other services) moved away > > > from project-specific URLs (e.g., /v3/%{project_id}s/volumes would become > > > /v3/volumes). This would obviously require refactoring to remove any > > > assumptions cinder has about project IDs being supplied on the request > > > path. But, this would force all authorization information to come from the > > > context object. Once a deployer removes the endpoint URL templating, the > > > endpoints will populate in the cinder entry of the service catalog. Brian's > > > been helping me understand this and we're unsure if this is something we > > > could even do with a microversion. I think nova did it moving from /v2/ to > > > /v2.0/, which was technically classified as a major bump? This feels like a > > > moon shot. > > > > > > > In my opinion such a change should not be treated as a microversion and > > would require us to go into v4, which is not something that is feasible > > in the short term. > > We can do it by supporting both URL with and without project_id. Nova did the same way Hi, I was not doubting that this was technically possible, I was arguing that a change that affects every single API endpoint in Cinder would not be described as "micro" and doing so could be considered a bit of abuse to the microversion infrastructure. This is just my opinion, not the Cinder official position. > in Mitaka cycle and also bumped the microversion but just for > notification. It was done in 2.18 microversion[1]. > > That way you can request compute API with or without project_id and later is recommended. > I think the same approach Cinder can consider. > Thanks for this information. It will definitely come in handy knowing where we have code references if we decide to go with the microversion route. Cheers, Gorka. > [1] https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id16 > > -gmann > > > > > > > > Second, we update cinder's clients, including tempest, to put the project > > > ID on the URL. > > > > > > After we update the clients to append the project ID for cinder endpoints, > > > we should be able to remove the URL templating in keystone, allowing cinder > > > endpoints to appear in system-scoped service catalogs (just like the first > > > approach). Clients can use the base URL from the catalog and append the > > > > I'm not familiar with keystone catalog entries, so maybe I'm saying > > something stupid, but couldn't we have multiple entries? A > > project-specific URL and another one for the project and system scoped > > requests? > > > > I know it sounds kind of hackish, but if we add them in the right order, > > first the project one and then the new one, it would probably be > > backward compatible, as older clients would get the first endpoint and > > new clients would be able to select the right one. > > > > > admin project ID before putting the request on the wire. Even though the > > > request has a project ID in the path, cinder would ignore it for > > > system-specific APIs. This is already true for users with an admin role on > > > a project because cinder will allow you to get volumes in one project if > > > you have a token scoped to another with the admin role [8]. One potential > > > side-effect is that cinder clients would need *a* project ID to build a > > > request, potentially requiring another roundtrip to keystone. > > > > What would happen in this additional roundtrip? Would we be converting > > provided project's name into its UUID? > > > > If that's the case then it wouldn't happen when UUIDs are being > > provided, so for cases where this extra request means a performance > > problem they could just provide the UUID. > > > > > > > > Thoughts? > > > > Truth is that I would love to see the Cinder API move into URLs without > > the project id as well as move out everything from contrib, but that > > doesn't seem like a realistic piece of work we can bite right now. > > > > So I think your second proposal is the way to go. > > > > Thanks for all the work you are putting into this. > > > > Cheers, > > Gorka. > > > > > > > > > > [0] https://review.opendev.org/q/project:openstack/cinder+topic:secure-rbac > > > [1] https://etherpad.opendev.org/p/cinder-secure-rbac-protection-testing > > > [2] https://review.opendev.org/c/openstack/cinderlib/+/772770 > > > [3] https://review.opendev.org/c/openstack/cinder-tempest-plugin/+/772915 > > > [4] http://paste.openstack.org/show/802117/ > > > [5] http://paste.openstack.org/show/802097/ > > > [6] > > > https://opendev.org/openstack/keystone/src/commit/c239cc66615b41a0c09e031b3e268c82678bac12/keystone/catalog/backends/sql.py > > > [7] http://paste.openstack.org/show/802092/ > > > [8] http://paste.openstack.org/show/802118/ > > > > > > > From jean-francois.taltavull at elca.ch Mon Feb 1 13:44:02 2021 From: jean-francois.taltavull at elca.ch (Taltavull Jean-Francois) Date: Mon, 1 Feb 2021 13:44:02 +0000 Subject: [KEYSTONE][FEDERATION] Groups mapping problem when using keycloak as IDP Message-ID: <4b328f90066149db85d0a006fb7ea01b@elca.ch> Hello, In order to implement identity federation, I've deployed (with OSA) keystone (Ussuri) as Service Provider and Keycloak as IDP. As one can read at [1], "groups" can have multiple values and each value must be separated by a ";" But, in the OpenID token sent by keycloak, groups are represented with a JSON list and keystone fails to parse it well (only the first group of the list is mapped). Have any of you already faced this problem ? Thanks ! Jean-François [1] https://docs.openstack.org/keystone/ussuri/admin/federation/mapping_combinations.html From C-Albert.Braden at charter.com Mon Feb 1 14:03:24 2021 From: C-Albert.Braden at charter.com (Braden, Albert) Date: Mon, 1 Feb 2021 14:03:24 +0000 Subject: How to get volume quota usage Message-ID: <1e4cf3b3c8d64f59b3f21da2aa291d15@ncwmexgp009.CORP.CHARTERCOM.com> I'm auditing quota usage for all of our projects, and checking them all in Horizon would be a lot of work so I wrote a little script to pull them all and build a spreadsheet. I can get network and compute usage with "openstack quota list -detail" but when I try to get volume usage it demurs: (openstack) [root at chrnc-area51-build-01 openstack]# openstack quota list --detail --volume --project test Volume service doesn't provide detailed quota information I can see the volume quota usage in Horizon. How can I get it on the CLI? I apologize for the nonsense below. So far I have not been able to stop it from being attached to my external emails. I'm working on it. E-MAIL CONFIDENTIALITY NOTICE: The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Mon Feb 1 14:36:50 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 1 Feb 2021 06:36:50 -0800 Subject: [ironic] Reducing EM stable branch testing Message-ID: Greetings everyone, Last week, during our Wallaby mid-cycle[0] meeting, we discussed removing testing on older (extended, and approaching extended) maintenance branches. In part because of the number of integration jobs which we have and the overhead being so high to keep them operational at this point in time across every repository we use. The consensus we reached was to begin peeling back fragile and resource intensive jobs on these older branches. Mainly "multinode" and grenade type jobs where we risk running the test virtual machine out of memory. We may opt to remove additional older integration jobs and minimize the scenarios as seems relevant, but we will approach that when that time comes. Please let us know if you have any questions or concerns. Thanks, Julia [0]: https://etherpad.opendev.org/p/ironic-wallaby-midcycle From florian at datalounges.com Mon Feb 1 14:40:47 2021 From: florian at datalounges.com (Florian Rommel) Date: Mon, 1 Feb 2021 16:40:47 +0200 Subject: How to get volume quota usage In-Reply-To: <1e4cf3b3c8d64f59b3f21da2aa291d15@ncwmexgp009.CORP.CHARTERCOM.com> References: <1e4cf3b3c8d64f59b3f21da2aa291d15@ncwmexgp009.CORP.CHARTERCOM.com> Message-ID: I think.... and I can be very wrong here, but it used to be a cinder cli feature... not yet available in the openstack cli... I had the same issue and I wrote a script that queries cinder. But things could have changed already. //FR > On 1. Feb 2021, at 16.07, Braden, Albert wrote: > >  > I’m auditing quota usage for all of our projects, and checking them all in Horizon would be a lot of work so I wrote a little script to pull them all and build a spreadsheet. I can get network and compute usage with “openstack quota list –detail” but when I try to get volume usage it demurs: > > (openstack) [root at chrnc-area51-build-01 openstack]# openstack quota list --detail --volume --project test > Volume service doesn't provide detailed quota information > > I can see the volume quota usage in Horizon. How can I get it on the CLI? > > I apologize for the nonsense below. So far I have not been able to stop it from being attached to my external emails. I'm working on it. > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From C-Albert.Braden at charter.com Mon Feb 1 14:52:10 2021 From: C-Albert.Braden at charter.com (Braden, Albert) Date: Mon, 1 Feb 2021 14:52:10 +0000 Subject: [EXTERNAL] Re: How to get volume quota usage In-Reply-To: References: <1e4cf3b3c8d64f59b3f21da2aa291d15@ncwmexgp009.CORP.CHARTERCOM.com> Message-ID: <65f1472ca56643bbb9fad99711b694f5@ncwmexgp009.CORP.CHARTERCOM.com> Ah yes, “cinder quota-usage ” works. Thank you! From: Florian Rommel Sent: Monday, February 1, 2021 9:41 AM To: Braden, Albert Cc: openstack-discuss at lists.openstack.org Subject: [EXTERNAL] Re: How to get volume quota usage CAUTION: The e-mail below is from an external source. Please exercise caution before opening attachments, clicking links, or following guidance. I think.... and I can be very wrong here, but it used to be a cinder cli feature... not yet available in the openstack cli... I had the same issue and I wrote a script that queries cinder. But things could have changed already. //FR On 1. Feb 2021, at 16.07, Braden, Albert > wrote:  I’m auditing quota usage for all of our projects, and checking them all in Horizon would be a lot of work so I wrote a little script to pull them all and build a spreadsheet. I can get network and compute usage with “openstack quota list –detail” but when I try to get volume usage it demurs: (openstack) [root at chrnc-area51-build-01 openstack]# openstack quota list --detail --volume --project test Volume service doesn't provide detailed quota information I can see the volume quota usage in Horizon. How can I get it on the CLI? I apologize for the nonsense below. So far I have not been able to stop it from being attached to my external emails. I'm working on it. The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. E-MAIL CONFIDENTIALITY NOTICE: The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Feb 1 15:09:47 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 01 Feb 2021 09:09:47 -0600 Subject: Mistral Maintenance In-Reply-To: <5403a167-345a-4055-90ef-0d16fdb4ca40@Spark> References: <9b9c894a-65fb-493a-9d1c-a81b7991de2c@Spark> <5403a167-345a-4055-90ef-0d16fdb4ca40@Spark> Message-ID: <1775e24843b.c3f4e00210676.2091416672087927064@ghanshyammann.com> ---- On Mon, 01 Feb 2021 01:52:08 -0600 Renat Akhmerov wrote ---- > Hi, > > Just for those who don’t know me, I’m Renat Akhmerov who originally started the project called Mistral (Workflow Service) and I’ve been the PTL of the project most of the time. The project was started in 2013 when I was at Mirantis, then in 2016 I moved to Nokia and continued to work on the project. My activities at Nokia were related with Mistral on 90%. On 25th of January I joined the company called Tailify where I won’t be contributing to Mistral officially anymore. On that note, I want to say that for now I’ll try to provide minimally required maintenance (reviews, releases, CI fixes) and consulting but I don’t think I’ll continue to be the major contributor of the project. And, most likely, I’ll stop working on the project completely at some point. Don’t know when though, it depends on my new load, my capacity and desire. > > I still care about the project and want it to live further. So, if anybody wants to step up as the PTL or just take over the development, it would be great and I could provide all the required help and guidance. In this case, I won’t be announcing my candidacy in the upcoming election. Thanks, Renat for all your hard work and leading Mistral since starting. I will bring this to the TC meeting too if we can help find maintainers. Tacker uses Mistral internally, maybe the Tacker team needs to give thought to this. -gmann > > Thanks > > Renat Akhmerov > @Tailify > From hberaud at redhat.com Mon Feb 1 15:30:34 2021 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 1 Feb 2021 16:30:34 +0100 Subject: [oslo] Proposing Daniel Bengtsson for Oslo Core Message-ID: Hi, Daniel has been working on Oslo for quite some time now and during that time he was a great help on Oslo. I think he would make a good addition to the general Oslo core team. He helped us by proposing patches to fix issues, maintaining stable branches by backporting changes, and managing our releases by assuming parts of the release liaison role. Existing Oslo team members (and anyone else we co-own libraries with) please respond with +1/-1. If there are no objections we'll add him to the ACL soon. :-) Thanks. -- Hervé Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ https://twitter.com/4383hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From renat.akhmerov at gmail.com Mon Feb 1 15:34:59 2021 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Mon, 1 Feb 2021 22:34:59 +0700 Subject: Mistral Maintenance In-Reply-To: <1775e24843b.c3f4e00210676.2091416672087927064@ghanshyammann.com> References: <9b9c894a-65fb-493a-9d1c-a81b7991de2c@Spark> <5403a167-345a-4055-90ef-0d16fdb4ca40@Spark> <1775e24843b.c3f4e00210676.2091416672087927064@ghanshyammann.com> Message-ID: <9fd3f135-d61c-45b7-aa9a-feb2a7cec869@Spark> That sounds good, thank you! Renat On 1 Feb 2021, 22:09 +0700, Ghanshyam Mann , wrote: > ---- On Mon, 01 Feb 2021 01:52:08 -0600 Renat Akhmerov wrote ---- > > Hi, > > > > Just for those who don’t know me, I’m Renat Akhmerov who originally started the project called Mistral (Workflow Service) and I’ve been the PTL of the project most of the time. The project was started in 2013 when I was at Mirantis, then in 2016 I moved to Nokia and continued to work on the project. My activities at Nokia were related with Mistral on 90%. On 25th of January I joined the company called Tailify where I won’t be contributing to Mistral officially anymore. On that note, I want to say that for now I’ll try to provide minimally required maintenance (reviews, releases, CI fixes) and consulting but I don’t think I’ll continue to be the major contributor of the project. And, most likely, I’ll stop working on the project completely at some point. Don’t know when though, it depends on my new load, my capacity and desire. > > > > I still care about the project and want it to live further. So, if anybody wants to step up as the PTL or just take over the development, it would be great and I could provide all the required help and guidance. In this case, I won’t be announcing my candidacy in the upcoming election. > > Thanks, Renat for all your hard work and leading Mistral since starting. I will bring this to the TC meeting too > if we can help find maintainers. > > Tacker uses Mistral internally, maybe the Tacker team needs to give thought to this. > > -gmann > > > > > Thanks > > > > Renat Akhmerov > > @Tailify > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Mon Feb 1 17:30:04 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Mon, 1 Feb 2021 18:30:04 +0100 Subject: [neutron] bug deputy report for week of 2021-01-25 Message-ID: Hello Neutrinos: This is the bug report from 2021-01-25 to 2021-01-31. Critical: - https://bugs.launchpad.net/neutron/+bug/1913718: “Enforcer perform deprecation logic unnecessarily”. Assigned to Slawek. High: - https://bugs.launchpad.net/neutron/+bug/1913401: “Timeout during creation of interface in functional tests”. Unassigned - https://bugs.launchpad.net/neutron/+bug/1913572: “[CI] Error in "tox-docs" job due to an error in library amqp”. Assigned to Rodolfo. - https://bugs.launchpad.net/neutron/+bug/1914037: “scenario tests tempest.scenario.test_network_v6.TestGettingAddress fails”. Unassigned. Medium: - https://bugs.launchpad.net/neutron/+bug/1913297: “Network creation fails when enable_security_group = False with error "Unknown quota resources ['security_group_rule’]”. Assigned to Slawek - https://bugs.launchpad.net/neutron/+bug/1913269: “DHCP entries for floating IP ports are not needed really”. Assigned to Slawek. - https://bugs.launchpad.net/neutron/+bug/1913723: “[DHCP] Race condition during port processing events in DHCP agent”. Assigned to Rodolfo. Low: - https://bugs.launchpad.net/neutron/+bug/1913280: “Keepalived unit tests don't mocks properly _is_keepalived_use_no_track_supported() function”. Assigned to Slawek - https://bugs.launchpad.net/neutron/+bug/1913180: “QoS policy with minimum bandwidth rules can be assigned to a VXLAN network if the network doesn't have any ports”. Unassigned. - https://bugs.launchpad.net/neutron/+bug/1913664: “[CI] neutron multinode jobs does not run neutron_tempest_plugin scenario cases”. Assigned to Mamatisa. Incomplete: - https://bugs.launchpad.net/neutron/+bug/1913038: “Port remains in status DOWN when fields "Device ID" and "Device owner" are filled from dashboard”. Undecided/unconfirmed: - https://bugs.launchpad.net/neutron/+bug/1913621: “Permant ARP entries not added to DVR qrouter when connected to two Networks”. - https://bugs.launchpad.net/neutron/+bug/1913646: “DVR router ARP traffic broken for networks containing multiple subnets”. Probably a duplicate of LP#1913621. Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aditi.Dukle at ibm.com Mon Feb 1 11:35:23 2021 From: aditi.Dukle at ibm.com (aditi Dukle) Date: Mon, 1 Feb 2021 11:35:23 +0000 Subject: Integration tests failing on master branch for ppc64le Message-ID: An HTML attachment was scrubbed... URL: From goel.nitish10 at gmail.com Mon Feb 1 12:22:31 2021 From: goel.nitish10 at gmail.com (Nitish Goel) Date: Mon, 1 Feb 2021 17:52:31 +0530 Subject: Failed to Install Ceilometer using devstack pike Message-ID: Hi , Please help.! I try to install heat ,ceilometer and aodh openstack using the devstack pike version. But getting the below error: *no matching distribution found for oslo.log>=4.3.0* pip version 9.x python version 2.7 Next Attempt: Upgraded the pip to 20.x but still gets the same error. Next Attempt: *Updated the python 3.6(pip with the old version) but then gets the below error:* Traceback (most recent call last): File "/opt/stack/devstack/tools/outfilter.py", line 85, in sys.exit(main()) File "/opt/stack/devstack/tools/outfilter.py", line 53, in main outfile = open(opts.outfile, 'a', 0) ValueError: can't have unbuffered text I/O Next Attempt: Initially heat was installed successfully, but after the above errors from ceilometer and aodh, I commented them and executed stack.sh but now heat installation also fails with below error: *(pip with 9.x and python 2.7)* CRITICAL keystone [None req-ddd57555-0e66-4e94-8349-044e79573d01 None None] Unhandled error: TypeError: __init__() got an unexpected keyword argument 'encoding' 2021-02-01 11:43:37.083 | ERROR keystone Traceback (most recent call last): 2021-02-01 11:43:37.083 | ERROR keystone File "/usr/local/bin/keystone-manage", line 10, in 2021-02-01 11:43:37.083 | ERROR keystone sys.exit(main()) 2021-02-01 11:43:37.083 | ERROR keystone File "/opt/stack/keystone/keystone/cmd/manage.py", line 45, in main 2021-02-01 11:43:37.083 | ERROR keystone cli.main(argv=sys.argv, config_files=config_files) 2021-02-01 11:43:37.083 | ERROR keystone File "/opt/stack/keystone/keystone/cmd/cli.py", line 1331, in main 2021-02-01 11:43:37.083 | ERROR keystone CONF.command.cmd_class.main() 2021-02-01 11:43:37.083 | ERROR keystone File "/opt/stack/keystone/keystone/cmd/cli.py", line 380, in main 2021-02-01 11:43:37.083 | ERROR keystone klass.do_bootstrap() 2021-02-01 11:43:37.083 | ERROR keystone File "/opt/stack/keystone/keystone/cmd/cli.py", line 190, in do_bootstrap 2021-02-01 11:43:37.083 | ERROR keystone domain=default_domain) 2021-02-01 11:43:37.083 | ERROR keystone File "/opt/stack/keystone/keystone/common/manager.py", line 110, in wrapped 2021-02-01 11:43:37.084 | ERROR keystone __ret_val = __f(*args, **kwargs) 2021-02-01 11:43:37.084 | ERROR keystone File "/opt/stack/keystone/keystone/resource/core.py", line 723, in create_domain 2021-02-01 11:43:37.084 | ERROR keystone domain_id, project_from_domain, initiator) 2021-02-01 11:43:37.084 | ERROR keystone File "/opt/stack/keystone/keystone/common/manager.py", line 110, in wrapped 2021-02-01 11:43:37.084 | ERROR keystone __ret_val = __f(*args, **kwargs) 2021-02-01 11:43:37.084 | ERROR keystone File "/opt/stack/keystone/keystone/resource/core.py", line 221, in create_project 2021-02-01 11:43:37.084 | ERROR keystone ret['domain_id']) 2021-02-01 11:43:37.084 | ERROR keystone File "/usr/local/lib/python2.7/dist-packages/dogpile/cache/region.py", line 1239, in set_ 2021-02-01 11:43:37.084 | ERROR keystone self.set(key, value) 2021-02-01 11:43:37.084 | ERROR keystone File "/usr/local/lib/python2.7/dist-packages/dogpile/cache/region.py", line 983, in set 2021-02-01 11:43:37.084 | ERROR keystone key = self.key_mangler(key) 2021-02-01 11:43:37.084 | ERROR keystone File "/opt/stack/keystone/keystone/common/cache/core.py", line 87, in key_mangler 2021-02-01 11:43:37.084 | ERROR keystone key = '%s:%s' % (key, invalidation_manager.region_id) 2021-02-01 11:43:37.084 | ERROR keystone File "/opt/stack/keystone/keystone/common/cache/core.py", line 45, in region_id 2021-02-01 11:43:37.084 | ERROR keystone self._region_key, self._generate_new_id, expiration_time=-1) 2021-02-01 11:43:37.084 | ERROR keystone File "/usr/local/lib/python2.7/dist-packages/dogpile/cache/region.py", line 833, in get_or_create 2021-02-01 11:43:37.084 | ERROR keystone async_creator) as value: 2021-02-01 11:43:37.084 | ERROR keystone File "/usr/local/lib/python2.7/dist-packages/dogpile/lock.py", line 154, in __enter__ 2021-02-01 11:43:37.084 | ERROR keystone return self._enter() 2021-02-01 11:43:37.084 | ERROR keystone File "/usr/local/lib/python2.7/dist-packages/dogpile/lock.py", line 87, in _enter 2021-02-01 11:43:37.084 | ERROR keystone value = value_fn() 2021-02-01 11:43:37.084 | ERROR keystone File "/usr/local/lib/python2.7/dist-packages/dogpile/cache/region.py", line 788, in get_value 2021-02-01 11:43:37.084 | ERROR keystone value = self.backend.get(key) 2021-02-01 11:43:37.084 | ERROR keystone File "/opt/stack/keystone/keystone/common/cache/_context_cache.py", line 72, in get 2021-02-01 11:43:37.084 | ERROR keystone value = self._get_local_cache(key) 2021-02-01 11:43:37.084 | ERROR keystone File "/opt/stack/keystone/keystone/common/cache/_context_cache.py", line 57, in _get_local_cache 2021-02-01 11:43:37.084 | ERROR keystone value = msgpackutils.loads(value) 2021-02-01 11:43:37.084 | ERROR keystone File "/usr/local/lib/python2.7/dist-packages/oslo_serialization/msgpackutils.py", line 487, in loads 2021-02-01 11:43:37.084 | ERROR keystone return msgpack.unpackb(s, ext_hook=ext_hook, encoding='utf-8') 2021-02-01 11:43:37.084 | ERROR keystone File "/usr/local/lib/python2.7/dist-packages/msgpack/fallback.py", line 126, in unpackb 2021-02-01 11:43:37.084 | ERROR keystone unpacker = Unpacker(None, max_buffer_size=len(packed), **kwargs) 2021-02-01 11:43:37.084 | ERROR keystone TypeError: __init__() got an unexpected keyword argument 'encoding' 2021-02-01 11:43:37.084 | ERROR keystone Thanks, Nitish Goel -------------- next part -------------- An HTML attachment was scrubbed... URL: From zoharcloud at gmail.com Mon Feb 1 14:37:11 2021 From: zoharcloud at gmail.com (Zohar Mamedov) Date: Mon, 1 Feb 2021 21:37:11 +0700 Subject: [cinder] kioxia ready for driver and connector reviews Message-ID: Hi Cinder team, The Kioxia Kumoscale NVMeOF driver is ready for reviews. Gate tests are passing and "KIOXIA CI" is leaving comments on review threads. Here is an example of last response it posted: https://review.opendev.org/c/openstack/cinder/+/765304/2#message-6ca11d3b3c8a28666c74c65aeece090acfe26f9b All changes that the CI responded to can be found here: http://104.254.65.37/ (each will also have a gerrit_event.json which will have a ['change']['url'] with a link to the review page) Part of this contribution is a major expansion to the nvmeof connector in os-brick. This was initially discussed in the PTG meetings, and mentioned in some follow up IRC meetings. The main change is adding (optional) support for mdraid replication of nvmeof volumes. Target connection sharing is also implemented (removing the need for a separate nvmeof target per volume.) Additionally, a concurrency / race condition issue in the existing nvmeof connector (nvmeof.py line 120) is resolved as part of these changes. Though this change is a big rework, it is in line with the spec, which had an in depth review and was approved: https://review.opendev.org/c/openstack/cinder-specs/+/766730 We are dedicated to addressing possible regressions, however, following a review of existing CIs using or testing the nvmeof connector, none were found to be using it. All suspected CIs seem to pass or be using a different connector. This is to the best of our knowledge, we suspect that Kioxia may be the first CI to be testing with NVMeOF protocol and driver. If we missed anything, please let us know with a review comment. Kioxia Kumoscale driver review: https://review.opendev.org/c/openstack/cinder/+/768574 NVMeOF connector review: https://review.opendev.org/c/openstack/os-brick/+/768575 Thank you! Zohar -------------- next part -------------- An HTML attachment was scrubbed... URL: From adriant at catalystcloud.nz Tue Feb 2 00:43:01 2021 From: adriant at catalystcloud.nz (Adrian Turjak) Date: Tue, 2 Feb 2021 13:43:01 +1300 Subject: [keystone][osc]Strange behaviour of OSC in keystone MFA context In-Reply-To: References: <27cda0ba41634425b5c4d688381d6107@elca.ch> <18d48a3d208317e9c9b220ff20ae2f46a2442ef0.camel@redhat.com> Message-ID: <55f07666-3315-d02d-5075-13e0f677420c@catalystcloud.nz> *puts up hand* You can blame me for this. When I implemented this I didn't (and still don't) fully understand how the loading stuff works in Keystoneauth and how it works with other things like OSC. I was more focused on getting the direct auth/session side working because the loading stuff isn't that useful in OSC really (see below why). If anyone does know the loader side of keystoneauth better please chime in and save me from making too much of a fool of myself! I think adding `.split(',')` is probably enough, but you'd want to also update the help text clarify that the option is a 'comma separated list'. The biggest issue, and why this area never got much testing, is because it is effectively useless since you'd have to supply your MFA values EVERY command. Imagine how awful that would be for TOTP. The whole point of the MFA process in keystone with auth-receipt was a dynamic interactive login. Supplying the MFA upfront isn't that useful. What the OSC really needs is a `login` command, that goes through a login process using the auth-receipts workflow from keystone (asks for password/totp) and sets some sort of state file. We can't set the environment variables of the parent shell process, so we'd have to go with a state/session file. But to avoid it clashing with other state files and terminal sessions we'd need some way to tag them by the parent process ID so you can login to more than one cloud/project/user/etc across multiple terminals. In addition it would be really nice if the OSC had some way of reusing a scoped token/catalog rather than having to fetch it every time, but I feel that would have to include changes in Keystoneauth to supply it some cached data which tells it to not attempt to reauthenticate but rather trust the catalog/token supplied (does keystoneauth support that?). Because that reauth every command via Keystoneauth is actually what often takes longer than the actual API calls... We can also just throw catalog into that state/session file as json/yaml. On 29/01/21 7:03 am, Stephen Finucane wrote: > The definition for those opts can be found at [1]. As Sean thought it might be, > that is using the default type defined in the parent 'Opt' class of 'str' [2]. > We don't expose argparse's 'action' parameter that would allow us to use the > 'append' action, so you'd have to fix this by parsing whatever the user provided > after the fact. I suspect you could resolve the immediate issue by changing this > line [3] from: > > self._methods = kwargs['auth_methods'] > > to: > > self._methods = kwargs['auth_methods'].split(',') > > However, I assume there's likely more to this issue. I don't have an environment > to hand to validate this fix, unfortunately. > > If you do manage to test that change and it works, I'd be happy to help you in > getting a patch proposed to 'keystoneauth'. > > Hope this helps, > Stephen > > [1] https://github.com/openstack/keystoneauth/blob/4.3.0/keystoneauth1/loading/_plugins/identity/v3.py#L316-L330 > [2] https://github.com/openstack/keystoneauth/blob/4.3.0/keystoneauth1/loading/opts.py#L65 > [3] https://github.com/openstack/keystoneauth/blob/4.3.0/keystoneauth1/loading/_plugins/identity/v3.py#L338 > >>> Jean-François > > From feilong at catalyst.net.nz Tue Feb 2 02:22:55 2021 From: feilong at catalyst.net.nz (feilong) Date: Tue, 2 Feb 2021 15:22:55 +1300 Subject: [magnum][heat] Rolling system upgrades In-Reply-To: References: <14a045a4-8705-438d-942f-f11f2d0258b2@www.fastmail.com> <09e337c8-5fed-40aa-6835-ea5f64d6d943@catalyst.net.nz> Message-ID: <04e1eaf4-ab10-ef90-2e3f-a0743b2f7413@catalyst.net.nz> HI Krzysztof, For the first point, at least for new node, we probably need to update the image ID of the Heat template. But for existing node, there is no way to update the image ID because it's a readonly attribute from Nova perspective. And I agree a non-invasive way would be good. I'm going to discuss with Heat team to understand if there is a pre-update hook we can leverage to drain before updating the OS. I will keep you posted. On 26/01/21 9:42 pm, Krzysztof Klimonda wrote: > Hi Feilong, > > Regarding first point, could you share your idea on how to fix it? I haven't yet put much thought into that, but solving system-level upgrades for nodes (and getting it to work with auto healing/auto scaling etc.) is something I'll have to tackle myself if we want to go into production with full feature-set, and I'd be happy to put work into that. > > Regarding image updates, perhaps that has been fixed since ussuri? I'm testing it on some ussuri snapshot, my nodes use images for root disk, and I can see that the updated image property from cluster template is not populated into heat stack itself. > I see your point about lack of communication from magnum to the cluster, but perhaps that could be handled in a similar way as OS::Heat::Software* updates, with an agent running on nodes? Perhaps heat's pre-update hook could be used, with agent clearing it after node has been drained. I'm not overly familiar with heat inner workings, and I was hoping that someone with more heat experience could chime in and give some idea how that could be handled. > Perhaps OS::Heat::Software* resources already provide a way to handle this (although I'm not sure how could that work given that they are probably updated only after server resource update is processed). > > I feel like getting images to update in a non-invasive way would be a cleaner and safer way of handling OS-level upgrades, although I'm not sure how feasible it is in the end. > -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ From midhunlaln66 at gmail.com Tue Feb 2 05:00:21 2021 From: midhunlaln66 at gmail.com (Midhunlal Nb) Date: Tue, 2 Feb 2021 10:30:21 +0530 Subject: LDAP integration with openstack Message-ID: Hi, I tried to integrate ldap with my openstack. I followed an openstack document and completed the ldap integration,after that I am getting a lot of errors,I am not able to run any openstack commands. -->I followed below document https://docs.openstack.org/keystone/rocky/admin/identity-integrate-with-ldap.html#integrate-identity-backend-ldap --->I am getting below errors root at controller:~/client-scripts# openstack image list The request you have made requires authentication. (HTTP 401) (Request-ID: req-bdcde4be-5b62-4454-9084-19324603d0ce) --->so I checked keystone log POST http://controller:5000/v3/auth/tokens 2021-01-29 11:16:36.881 28558 WARNING keystone.auth.plugins.core [req-cf013eff-6e1e-43c4-a6ae-9f91f4fe48f9 - - - - -] Could not find user: neutron.: UserNotFound: Could not find user: neutron. 2021-01-29 11:16:36.881 28558 WARNING keystone.common.wsgi [req-cf013eff-6e1e-43c4-a6ae-9f91f4fe48f9 - - - - -] Authorization failed. The request you have made requires authentication. from192.168.xxx.xx: Unauthorized: The request you have made requires authentication. 2021-01-29 11:17:22.009 28556 INFO keystone.common.wsgi [req-a2a480a7-2ee1-4e11-8a48-dcf93ffb96db - - - - -] POSthttp://controller:5000/v3/auth/tokens 2021-01-29 11:17:22.039 28556 WARNING keystone.auth.plugins.core [req-a2a480a7-2ee1-4e11-8a48-dcf93ffb96db - - - - -] Could not find user: placement.: UserNotFound: Could not find user: placement. Anyone please help me on this issue. Thanks & Regards Midhunlal N B +918921245637 -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Tue Feb 2 06:57:55 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 2 Feb 2021 07:57:55 +0100 Subject: LDAP integration with openstack In-Reply-To: References: Message-ID: On Tue, Feb 2, 2021 at 6:01 AM Midhunlal Nb wrote: > > Hi, Hi Midhunlal, > I tried to integrate ldap with my openstack. I followed an openstack > document and completed the ldap integration,after that I am getting > a lot of errors,I am not able to run any openstack commands. > -->I followed below document > https://docs.openstack.org/keystone/rocky/admin/identity-integrate-with-ldap.html#integrate-identity-backend-ldap > > --->I am getting below errors > root at controller:~/client-scripts# openstack image list > The request you have made requires authentication. (HTTP 401) > (Request-ID: req-bdcde4be-5b62-4454-9084-19324603d0ce) > > --->so I checked keystone log > > POST http://controller:5000/v3/auth/tokens > 2021-01-29 11:16:36.881 28558 WARNING keystone.auth.plugins.core > [req-cf013eff-6e1e-43c4-a6ae-9f91f4fe48f9 - - - - -] Could not find > user: neutron.: UserNotFound: Could not find user: neutron. > 2021-01-29 11:16:36.881 28558 WARNING keystone.common.wsgi > [req-cf013eff-6e1e-43c4-a6ae-9f91f4fe48f9 - - - - -] Authorization > failed. The request you have made requires authentication. > from192.168.xxx.xx: Unauthorized: The request you have made requires > authentication. > 2021-01-29 11:17:22.009 28556 INFO keystone.common.wsgi > [req-a2a480a7-2ee1-4e11-8a48-dcf93ffb96db - - - - -] > POSthttp://controller:5000/v3/auth/tokens > 2021-01-29 11:17:22.039 28556 WARNING keystone.auth.plugins.core > [req-a2a480a7-2ee1-4e11-8a48-dcf93ffb96db - - - - -] Could not find > user: placement.: UserNotFound: Could not find user: placement. That is because, if you switch the main domain from SQL to LDAP, it will no longer "see" the users defined in the SQL database. You can either define them again in LDAP or use LDAP with a different domain. I find the latter a much cleaner solution. -yoctozepto From adriant at catalystcloud.nz Tue Feb 2 06:58:50 2021 From: adriant at catalystcloud.nz (Adrian Turjak) Date: Tue, 2 Feb 2021 19:58:50 +1300 Subject: [All][StoryBoard] Angular.js Alternatives In-Reply-To: References: Message-ID: <0a0bf782-09eb-aeb6-fdd7-0d3e3d7b4f89@catalystcloud.nz> Sorry for being late to the thread, but it might be worth touching base with Horizon peeps as well, because there could potentially be some useful knowledge/contributor sharing if we all stick to similar libraries for front end when the eventually Horizon rewrite starts. I'm sadly also out of the loop there so no clue if they even know what direction they plan to go in. Although given the good things I've heard about Vue.js I can't say it's a bad choice (but neither would have been react). On 29/01/21 7:33 am, Kendall Nelson wrote: > To circle back to this, the StoryBoard team has decided on Vue, given > some contributors previous experience with it and a POC already started. > > Thank you everyone for your valuable input! We really do appreciate it! > > -Kendall (diablo_rojo) > > On Thu, Jan 21, 2021 at 1:23 PM Kendall Nelson > wrote: > > Hello Everyone! > > The StoryBoard team is looking at alternatives to Angular.js since > its going end of life. After some research, we've boiled all the > options down to two possibilities: > > Vue.js > > or > > React.js > > I am diving more deeply into researching those two options this > week, but any opinions or feedback on your experiences with either > of them would be helpful! > > Here is the etherpad with our research so far[3]. > > Feel free to add opinions there or in response to this thread! > > -Kendall Nelson (diablo_rojo) & The StoryBoard Team > > [1] https://vuejs.org/ > [2] https://reactjs.org/ > [3] > https://etherpad.opendev.org/p/replace-angularjs-storyboard-research > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adriant at catalystcloud.nz Tue Feb 2 07:40:22 2021 From: adriant at catalystcloud.nz (Adrian Turjak) Date: Tue, 2 Feb 2021 20:40:22 +1300 Subject: Ospurge or "project purge" - What's the right approach to cleanup projects prior to deletion In-Reply-To: <5C651C9C-0D00-4CB8-9992-4AC23D92FE38@gmail.com> References: <76498a8c-c8a5-9488-0223-3f47ac4486df@inovex.de> <0CC2DFF7-5721-4106-A06B-6FC2970AC07B@gmail.com> <7237beb7-a68a-0398-f779-aef76fbc0e82@debian.org> <10C08D43-B4E6-4423-B561-183A4336C488@gmail.com> <9f408ffe-4046-76e0-bbdf-57ee94191738@inovex.de> <5C651C9C-0D00-4CB8-9992-4AC23D92FE38@gmail.com> Message-ID: OH! As someone who tried to champion project termination at the API layer as a community goal during the Berlin summit, I love that someone ran with our idea of implementing it in the SDK as a starting point! I still think we should champion a purge API for each service, but the SDK is definitely the place to handle dependencies, and I love that you are doing it in threads! I'd be happy to review/test any code related to this both in the OSC and the SDK. I sadly haven't had time to help implement anything like this, but I will somehow find the time to review/test! And I look forward to one day throwing away our internal termination logic in favor of what's in the SDK! On 20/01/21 2:16 am, Artem Goncharov wrote: > Hi Christian. > > Actually the patch stuck due to lack of reviewers. Idea here was not > to replace “openstack project purge”, but to add a totally new > implementation (hopefully later dropping project purge as such). From > my POV at the moment there is nothing else I was thinking to > mandatorily implement on OSC side (sure, for future I would like to > give possibility to limit services to cleanup, to add cleanup of > key-airs, etc). SDK part is completely independent of that. Here we > definitely need to add dropping of private images. Also on DNS side we > can do cleanup. Other services are tricky (while swift we can still > implement relatively easy). > > All in all - we can merge the PR in it’s current form (assuming we get > some positive reviews). > > BG, > Artem > >> On 19. Jan 2021, at 13:33, Christian Rohmann >> > wrote: >> >> Hey Artem, >> >> thank you very much for your quick reply and pointer to the patchset >> you work on! >> >> >> >> On 18/01/2021 20:14, Artem Goncharov wrote: >>> Ha, thats exactly the case, the whole logic sits in sdk and is >>> spread across the supported services: >>> - >>> https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/compute/v2/_proxy.py#L1798 >>>  - >>> for compute. KeyPairs not dropped, since they belong to user, and >>> not to the “project”; >>> - >>> https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/block_storage/v3/_proxy.py#L547 >>>  - >>> block storage; >>> - >>> https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/orchestration/v1/_proxy.py#L490 >>> >>> - >>> https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/network/v2/_proxy.py#L4130 >>>  - >>> the most complex one in order to give possibility to clean “old” >>> resource without destroying everything else >>> >>> Adding image is few lines of code (never had enough time to add it), >>> identity is a bit tricky, since also here mostly resources does not >>> belong to Project. DNS would be also easy to do. OSC here is only >>> providing I/F, while the logic sits in SDK and can be very easy >>> extended for other services. >> >>>> On 18. Jan 2021, at 19:52, Thomas Goirand >>> > wrote: >>>> >>>> On 1/18/21 6:56 PM, Artem Goncharov wrote: >>>>> What do you mean it doesn’t implement anything at all? It does >>>>> clean up compute, network, block_storage, orchestrate resources. >>>>> Moreover it gives you possibility to clean “old” resources >>>>> (created before or last updated before). >>>> >>>> Oh really? With that few lines of code? I'll re-read the patch then, >>>> sorry for my bad assumptions. >>>> >>>> Can you point at the part that's actually deleting the resources? >> >> If I understood correctly, the cleanup relies on the SDK >> functionality / requirement for each resource type to provide a >> corresponding function( >> https://github.com/openstack/openstacksdk/blob/master/openstack/cloud/openstackcloud.py#L762) >> ? >> >> Reading through the (SDK) code this even covers depending resources, >> nice! >> >> >> I certainly will leave some feedback and comments in your change >> (https://review.opendev.org/c/openstack/python-openstackclient/+/734485). >> But what are your immediate plans moving forward on with this then, >> Artem? >> >> There is a little todo list in the description on your change .. is >> there anything you yourself know that is still missing before taking >> this to a full review and finally merging it? >> >> Only code that is shipped and then actively used will improve further >> and people will notice other required functionality or switches for >> later iterations. With the current state of having a somewhat working >> but unmaintained ospurge and a non feature complete "project purge"  >> (supports only Block Storage v1, v2; Compute v2; Image v1, v2) this >> will only cause people to start hacking away on the ospurge codebase >> or worse building their own tools and scripts to implement project >> cleanup for their environments over and over again. >> >> >> >> Regards, >> >> >> Christian >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Tue Feb 2 08:06:33 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Tue, 02 Feb 2021 09:06:33 +0100 Subject: [nova][placement] adding nova-core to placement-core in gerrit In-Reply-To: References: Message-ID: Hi, I've now added nova-core to the placement-core group[1] [1] https://review.opendev.org/admin/groups/93c2b262ebfe0b3270c0b7ad60de887b02aaba9d,members On Wed, Jan 27, 2021 at 09:49, Stephen Finucane wrote: > On Tue, 2021-01-26 at 17:14 +0100, Balazs Gibizer wrote: >> Hi, >> >> Placement got back under nova governance but so far we haven't >> consolidated the core teams yet. Stephen pointed out to me that >> given >> the ongoing RBAC works it would be beneficial if more nova cores, >> with >> API and RBAC experience, could approve such patches. So I'm >> proposing >> to add nova-core group to the placement-core group in gerrit. This >> means Ghanshyam, John, Lee, and Melanie would get core rights in the >> placement related repositories. >> >> @placement-core, @nova-core members: Please let me know if you have >> any >> objection to such change until end of this week. > > I brought it up and obviously think it's a sensible idea, so it's an > easy +1 > from me. > > Stephen > >> cheers, >> gibi > > > From sathlang at redhat.com Tue Feb 2 11:07:17 2021 From: sathlang at redhat.com (Sofer Athlan-Guyot) Date: Tue, 02 Feb 2021 12:07:17 +0100 Subject: [oslo] Proposing Daniel Bengtsson for Oslo Core In-Reply-To: References: Message-ID: <87r1lyramy.fsf@redhat.com> Hi, as one of his colleague, I'll definitively support his nomination even though I have no authority on oslo lib whatsoever :) Herve Beraud writes: > Hi, > > Daniel has been working on Oslo for quite some time now and during that time > he was a great help on Oslo. I think he would make a good addition to the > general Oslo core team. > > He helped us by proposing patches to fix issues, maintaining stable branches > by backporting changes, and managing our releases by assuming parts of the > release liaison role. > > Existing Oslo team members (and anyone else we > co-own libraries with) please respond with +1/-1. If there are no > objections we'll add him to the ACL soon. :-) > > Thanks. Thanks, -- Sofer Athlan-Guyot chem on #irc at rhos-upgrades DFG:Upgrades Squad:Update From zigo at debian.org Tue Feb 2 11:32:29 2021 From: zigo at debian.org (Thomas Goirand) Date: Tue, 2 Feb 2021 12:32:29 +0100 Subject: [all] Eventlet broken again with SSL, this time under Python 3.9 In-Reply-To: <88041612182037@mail.yandex.ru> References: <6e817a0e-aaa7-9444-fca3-6c5ae8ed2ae7@debian.org> <6f653877fa1da9b2d191b3d3818307f9b29f60bb.camel@redhat.com> <7441d18e5313af6d76a28cabb3866e05dad6f6d5.camel@redhat.com> <666451611999668@mail.yandex.ru> <6241612007690@mail.yandex.ru> <88041612182037@mail.yandex.ru> Message-ID: <9fd8defd-4bcc-2980-5d0e-0e4f696dfbf9@debian.org> On 2/1/21 1:21 PM, Dmitriy Rabotyagov wrote: > Yes, I can confirm that amqp of version 5.0.3 and later does not accept self-signed certificates in case root ca has not been provided. > It has been bumped to 5.0.5 in u-c recently which made things fail for us everywhere now. > > However, in case of adding root CA into the system things continue working properly. I'm making a bit of progress over here. I found out that downgrading to python3-dnspython 1.16.0 made swift-proxy (and probably others) back to working. What version of dnspython are you using in RDO? Cheers, Thomas Goirand (zigo) From amoralej at redhat.com Tue Feb 2 11:53:48 2021 From: amoralej at redhat.com (Alfredo Moralejo Alonso) Date: Tue, 2 Feb 2021 12:53:48 +0100 Subject: [all] Eventlet broken again with SSL, this time under Python 3.9 In-Reply-To: <9fd8defd-4bcc-2980-5d0e-0e4f696dfbf9@debian.org> References: <6e817a0e-aaa7-9444-fca3-6c5ae8ed2ae7@debian.org> <6f653877fa1da9b2d191b3d3818307f9b29f60bb.camel@redhat.com> <7441d18e5313af6d76a28cabb3866e05dad6f6d5.camel@redhat.com> <666451611999668@mail.yandex.ru> <6241612007690@mail.yandex.ru> <88041612182037@mail.yandex.ru> <9fd8defd-4bcc-2980-5d0e-0e4f696dfbf9@debian.org> Message-ID: On Tue, Feb 2, 2021 at 12:36 PM Thomas Goirand wrote: > On 2/1/21 1:21 PM, Dmitriy Rabotyagov wrote: > > Yes, I can confirm that amqp of version 5.0.3 and later does not accept > self-signed certificates in case root ca has not been provided. > > It has been bumped to 5.0.5 in u-c recently which made things fail for > us everywhere now. > > > > However, in case of adding root CA into the system things continue > working properly. > > I'm making a bit of progress over here. > > I found out that downgrading to python3-dnspython 1.16.0 made > swift-proxy (and probably others) back to working. > > What version of dnspython are you using in RDO? > > The current build bundles 1.16.0. Regards, Alfredo > Cheers, > > Thomas Goirand (zigo) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Tue Feb 2 12:20:41 2021 From: hberaud at redhat.com (Herve Beraud) Date: Tue, 2 Feb 2021 13:20:41 +0100 Subject: [oslo] Proposing Daniel Bengtsson for Oslo Core In-Reply-To: <87r1lyramy.fsf@redhat.com> References: <87r1lyramy.fsf@redhat.com> Message-ID: Le mar. 2 févr. 2021 à 12:13, Sofer Athlan-Guyot a écrit : > Hi, > > as one of his colleague, I'll definitively support his nomination even > though I have no authority on oslo lib whatsoever :) > It reinforces my feelings concerning Daniel's good job! Thank you Sofer! > Herve Beraud writes: > > > Hi, > > > > Daniel has been working on Oslo for quite some time now and during that > time > > he was a great help on Oslo. I think he would make a good addition to the > > general Oslo core team. > > > > He helped us by proposing patches to fix issues, maintaining stable > branches > > by backporting changes, and managing our releases by assuming parts of > the > > release liaison role. > > > > Existing Oslo team members (and anyone else we > > co-own libraries with) please respond with +1/-1. If there are no > > objections we'll add him to the ACL soon. :-) > > > > Thanks. > > Thanks, > -- > Sofer Athlan-Guyot > chem on #irc at rhos-upgrades > DFG:Upgrades Squad:Update > > > -- Hervé Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ https://twitter.com/4383hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Tue Feb 2 13:30:45 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Tue, 2 Feb 2021 08:30:45 -0500 Subject: [tc] weekly meeting Message-ID: Hi everyone, Here’s an update on what happened in the OpenStack TC this week. You can get more information by checking for changes in the openstack/governance repository. # Patches ## Open Reviews - Remove Karbor project team https://review.opendev.org/c/openstack/governance/+/767056 - Create ansible-role-pki repo https://review.opendev.org/c/openstack/governance/+/773383 - Add assert:supports-api-interoperability tag to neutron https://review.opendev.org/c/openstack/governance/+/773090 - Add rbd-iscsi-client to cinder project https://review.opendev.org/c/openstack/governance/+/772597 - Cool-down cycle goal https://review.opendev.org/c/openstack/governance/+/770616 - Define Xena release testing runtime https://review.opendev.org/c/openstack/governance/+/770860 - monasca-log-api & monasca-ceilometer does not make releases https://review.opendev.org/c/openstack/governance/+/771785 - Define 2021 upstream investment opportunities https://review.opendev.org/c/openstack/governance/+/771707 - [manila] add assert:supports-api-interoperability https://review.opendev.org/c/openstack/governance/+/770859 ## Project Updates - Setting Ke Chen as Watcher's PTL https://review.opendev.org/c/openstack/governance/+/770913 - Move openstack-tempest-skiplist to release-management: none https://review.opendev.org/c/openstack/governance/+/771488 - js-openstack-lib does not make releases https://review.opendev.org/c/openstack/governance/+/771789 ## General Changes - Drop openSUSE from commonly tested distro list https://review.opendev.org/c/openstack/governance/+/770855 - Add Resolution of TC stance on the OpenStackClient https://review.opendev.org/c/openstack/governance/+/759904 ## Abandoned Projects - WIP NO MERGE Move os-*-config to Heat project governance https://review.opendev.org/c/openstack/governance/+/770285 # Other Reminders - Our next [TC] Weekly meeting is scheduled for February 4th at 1500 UTC. If you would like to add topics for discussion, please go to https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting and fill out your suggestions by Wednesday, February 3rd, at 2100 UTC. Thanks for reading! Mohammed & Kendall -- Mohammed Naser VEXXHOST, Inc. From rosmaita.fossdev at gmail.com Tue Feb 2 13:45:22 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 2 Feb 2021 08:45:22 -0500 Subject: [cinder] wallaby R-9 mid-cycle poll In-Reply-To: <8edeadaf-baba-616a-6db8-2c45bab0271d@gmail.com> References: <8edeadaf-baba-616a-6db8-2c45bab0271d@gmail.com> Message-ID: Reminder: the poll closes in 7 hours. Vote now if you have a preference for when the midcycle is held. On 1/27/21 10:04 PM, Brian Rosmaita wrote: > Hello Cinder team and anyone interested in Cinder, > > The second wallaby mid-cycle meeting will be held during week R-9 (the > week of 8 February 2021).  It will be 2 hours long. > > Please indicate your availability on the following poll: > >   https://doodle.com/poll/9vqri7855cab858d > > Please respond before 21:00 UTC on Tuesday 2 February 2021. > > > thanks, > brian From amotoki at gmail.com Tue Feb 2 14:21:30 2021 From: amotoki at gmail.com (Akihiro Motoki) Date: Tue, 2 Feb 2021 23:21:30 +0900 Subject: [horizon][i18n][infra] Renaming Chinese locales in Django from zh-cn/zh-tw to zh-hans/zh-hant Message-ID: Hi, The horizon team is planning to switch Chinese language codes in Django codes from zh-cn/zh-tw to zh-hans/zh-hant. Django, a framework used in horizon, recommends to use them since more than 5 years ago [1][2]. This change touches Chinese locales in the dashbaord codes of horizon and its plugins only. It does not change Chinese locales in other translations like documentations and non-Django python codes. This is to minimize the impact to other translations and the translation platform. ### What are/are not changed in repositories * horizon and horizon plugins * locales in the dashboard codes are renamed from zh-cn/zh-tw to zh-hans/zh-hant * locales in doc/ and releasenotes/ are not changed * other repositories * no locale change happens NOTE: * This leads to a situation that we have two different locales in horizon and plugin repositories (zh-hans/hant in the code folders and zh-cn/tw in doc and releasenotes folders), but it affects only developers and does not affect horizon consumers (operators/users) and translators. * In addition, documentations are translated in OpenStack-wide (not only in horizon and plugins). By keeping locales in docs, locales in documentation translations will be consistent. ### Impact on Zanata In Zanata (the translation platform), zh-cn/zh-tw continue to be used, so no change is visible to translators. The infra job proposes zh-cn/zh-tw GUI translatoins as zh-hans/zh-hant translations to horizon and plugin repositories. NOTE: The alternative is to create the corresponding language teams (zh-hans/zh-hant) in Zanata, but it affects Chinese translators a lot. They need to join two language teams to translate horizon (zh-hans/zh-hant) and docs (zh-cn/zh-tw). It makes translator workflow complicated. The proposed way has no impact on translators and they can continue the current translation process and translate both horizon and docs under a single language code. ### Changes in the infra scripts Converting Chinese locales of dashboard translations from zh-cn/zh-tw to zh-hans/zh-hant is handled by the periodic translation job. propose_translation_update job is responsible for this. [propose_translation_update.sh] * Move zh-cn/zh-tw translations related to Django codes in horizon and its plugins from zanata to zh-hans/hant directory. * This should happen in the master branch (+ future stable branhces such as stable/wallaby). ### Additional Remarks I18n SIG respects all language team coordinators & members, and is looking forward to seeing discussions and/or active contributions from the language teams. Currently all language codes follow the ISO 639-1 standard (language codes with relevant country codes), but the change introduces new language code forms like zh-hans/zh-hant. This follows IETF BCP 47 which recommends a combination of language codes and ISO 15924 script code (four letters). We now have two different language codes in OpenStack world. This is just to minimize the impact on the existing translations. It is not ideal. We are open for further discussion on language codes and translation support. ### References [1] https://code.djangoproject.com/ticket/18419 [2] https://www.djbook.ru/rel1.7/releases/1.7.html#language-codes-zh-cn-zh-tw-and-fy-nl Thanks, Akihiro Motoki (irc: amotoki) From fungi at yuggoth.org Tue Feb 2 15:57:35 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 2 Feb 2021 15:57:35 +0000 Subject: [all] Eventlet broken again with SSL, this time under Python 3.9 In-Reply-To: <9fd8defd-4bcc-2980-5d0e-0e4f696dfbf9@debian.org> References: <6f653877fa1da9b2d191b3d3818307f9b29f60bb.camel@redhat.com> <7441d18e5313af6d76a28cabb3866e05dad6f6d5.camel@redhat.com> <666451611999668@mail.yandex.ru> <6241612007690@mail.yandex.ru> <88041612182037@mail.yandex.ru> <9fd8defd-4bcc-2980-5d0e-0e4f696dfbf9@debian.org> Message-ID: <20210202155734.ym423voolru6voxn@yuggoth.org> On 2021-02-02 12:32:29 +0100 (+0100), Thomas Goirand wrote: [...] > I found out that downgrading to python3-dnspython 1.16.0 made > swift-proxy (and probably others) back to working. [...] If memory serves, dnspython and eventlet both monkey-patch the stdlib in potentially conflicting ways, and we've seen them interact badly in the past. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From smooney at redhat.com Tue Feb 2 16:35:17 2021 From: smooney at redhat.com (Sean Mooney) Date: Tue, 02 Feb 2021 16:35:17 +0000 Subject: [all] Eventlet broken again with SSL, this time under Python 3.9 In-Reply-To: <20210202155734.ym423voolru6voxn@yuggoth.org> References: <6f653877fa1da9b2d191b3d3818307f9b29f60bb.camel@redhat.com> <7441d18e5313af6d76a28cabb3866e05dad6f6d5.camel@redhat.com> <666451611999668@mail.yandex.ru> <6241612007690@mail.yandex.ru> <88041612182037@mail.yandex.ru> <9fd8defd-4bcc-2980-5d0e-0e4f696dfbf9@debian.org> <20210202155734.ym423voolru6voxn@yuggoth.org> Message-ID: <4f56323c6fc4df53a007a1c2483d387f7d0c0045.camel@redhat.com> On Tue, 2021-02-02 at 15:57 +0000, Jeremy Stanley wrote: > On 2021-02-02 12:32:29 +0100 (+0100), Thomas Goirand wrote: > [...] > > I found out that downgrading to python3-dnspython 1.16.0 made > > swift-proxy (and probably others) back to working. > [...] > > If memory serves, dnspython and eventlet both monkey-patch the > stdlib in potentially conflicting ways, and we've seen them interact > badly in the past. upstream eventlet force 1.16.0 to be used via there requirement files in responce to us filing upstream bug after 2.0.0 was released so its known that you cant use dnspython 2.0.0 with eventlest currently part of the issue however is that was not comunicated to distos well so fedora for example in f33 ships eventlest and dnspyton 2.0.0 so they are technially incompatable but since they dont have the upper limit in the eventlet spec file they were not aware of that. eventlet have fixt some fo the incompatibilte in the last few months but not all of them From stephenfin at redhat.com Tue Feb 2 16:42:50 2021 From: stephenfin at redhat.com (Stephen Finucane) Date: Tue, 02 Feb 2021 16:42:50 +0000 Subject: Integration tests failing on master branch for ppc64le In-Reply-To: References: Message-ID: On Mon, 2021-02-01 at 11:35 +0000, aditi Dukle wrote: > Hi, > > We recently observed failures on master branch jobs for ppc64le. The VMs fail > to boot and this is the error I observed in libvirt logs - > "qemuDomainUSBAddressAddHubs:2997 : unsupported configuration: USB is disabled > for this domain, but USB devices are present in the domain XML" >   > Detailed logs -  > https://oplab9.parqtec.unicamp.br/pub/ppc64el/openstack/nova/96/758396/7/check/tempest-dsvm-full-focal-py3/04fc9e7/job-output.txt > > Could you please let me know if there was any recent configuration that was > done to disable USB configuration in the instance domain?  > > Thanks, > Aditi Dukle Thanks for bringing this up. I've proposed a change [1] that should resolve this issue. I'll wait and see what the CI says before we merge it. Cheers, Stephen [1] https://review.opendev.org/c/openstack/nova/+/773728 From C-Albert.Braden at charter.com Tue Feb 2 17:18:51 2021 From: C-Albert.Braden at charter.com (Braden, Albert) Date: Tue, 2 Feb 2021 17:18:51 +0000 Subject: [kolla][adjutant] Granular quota settings with Adjutant Message-ID: We have been experimenting with Adjutant, and it offers a menu of sizes (default small, medium and large). It's easy to add more sizes, but we would like to allow customers to request granular changes. For example, if a customer needs to add 10 cores and 5 instances to his existing quota. This would require some changes to Horizon, and an Adjutant plugin. Is anyone already doing this, or working in that direction? I apologize for the nonsense below. So far I have not been able to stop it from being attached to my external emails. I'm working on it. E-MAIL CONFIDENTIALITY NOTICE: The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.morin at gmail.com Tue Feb 2 17:37:13 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Tue, 2 Feb 2021 17:37:13 +0000 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> Message-ID: <20210202173713.GA14971@sync> Hey all, I will start the answers :) At OVH, our hard limit is around 1500 hypervisors on a region. It also depends a lot on number of instances (and neutron ports). The effects if we try to go above this number: - load on control plane (db/rabbit) is increasing a lot - "burst" load is hard to manage (e.g. restart of all neutron agent or nova computes is putting a high pressure on control plane) - and of course, failure domain is bigger Note that we dont use cells. We are deploying multiple regions, but this is painful to manage / understand for our clients. We are looking for a solution to unify the regions, but we did not find anything which could fit our needs for now. Cheers, -- Arnaud Morin On 28.01.21 - 14:24, Thierry Carrez wrote: > Hi everyone, > > As part of the Large Scale SIG[1] activities, I'd like to quickly poll our > community on the following question: > > How many compute nodes do you feel comfortable fitting in a single-cluster > deployment of OpenStack, before you need to scale it out to multiple > regions/cells/.. ? > > Obviously this depends on a lot of deployment-dependent factors (type of > activity, choice of networking...) so don't overthink it: a rough number is > fine :) > > [1] https://wiki.openstack.org/wiki/Large_Scale_SIG > > Thanks in advance, > > -- > Thierry Carrez (ttx) > From smooney at redhat.com Tue Feb 2 17:50:27 2021 From: smooney at redhat.com (Sean Mooney) Date: Tue, 02 Feb 2021 17:50:27 +0000 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: <20210202173713.GA14971@sync> References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> <20210202173713.GA14971@sync> Message-ID: <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> On Tue, 2021-02-02 at 17:37 +0000, Arnaud Morin wrote: > Hey all, > > I will start the answers :) > > At OVH, our hard limit is around 1500 hypervisors on a region. > It also depends a lot on number of instances (and neutron ports). > The effects if we try to go above this number: > - load on control plane (db/rabbit) is increasing a lot > - "burst" load is hard to manage (e.g. restart of all neutron agent or >   nova computes is putting a high pressure on control plane) > - and of course, failure domain is bigger > > Note that we dont use cells. > We are deploying multiple regions, but this is painful to manage / > understand for our clients. > We are looking for a solution to unify the regions, but we did not find > anything which could fit our needs for now. i assume you do not see cells v2 as a replacment for multipel regions because they do not provide indepente falut domains and also because they are only a nova feature so it does not solve sclaing issue in other service like neutorn which are streached acrooss all cells. cells are a scaling mechinm but the larger the cloud the harder it is to upgrade and cells does not help with that infact by adding more contoler it hinders upgrades. seperate regoins can all be upgraded indepently and can be fault tolerant if you dont share serviecs between regjions and use fedeeration to avoid sharing keystone. glad to hear you can manage 1500 compute nodes by the way. the old value of 500 nodes max has not been true for a very long time rabbitmq and the db still tends to be the bottelneck to scale however beyond 1500 nodes outside of the operational overhead. > > Cheers, > From artem.goncharov at gmail.com Tue Feb 2 18:22:18 2021 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Tue, 2 Feb 2021 19:22:18 +0100 Subject: [keystone][osc]Strange behaviour of OSC in keystone MFA context In-Reply-To: <55f07666-3315-d02d-5075-13e0f677420c@catalystcloud.nz> References: <27cda0ba41634425b5c4d688381d6107@elca.ch> <18d48a3d208317e9c9b220ff20ae2f46a2442ef0.camel@redhat.com> <55f07666-3315-d02d-5075-13e0f677420c@catalystcloud.nz> Message-ID: Hi > On 2. Feb 2021, at 01:43, Adrian Turjak wrote: > > *puts up hand* > > You can blame me for this. When I implemented this I didn't (and still don't) fully understand how the loading stuff works in Keystoneauth and how it works with other things like OSC. I was more focused on getting the direct auth/session side working because the loading stuff isn't that useful in OSC really (see below why). > > If anyone does know the loader side of keystoneauth better please chime in and save me from making too much of a fool of myself! > > I think adding `.split(',')` is probably enough, but you'd want to also update the help text clarify that the option is a 'comma separated list'. > > > The biggest issue, and why this area never got much testing, is because it is effectively useless since you'd have to supply your MFA values EVERY command. Imagine how awful that would be for TOTP. The whole point of the MFA process in keystone with auth-receipt was a dynamic interactive login. Supplying the MFA upfront isn't that useful. > > What the OSC really needs is a `login` command, that goes through a login process using the auth-receipts workflow from keystone (asks for password/totp) and sets some sort of state file. We can't set the environment variables of the parent shell process, so we'd have to go with a state/session file. But to avoid it clashing with other state files and terminal sessions we'd need some way to tag them by the parent process ID so you can login to more than one cloud/project/user/etc across multiple terminals. I guess we can do something about that. Recently Monty started and I took over the patch for adding token caching in the keyring[1]. As such it will not really help, but together with [2] and [3] we can use authorisation caching on the OSC side. I was never really giving priority to this, since in a regular use case it perhaps saves .5 - 1 second, what is not really noticeable (most time is wasted on initialization). However in this context it might become really handy. Feel free to trigger discussion if that looks important. And yes, I totally agree on the fact, that TOTP/MFA for scripting is a total disaster, therefore nobody really uses it. > > In addition it would be really nice if the OSC had some way of reusing a scoped token/catalog rather than having to fetch it every time, but I feel that would have to include changes in Keystoneauth to supply it some cached data which tells it to not attempt to reauthenticate but rather trust the catalog/token supplied (does keystoneauth support that?). Because that reauth every command via Keystoneauth is actually what often takes longer than the actual API calls... We can also just throw catalog into that state/session file as json/yaml. > > On 29/01/21 7:03 am, Stephen Finucane wrote: >> The definition for those opts can be found at [1]. As Sean thought it might be, >> that is using the default type defined in the parent 'Opt' class of 'str' [2]. >> We don't expose argparse's 'action' parameter that would allow us to use the >> 'append' action, so you'd have to fix this by parsing whatever the user provided >> after the fact. I suspect you could resolve the immediate issue by changing this >> line [3] from: >> >> self._methods = kwargs['auth_methods'] >> >> to: >> >> self._methods = kwargs['auth_methods'].split(',') >> >> However, I assume there's likely more to this issue. I don't have an environment >> to hand to validate this fix, unfortunately. >> >> If you do manage to test that change and it works, I'd be happy to help you in >> getting a patch proposed to 'keystoneauth'. >> >> Hope this helps, >> Stephen >> >> [1] https://github.com/openstack/keystoneauth/blob/4.3.0/keystoneauth1/loading/_plugins/identity/v3.py#L316-L330 >> [2] https://github.com/openstack/keystoneauth/blob/4.3.0/keystoneauth1/loading/opts.py#L65 >> [3] https://github.com/openstack/keystoneauth/blob/4.3.0/keystoneauth1/loading/_plugins/identity/v3.py#L338 >> >>>> Jean-François >> >> > [1] https://review.opendev.org/c/openstack/openstacksdk/+/735352 [2] https://review.opendev.org/c/openstack/python-openstackclient/+/765652 [3] https://review.opendev.org/c/openstack/osc-lib/+/765650 -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Tue Feb 2 18:52:45 2021 From: zigo at debian.org (Thomas Goirand) Date: Tue, 2 Feb 2021 19:52:45 +0100 Subject: [all] Eventlet broken again with SSL, this time under Python 3.9 In-Reply-To: <4f56323c6fc4df53a007a1c2483d387f7d0c0045.camel@redhat.com> References: <6f653877fa1da9b2d191b3d3818307f9b29f60bb.camel@redhat.com> <7441d18e5313af6d76a28cabb3866e05dad6f6d5.camel@redhat.com> <666451611999668@mail.yandex.ru> <6241612007690@mail.yandex.ru> <88041612182037@mail.yandex.ru> <9fd8defd-4bcc-2980-5d0e-0e4f696dfbf9@debian.org> <20210202155734.ym423voolru6voxn@yuggoth.org> <4f56323c6fc4df53a007a1c2483d387f7d0c0045.camel@redhat.com> Message-ID: <7118c48e-4a31-bbd4-65b5-16fc87ef54bc@debian.org> On 2/2/21 5:35 PM, Sean Mooney wrote: > On Tue, 2021-02-02 at 15:57 +0000, Jeremy Stanley wrote: >> On 2021-02-02 12:32:29 +0100 (+0100), Thomas Goirand wrote: >> [...] >>> I found out that downgrading to python3-dnspython 1.16.0 made >>> swift-proxy (and probably others) back to working. >> [...] >> >> If memory serves, dnspython and eventlet both monkey-patch the >> stdlib in potentially conflicting ways, and we've seen them interact >> badly in the past. > upstream eventlet force 1.16.0 to be used via there requirement files > in responce to us filing upstream bug after 2.0.0 was released > so its known that you cant use dnspython 2.0.0 with eventlest currently Setting such a upper bound is just a timebomb, that should be desactivated as fast as possible. > eventlet have fixt some fo the incompatibilte in the last few months but not all of them I wonder where / how dnspython is doing the monkey patching of the SSL library. Is everything located in query.py ? Cheers, Thomas Goirand (zigo) From smooney at redhat.com Tue Feb 2 19:53:34 2021 From: smooney at redhat.com (Sean Mooney) Date: Tue, 02 Feb 2021 19:53:34 +0000 Subject: [all] Eventlet broken again with SSL, this time under Python 3.9 In-Reply-To: <7118c48e-4a31-bbd4-65b5-16fc87ef54bc@debian.org> References: <6f653877fa1da9b2d191b3d3818307f9b29f60bb.camel@redhat.com> <7441d18e5313af6d76a28cabb3866e05dad6f6d5.camel@redhat.com> <666451611999668@mail.yandex.ru> <6241612007690@mail.yandex.ru> <88041612182037@mail.yandex.ru> <9fd8defd-4bcc-2980-5d0e-0e4f696dfbf9@debian.org> <20210202155734.ym423voolru6voxn@yuggoth.org> <4f56323c6fc4df53a007a1c2483d387f7d0c0045.camel@redhat.com> <7118c48e-4a31-bbd4-65b5-16fc87ef54bc@debian.org> Message-ID: On Tue, 2021-02-02 at 19:52 +0100, Thomas Goirand wrote: > On 2/2/21 5:35 PM, Sean Mooney wrote: > > On Tue, 2021-02-02 at 15:57 +0000, Jeremy Stanley wrote: > > > On 2021-02-02 12:32:29 +0100 (+0100), Thomas Goirand wrote: > > > [...] > > > > I found out that downgrading to python3-dnspython 1.16.0 made > > > > swift-proxy (and probably others) back to working. > > > [...] > > > > > > If memory serves, dnspython and eventlet both monkey-patch the > > > stdlib in potentially conflicting ways, and we've seen them interact > > > badly in the past. > > upstream eventlet force 1.16.0 to be used via there requirement files > > in responce to us filing upstream bug after 2.0.0 was released > > so its known that you cant use dnspython 2.0.0 with eventlest currently > > Setting such a upper bound is just a timebomb, that should be > desactivated as fast as possible. yes it was done because dnspython broke backwards comaptiatbly and reved a new major version the eventlet mainatienr were not aware of it and it was capped to give them time to fix it. they have merges some patches to make it work but i think some of it need to be fixed in dnspython too. this is the bug https://github.com/eventlet/eventlet/issues/619 the pin was put in place in august https://github.com/eventlet/eventlet/issues/619#issuecomment-681480014 but the fix submitted to eventlet https://github.com/eventlet/eventlet/pull/639 didnt actully fully fix it https://github.com/eventlet/eventlet/issues/619#issuecomment-689903897 https://github.com/rthalley/dnspython/issues/559 was the dnspython bug but that seams to be closed. https://github.com/rthalley/dnspython/issues/557 and https://github.com/rthalley/dnspython/issues/558 are also closed but its still not actully fixed. > > > eventlet have fixt some fo the incompatibilte in the last few months but not all of them > > I wonder where / how dnspython is doing the monkey patching of the SSL > library. Is everything located in query.py ? > > Cheers, > > Thomas Goirand (zigo) > From eandersson at blizzard.com Tue Feb 2 20:10:27 2021 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Tue, 2 Feb 2021 20:10:27 +0000 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> <20210202173713.GA14971@sync>, <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> Message-ID: > the old value of 500 nodes max has not been true for a very long time rabbitmq and the db still tends to be the bottelneck to scale however beyond 1500 nodes outside of the operational overhead. We manage our scale with regions as well. With 1k nodes our RabbitMQ isn't breaking a sweat, and no signs that the database would be hitting any limits. Our issues have been limited to scaling Neutron and VM scheduling on Nova mostly due to, NUMA pinning. ________________________________ From: Sean Mooney Sent: Tuesday, February 2, 2021 9:50 AM To: openstack-discuss at lists.openstack.org Subject: Re: [ops][largescale-sig] How many compute nodes in a single cluster ? On Tue, 2021-02-02 at 17:37 +0000, Arnaud Morin wrote: > Hey all, > > I will start the answers :) > > At OVH, our hard limit is around 1500 hypervisors on a region. > It also depends a lot on number of instances (and neutron ports). > The effects if we try to go above this number: > - load on control plane (db/rabbit) is increasing a lot > - "burst" load is hard to manage (e.g. restart of all neutron agent or > nova computes is putting a high pressure on control plane) > - and of course, failure domain is bigger > > Note that we dont use cells. > We are deploying multiple regions, but this is painful to manage / > understand for our clients. > We are looking for a solution to unify the regions, but we did not find > anything which could fit our needs for now. i assume you do not see cells v2 as a replacment for multipel regions because they do not provide indepente falut domains and also because they are only a nova feature so it does not solve sclaing issue in other service like neutorn which are streached acrooss all cells. cells are a scaling mechinm but the larger the cloud the harder it is to upgrade and cells does not help with that infact by adding more contoler it hinders upgrades. seperate regoins can all be upgraded indepently and can be fault tolerant if you dont share serviecs between regjions and use fedeeration to avoid sharing keystone. glad to hear you can manage 1500 compute nodes by the way. the old value of 500 nodes max has not been true for a very long time rabbitmq and the db still tends to be the bottelneck to scale however beyond 1500 nodes outside of the operational overhead. > > Cheers, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adriant at catalystcloud.nz Wed Feb 3 00:45:05 2021 From: adriant at catalystcloud.nz (Adrian Turjak) Date: Wed, 3 Feb 2021 13:45:05 +1300 Subject: [keystone][osc]Strange behaviour of OSC in keystone MFA context In-Reply-To: References: <27cda0ba41634425b5c4d688381d6107@elca.ch> <18d48a3d208317e9c9b220ff20ae2f46a2442ef0.camel@redhat.com> <55f07666-3315-d02d-5075-13e0f677420c@catalystcloud.nz> Message-ID: <9a7cc1b6-919c-7202-6a30-af45bb3b124d@catalystcloud.nz> On 3/02/21 7:22 am, Artem Goncharov wrote: > Hi >> On 2. Feb 2021, at 01:43, Adrian Turjak > > wrote: >> >> The biggest issue, and why this area never got much testing, is >> because it is effectively useless since you'd have to supply your MFA >> values EVERY command. Imagine how awful that would be for TOTP. The >> whole point of the MFA process in keystone with auth-receipt was a >> dynamic interactive login. Supplying the MFA upfront isn't that useful. >> >> What the OSC really needs is a `login` command, that goes through a >> login process using the auth-receipts workflow from keystone (asks >> for password/totp) and sets some sort of state file. We can't set the >> environment variables of the parent shell process, so we'd have to go >> with a state/session file. But to avoid it clashing with other state >> files and terminal sessions we'd need some way to tag them by the >> parent process ID so you can login to more than one >> cloud/project/user/etc across multiple terminals. > > I guess we can do something about that. Recently Monty started and I > took over the patch for adding token caching in the keyring[1]. As > such it will not really help, but together with [2] and [3] we can use > authorisation caching on the OSC side. I was never really giving > priority to this, since in a regular use case it perhaps saves .5 - 1 > second, what is not really noticeable (most time is wasted on > initialization). However in this context it might become really handy. > Feel free to trigger discussion if that looks important. > > And yes, I totally agree on the fact, that TOTP/MFA for scripting is a > total disaster, therefore nobody really uses it. I definitely do think it is important, but then again so would getting MFA working in Horizon which I was planning to do, but circumstances beyond my control stopped me from doing that, and I didn't work on finding someone else to implement it. If we did get it working on the CLI, then there might be more push to get it working on Horizon as well. How auth-receipts and MFA work is documented fairly well from memory, and we have a very clear error thrown that lets you build an interactive workflow for asking for the missing pieces of auth: https://docs.openstack.org/keystoneauth/latest/authentication-plugins.html#multi-factor-with-v3-identity-plugins I can't find the time to implement anything here right now because of so much internal work, but anything related to MFA feel free to ping me or just outright add me as reviewer! -------------- next part -------------- An HTML attachment was scrubbed... URL: From adriant at catalystcloud.nz Wed Feb 3 01:02:45 2021 From: adriant at catalystcloud.nz (Adrian Turjak) Date: Wed, 3 Feb 2021 14:02:45 +1300 Subject: [kolla][adjutant] Granular quota settings with Adjutant In-Reply-To: References: Message-ID: <9de32ac6-6650-e460-6b3f-78219897abc5@catalystcloud.nz> Hey Albert! At present there isn't any plan to add that feature, nor would I or the other devs working on Adjutant likely have the time to build it. That said if you have a developer who wants to work on such a feature, I'd be happy to help work with them to design it so it fits into what we currently have. The original design was going to be closer to that (allow customers to ask for a specific quota changes), but we kept finding that they'd ask for one quota, then another because they didn't realise they were related. So we opted for defined sizes as default. I'm not at all against finding a way to offer both options in a way that works together. :) Future plans though do include splitting the quota parts of Adjutant into their own plugin (and a second horizon plugin) so we can keep some of that logic out of the core service as well but maintain it as a 'core' plugin. So this could be a good time to fit that work in together and deprecate the quota parts in Adjutant itself for removal later. Feel free to bug me about anything as you need, and do come say hello in #openstack-adjutant on freenode! Cheers, Adrian On 3/02/21 6:18 am, Braden, Albert wrote: > > We have been experimenting with Adjutant, and it offers a menu of > sizes (default small, medium and large). It’s easy to add more sizes, > but we would like to allow customers to request granular changes. For > example, if a customer needs to add 10 cores and 5 instances to his > existing quota. This would require some changes to Horizon, and an > Adjutant plugin. Is anyone already doing this, or working in that > direction? > > I apologize for the nonsense below. So far I have not been able to > stop it from being attached to my external emails. I'm working on it. > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sxmatch1986 at gmail.com Wed Feb 3 02:14:26 2021 From: sxmatch1986 at gmail.com (hao wang) Date: Wed, 3 Feb 2021 10:14:26 +0800 Subject: [zaqar][stable]Proposing Hao Wang as a stable branches core reviewer Message-ID: Hi, Thierry I want to propose myself(wanghao) to be a new core reviewer of the Zaqar stable core team. I have been PTL in Zaqar for almost two years. I also want to help the stable branches better. Would you help me to add myself to the zaqar stable core team? Thank you. From arne.wiebalck at cern.ch Wed Feb 3 07:53:48 2021 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Wed, 3 Feb 2021 08:53:48 +0100 Subject: [baremetal-sig][ironic] Tue Jan 12, 2021, 2pm UTC: 'Multi-Tenancy in Ironic: Of Owners and Lessees' In-Reply-To: References: <20210112145852.i3yyh2u7lnqonqjt@barron.net> <6a3c789f-1413-51c6-81ff-4755ccdbe26a@cern.ch> Message-ID: <1e7a18e5-d25a-caa8-725d-030034c2b93a@cern.ch> Arkady, Tom, all, The first "Topic of the day" videos from the meetings of the bare metal SIG are now available from https://www.youtube.com/playlist?list=PLKqaoAnDyfgoBFAjUvZGjKXQjogWZBLL_ Upcoming videos will also be uploaded to this playlist. Cheers, Arne On 13.01.21 00:25, Kanevsky, Arkady wrote: > Thanks Arne. Looking forward to recording. > > -----Original Message----- > From: Arne Wiebalck > Sent: Tuesday, January 12, 2021 9:12 AM > To: Tom Barron > Cc: openstack-discuss > Subject: Re: [baremetal-sig][ironic] Tue Jan 12, 2021, 2pm UTC: 'Multi-Tenancy in Ironic: Of Owners and Lessees' > > > [EXTERNAL EMAIL] > > On 12.01.21 15:58, Tom Barron wrote: >> On 08/01/21 16:39 +0100, Arne Wiebalck wrote: >>> Dear all, >>> >>> Happy new year! >>> >>> The Bare Metal SIG will continue its monthly meetings and start again >>> on >>> >>> Tue Jan 12, 2021, at 2pm UTC. >>> >>> This time there will be a 10 minute "topic-of-the-day" >>> presentation by Tzu-Mainn Chen (tzumainn) on >>> >>> 'Multi-Tenancy in Ironic: Of Owners and Lessees' >>> >>> So, if you would like to learn how this relatively recent addition to >>> Ironic works, you can find all the details for this meeting on the >>> SIG's etherpad: >>> >>> https://etherpad.opendev.org/p/bare-metal-sig >>> >>> Everyone is welcome, don't miss out! >>> >>> Cheers, >>> Arne >>> >> >> Had to miss Multi-Tenancy presentation today but look forward to any >> artefacts (recording, minutes, slides) if these become available. >> >> -- Tom >> > Hi Tom, > > We have recorded the presentation and will make it (and the ones of the previous presentations) available as soon as we find the time to edit the videos. > > The links will be made available on the bare metal SIG etherpad, and we will also send something to the list when they are ready! > > Cheers, > Arne > > From mdulko at redhat.com Wed Feb 3 09:25:30 2021 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Wed, 03 Feb 2021 10:25:30 +0100 Subject: [infra][kuryr][launchpad] Message-ID: Hi, Seems like the bugs integration with Launchpad for Kuryr is broken since the Gerrit upgrade - the patches with Closes-Bug: are no longer picked up by Launchpad and bug statuses are not updated. Is it a known problem or are there any extra steps we need to as a Kuryr team do to reenable the integration? Thanks, Michał From radoslaw.piliszek at gmail.com Wed Feb 3 09:31:03 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 3 Feb 2021 10:31:03 +0100 Subject: [infra][kuryr][launchpad] In-Reply-To: References: Message-ID: On Wed, Feb 3, 2021 at 10:26 AM Michał Dulko wrote: > > Hi, Hi, Michał! > Seems like the bugs integration with Launchpad for Kuryr is broken > since the Gerrit upgrade - the patches with Closes-Bug: are no > longer picked up by Launchpad and bug statuses are not updated. > > Is it a known problem or are there any extra steps we need to as a > Kuryr team do to reenable the integration? It is a known problem and it affects every project. It was expected to happen as the old integration is not compatible with the new Gerrit. The upside is that the OpenStack release process still works with Closes-Bug tags and will mark bugs as 'Fix Released' when it is due. :-) Also note the blueprint integration is similarly broken - no new patches get mentioned on the blueprint. -yoctozepto From balazs.gibizer at est.tech Wed Feb 3 09:43:12 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Wed, 03 Feb 2021 10:43:12 +0100 Subject: [nova][placement] adding nova-core to placement-core in gerrit In-Reply-To: References: Message-ID: <0O5YNQ.4J6G712XMY0K@est.tech> On Tue, Feb 2, 2021 at 09:06, Balazs Gibizer wrote: > Hi, > > I've now added nova-core to the placement-core group[1] It turned out that there is a separate placement-stable-maint group which does not have the nova-stable-maint included. Is there any objection to add nova-stable-maint to placement-stable-maint? Cheers, gibi > > [1] > https://review.opendev.org/admin/groups/93c2b262ebfe0b3270c0b7ad60de887b02aaba9d,members > > On Wed, Jan 27, 2021 at 09:49, Stephen Finucane > wrote: >> On Tue, 2021-01-26 at 17:14 +0100, Balazs Gibizer wrote: >>> Hi, >>> >>> Placement got back under nova governance but so far we haven't >>> consolidated the core teams yet. Stephen pointed out to me that >>> given >>> the ongoing RBAC works it would be beneficial if more nova cores, >>> with >>> API and RBAC experience, could approve such patches. So I'm >>> proposing >>> to add nova-core group to the placement-core group in gerrit. This >>> means Ghanshyam, John, Lee, and Melanie would get core rights in >>> the >>> placement related repositories. >>> >>> @placement-core, @nova-core members: Please let me know if you >>> have any >>> objection to such change until end of this week. >> >> I brought it up and obviously think it's a sensible idea, so it's an >> easy +1 >> from me. >> >> Stephen >> >>> cheers, >>> gibi >> >> >> > > > From mdulko at redhat.com Wed Feb 3 09:58:03 2021 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Wed, 03 Feb 2021 10:58:03 +0100 Subject: [infra][kuryr][launchpad] In-Reply-To: References: Message-ID: On Wed, 2021-02-03 at 10:31 +0100, Radosław Piliszek wrote: > On Wed, Feb 3, 2021 at 10:26 AM Michał Dulko wrote: > > > > Hi, > > Hi, Michał! > > > Seems like the bugs integration with Launchpad for Kuryr is broken > > since the Gerrit upgrade - the patches with Closes-Bug: are no > > longer picked up by Launchpad and bug statuses are not updated. > > > > Is it a known problem or are there any extra steps we need to as a > > Kuryr team do to reenable the integration? > > It is a known problem and it affects every project. > It was expected to happen as the old integration is not compatible > with the new Gerrit. > > The upside is that the OpenStack release process still works with > Closes-Bug tags and will mark bugs as 'Fix Released' when it is due. > :-) Oh well. Thanks for explanations! > Also note the blueprint integration is similarly broken - no new > patches get mentioned on the blueprint. > > -yoctozepto > From jean-francois.taltavull at elca.ch Wed Feb 3 10:03:53 2021 From: jean-francois.taltavull at elca.ch (Taltavull Jean-Francois) Date: Wed, 3 Feb 2021 10:03:53 +0000 Subject: [KEYSTONE][FEDERATION] Groups mapping problem when using keycloak as IDP In-Reply-To: <4b328f90066149db85d0a006fb7ea01b@elca.ch> References: <4b328f90066149db85d0a006fb7ea01b@elca.ch> Message-ID: Hello, Actually, the solution is to add this line to Apache configuration: OIDCClaimDelimiter ";" The problem is that this configuration variable does not exist in OSA keystone role and its apache configuration template (https://opendev.org/openstack/openstack-ansible-os_keystone/src/branch/master/templates/keystone-httpd.conf.j2). Jean-Francois > -----Original Message----- > From: Taltavull Jean-Francois > Sent: lundi, 1 février 2021 14:44 > To: openstack-discuss at lists.openstack.org > Subject: [KEYSTONE][FEDERATION] Groups mapping problem when using > keycloak as IDP > > Hello, > > In order to implement identity federation, I've deployed (with OSA) keystone > (Ussuri) as Service Provider and Keycloak as IDP. > > As one can read at [1], "groups" can have multiple values and each value must > be separated by a ";" > > But, in the OpenID token sent by keycloak, groups are represented with a JSON > list and keystone fails to parse it well (only the first group of the list is mapped). > > Have any of you already faced this problem ? > > Thanks ! > > Jean-François > > [1] > https://docs.openstack.org/keystone/ussuri/admin/federation/mapping_combi > nations.html From thierry at openstack.org Wed Feb 3 10:11:56 2021 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 3 Feb 2021 11:11:56 +0100 Subject: [zaqar][stable]Proposing Hao Wang as a stable branches core reviewer In-Reply-To: References: Message-ID: <112818e5-9f84-5753-5cb9-d054805d4b3a@openstack.org> hao wang wrote: > I want to propose myself(wanghao) to be a new core reviewer of the > Zaqar stable core team. > I have been PTL in Zaqar for almost two years. I also want to help the > stable branches better. Thanks for volunteering! Let's wait a couple of days for feedback from other stable-maint-core members, and I'll add you in. In the meantime, please review the stable branch policy, and let us know if you have any questions: https://docs.openstack.org/project-team-guide/stable-branches.html -- Thierry Carrez (ttx) From moguimar at redhat.com Wed Feb 3 10:16:00 2021 From: moguimar at redhat.com (Moises Guimaraes de Medeiros) Date: Wed, 3 Feb 2021 11:16:00 +0100 Subject: [oslo] Proposing Daniel Bengtsson for Oslo Core In-Reply-To: References: <87r1lyramy.fsf@redhat.com> Message-ID: +1 \o/ On Tue, Feb 2, 2021 at 1:21 PM Herve Beraud wrote: > > > Le mar. 2 févr. 2021 à 12:13, Sofer Athlan-Guyot a > écrit : > >> Hi, >> >> as one of his colleague, I'll definitively support his nomination even >> though I have no authority on oslo lib whatsoever :) >> > > It reinforces my feelings concerning Daniel's good job! Thank you Sofer! > > >> Herve Beraud writes: >> >> > Hi, >> > >> > Daniel has been working on Oslo for quite some time now and during that >> time >> > he was a great help on Oslo. I think he would make a good addition to >> the >> > general Oslo core team. >> > >> > He helped us by proposing patches to fix issues, maintaining stable >> branches >> > by backporting changes, and managing our releases by assuming parts of >> the >> > release liaison role. >> > >> > Existing Oslo team members (and anyone else we >> > co-own libraries with) please respond with +1/-1. If there are no >> > objections we'll add him to the ACL soon. :-) >> > >> > Thanks. >> >> Thanks, >> -- >> Sofer Athlan-Guyot >> chem on #irc at rhos-upgrades >> DFG:Upgrades Squad:Update >> >> >> > > -- > Hervé Beraud > Senior Software Engineer at Red Hat > irc: hberaud > https://github.com/4383/ > https://twitter.com/4383hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > > -- Moisés Guimarães Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From lyarwood at redhat.com Wed Feb 3 11:01:42 2021 From: lyarwood at redhat.com (Lee Yarwood) Date: Wed, 3 Feb 2021 11:01:42 +0000 Subject: [nova][placement] adding nova-core to placement-core in gerrit In-Reply-To: <0O5YNQ.4J6G712XMY0K@est.tech> References: <0O5YNQ.4J6G712XMY0K@est.tech> Message-ID: <20210203110142.2noi7v2b4q57fozy@lyarwood-laptop.usersys.redhat.com> On 03-02-21 10:43:12, Balazs Gibizer wrote: > > > On Tue, Feb 2, 2021 at 09:06, Balazs Gibizer > wrote: > > Hi, > > > > I've now added nova-core to the placement-core group[1] > > It turned out that there is a separate placement-stable-maint group which > does not have the nova-stable-maint included. Is there any objection to add > nova-stable-maint to placement-stable-maint? None from me as a member of nova-stable-maint. Thanks again for cleaning this up! -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From stephenfin at redhat.com Wed Feb 3 11:13:44 2021 From: stephenfin at redhat.com (Stephen Finucane) Date: Wed, 03 Feb 2021 11:13:44 +0000 Subject: [oslo] Proposing Daniel Bengtsson for Oslo Core In-Reply-To: References: Message-ID: <9c1cd70910714546faa5ea07d0d0d7abfa4e6ce5.camel@redhat.com> On Mon, 2021-02-01 at 16:30 +0100, Herve Beraud wrote: > Hi, > > Daniel has been working on Oslo for quite some time now and during that time > he was a great help on Oslo. I think he would make a good addition to the > general Oslo core team. > > He helped us by proposing patches to fix issues, maintaining stable branches > by backporting changes, and managing our releases by assuming parts of the > release liaison role. > > Existing Oslo team members (and anyone else we > co-own libraries with) please respond with +1/-1. If there are no > objections we'll add him to the ACL soon. :-) > > Thanks. I'm +0, simply because I haven't really interacted with Daniel much on my oslo review journey so far. I will however defer to the experiences of those who have to vouch for him :) Cheers, Stephen From ssbarnea at redhat.com Wed Feb 3 12:07:04 2021 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Wed, 3 Feb 2021 04:07:04 -0800 Subject: Ospurge or "project purge" - What's the right approach to cleanup projects prior to deletion In-Reply-To: References: <76498a8c-c8a5-9488-0223-3f47ac4486df@inovex.de> <0CC2DFF7-5721-4106-A06B-6FC2970AC07B@gmail.com> <7237beb7-a68a-0398-f779-aef76fbc0e82@debian.org> <10C08D43-B4E6-4423-B561-183A4336C488@gmail.com> <9f408ffe-4046-76e0-bbdf-57ee94191738@inovex.de> <5C651C9C-0D00-4CB8-9992-4AC23D92FE38@gmail.com> Message-ID: That reminded me of my old osclean bash script which did some work in parallel, but I would see the purge as a very useful sdk feature. -- /sorin On 2 Feb 2021 at 07:40:22, Adrian Turjak wrote: > OH! As someone who tried to champion project termination at the API layer > as a community goal during the Berlin summit, I love that someone ran with > our idea of implementing it in the SDK as a starting point! > > I still think we should champion a purge API for each service, but the SDK > is definitely the place to handle dependencies, and I love that you are > doing it in threads! > > I'd be happy to review/test any code related to this both in the OSC and > the SDK. I sadly haven't had time to help implement anything like this, but > I will somehow find the time to review/test! And I look forward to one day > throwing away our internal termination logic in favor of what's in the SDK! > > On 20/01/21 2:16 am, Artem Goncharov wrote: > > Hi Christian. > > Actually the patch stuck due to lack of reviewers. Idea here was not to > replace “openstack project purge”, but to add a totally new implementation > (hopefully later dropping project purge as such). From my POV at the moment > there is nothing else I was thinking to mandatorily implement on OSC side > (sure, for future I would like to give possibility to limit services to > cleanup, to add cleanup of key-airs, etc). SDK part is completely > independent of that. Here we definitely need to add dropping of private > images. Also on DNS side we can do cleanup. Other services are tricky > (while swift we can still implement relatively easy). > > All in all - we can merge the PR in it’s current form (assuming we get > some positive reviews). > > BG, > Artem > > On 19. Jan 2021, at 13:33, Christian Rohmann > wrote: > > Hey Artem, > > thank you very much for your quick reply and pointer to the patchset you > work on! > > > > On 18/01/2021 20:14, Artem Goncharov wrote: > > Ha, thats exactly the case, the whole logic sits in sdk and is spread > across the supported services: > - > https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/compute/v2/_proxy.py#L1798 - > for compute. KeyPairs not dropped, since they belong to user, and not to > the “project”; > - > https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/block_storage/v3/_proxy.py#L547 - > block storage; > - > https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/orchestration/v1/_proxy.py#L490 > - > https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/network/v2/_proxy.py#L4130 - > the most complex one in order to give possibility to clean “old” resource > without destroying everything else > > Adding image is few lines of code (never had enough time to add it), > identity is a bit tricky, since also here mostly resources does not belong > to Project. DNS would be also easy to do. OSC here is only providing I/F, > while the logic sits in SDK and can be very easy extended for other > services. > > > On 18. Jan 2021, at 19:52, Thomas Goirand wrote: > > On 1/18/21 6:56 PM, Artem Goncharov wrote: > > What do you mean it doesn’t implement anything at all? It does clean up > compute, network, block_storage, orchestrate resources. Moreover it gives > you possibility to clean “old” resources (created before or last updated > before). > > > Oh really? With that few lines of code? I'll re-read the patch then, > sorry for my bad assumptions. > > Can you point at the part that's actually deleting the resources? > > If I understood correctly, the cleanup relies on the SDK functionality / > requirement for each resource type to provide a corresponding function( > https://github.com/openstack/openstacksdk/blob/master/openstack/cloud/openstackcloud.py#L762) > ? > > Reading through the (SDK) code this even covers depending resources, nice! > > > I certainly will leave some feedback and comments in your change ( > https://review.opendev.org/c/openstack/python-openstackclient/+/734485). > But what are your immediate plans moving forward on with this then, Artem? > > There is a little todo list in the description on your change .. is there > anything you yourself know that is still missing before taking this to a > full review and finally merging it? > > Only code that is shipped and then actively used will improve further and > people will notice other required functionality or switches for later > iterations. With the current state of having a somewhat working but > unmaintained ospurge and a non feature complete "project purge" (supports > only Block Storage v1, v2; Compute v2; Image v1, v2) this will only cause > people to start hacking away on the ospurge codebase or worse building > their own tools and scripts to implement project cleanup for their > environments over and over again. > > > > Regards, > > > Christian > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Wed Feb 3 13:13:39 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 3 Feb 2021 08:13:39 -0500 Subject: [cinder] wallaby R-9 virtual mid-cycle 10 Feb at 14:00 UTC Message-ID: The poll results are in (9 responses) and there was only one option where everyone could attend (apologies to the two people who marked that "as need be"): DATE: Wednesday, 10 February 2021 TIME: 1400-1600 UTC (2 hours) LOCATION: https://bluejeans.com/3228528973 The meeting will be recorded. Please add topics to the planning etherpad: https://etherpad.opendev.org/p/cinder-wallaby-mid-cycles Note that there will be no cinder weekly meeting on 10 February. cheers, brian From ankit at aptira.com Wed Feb 3 12:40:02 2021 From: ankit at aptira.com (Ankit Goel) Date: Wed, 3 Feb 2021 12:40:02 +0000 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo Message-ID: Hello Experts, I was trying to install Openstack rally on centos 7 VM but the link provided in the Openstack doc to download the install_rally.sh is broken. Latest Rally Doc link - > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation Rally Install Script -> https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh - > This is broken After searching on internet I could reach to the Openstack rally Repo - > https://opendev.org/openstack/rally but here I am not seeing the install_ rally.sh script and according to all the information available on internet it says we need install_ rally.sh. Thus can you please let me know what's the latest procedure to install Rally. Awaiting for your response. Thanks, Ankit Goel -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.j.ivey at gmail.com Wed Feb 3 14:05:30 2021 From: david.j.ivey at gmail.com (David Ivey) Date: Wed, 3 Feb 2021 09:05:30 -0500 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> <20210202173713.GA14971@sync> <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> Message-ID: I am not sure simply going off the number of compute nodes is a good representation of scaling issues. I think it has a lot more to do with density/networks/ports and the rate of churn in the environment, but I could be wrong. For example, I only have 80 high density computes (64 or 128 CPU's with ~400 instances per compute) and I run into the same scaling issues that are described in the Large Scale Sig and have to do a lot of tuning to keep the environment stable. My environment is also kinda unique in the way mine gets used as I have 2k to 4k instances torn down and rebuilt within an hour or two quite often so my API's are constantly bombarded. On Tue, Feb 2, 2021 at 3:15 PM Erik Olof Gunnar Andersson < eandersson at blizzard.com> wrote: > > the old value of 500 nodes max has not been true for a very long time > rabbitmq and the db still tends to be the bottelneck to scale however > beyond 1500 nodes > outside of the operational overhead. > > We manage our scale with regions as well. With 1k nodes our RabbitMQ > isn't breaking a sweat, and no signs that the database would be hitting any > limits. Our issues have been limited to scaling Neutron and VM scheduling > on Nova mostly due to, NUMA pinning. > ------------------------------ > *From:* Sean Mooney > *Sent:* Tuesday, February 2, 2021 9:50 AM > *To:* openstack-discuss at lists.openstack.org < > openstack-discuss at lists.openstack.org> > *Subject:* Re: [ops][largescale-sig] How many compute nodes in a single > cluster ? > > On Tue, 2021-02-02 at 17:37 +0000, Arnaud Morin wrote: > > Hey all, > > > > I will start the answers :) > > > > At OVH, our hard limit is around 1500 hypervisors on a region. > > It also depends a lot on number of instances (and neutron ports). > > The effects if we try to go above this number: > > - load on control plane (db/rabbit) is increasing a lot > > - "burst" load is hard to manage (e.g. restart of all neutron agent or > > nova computes is putting a high pressure on control plane) > > - and of course, failure domain is bigger > > > > Note that we dont use cells. > > We are deploying multiple regions, but this is painful to manage / > > understand for our clients. > > We are looking for a solution to unify the regions, but we did not find > > anything which could fit our needs for now. > > i assume you do not see cells v2 as a replacment for multipel regions > because they > do not provide indepente falut domains and also because they are only a > nova feature > so it does not solve sclaing issue in other service like neutorn which are > streached acrooss > all cells. > > cells are a scaling mechinm but the larger the cloud the harder it is to > upgrade and cells does not > help with that infact by adding more contoler it hinders upgrades. > > seperate regoins can all be upgraded indepently and can be fault tolerant > if you dont share serviecs > between regjions and use fedeeration to avoid sharing keystone. > > > glad to hear you can manage 1500 compute nodes by the way. > > the old value of 500 nodes max has not been true for a very long time > rabbitmq and the db still tends to be the bottelneck to scale however > beyond 1500 nodes > outside of the operational overhead. > > > > > Cheers, > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-francois.taltavull at elca.ch Wed Feb 3 14:08:17 2021 From: jean-francois.taltavull at elca.ch (Taltavull Jean-Francois) Date: Wed, 3 Feb 2021 14:08:17 +0000 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: References: Message-ID: <3cb755495b994352aaadf0d31ad295f3@elca.ch> Hello Ankit, Installation part of Rally official doc is not up to date, actually. Just do “pip install rally-openstack” (in a virtualenv, of course 😊) This will also install “rally” python package. Enjoy ! Jean-Francois From: Ankit Goel Sent: mercredi, 3 février 2021 13:40 To: openstack-dev at lists.openstack.org Subject: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Experts, I was trying to install Openstack rally on centos 7 VM but the link provided in the Openstack doc to download the install_rally.sh is broken. Latest Rally Doc link - > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation Rally Install Script -> https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh - > This is broken After searching on internet I could reach to the Openstack rally Repo - > https://opendev.org/openstack/rally but here I am not seeing the install_ rally.sh script and according to all the information available on internet it says we need install_ rally.sh. Thus can you please let me know what’s the latest procedure to install Rally. Awaiting for your response. Thanks, Ankit Goel -------------- next part -------------- An HTML attachment was scrubbed... URL: From kalle.happonen at csc.fi Wed Feb 3 14:13:08 2021 From: kalle.happonen at csc.fi (Kalle Happonen) Date: Wed, 3 Feb 2021 16:13:08 +0200 (EET) Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: References: Message-ID: <1297683591.2146.1612361588263.JavaMail.zimbra@csc.fi> Hi, I found this commit which says it was removed. https://opendev.org/openstack/rally/commit/9811aa9726c9a9befbe3acb6610c1c93c9924948 We use ansible to install Rally on CentOS7. The relevant roles are linked here. http://www.9bitwizard.eu/rally-and-tempest Although I think the roles are ours so they are not tested in all scenarios, so YMMV. Cheers, Kalle ----- Original Message ----- > From: "Ankit Goel" > To: openstack-dev at lists.openstack.org > Sent: Wednesday, 3 February, 2021 14:40:02 > Subject: Rally - Unable to install rally - install_rally.sh is not available in repo > Hello Experts, > > I was trying to install Openstack rally on centos 7 VM but the link provided in > the Openstack doc to download the install_rally.sh is broken. > > Latest Rally Doc link - > > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation > > Rally Install Script -> > https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh - > > This is broken > > After searching on internet I could reach to the Openstack rally Repo - > > https://opendev.org/openstack/rally but here I am not seeing the install_ > rally.sh script and according to all the information available on internet it > says we need install_ rally.sh. > > Thus can you please let me know what's the latest procedure to install Rally. > > Awaiting for your response. > > Thanks, > Ankit Goel From kgiusti at gmail.com Wed Feb 3 14:15:47 2021 From: kgiusti at gmail.com (Ken Giusti) Date: Wed, 3 Feb 2021 09:15:47 -0500 Subject: [oslo] Proposing Daniel Bengtsson for Oslo Core In-Reply-To: References: Message-ID: +1 for Daniel! On Mon, Feb 1, 2021 at 10:31 AM Herve Beraud wrote: > Hi, > > Daniel has been working on Oslo for quite some time now and during that > time > he was a great help on Oslo. I think he would make a good addition to the > general Oslo core team. > > He helped us by proposing patches to fix issues, maintaining stable > branches > by backporting changes, and managing our releases by assuming parts of the > release liaison role. > > Existing Oslo team members (and anyone else we > co-own libraries with) please respond with +1/-1. If there are no > objections we'll add him to the ACL soon. :-) > > Thanks. > > -- > Hervé Beraud > Senior Software Engineer at Red Hat > irc: hberaud > https://github.com/4383/ > https://twitter.com/4383hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > > -- Ken Giusti (kgiusti at gmail.com) -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.morin at gmail.com Wed Feb 3 14:24:05 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Wed, 3 Feb 2021 14:24:05 +0000 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> <20210202173713.GA14971@sync> <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> Message-ID: <20210203142405.GB14971@sync> Yes, totally agree with that, on our side we are used to monitor the number of neutron ports (and espacially the number of ports in BUILD state). As usually an instance is having one port in our cloud, number of instances is closed to number of ports. About the cellsv2, we are mostly struggling on neutron side, so cells are not helping us. -- Arnaud Morin On 03.02.21 - 09:05, David Ivey wrote: > I am not sure simply going off the number of compute nodes is a good > representation of scaling issues. I think it has a lot more to do with > density/networks/ports and the rate of churn in the environment, but I > could be wrong. For example, I only have 80 high density computes (64 or > 128 CPU's with ~400 instances per compute) and I run into the same scaling > issues that are described in the Large Scale Sig and have to do a lot of > tuning to keep the environment stable. My environment is also kinda unique > in the way mine gets used as I have 2k to 4k instances torn down and > rebuilt within an hour or two quite often so my API's are constantly > bombarded. > > On Tue, Feb 2, 2021 at 3:15 PM Erik Olof Gunnar Andersson < > eandersson at blizzard.com> wrote: > > > > the old value of 500 nodes max has not been true for a very long time > > rabbitmq and the db still tends to be the bottelneck to scale however > > beyond 1500 nodes > > outside of the operational overhead. > > > > We manage our scale with regions as well. With 1k nodes our RabbitMQ > > isn't breaking a sweat, and no signs that the database would be hitting any > > limits. Our issues have been limited to scaling Neutron and VM scheduling > > on Nova mostly due to, NUMA pinning. > > ------------------------------ > > *From:* Sean Mooney > > *Sent:* Tuesday, February 2, 2021 9:50 AM > > *To:* openstack-discuss at lists.openstack.org < > > openstack-discuss at lists.openstack.org> > > *Subject:* Re: [ops][largescale-sig] How many compute nodes in a single > > cluster ? > > > > On Tue, 2021-02-02 at 17:37 +0000, Arnaud Morin wrote: > > > Hey all, > > > > > > I will start the answers :) > > > > > > At OVH, our hard limit is around 1500 hypervisors on a region. > > > It also depends a lot on number of instances (and neutron ports). > > > The effects if we try to go above this number: > > > - load on control plane (db/rabbit) is increasing a lot > > > - "burst" load is hard to manage (e.g. restart of all neutron agent or > > > nova computes is putting a high pressure on control plane) > > > - and of course, failure domain is bigger > > > > > > Note that we dont use cells. > > > We are deploying multiple regions, but this is painful to manage / > > > understand for our clients. > > > We are looking for a solution to unify the regions, but we did not find > > > anything which could fit our needs for now. > > > > i assume you do not see cells v2 as a replacment for multipel regions > > because they > > do not provide indepente falut domains and also because they are only a > > nova feature > > so it does not solve sclaing issue in other service like neutorn which are > > streached acrooss > > all cells. > > > > cells are a scaling mechinm but the larger the cloud the harder it is to > > upgrade and cells does not > > help with that infact by adding more contoler it hinders upgrades. > > > > seperate regoins can all be upgraded indepently and can be fault tolerant > > if you dont share serviecs > > between regjions and use fedeeration to avoid sharing keystone. > > > > > > glad to hear you can manage 1500 compute nodes by the way. > > > > the old value of 500 nodes max has not been true for a very long time > > rabbitmq and the db still tends to be the bottelneck to scale however > > beyond 1500 nodes > > outside of the operational overhead. > > > > > > > > Cheers, > > > > > > > > > > > From smooney at redhat.com Wed Feb 3 14:40:03 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 03 Feb 2021 14:40:03 +0000 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> <20210202173713.GA14971@sync> <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> Message-ID: <136f186f665d4dc1b4cb897a1fb7855d10e40730.camel@redhat.com> On Wed, 2021-02-03 at 09:05 -0500, David Ivey wrote: > I am not sure simply going off the number of compute nodes is a good > representation of scaling issues. I think it has a lot more to do with > density/networks/ports and the rate of churn in the environment, but I > could be wrong. For example, I only have 80 high density computes (64 or > 128 CPU's with ~400 instances per compute) and I run into the same scaling > issues that are described in the Large Scale Sig and have to do a lot of > tuning to keep the environment stable. My environment is also kinda unique > in the way mine gets used as I have 2k to 4k instances torn down and > rebuilt within an hour or two quite often so my API's are constantly > bombarded. actully your envionment sound like a pretty typical CI cloud where you often have short lifetimes for instance, oftten have high density and large turnover. but you are correct compute node scalse alone is not a good indictor. port,volume,instance count are deffinetly factors as is the workload profile im just assuming your cloud is a ci cloud but interms of generic workload profiles that would seam to be the closes aproximation im aware off to that type of creation and deleteion in a n hour period. 400 instance per comput ewhile a lot is really not that unreasonable assuming your typical host have 1+TB of ram and you have typically less than 4-8 cores per guests with only 128 CPUs going much above that would be over subscitbing the cpus quite hevially we generally dont recommend exceeding more then about 4x oversubsiption for cpus even though the default is 16 based on legacy reason that assume effectvly website hosts type workloads where the botelneck is not on cpu but disk and network io. with 400 instance per host that also equatest to at least 400 neutrop ports if you are using ipatable thats actully at least 1200 ports on the host which definetly has scalining issues on agent restart or host reboot. usign the python binding for ovs can help a lot as well as changing to the ovs firewall driver as that removes the linux bridge and veth pair created for each nueton port when doing hybrid plug. > > On Tue, Feb 2, 2021 at 3:15 PM Erik Olof Gunnar Andersson < > eandersson at blizzard.com> wrote: > > > > the old value of 500 nodes max has not been true for a very long time > > rabbitmq and the db still tends to be the bottelneck to scale however > > beyond 1500 nodes > > outside of the operational overhead. > > > > We manage our scale with regions as well. With 1k nodes our RabbitMQ > > isn't breaking a sweat, and no signs that the database would be hitting any > > limits. Our issues have been limited to scaling Neutron and VM scheduling > > on Nova mostly due to, NUMA pinning. > > ------------------------------ > > *From:* Sean Mooney > > *Sent:* Tuesday, February 2, 2021 9:50 AM > > *To:* openstack-discuss at lists.openstack.org < > > openstack-discuss at lists.openstack.org> > > *Subject:* Re: [ops][largescale-sig] How many compute nodes in a single > > cluster ? > > > > On Tue, 2021-02-02 at 17:37 +0000, Arnaud Morin wrote: > > > Hey all, > > > > > > I will start the answers :) > > > > > > At OVH, our hard limit is around 1500 hypervisors on a region. > > > It also depends a lot on number of instances (and neutron ports). > > > The effects if we try to go above this number: > > > - load on control plane (db/rabbit) is increasing a lot > > > - "burst" load is hard to manage (e.g. restart of all neutron agent or > > >   nova computes is putting a high pressure on control plane) > > > - and of course, failure domain is bigger > > > > > > Note that we dont use cells. > > > We are deploying multiple regions, but this is painful to manage / > > > understand for our clients. > > > We are looking for a solution to unify the regions, but we did not find > > > anything which could fit our needs for now. > > > > i assume you do not see cells v2 as a replacment for multipel regions > > because they > > do not provide indepente falut domains and also because they are only a > > nova feature > > so it does not solve sclaing issue in other service like neutorn which are > > streached acrooss > > all cells. > > > > cells are a scaling mechinm but the larger the cloud the harder it is to > > upgrade and cells does not > > help with that infact by adding more contoler it hinders upgrades. > > > > seperate regoins can all be upgraded indepently and can be fault tolerant > > if you dont share serviecs > > between regjions and use fedeeration to avoid sharing keystone. > > > > > > glad to hear you can manage 1500 compute nodes by the way. > > > > the old value of 500 nodes max has not been true for a very long time > > rabbitmq and the db still tends to be the bottelneck to scale however > > beyond 1500 nodes > > outside of the operational overhead. > > > > > > > > Cheers, > > > > > > > > > > > From smooney at redhat.com Wed Feb 3 14:55:06 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 03 Feb 2021 14:55:06 +0000 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: <20210203142405.GB14971@sync> References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> <20210202173713.GA14971@sync> <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> <20210203142405.GB14971@sync> Message-ID: On Wed, 2021-02-03 at 14:24 +0000, Arnaud Morin wrote: > Yes, totally agree with that, on our side we are used to monitor the > number of neutron ports (and espacially the number of ports in BUILD > state). > > As usually an instance is having one port in our cloud, number of > instances is closed to number of ports. > > About the cellsv2, we are mostly struggling on neutron side, so cells > are not helping us. > ack, that makes sense. there are some things you can do to help scale neutron. one semi simple step is if you are usign ml2/ovs, ml2/linux-bridge or ml2/sriov-nic-agent is to move neutron to its own rabbitmq instance. neutron using the default ml2 drivers tends to be quite chatty so placing those on there own rabbit instance can help. while its in conflict with ha requirements ensuring that clustering is not used and instead loadblanicn with something like pace maker to a signel rabbitmq server can also help. rabbmqs clustering ablity while improving Ha by removing a singel point of failure decreease the performance of rabbit so if you have good monitoring and simpley restat or redeploy rabbit quickly using k8s or something else like an active backup deplopment mediataed by pacemeaker can work much better then actully clutering. if you use ml2/ovn that allows you to remove the need for the dhcp agent and l3 agent as well as the l2 agent per compute host. that signifcaltly reducece neutron rpc impact however ovn does have some partiy gaps and scaling issues of its own. if it works for you and you can use as a new enough version that allows the ovn southd process on the compute nodes to subscibe to a subset of noth/southdb update relevent to just that node i can help with scaling neutorn. im not sure about usage fo feature like dvr or routed provider networks impact this as i mostly work on nova now but at least form a data plane point of view it can reduce contention on the networing nodes(where l3 agents ran) to do routing and nat on behalf of all compute nodes. at some point it might make sense for neutorn to take a similar cells approch to its own architrue but given the ablity of it to delegate some or all of the networkign to extrenal network contoler like ovn/odl its never been clear that an in tree sharding mechium like cells was actully required. one thing that i hope some one will have time to investate at some point is can we replace rabbitmq in general with nats. this general topic comes up with different technolgies form time to time. nats however look like it would actuly be a good match in terms of feature and intended use while being much lighter weight then rabbitmq and actully improving in performance the more nats server instance you cluster since that was a design constraint form the start. i dont actully think neutorn acritrues or nova for that matter is inherintly flawed but a more moderne messagaing buts might help all distibuted services scale with fewer issues then they have today. > From david.j.ivey at gmail.com Wed Feb 3 15:03:24 2021 From: david.j.ivey at gmail.com (David Ivey) Date: Wed, 3 Feb 2021 10:03:24 -0500 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: <136f186f665d4dc1b4cb897a1fb7855d10e40730.camel@redhat.com> References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> <20210202173713.GA14971@sync> <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> <136f186f665d4dc1b4cb897a1fb7855d10e40730.camel@redhat.com> Message-ID: I never thought about it being like a CI cloud, but it would be very similar in usage. I should clarify that it is actually physical cores (AMD Epics) so it's 128 and 256 threads and yes at least 1TB ram per with ceph shared storage. That 400 is actually capped out at about 415 instances per compute (same cap for 64 and 128 cpu's) where I run into kernel/libvirt issues and nfconntrack hits limits and crashes. I don't have specifics to give at the moment regarding that issue, I will have to try and recreate/reproduce that when I get my other environment freed up to allow me to test that again. I was in a hurry last time that happened to me and did not get a chance to gather all the information for a bug. Switching to python binding with ovs and some tuning of mariadb, rabbit, haproxy and memcached is how I got to be able to accommodate that rate of turnover. On Wed, Feb 3, 2021 at 9:40 AM Sean Mooney wrote: > On Wed, 2021-02-03 at 09:05 -0500, David Ivey wrote: > > I am not sure simply going off the number of compute nodes is a good > > representation of scaling issues. I think it has a lot more to do with > > density/networks/ports and the rate of churn in the environment, but I > > could be wrong. For example, I only have 80 high density computes (64 or > > 128 CPU's with ~400 instances per compute) and I run into the same > scaling > > issues that are described in the Large Scale Sig and have to do a lot of > > tuning to keep the environment stable. My environment is also kinda > unique > > in the way mine gets used as I have 2k to 4k instances torn down and > > rebuilt within an hour or two quite often so my API's are constantly > > bombarded. > actully your envionment sound like a pretty typical CI cloud > where you often have short lifetimes for instance, oftten have high density > and large turnover. > but you are correct compute node scalse alone is not a good indictor. > port,volume,instance count are deffinetly factors as is the workload > profile > > im just assuming your cloud is a ci cloud but interms of generic workload > profiles > that would seam to be the closes aproximation im aware off to that type of > creation > and deleteion in a n hour period. > > 400 instance per comput ewhile a lot is really not that unreasonable > assuming your > typical host have 1+TB of ram and you have typically less than 4-8 cores > per guests > with only 128 CPUs going much above that would be over subscitbing the > cpus quite hevially > we generally dont recommend exceeding more then about 4x oversubsiption > for cpus even though > the default is 16 based on legacy reason that assume effectvly website > hosts type workloads > where the botelneck is not on cpu but disk and network io. > > with 400 instance per host that also equatest to at least 400 neutrop ports > if you are using ipatable thats actully at least 1200 ports on the host > which definetly has > scalining issues on agent restart or host reboot. > > usign the python binding for ovs can help a lot as well as changing to the > ovs firewall driver > as that removes the linux bridge and veth pair created for each nueton > port when doing hybrid plug. > > > > > On Tue, Feb 2, 2021 at 3:15 PM Erik Olof Gunnar Andersson < > > eandersson at blizzard.com> wrote: > > > > > > the old value of 500 nodes max has not been true for a very long time > > > rabbitmq and the db still tends to be the bottelneck to scale however > > > beyond 1500 nodes > > > outside of the operational overhead. > > > > > > We manage our scale with regions as well. With 1k nodes our RabbitMQ > > > isn't breaking a sweat, and no signs that the database would be > hitting any > > > limits. Our issues have been limited to scaling Neutron and VM > scheduling > > > on Nova mostly due to, NUMA pinning. > > > ------------------------------ > > > *From:* Sean Mooney > > > *Sent:* Tuesday, February 2, 2021 9:50 AM > > > *To:* openstack-discuss at lists.openstack.org < > > > openstack-discuss at lists.openstack.org> > > > *Subject:* Re: [ops][largescale-sig] How many compute nodes in a single > > > cluster ? > > > > > > On Tue, 2021-02-02 at 17:37 +0000, Arnaud Morin wrote: > > > > Hey all, > > > > > > > > I will start the answers :) > > > > > > > > At OVH, our hard limit is around 1500 hypervisors on a region. > > > > It also depends a lot on number of instances (and neutron ports). > > > > The effects if we try to go above this number: > > > > - load on control plane (db/rabbit) is increasing a lot > > > > - "burst" load is hard to manage (e.g. restart of all neutron agent > or > > > > nova computes is putting a high pressure on control plane) > > > > - and of course, failure domain is bigger > > > > > > > > Note that we dont use cells. > > > > We are deploying multiple regions, but this is painful to manage / > > > > understand for our clients. > > > > We are looking for a solution to unify the regions, but we did not > find > > > > anything which could fit our needs for now. > > > > > > i assume you do not see cells v2 as a replacment for multipel regions > > > because they > > > do not provide indepente falut domains and also because they are only a > > > nova feature > > > so it does not solve sclaing issue in other service like neutorn which > are > > > streached acrooss > > > all cells. > > > > > > cells are a scaling mechinm but the larger the cloud the harder it is > to > > > upgrade and cells does not > > > help with that infact by adding more contoler it hinders upgrades. > > > > > > seperate regoins can all be upgraded indepently and can be fault > tolerant > > > if you dont share serviecs > > > between regjions and use fedeeration to avoid sharing keystone. > > > > > > > > > glad to hear you can manage 1500 compute nodes by the way. > > > > > > the old value of 500 nodes max has not been true for a very long time > > > rabbitmq and the db still tends to be the bottelneck to scale however > > > beyond 1500 nodes > > > outside of the operational overhead. > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Wed Feb 3 15:28:06 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 3 Feb 2021 10:28:06 -0500 Subject: [cinder][ops] Backup drivers issue with the container parameter In-Reply-To: <20210129172923.3dk3wqfkfu6xtriy@localhost> References: <20210129172923.3dk3wqfkfu6xtriy@localhost> Message-ID: <5a2cbe0b-4c12-b8d8-151f-e88f1aa73afc@gmail.com> Re-sending this because the PTL somehow missed this agenda item for today's Cinder meeting (which just ended). We'll discuss this at next week's cinder virtual R-9 mid-cycle (wednesday 10 Feb 1400-1600 UTC): https://etherpad.opendev.org/p/cinder-wallaby-mid-cycles On 1/29/21 12:29 PM, Gorka Eguileor wrote: > Hi all, > > In the next Cinder meeting I'll bring a Backup driver issue up for > discussion, and this email hopefully provides the necessary context to > have a fruitful discussion. > > The issue is around the `container` optional parameter in backup > creation, and its user and administrator unfriendliness. > > The interpretation of the `container` parameter is driver dependent, and > it's being treated as: > > - A bucket in Google Cloud Storage and the new S3 driver > - A container in Swift > - A pool in Ceph > - A directory in NFS and Posix > > Currently the only way to prevent cloud users from selecting a different > `container` is by restricting what the storage user configured in Cinder > backup can do. > > For Ceph we can make the storage user unable to access any other > existing pools, for Swift, GCS, and S3 we can remove permissions to > create buckets/containers from the storage user. > > This achieves the administrator's objective of not allowing them to > change the `container`, but cloud users will have a bad experience, > because the API will accept the request but the backup will go into > `error` state and they won't see any additional information. > > And this solution is an all or nothing approach, as we cannot allow just > some cloud users select the container while preventing others from doing > so. For example we may want some cloud users to be able to do backups > on a specific RBD pool that is replicated to a remote location. > > I think we can solve all these issues if we: > > - Create a policy for accepting the `container` parameter on the API > (defaulting to allow for backward compatibility). > > - Add a new configuration option `backup_container_regex` to control > acceptable values for the `container` (defaults to `.*` for backward > compatibility). > > This option would be used by the backup manager (not the drivers > themselves) on backup creation, and would result in a user message if > the provided container was not empty and failed the regex check. > > I think this summarizes the situation and my view on the matter. > > Feedback is welcome here or in the next Cinder meeting. > > Cheers, > Gorka. > > From mjturek at linux.vnet.ibm.com Wed Feb 3 17:03:26 2021 From: mjturek at linux.vnet.ibm.com (Michael Turek) Date: Wed, 3 Feb 2021 12:03:26 -0500 Subject: Integration tests failing on master branch for ppc64le In-Reply-To: References: Message-ID: Hello all, It seems that nova patch 756549 [0] seems to have broken us but it's not yet clear why. Aditi provided a guest XML from a failed test [1], but there are no defined USB devices despite what the libvirt error suggests. Has anyone else seen this? We'll continue to investigate. We could be hitting a libvirt bug, though I haven't found one open yet. Thanks, Mike Turek [0] https://review.opendev.org/c/openstack/nova/+/756549 [1] http://paste.openstack.org/show/802283/ On 2/1/21 6:35 AM, aditi Dukle wrote: > Hi, We recently observed failures on master branch jobs for ppc64le.... > This Message Is From an External Sender > This message came from outside your organization. > > Hi, > > We recently observed failures on master branch jobs for ppc64le. The > VMs fail to boot and this is the error I observed in libvirt logs - > "qemuDomainUSBAddressAddHubs:2997 : unsupported configuration: USB is > disabled for this domain, but USB devices are present in the domain XML" > Detailed logs - > https://oplab9.parqtec.unicamp.br/pub/ppc64el/openstack/nova/96/758396/7/check/tempest-dsvm-full-focal-py3/04fc9e7/job-output.txt > > > Could you please let me know if there was any recent configuration > that was done to disable USB configuration in the instance domain? > > Thanks, > Aditi Dukle > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raubvogel at gmail.com Wed Feb 3 17:40:10 2021 From: raubvogel at gmail.com (Mauricio Tavares) Date: Wed, 3 Feb 2021 12:40:10 -0500 Subject: Cannot specify host in availability zone Message-ID: Easy peasy question: According to https://docs.openstack.org/nova/rocky/admin/availability-zones.html, I can specify the host I want to use by following the format --availability-zone ZONE:HOST So, let's get the hostnames for the compute nodes. [raub at test-hn ~(keystone_admin)]$ openstack compute service list --service nova-compute +----+--------------+-----------------------+------+---------+-------+----------------------------+ | ID | Binary | Host | Zone | Status | State | Updated At | +----+--------------+-----------------------+------+---------+-------+----------------------------+ | 11 | nova-compute | compute02.example.com | nova | enabled | up | 2021-02-03T17:25:53.000000 | | 12 | nova-compute | compute03.example.com | nova | enabled | up | 2021-02-03T17:25:53.000000 | | 13 | nova-compute | compute01.example.com | nova | enabled | up | 2021-02-03T17:25:52.000000 | +----+--------------+-----------------------+------+---------+-------+----------------------------+ [raub at test-hn ~(keystone_admin)]$ I want to use compute01.example.com, which I can resolve to 10.1.1.11. But, when I try to create server with (running as admin): openstack server create \ --image default_centos_8 \ --flavor m1.small \ --key-name raubkey \ --availability-zone nova:compute01.example.com \ --nic net-id=LONG_BORING-ID \ raub-netest I get the error message (from openstack server show raub-netest|grep fault): 'message': 'No valid host was found. No such host - host: compute01.example.com node: None ' What am I doing wrong here? From jonathan.rosser at rd.bbc.co.uk Wed Feb 3 18:27:12 2021 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Wed, 3 Feb 2021 18:27:12 +0000 Subject: [KEYSTONE][FEDERATION] Groups mapping problem when using keycloak as IDP In-Reply-To: References: <4b328f90066149db85d0a006fb7ea01b@elca.ch> Message-ID: <0917ee6e-f918-8a2e-7abb-6de42724ba5c@rd.bbc.co.uk> Hi Jean-Francois, I made a patch to the openstack-ansible keystone role which will hopefully address this. It would be really helpful if you are able to test the patch and provide some feedback. https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/773978 Regards, Jonathan. On 03/02/2021 10:03, Taltavull Jean-Francois wrote: > Hello, > > Actually, the solution is to add this line to Apache configuration: > OIDCClaimDelimiter ";" > > The problem is that this configuration variable does not exist in OSA keystone role and its apache configuration template (https://opendev.org/openstack/openstack-ansible-os_keystone/src/branch/master/templates/keystone-httpd.conf.j2). > > > Jean-Francois > >> -----Original Message----- >> From: Taltavull Jean-Francois >> Sent: lundi, 1 février 2021 14:44 >> To: openstack-discuss at lists.openstack.org >> Subject: [KEYSTONE][FEDERATION] Groups mapping problem when using >> keycloak as IDP >> >> Hello, >> >> In order to implement identity federation, I've deployed (with OSA) keystone >> (Ussuri) as Service Provider and Keycloak as IDP. >> >> As one can read at [1], "groups" can have multiple values and each value must >> be separated by a ";" >> >> But, in the OpenID token sent by keycloak, groups are represented with a JSON >> list and keystone fails to parse it well (only the first group of the list is mapped). >> >> Have any of you already faced this problem ? >> >> Thanks ! >> >> Jean-François >> >> [1] >> https://docs.openstack.org/keystone/ussuri/admin/federation/mapping_combi >> nations.html > From eandersson at blizzard.com Wed Feb 3 20:13:33 2021 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Wed, 3 Feb 2021 20:13:33 +0000 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> <20210202173713.GA14971@sync> <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> <136f186f665d4dc1b4cb897a1fb7855d10e40730.camel@redhat.com> Message-ID: We also have three different types of Clouds deployments. 1. Large deployment with 12,000+ nodes. 2. Smaller deployment with a lot higher density VMs (over-provisioned). 3. Small deployment with high number of security groups. All three have very different issues. In deployment one the major issue is just the sheer number of updates from the nova-compute and neutron agents. In deployment two we suffer more from just the sheer number of changes to things like neutron ports. The third deployment struggles with the scalability of security groups. Also worth mentioning that things like Kubernetes (and high parallel Terraform deployments to some degree) deployments posed new issues for our deployments as either one of those can trigger millions of API calls per day, especially in cases where Kubernetes has gone “rogue” trying to recover from an unexpected state (e.g. bad volume, bad load balancer). Best Regards, Erik Olof Gunnar Andersson Technical Lead, Senior Cloud Engineer From: David Ivey Sent: Wednesday, February 3, 2021 7:03 AM To: Sean Mooney Cc: Erik Olof Gunnar Andersson ; openstack-discuss at lists.openstack.org Subject: Re: [ops][largescale-sig] How many compute nodes in a single cluster ? I never thought about it being like a CI cloud, but it would be very similar in usage. I should clarify that it is actually physical cores (AMD Epics) so it's 128 and 256 threads and yes at least 1TB ram per with ceph shared storage. That 400 is actually capped out at about 415 instances per compute (same cap for 64 and 128 cpu's) where I run into kernel/libvirt issues and nfconntrack hits limits and crashes. I don't have specifics to give at the moment regarding that issue, I will have to try and recreate/reproduce that when I get my other environment freed up to allow me to test that again. I was in a hurry last time that happened to me and did not get a chance to gather all the information for a bug. Switching to python binding with ovs and some tuning of mariadb, rabbit, haproxy and memcached is how I got to be able to accommodate that rate of turnover. On Wed, Feb 3, 2021 at 9:40 AM Sean Mooney > wrote: On Wed, 2021-02-03 at 09:05 -0500, David Ivey wrote: > I am not sure simply going off the number of compute nodes is a good > representation of scaling issues. I think it has a lot more to do with > density/networks/ports and the rate of churn in the environment, but I > could be wrong. For example, I only have 80 high density computes (64 or > 128 CPU's with ~400 instances per compute) and I run into the same scaling > issues that are described in the Large Scale Sig and have to do a lot of > tuning to keep the environment stable. My environment is also kinda unique > in the way mine gets used as I have 2k to 4k instances torn down and > rebuilt within an hour or two quite often so my API's are constantly > bombarded. actully your envionment sound like a pretty typical CI cloud where you often have short lifetimes for instance, oftten have high density and large turnover. but you are correct compute node scalse alone is not a good indictor. port,volume,instance count are deffinetly factors as is the workload profile im just assuming your cloud is a ci cloud but interms of generic workload profiles that would seam to be the closes aproximation im aware off to that type of creation and deleteion in a n hour period. 400 instance per comput ewhile a lot is really not that unreasonable assuming your typical host have 1+TB of ram and you have typically less than 4-8 cores per guests with only 128 CPUs going much above that would be over subscitbing the cpus quite hevially we generally dont recommend exceeding more then about 4x oversubsiption for cpus even though the default is 16 based on legacy reason that assume effectvly website hosts type workloads where the botelneck is not on cpu but disk and network io. with 400 instance per host that also equatest to at least 400 neutrop ports if you are using ipatable thats actully at least 1200 ports on the host which definetly has scalining issues on agent restart or host reboot. usign the python binding for ovs can help a lot as well as changing to the ovs firewall driver as that removes the linux bridge and veth pair created for each nueton port when doing hybrid plug. > > On Tue, Feb 2, 2021 at 3:15 PM Erik Olof Gunnar Andersson < > eandersson at blizzard.com> wrote: > > > > the old value of 500 nodes max has not been true for a very long time > > rabbitmq and the db still tends to be the bottelneck to scale however > > beyond 1500 nodes > > outside of the operational overhead. > > > > We manage our scale with regions as well. With 1k nodes our RabbitMQ > > isn't breaking a sweat, and no signs that the database would be hitting any > > limits. Our issues have been limited to scaling Neutron and VM scheduling > > on Nova mostly due to, NUMA pinning. > > ------------------------------ > > *From:* Sean Mooney > > > *Sent:* Tuesday, February 2, 2021 9:50 AM > > *To:* openstack-discuss at lists.openstack.org < > > openstack-discuss at lists.openstack.org> > > *Subject:* Re: [ops][largescale-sig] How many compute nodes in a single > > cluster ? > > > > On Tue, 2021-02-02 at 17:37 +0000, Arnaud Morin wrote: > > > Hey all, > > > > > > I will start the answers :) > > > > > > At OVH, our hard limit is around 1500 hypervisors on a region. > > > It also depends a lot on number of instances (and neutron ports). > > > The effects if we try to go above this number: > > > - load on control plane (db/rabbit) is increasing a lot > > > - "burst" load is hard to manage (e.g. restart of all neutron agent or > > > nova computes is putting a high pressure on control plane) > > > - and of course, failure domain is bigger > > > > > > Note that we dont use cells. > > > We are deploying multiple regions, but this is painful to manage / > > > understand for our clients. > > > We are looking for a solution to unify the regions, but we did not find > > > anything which could fit our needs for now. > > > > i assume you do not see cells v2 as a replacment for multipel regions > > because they > > do not provide indepente falut domains and also because they are only a > > nova feature > > so it does not solve sclaing issue in other service like neutorn which are > > streached acrooss > > all cells. > > > > cells are a scaling mechinm but the larger the cloud the harder it is to > > upgrade and cells does not > > help with that infact by adding more contoler it hinders upgrades. > > > > seperate regoins can all be upgraded indepently and can be fault tolerant > > if you dont share serviecs > > between regjions and use fedeeration to avoid sharing keystone. > > > > > > glad to hear you can manage 1500 compute nodes by the way. > > > > the old value of 500 nodes max has not been true for a very long time > > rabbitmq and the db still tends to be the bottelneck to scale however > > beyond 1500 nodes > > outside of the operational overhead. > > > > > > > > Cheers, > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From moreira.belmiro.email.lists at gmail.com Wed Feb 3 20:40:04 2021 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Wed, 3 Feb 2021 21:40:04 +0100 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> Message-ID: Hi, at CERN we have 3 regions regions with a total of 75 cells (>8000 compute nodes). In the past we had a cell with almost 2000 compute nodes. Now, we try to not have more than 200 compute nodes per cell. We prefer to manage more but smaller cells. Belmiro CERN On Thu, Jan 28, 2021 at 2:29 PM Thierry Carrez wrote: > Hi everyone, > > As part of the Large Scale SIG[1] activities, I'd like to quickly poll > our community on the following question: > > How many compute nodes do you feel comfortable fitting in a > single-cluster deployment of OpenStack, before you need to scale it out > to multiple regions/cells/.. ? > > Obviously this depends on a lot of deployment-dependent factors (type of > activity, choice of networking...) so don't overthink it: a rough number > is fine :) > > [1] https://wiki.openstack.org/wiki/Large_Scale_SIG > > Thanks in advance, > > -- > Thierry Carrez (ttx) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnsomor at gmail.com Wed Feb 3 23:05:41 2021 From: johnsomor at gmail.com (Michael Johnson) Date: Wed, 3 Feb 2021 15:05:41 -0800 Subject: [All][StoryBoard] Angular.js Alternatives In-Reply-To: <0a0bf782-09eb-aeb6-fdd7-0d3e3d7b4f89@catalystcloud.nz> References: <0a0bf782-09eb-aeb6-fdd7-0d3e3d7b4f89@catalystcloud.nz> Message-ID: Horizon has a similar issue with the Angular.js components. Since there is documentation[1] for migrating from Angular.js (the version reaching end-of-life) to Angular(2/4) I had assumed we would be migrating Horizon and the plugins to Angular. It would be nice to align all of the OpenInfra components to the same framework. I agree with Adrian, I would like to include feedback from the Horizon folks. Michael [1] https://angular.io/guide/upgrade On Mon, Feb 1, 2021 at 11:07 PM Adrian Turjak wrote: > > Sorry for being late to the thread, but it might be worth touching base with Horizon peeps as well, because there could potentially be some useful knowledge/contributor sharing if we all stick to similar libraries for front end when the eventually Horizon rewrite starts. I'm sadly also out of the loop there so no clue if they even know what direction they plan to go in. > > Although given the good things I've heard about Vue.js I can't say it's a bad choice (but neither would have been react). > > On 29/01/21 7:33 am, Kendall Nelson wrote: > > To circle back to this, the StoryBoard team has decided on Vue, given some contributors previous experience with it and a POC already started. > > Thank you everyone for your valuable input! We really do appreciate it! > > -Kendall (diablo_rojo) > > On Thu, Jan 21, 2021 at 1:23 PM Kendall Nelson wrote: >> >> Hello Everyone! >> >> The StoryBoard team is looking at alternatives to Angular.js since its going end of life. After some research, we've boiled all the options down to two possibilities: >> >> Vue.js >> >> or >> >> React.js >> >> I am diving more deeply into researching those two options this week, but any opinions or feedback on your experiences with either of them would be helpful! >> >> Here is the etherpad with our research so far[3]. >> >> Feel free to add opinions there or in response to this thread! >> >> -Kendall Nelson (diablo_rojo) & The StoryBoard Team >> >> [1] https://vuejs.org/ >> [2] https://reactjs.org/ >> [3] https://etherpad.opendev.org/p/replace-angularjs-storyboard-research From adriant at catalystcloud.nz Wed Feb 3 23:43:50 2021 From: adriant at catalystcloud.nz (Adrian Turjak) Date: Thu, 4 Feb 2021 12:43:50 +1300 Subject: [All][StoryBoard] Angular.js Alternatives In-Reply-To: References: <0a0bf782-09eb-aeb6-fdd7-0d3e3d7b4f89@catalystcloud.nz> Message-ID: <4b35907b-f53e-4812-1140-ab8abb770af2@catalystcloud.nz> From most of my front end dev friends I've heard that a migration away from AngularJS is basically a rewrite, so switching to Vue.js or React.js now if we think that's the best idea isn't a stretch at all. I just think that if we stick to one frontend framework for most of OpenStack it can make it easier to share resources. :) On 4/02/21 12:05 pm, Michael Johnson wrote: > Horizon has a similar issue with the Angular.js components. > > Since there is documentation[1] for migrating from Angular.js (the > version reaching end-of-life) to Angular(2/4) I had assumed we would > be migrating Horizon and the plugins to Angular. > > It would be nice to align all of the OpenInfra components to the same > framework. I agree with Adrian, I would like to include feedback from > the Horizon folks. > > Michael > > [1] https://angular.io/guide/upgrade > > On Mon, Feb 1, 2021 at 11:07 PM Adrian Turjak wrote: >> Sorry for being late to the thread, but it might be worth touching base with Horizon peeps as well, because there could potentially be some useful knowledge/contributor sharing if we all stick to similar libraries for front end when the eventually Horizon rewrite starts. I'm sadly also out of the loop there so no clue if they even know what direction they plan to go in. >> >> Although given the good things I've heard about Vue.js I can't say it's a bad choice (but neither would have been react). >> >> On 29/01/21 7:33 am, Kendall Nelson wrote: >> >> To circle back to this, the StoryBoard team has decided on Vue, given some contributors previous experience with it and a POC already started. >> >> Thank you everyone for your valuable input! We really do appreciate it! >> >> -Kendall (diablo_rojo) >> >> On Thu, Jan 21, 2021 at 1:23 PM Kendall Nelson wrote: >>> Hello Everyone! >>> >>> The StoryBoard team is looking at alternatives to Angular.js since its going end of life. After some research, we've boiled all the options down to two possibilities: >>> >>> Vue.js >>> >>> or >>> >>> React.js >>> >>> I am diving more deeply into researching those two options this week, but any opinions or feedback on your experiences with either of them would be helpful! >>> >>> Here is the etherpad with our research so far[3]. >>> >>> Feel free to add opinions there or in response to this thread! >>> >>> -Kendall Nelson (diablo_rojo) & The StoryBoard Team >>> >>> [1] https://vuejs.org/ >>> [2] https://reactjs.org/ >>> [3] https://etherpad.opendev.org/p/replace-angularjs-storyboard-research From mnaser at vexxhost.com Thu Feb 4 01:08:38 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Wed, 3 Feb 2021 20:08:38 -0500 Subject: [tc] weekly meeting Message-ID: Hi everyone, Here’s the agenda for our weekly TC meeting. It will happen tomorrow (Thursday the 4th) at 1500 UTC in #openstack-tc and I will be your chair. If you can’t attend, please put your name in the “Apologies for Absence” section. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting # ACTIVE INITIATIVES * Follow up on past action items * Audit SIG list and chairs (diablo_rojo) - https://etherpad.opendev.org/p/2021-SIG-Updates - http://lists.openstack.org/pipermail/openstack-discuss/2021-January/019994.html * Gate performance and heavy job configs (dansmith) - http://paste.openstack.org/show/jD6kAP9tHk7PZr2nhv8h/ * Dropping lower-constraints testing from all projects (gmann) - http://lists.openstack.org/pipermail/openstack-discuss/2021-January/019672.html * Mistral Maintenance (gmann) - Any early step we can take to get help in Mistral maintenance? - http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020137.html * Open Reviews - https://review.opendev.org/q/project:openstack/governance+is:open Thanks, Mohammed -- Mohammed Naser VEXXHOST, Inc. From pfb29 at cam.ac.uk Thu Feb 4 08:05:43 2021 From: pfb29 at cam.ac.uk (Paul Browne) Date: Thu, 4 Feb 2021 08:05:43 +0000 Subject: Cannot specify host in availability zone In-Reply-To: References: Message-ID: Hi Mauricio, What happens if instead of the node name you use the node UUID, as returned from; openstack hypervisor list ? On Wed, 3 Feb 2021, 17:46 Mauricio Tavares, wrote: > Easy peasy question: According to > https://docs.openstack.org/nova/rocky/admin/availability-zones.html, I > can specify the host I want to use by following the format > > --availability-zone ZONE:HOST > > So, let's get the hostnames for the compute nodes. > > [raub at test-hn ~(keystone_admin)]$ openstack compute service list > --service nova-compute > > +----+--------------+-----------------------+------+---------+-------+----------------------------+ > | ID | Binary | Host | Zone | Status | State | > Updated At | > > +----+--------------+-----------------------+------+---------+-------+----------------------------+ > | 11 | nova-compute | compute02.example.com | nova | enabled | up | > 2021-02-03T17:25:53.000000 | > | 12 | nova-compute | compute03.example.com | nova | enabled | up | > 2021-02-03T17:25:53.000000 | > | 13 | nova-compute | compute01.example.com | nova | enabled | up | > 2021-02-03T17:25:52.000000 | > > +----+--------------+-----------------------+------+---------+-------+----------------------------+ > [raub at test-hn ~(keystone_admin)]$ > > I want to use compute01.example.com, which I can resolve to 10.1.1.11. > But, when I try to create server with (running as admin): > > openstack server create \ > --image default_centos_8 \ > --flavor m1.small \ > --key-name raubkey \ > --availability-zone nova:compute01.example.com \ > --nic net-id=LONG_BORING-ID \ > raub-netest > > I get the error message (from openstack server show raub-netest|grep > fault): > > 'message': 'No valid host was found. No such host - host: > compute01.example.com node: None ' > > What am I doing wrong here? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Thu Feb 4 08:50:19 2021 From: zigo at debian.org (Thomas Goirand) Date: Thu, 4 Feb 2021 09:50:19 +0100 Subject: [all] Eventlet broken again with SSL, this time under Python 3.9 In-Reply-To: <20210202155734.ym423voolru6voxn@yuggoth.org> References: <6f653877fa1da9b2d191b3d3818307f9b29f60bb.camel@redhat.com> <7441d18e5313af6d76a28cabb3866e05dad6f6d5.camel@redhat.com> <666451611999668@mail.yandex.ru> <6241612007690@mail.yandex.ru> <88041612182037@mail.yandex.ru> <9fd8defd-4bcc-2980-5d0e-0e4f696dfbf9@debian.org> <20210202155734.ym423voolru6voxn@yuggoth.org> Message-ID: <8a08ad50-f5df-0db6-400f-153f025d4b3d@debian.org> On 2/2/21 4:57 PM, Jeremy Stanley wrote: > On 2021-02-02 12:32:29 +0100 (+0100), Thomas Goirand wrote: > [...] >> I found out that downgrading to python3-dnspython 1.16.0 made >> swift-proxy (and probably others) back to working. > [...] > > If memory serves, dnspython and eventlet both monkey-patch the > stdlib in potentially conflicting ways, and we've seen them interact > badly in the past. According to upstream dnspython author, no, dnspython does not monkey-patch the SSL std lib. However, Eventlet monkey-patches dnspython, in a way which is incompatible with version 2.0.0. See Bob's comment on the Eventlet issue: https://github.com/eventlet/eventlet/issues/619#issuecomment-660250478 Cheers, Thomas Goirand From xin-ran.wang at intel.com Thu Feb 4 09:00:03 2021 From: xin-ran.wang at intel.com (Wang, Xin-ran) Date: Thu, 4 Feb 2021 09:00:03 +0000 Subject: [cyborg]No meeting on Feb 11 Message-ID: Hi all, With Chinese new year approaching and people being out, we are going to cancel the cyborg IRC meeting on Feb 11. Have a Happy Holiday and Happy Chinese New Year. 😊 Thanks, Xin-Ran -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Thu Feb 4 09:12:41 2021 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 4 Feb 2021 10:12:41 +0100 Subject: [All][StoryBoard] Angular.js Alternatives In-Reply-To: <4b35907b-f53e-4812-1140-ab8abb770af2@catalystcloud.nz> References: <0a0bf782-09eb-aeb6-fdd7-0d3e3d7b4f89@catalystcloud.nz> <4b35907b-f53e-4812-1140-ab8abb770af2@catalystcloud.nz> Message-ID: <78e2ff18-0321-0c32-4aa0-b8f77971c870@openstack.org> Adrian Turjak wrote: > [...] > I just think that if we stick to one frontend framework for most of > OpenStack it can make it easier to share resources. :) I agree on principle... But unfortunately in the StoryBoard experience adopting the same framework as Horizon did not magically make Horizon people invest time improving StoryBoard. All other things being equal, I would indeed recommend alignment on the same framework. But teams existing familiarity with the framework chosen (or ease of learning said chosen framework, or desirable features for the specific use case) probably rank higher in the list of criteria. -- Thierry Carrez (ttx) From midhunlaln66 at gmail.com Thu Feb 4 09:42:21 2021 From: midhunlaln66 at gmail.com (Midhunlal Nb) Date: Thu, 4 Feb 2021 15:12:21 +0530 Subject: Authentication error after configuring LDAP integration with openstack Message-ID: Hi all, Before ldap integration openstack working properly but if i set "driver = ldap" in keystone.conf under [identity] section nothing is working for me,I am not able run any openstack command and also not able to create any project or domain or user.If remove "driver = ldap" entry everything working back normally please help me on this issue. If i run admin-openrc file I am getting below error; root at controller:~/client-scripts# openstack image list The request you have made requires authentication. (HTTP 401) (Request-ID: req-bdcde4be-5b62-4454-9084-19324603d0ce) Please help me .Where I am making mistakes? Thanks & Regards Midhunlal N B +918921245637 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wchy1001 at gmail.com Thu Feb 4 10:11:34 2021 From: wchy1001 at gmail.com (W Ch) Date: Thu, 4 Feb 2021 18:11:34 +0800 Subject: [nova][octavia][kolla] failed to create loadbalancer on Centos8 system Message-ID: Hi: Recently, we added a CI task[0] for octavia in the kolla project. and we tested octavia based on the ubuntu and centos systems. The ubuntu system worked as expected but Centos did not work. I tried to debug it and result is following: Octavia can not create a load balancer on centos8. because octavia-worker failed to plug a vip port on amphora vm.[1] 2021-01-31 08:20:12.065 22 ERROR octavia.compute.drivers.nova_driver [-] Error attaching network None with ip None and port 26a39187-e95a-4131-91e7-24289e777f36 to amphora (compute_id: a210ec88-b554-487f-a125-30b5c7473060) : novaclient.exceptions.ClientException: Unknown Error (HTTP 504) 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs [-] Error plugging amphora (compute_id: a210ec88-b554-487f-a125-30b5c7473060) into port 26a39187-e95a-4131-91e7-24289e777f36.: octavia.common.exceptions.ComputeUnknownException: Unknown exception from the compute driver: Unknown Error (HTTP 504). 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs Traceback (most recent call last): 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/octavia/compute/drivers/nova_driver.py", line 318, in attach_network_or_port 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs port_id=port_id) 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/api_versions.py", line 393, in substitution 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs return methods[-1].func(obj, *args, **kwargs) 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/v2/servers.py", line 2063, in interface_attach 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs obj_class=NetworkInterface) 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/base.py", line 363, in _create 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs resp, body = self.api.client.post(url, body=body) 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 401, in post 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs return self.request(url, 'POST', **kwargs) 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/client.py", line 78, in request 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs raise exceptions.from_response(resp, body, url, method) 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs novaclient.exceptions.ClientException: Unknown Error (HTTP 504) Octavia-work called Neutron API to create a port. And called nova-api to attach the vip to amphora. Neutron created port successfully, but nova failed to attach the port to instance.[2] 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [req-ab0e8d9b-664b-430f-8006-cad713b0c826 401ba22da5f8427fbda5fce24600041b 8ce8b97f710f43d7af2b8f9b1e0463c8 - default default] [instance: a210ec88-b554-487f-a125-30b5c7473060] attaching network adapter failed.: libvirt.libvirtError: Unable to read from monitor: Connection reset by peer 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] Traceback (most recent call last): 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 2149, in attach_interface 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] guest.attach_device(cfg, persistent=True, live=live) 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", line 304, in attach_device 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] self._domain.attachDeviceFlags(device_xml, flags=flags) 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] result = proxy_call(self._autowrap, f, *args, **kwargs) 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] rv = execute(f, *args, **kwargs) 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] six.reraise(c, e, tb) 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] raise value 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] rv = meth(*args, **kwargs) 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/usr/lib64/python3.6/site-packages/libvirt.py", line 630, in attachDeviceFlags 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self) 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] libvirt.libvirtError: Unable to read from monitor: Connection reset by peer 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] Nova-compute called libvirt to attach the device. And libvirt also failed to attach the device[3] 2021-01-31 08:20:23.884+0000: 86663: error : qemuMonitorIORead:491 : Unable to read from monitor: Connection reset by peer 2021-01-31 08:20:23.884+0000: 86663: debug : qemuMonitorIO:618 : Error on monitor Unable to read from monitor: Connection reset by peer 2021-01-31 08:20:23.884+0000: 86663: info : virObjectRef:402 : OBJECT_REF: obj=0x7f004c00b610 2021-01-31 08:20:23.884+0000: 86663: debug : qemuMonitorIO:649 : Triggering error callback 2021-01-31 08:20:23.884+0000: 86663: debug : qemuProcessHandleMonitorError:346 : Received error on 0x7f004c0095b0 'instance-00000001' 2021-01-31 08:20:23.884+0000: 64768: debug : qemuMonitorSend:958 : Send command resulted in error Unable to read from monitor: Connection reset by peer I also tried to use kolla/ubuntu-source-nova-libvirt instead of kolla/centos-source-nova-libvirt. and it worked as expected. I think the root cause is that libvirt failed to attach a network device. but i don't know how to resolve this problem. could anyone help me? Thanks Wuchunyang [0]: https://review.opendev.org/c/openstack/kolla-ansible/+/754285 [1]: https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/octavia/octavia-worker.txt#1116 [2]: https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/nova/nova-compute.txt#2546 [3]: https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/libvirt/libvirtd.txt#194472 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.rosser at rd.bbc.co.uk Thu Feb 4 10:25:13 2021 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Thu, 4 Feb 2021 10:25:13 +0000 Subject: [nova][octavia][kolla][openstack-ansible] failed to create loadbalancer on Centos8 system In-Reply-To: References: Message-ID: I'm seeing very similar errors on patches we are trying to merge in openstack-ansible. Same sort of trouble with Centos breaking and Ubuntu working. References to logs with errors like yours are in the comments on this patch https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/768514 I was trying to find something specific we are doing wrong on Centos in openstack-ansible deployments but feels like there maybe some common Centos related factor here. On 04/02/2021 10:11, W Ch wrote: > Hi: > Recently, we added a CI task[0] for octavia in the kolla project. and > we tested octavia based on the ubuntu and centos systems. > The ubuntu system worked as expected but Centos did not work. > I tried to debug it and result is following: > Octavia can not create a load balancer on centos8. because > octavia-worker failed to plug a vip port on amphora vm.[1] > 2021-01-31 08:20:12.065 22 ERROR octavia.compute.drivers.nova_driver > [-] Error attaching network None with ip None and port > 26a39187-e95a-4131-91e7-24289e777f36 to amphora (compute_id: > a210ec88-b554-487f-a125-30b5c7473060) : > novaclient.exceptions.ClientException: Unknown Error (HTTP 504) > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs [-] Error > plugging amphora (compute_id: a210ec88-b554-487f-a125-30b5c7473060) > into port 26a39187-e95a-4131-91e7-24289e777f36.: > octavia.common.exceptions.ComputeUnknownException: Unknown exception > from the compute driver: Unknown Error (HTTP 504). > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs Traceback (most > recent call last): > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs File > "/var/lib/kolla/venv/lib/python3.6/site-packages/octavia/compute/drivers/nova_driver.py", > line 318, in attach_network_or_port > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs port_id=port_id) > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs File > "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/api_versions.py", > line 393, in substitution > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs return > methods[-1].func(obj, *args, **kwargs) > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs File > "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/v2/servers.py", > line 2063, in interface_attach > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs > obj_class=NetworkInterface) > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs File > "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/base.py", > line 363, in _create > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs resp, body = > self.api.client.post (url, body=body) > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs File > "/var/lib/kolla/venv/lib/python3.6/site-packages/keystoneauth1/adapter.py", > line 401, in post > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs return > self.request(url, 'POST', **kwargs) > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs File > "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/client.py", > line 78, in request > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs raise > exceptions.from_response(resp, body, url, method) > 2021-01-31 08:20:12.066 22 ERROR > octavia.network.drivers.neutron.allowed_address_pairs > novaclient.exceptions.ClientException: Unknown Error (HTTP 504) > Octavia-work called Neutron API to create a port. And called nova-api > to attach the vip to amphora. > Neutron created port successfully, but nova failed to attach the port > to instance.[2] > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver > [req-ab0e8d9b-664b-430f-8006-cad713b0c826 > 401ba22da5f8427fbda5fce24600041b 8ce8b97f710f43d7af2b8f9b1e0463c8 - > default default] [instance: a210ec88-b554-487f-a125-30b5c7473060] > attaching network adapter failed.: libvirt.libvirtError: Unable to > read from monitor: Connection reset by peer > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] Traceback (most recent call last): > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] File > "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", > line 2149, in attach_interface > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] guest.attach_device(cfg, > persistent=True, live=live) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] File > "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", > line 304, in attach_device > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] > self._domain.attachDeviceFlags(device_xml, flags=flags) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] File > "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", > line 190, in doit > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] result = > proxy_call(self._autowrap, f, *args, **kwargs) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] File > "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", > line 148, in proxy_call > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] rv = execute(f, *args, **kwargs) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] File > "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", > line 129, in execute > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] six.reraise(c, e, tb) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] File > "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] raise value > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] File > "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", > line 83, in tworker > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] rv = meth(*args, **kwargs) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] File > "/usr/lib64/python3.6/site-packages/libvirt.py", line 630, in > attachDeviceFlags > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] if ret == -1: raise libvirtError > ('virDomainAttachDeviceFlags() failed', dom=self) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] libvirt.libvirtError: Unable to > read from monitor: Connection reset by peer > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: > a210ec88-b554-487f-a125-30b5c7473060] > Nova-compute called libvirt to attach the device. And libvirt also > failed to attach the device[3] > 2021-01-31 08:20:23.884+0000: 86663: error : qemuMonitorIORead:491 : > Unable to read from monitor: Connection reset by peer > 2021-01-31 08:20:23.884+0000: 86663: debug : qemuMonitorIO:618 : Error > on monitor Unable to read from monitor: Connection reset by peer > 2021-01-31 08:20:23.884+0000: 86663: info : virObjectRef:402 : > OBJECT_REF: obj=0x7f004c00b610 > 2021-01-31 08:20:23.884+0000: 86663: debug : qemuMonitorIO:649 : > Triggering error callback > 2021-01-31 08:20:23.884+0000: 86663: debug : > qemuProcessHandleMonitorError:346 : Received error on 0x7f004c0095b0 > 'instance-00000001' > 2021-01-31 08:20:23.884+0000: 64768: debug : qemuMonitorSend:958 : > Send command resulted in error Unable to read from monitor: Connection > reset by peer > I also tried to use kolla/ubuntu-source-nova-libvirt instead of > kolla/centos-source-nova-libvirt. and it worked as expected. > I think the root cause is that libvirt failed to attach a network > device. but i don't know how to resolve this problem. > could anyone help me? > Thanks > Wuchunyang > [0]: https://review.opendev.org/c/openstack/kolla-ansible/+/754285 > > [1]: > https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/octavia/octavia-worker.txt#1116 > > [2]: > https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/nova/nova-compute.txt#2546 > > [3]: > https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/libvirt/libvirtd.txt#194472 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Thu Feb 4 10:25:16 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Thu, 04 Feb 2021 12:25:16 +0200 Subject: [nova][octavia][kolla] failed to create loadbalancer on Centos8 system In-Reply-To: References: Message-ID: <445561612434112@mail.yandex.ru> An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Thu Feb 4 10:48:59 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Thu, 04 Feb 2021 12:48:59 +0200 Subject: [nova][octavia][kolla][openstack-ansible] failed to create loadbalancer on Centos8 system In-Reply-To: References: Message-ID: <62281612435222@mail.yandex.ua> I just wanted to add that we started seeing this in mid December. The last patch landed that was passing CentOS 8 CI was 14th of December. 04.02.2021, 12:40, "Jonathan Rosser" : > I'm seeing very similar errors on patches we are trying to merge in openstack-ansible. Same sort of trouble with Centos breaking and Ubuntu working. References to logs with errors like yours are in the comments on this patch https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/768514 > > I was trying to find something specific we are doing wrong on Centos in openstack-ansible deployments but feels like there maybe some common Centos related factor here. > On 04/02/2021 10:11, W Ch wrote: >> Hi: >> >> Recently, we added a CI task[0] for octavia in the kolla project. and we tested octavia based on the ubuntu and centos systems. >> >> The ubuntu system worked as expected but Centos did not work. >> >> I tried to debug it and result is following: >> >> Octavia can not create a load balancer on centos8. because octavia-worker failed to plug a vip port on amphora vm.[1] >> >> 2021-01-31 08:20:12.065 22 ERROR octavia.compute.drivers.nova_driver [-] Error attaching network None with ip None and port 26a39187-e95a-4131-91e7-24289e777f36 to amphora (compute_id: a210ec88-b554-487f-a125-30b5c7473060) : novaclient.exceptions.ClientException: Unknown Error (HTTP 504) >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs [-] Error plugging amphora (compute_id: a210ec88-b554-487f-a125-30b5c7473060) into port 26a39187-e95a-4131-91e7-24289e777f36.: octavia.common.exceptions.ComputeUnknownException: Unknown exception from the compute driver: Unknown Error (HTTP 504). >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs Traceback (most recent call last): >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/octavia/compute/drivers/nova_driver.py", line 318, in attach_network_or_port >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs port_id=port_id) >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/api_versions.py", line 393, in substitution >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs return methods[-1].func(obj, *args, **kwargs) >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/v2/servers.py", line 2063, in interface_attach >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs obj_class=NetworkInterface) >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/base.py", line 363, in _create >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs resp, body = self.api.client.post(url, body=body) >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 401, in post >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs return self.request(url, 'POST', **kwargs) >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/client.py", line 78, in request >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs raise exceptions.from_response(resp, body, url, method) >> 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs novaclient.exceptions.ClientException: Unknown Error (HTTP 504) >> >> Octavia-work called Neutron API to create a port. And called nova-api to attach the vip to amphora. >> >> Neutron created port successfully, but nova failed to attach the port to instance.[2] >> >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [req-ab0e8d9b-664b-430f-8006-cad713b0c826 401ba22da5f8427fbda5fce24600041b 8ce8b97f710f43d7af2b8f9b1e0463c8 - default default] [instance: a210ec88-b554-487f-a125-30b5c7473060] attaching network adapter failed.: libvirt.libvirtError: Unable to read from monitor: Connection reset by peer >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] Traceback (most recent call last): >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 2149, in attach_interface >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] guest.attach_device(cfg, persistent=True, live=live) >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", line 304, in attach_device >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] self._domain.attachDeviceFlags(device_xml, flags=flags) >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] result = proxy_call(self._autowrap, f, *args, **kwargs) >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] rv = execute(f, *args, **kwargs) >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] six.reraise(c, e, tb) >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] raise value >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] rv = meth(*args, **kwargs) >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/usr/lib64/python3.6/site-packages/libvirt.py", line 630, in attachDeviceFlags >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self) >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] libvirt.libvirtError: Unable to read from monitor: Connection reset by peer >> 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] >> >> Nova-compute called libvirt to attach the device. And libvirt also failed to attach the device[3] >> >> 2021-01-31 08:20:23.884+0000: 86663: error : qemuMonitorIORead:491 : Unable to read from monitor: Connection reset by peer >> 2021-01-31 08:20:23.884+0000: 86663: debug : qemuMonitorIO:618 : Error on monitor Unable to read from monitor: Connection reset by peer >> 2021-01-31 08:20:23.884+0000: 86663: info : virObjectRef:402 : OBJECT_REF: obj=0x7f004c00b610 >> 2021-01-31 08:20:23.884+0000: 86663: debug : qemuMonitorIO:649 : Triggering error callback >> 2021-01-31 08:20:23.884+0000: 86663: debug : qemuProcessHandleMonitorError:346 : Received error on 0x7f004c0095b0 'instance-00000001' >> 2021-01-31 08:20:23.884+0000: 64768: debug : qemuMonitorSend:958 : Send command resulted in error Unable to read from monitor: Connection reset by peer >> >> I also tried to use kolla/ubuntu-source-nova-libvirt instead of kolla/centos-source-nova-libvirt. and it worked as expected. >> >> I think the root cause is that libvirt failed to attach a network device. but i don't know how to resolve this problem. >> >> could anyone help me? >> >> Thanks >> >> Wuchunyang >> >> [0]: https://review.opendev.org/c/openstack/kolla-ansible/+/754285 >> [1]: https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/octavia/octavia-worker.txt#1116 >> [2]: https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/nova/nova-compute.txt#2546 >> [3]: https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/libvirt/libvirtd.txt#194472 --  Kind Regards, Dmitriy Rabotyagov From syedammad83 at gmail.com Thu Feb 4 11:29:31 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Thu, 4 Feb 2021 16:29:31 +0500 Subject: Trove Multi-Tenancy Message-ID: Hi, I have deployed trove and database instance deployment is successful. But the problem is all the database servers are being created in service account i.e openstack instance list shows the database instances in admin user but when I check openstack server list the database instance won't show up here, its visible in trove service account. Can you please advise how the servers will be visible in admin account ? I want to enable multi-tenancy. Below is the configuration [DEFAULT] log_dir = /var/log/trove # RabbitMQ connection info transport_url = rabbit://openstack:password at controller control_exchange = trove trove_api_workers = 5 network_driver = trove.network.neutron.NeutronDriver taskmanager_manager = trove.taskmanager.manager.Manager default_datastore = mysql cinder_volume_type = database_storage reboot_time_out = 300 usage_timeout = 900 agent_call_high_timeout = 1200 nova_keypair = trove-key debug = true trace = true # MariaDB connection info [database] connection = mysql+pymysql://trove:password at mariadb01/trove [mariadb] tcp_ports = 3306,4444,4567,4568 [mysql] tcp_ports = 3306 [postgresql] tcp_ports = 5432 [redis] tcp_ports = 6379,16379 # Keystone auth info [keystone_authtoken] www_authenticate_uri = http://controller:5000 auth_url = http://controller:5000 memcached_servers = controller:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = trove password = servicepassword [service_credentials] auth_url = http://controller:5000 region_name = RegionOne project_domain_name = default user_domain_name = default project_name = service username = trove password = servicepassword -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From erkki.peura at nokia.com Thu Feb 4 11:47:58 2021 From: erkki.peura at nokia.com (Peura, Erkki (Nokia - FI/Espoo)) Date: Thu, 4 Feb 2021 11:47:58 +0000 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> Message-ID: Hi, in our case limit is 280 compute nodes, maybe bit less would be more comfortable but that depends on usage profile Br, - Eki - > -----Original Message----- > From: Thierry Carrez > Sent: Thursday, January 28, 2021 3:24 PM > To: openstack-discuss at lists.openstack.org > Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? > > Hi everyone, > > As part of the Large Scale SIG[1] activities, I'd like to quickly poll our > community on the following question: > > How many compute nodes do you feel comfortable fitting in a single-cluster > deployment of OpenStack, before you need to scale it out to multiple > regions/cells/.. ? > > Obviously this depends on a lot of deployment-dependent factors (type of > activity, choice of networking...) so don't overthink it: a rough number is fine > :) > > [1] https://wiki.openstack.org/wiki/Large_Scale_SIG > > Thanks in advance, > > -- > Thierry Carrez (ttx) From mnasiadka at gmail.com Thu Feb 4 11:55:02 2021 From: mnasiadka at gmail.com (=?UTF-8?Q?Micha=C5=82_Nasiadka?=) Date: Thu, 4 Feb 2021 12:55:02 +0100 Subject: [magnum] [neutron] [ovn] No inter-node pod-to-pod communication due to missing ACLs in OVN In-Reply-To: <20201216115736.wtnpszo3m4dlv6ki@p1.localdomain> References: <20201215211809.fi25gel5n7pjuhfs@p1.localdomain> <20201216115736.wtnpszo3m4dlv6ki@p1.localdomain> Message-ID: Hello, Have the same issue as Krzysztof, did some more digging and it seems that: - Magnum adds a CIDR network to allowed_address_pairs (10.100.0.0/16 by default) - OVN does not support adding CIDR to OVN NB LSP addresses field (so it makes it at least harder to reach feature parity with ML2/OVS in that sense) I've been able to work this around by changing Magnum code to add additional SG rules to pass traffic with remote_ip: 10.100.0.0/16 ( https://review.opendev.org/c/openstack/magnum/+/773923/1/magnum/drivers/k8s_fedora_coreos_v1/templates/kubecluster.yaml ). Unfortunately disabling allowed_address_pairs (which I wanted to propose in the same change) effects in 10.100.0.0/16 not being added to OVN NB LSP port_security field - and then it stops working. Are there some additional SG entries needed that might allow that traffic (to facilitate the disablement of allowed_address_pairs and improve scalability)? I'll post another thread on ovs-discuss to discuss if adding CIDRs to addresses field as a feature is technically feasible. Michal śr., 16 gru 2020 o 13:00 Slawek Kaplonski napisał(a): > Hi, > > On Wed, Dec 16, 2020 at 12:23:02PM +0100, Daniel Alvarez Sanchez wrote: > > On Tue, Dec 15, 2020 at 10:57 PM Daniel Alvarez > wrote: > > > > > > > > > > > > > > > On 15 Dec 2020, at 22:18, Slawek Kaplonski > wrote: > > > > > > > > Hi, > > > > > > > >> On Tue, Dec 15, 2020 at 05:14:29PM +0100, Krzysztof Klimonda wrote: > > > >>> On Tue, Dec 15, 2020, at 16:59, Daniel Alvarez Sanchez wrote: > > > >>> Hi Chris, thanks for moving this here. > > > >>> > > > >>> On Tue, Dec 15, 2020 at 4:22 PM Krzysztof Klimonda < > > > kklimonda at syntaxhighlighted.com> wrote: > > > >>>> Hi, > > > >>>> > > > >>>> This email is a follow-up to a discussion I've openened on > > > ovs-discuss ML[1] regarding lack of TCP/UDP connectivity between pods > > > deployed on magnum-managed k8s cluster with calico CNI and IPIP > tunneling > > > disabled (calico_ipv4pool_ipip label set to a default value of Off). > > > >>>> > > > >>>> As a short introduction, during magnum testing in ussuri > deployment > > > with ml2/ovn neutron driver I've noticed lack of communication between > pods > > > deployed on different nodes as part of magnum deployment with calico > > > configured to *not* encapsulate traffic in IPIP tunnel, but route it > > > directly between nodes. In theory, magnum configures adds defined pod > > > network to k8s nodes ports' allowed_address_pairs[2] and then security > > > group is created allowing for ICMP and TCP/UDP traffic between ports > > > belonging to that security group[3]. This doesn't work with ml2/ovn as > > > TCP/UDP traffic between IP addresses in pod network is not matching > ACLs > > > defined in OVN. > > > >>>> > > > >>>> I can't verify this behaviour under ml2/ovs for the next couple of > > > weeks, as I'm taking them off for holidays, but perhaps someone knows > if > > > that specific usecase (security group rules with remote groups used > with > > > allowed address pairs) is supposed to be working, or should magnum use > pod > > > network cidr to allow traffic between nodes instead. > > > >>> > > > >>> In ML2/OVN we're adding the allowed address pairs to the > 'addresses' > > > field only when the MAC address of the pair is the same as the port > MAC [0]. > > > >>> I think that we can change the code to accomplish what you want > (if it > > > matches ML2/OVS which I think it does) by adding all IP-MAC pairs of > the > > > allowed-address pairs to the 'addresses' column. E.g: > > > >>> > > > >>> addresses = [ MAC1 IP1, AP_MAC1 AP_IP1, AP_MAC2 AP_IP2 ] (right > now > > > it's just addresses = [ MAC1 IP1 ]) > > > >>> port_security column will be kept as it is today. > > > >> > > > >> How does [AP_MAC1 AP_IP1 AP_MAC2 AP_IP2] scale with a number of IP > > > addresses set in allowed_address_pairs? Given how default pod network > is > > > 10.100.0.0/16 will that generate 65k flows in ovs, or is it not a 1:1 > > > mapping? > > > > > > It will use conjunctive flows but yes it will be huge no matter what. > If > > > we follow the approach of adding match conditions to the ACLs for each > > > address pair it is going to be even worse when expanded by > ovn-controller. > > > >> > > > >> If ml2/ovs is also having scaling issues when remote groups are > used, > > > perhaps magnum should switch to defining remote-ip in its security > groups > > > instead, even if the underlying issue on ml2/ovn is fixed? > > > > > > > > IIRC Kuryr moved already to such solution as they had problems with > > > scaling on > > > > ML2/OVS when remote_group ids where used. > > > > > > > @Slaweq, ML2/OVS accounts for allowed address pairs for remote security > > groups but not for FIPs right? I wonder why the distinction. > > Documentation is not clear but I'm certain that FIPs are not accounted > for > > by remote groups. > > Right. FIPs aren't added to the list of allowed IPs in the ipset. > > > > > If we decide to go ahead and implement this in ML2/OVN, the same thing > can > > be applied for FIPs adding the FIP to the 'addresses' field but there > might > > be scaling issues. > > > > > > > That’s right. Remote groups are expensive in any case. > > > > > > Mind opening a launchpad bug for OVN though? > > > > > > Thanks! > > > > > > > >> > > > >>> > > > >>> This way, when ovn-northd generates the Address_Set in the SB > database > > > for the corresponding remote group, the allowed-address pairs IP > addresses > > > will be added to it and honored by the security groups. > > > >>> > > > >>> +Numan Siddique to confirm that this > > > doesn't have any unwanted side effects. > > > >>> > > > >>> [0] > > > > https://opendev.org/openstack/neutron/src/commit/6a8fa65302b45f32958e7fc2b73614715780b997/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L122-L125 > > > >>>> > > > >>>> [1] > > > > https://mail.openvswitch.org/pipermail/ovs-discuss/2020-December/050836.html > > > >>>> [2] > > > > https://github.com/openstack/magnum/blob/c556b8964fab129f33e766b1c33908b2eb001df4/magnum/drivers/k8s_fedora_coreos_v1/templates/kubeminion.yaml > > > >>>> [3] > > > > https://github.com/openstack/magnum/blob/c556b8964fab129f33e766b1c33908b2eb001df4/magnum/drivers/k8s_fedora_coreos_v1/templates/kubecluster.yaml#L1038 > > > >>>> > > > >>>> -- > > > >>>> Best Regards, > > > >>>> - Chris > > > >>>> > > > > > > > > -- > > > > Slawek Kaplonski > > > > Principal Software Engineer > > > > Red Hat > > > > > > > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat > > > -- Michał Nasiadka mnasiadka at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From midhunlaln66 at gmail.com Thu Feb 4 12:10:25 2021 From: midhunlaln66 at gmail.com (Midhunlal Nb) Date: Thu, 4 Feb 2021 17:40:25 +0530 Subject: LDAP integration with penstack Message-ID: Hi all, Before ldap integration openstack working properly but if i set "driver = ldap" in keystone.conf under [identity] section nothing is working for me,I am not able run any openstack command and also not able to create any project or domain or user.If remove "driver = ldap" entry everything working back normally please help me on this issue. If i run admin-openrc file I am getting below error; root at controller:~/client-scripts# openstack image list The request you have made requires authentication. (HTTP 401) (Request-ID: req-bdcde4be-5b62-4454-9084-19324603d0ce) Please help me .Where I am making mistakes? Thanks & Regards Midhunlal N B +918921245637 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openinfradn at gmail.com Thu Feb 4 12:38:19 2021 From: openinfradn at gmail.com (open infra) Date: Thu, 4 Feb 2021 18:08:19 +0530 Subject: Issue when accessing Openstack Message-ID: Hi, I am new to both starlingx and openstack. I have deployed StarlingX and now trying to get access OpenStack. I highly appreciate your help on troubleshooting this issue. http://paste.openstack.org/show/802321/ Log: http://paste.openstack.org/show/802322/ Regards, Danishka -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Thu Feb 4 13:40:04 2021 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 4 Feb 2021 08:40:04 -0500 Subject: LDAP integration with penstack In-Reply-To: References: Message-ID: Default all group/role/project/user information in SQL but when you say use LDAP then it’s trying to find those information in LDAP, do you have all those information in LDAP? ( assuming not that is why you getting that error) You should tell your openstack use LDAP for only authentication for user information and look for remaining roles/project etc in SQL That is what I’m running in my cloud and everything works. Full LDAP integration is little complicated that is why I pick partial method. Sent from my iPhone > On Feb 4, 2021, at 7:16 AM, Midhunlal Nb wrote: > >  > Hi all, > > Before ldap integration openstack working properly but if i set "driver = ldap" in keystone.conf under [identity] section nothing is working for me,I am not able run any openstack command and also not able to create any project or domain or user.If remove "driver = ldap" entry everything working back normally > please help me on this issue. > > If i run admin-openrc file I am getting below error; > > root at controller:~/client-scripts# openstack image list > The request you have made requires authentication. (HTTP 401) > (Request-ID: req-bdcde4be-5b62-4454-9084-19324603d0ce) > > Please help me .Where I am making mistakes? > > Thanks & Regards > Midhunlal N B > +918921245637 From satish.txt at gmail.com Thu Feb 4 14:28:26 2021 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 4 Feb 2021 09:28:26 -0500 Subject: LDAP integration with penstack In-Reply-To: References: Message-ID: This is what i have In /etc/keystone/keystone.conf [identity] driver = sql domain_config_dir = /etc/keystone/domains domain_specific_drivers_enabled = True In /etc/keystone/domains/keystone.myldapdomain.conf [identity] driver = ldap [ldap] group_allow_create = False group_allow_delete = False group_allow_update = False group_id_attribute = cn ... ... ... <> On Thu, Feb 4, 2021 at 9:10 AM Midhunlal Nb wrote: > > Hi satish, > Thank you so much for your response!Here I am pasting my ldap configuration what i done in keystone.conf,please check and let me know what changes i need to make,also please tell me what are all the new entries i need to add in LDAP. > I have been struggling with this issue for the last 2 month,please help me. > > 1.[identity] > driver = ldap > 2.[ldap] > url = ldap://192.168.x.xx > user = cn=admin,dc=blr,dc=ind,dc=company,dc=com > password = xxxxx > suffix = dc=company,dc=com > query_scope = sub > page_size = 2000 > alias_dereferencing = default > #chase_referrals = false > chase_referrals = false > debug_level = 0 > use_pool = true > pool_size = 10 > pool_retry_max = 3 > pool_retry_delay = 0.1 > pool_connection_timeout = -1 > pool_connection_lifetime = 600 > use_auth_pool = false > auth_pool_size = 100 > auth_pool_connection_lifetime = 60 > user_id_attribute = cn > user_name_attribute = sn > user_mail_attribute = mail > user_pass_attribute = userPassword > user_enabled_attribute = userAccountControl > user_enabled_mask = 2 > user_enabled_invert = false > user_enabled_default = 512 > user_default_project_id_attribute = > user_additional_attribute_mapping = > > group_id_attribute = cn > group_name_attribute = ou > group_member_attribute = member > group_desc_attribute = description > group_additional_attribute_mapping = > > > user_tree_dn = ou=people,dc=blr,dc=ind,dc=company,dc=com > user_objectclass = inetOrgPerson > > group_tree_dn = ou=group,dc=blr,dc=ind,dc=company,dc=com > group_objectclass = organizationalUnit > > This is the configuration I have in my keystone.conf file for ldap integration. > > Thanks & Regards > Midhunlal N B > +918921245637 > > > On Thu, Feb 4, 2021 at 7:10 PM Satish Patel wrote: >> >> Default all group/role/project/user information in SQL but when you say use LDAP then it’s trying to find those information in LDAP, do you have all those information in LDAP? ( assuming not that is why you getting that error) >> >> You should tell your openstack use LDAP for only authentication for user information and look for remaining roles/project etc in SQL That is what I’m running in my cloud and everything works. >> >> Full LDAP integration is little complicated that is why I pick partial method. >> >> Sent from my iPhone >> >> > On Feb 4, 2021, at 7:16 AM, Midhunlal Nb wrote: >> > >> >  >> > Hi all, >> > >> > Before ldap integration openstack working properly but if i set "driver = ldap" in keystone.conf under [identity] section nothing is working for me,I am not able run any openstack command and also not able to create any project or domain or user.If remove "driver = ldap" entry everything working back normally >> > please help me on this issue. >> > >> > If i run admin-openrc file I am getting below error; >> > >> > root at controller:~/client-scripts# openstack image list >> > The request you have made requires authentication. (HTTP 401) >> > (Request-ID: req-bdcde4be-5b62-4454-9084-19324603d0ce) >> > >> > Please help me .Where I am making mistakes? >> > >> > Thanks & Regards >> > Midhunlal N B >> > +918921245637 From raubvogel at gmail.com Thu Feb 4 14:35:08 2021 From: raubvogel at gmail.com (Mauricio Tavares) Date: Thu, 4 Feb 2021 09:35:08 -0500 Subject: Cannot specify host in availability zone In-Reply-To: References: Message-ID: On Thu, Feb 4, 2021 at 3:05 AM Paul Browne wrote: > > Hi Mauricio, > > What happens if instead of the node name you use the node UUID, as returned from; > > openstack hypervisor list > I must be doing something wrong, as my output to openstack hypervisor list does not spit out the UUID: +----+---------------------------+-----------------+-----------+-------+ | ID | Hypervisor Hostname | Hypervisor Type | Host IP | State | +----+---------------------------+-----------------+-----------+-------+ | 1 | compute02.example.com | QEMU | 10.1.1.12 | up | | 2 | compute03.example.com | QEMU | 10.1.1.13 | up | | 3 | compute01.example.com | QEMU | 10.1.1.11 | up | +----+---------------------------+-----------------+-----------+-------+ I guess this install was set up to list the FQDN as the 'Hypervisor Hostname' instead of UUID. Then I tried openstack hypervisor show compute01.example.com So I then tried nova hypervisor-list and got 064097ce-af32-45f0-9f5b-7b21434f3cbf as the ID associated with compute01.example.com; I take that is the UUID. However, when I tried it openstack server create \ --image default_centos_8 \ --flavor m1.small \ --key-name raubkey \ --availability-zone nova:064097ce-af32-45f0-9f5b-7b21434f3cbf \ --nic net-id=LONG_BORING-ID \ raub-netest I got | fault | {'code': 500, 'created': '2021-02-04T14:29:09Z', 'message': 'No valid host was found. No such host - host: 064097ce-af32-45f0-9f5b-7b21434f3cbf node: None ', 'details': [...] > ? > > On Wed, 3 Feb 2021, 17:46 Mauricio Tavares, wrote: >> >> Easy peasy question: According to >> https://docs.openstack.org/nova/rocky/admin/availability-zones.html, I >> can specify the host I want to use by following the format >> >> --availability-zone ZONE:HOST >> >> So, let's get the hostnames for the compute nodes. >> >> [raub at test-hn ~(keystone_admin)]$ openstack compute service list >> --service nova-compute >> +----+--------------+-----------------------+------+---------+-------+----------------------------+ >> | ID | Binary | Host | Zone | Status | State | >> Updated At | >> +----+--------------+-----------------------+------+---------+-------+----------------------------+ >> | 11 | nova-compute | compute02.example.com | nova | enabled | up | >> 2021-02-03T17:25:53.000000 | >> | 12 | nova-compute | compute03.example.com | nova | enabled | up | >> 2021-02-03T17:25:53.000000 | >> | 13 | nova-compute | compute01.example.com | nova | enabled | up | >> 2021-02-03T17:25:52.000000 | >> +----+--------------+-----------------------+------+---------+-------+----------------------------+ >> [raub at test-hn ~(keystone_admin)]$ >> >> I want to use compute01.example.com, which I can resolve to 10.1.1.11. >> But, when I try to create server with (running as admin): >> >> openstack server create \ >> --image default_centos_8 \ >> --flavor m1.small \ >> --key-name raubkey \ >> --availability-zone nova:compute01.example.com \ >> --nic net-id=LONG_BORING-ID \ >> raub-netest >> >> I get the error message (from openstack server show raub-netest|grep fault): >> >> 'message': 'No valid host was found. No such host - host: >> compute01.example.com node: None ' >> >> What am I doing wrong here? >> From mark at stackhpc.com Thu Feb 4 14:54:13 2021 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 4 Feb 2021 14:54:13 +0000 Subject: [nova][octavia][kolla] failed to create loadbalancer on Centos8 system In-Reply-To: <445561612434112@mail.yandex.ru> References: <445561612434112@mail.yandex.ru> Message-ID: On Thu, 4 Feb 2021 at 10:25, Dmitriy Rabotyagov wrote: > > Hi, > > We see pretty much the same for OpenStack-Ansible CI since mid December I guess. And it's specificly CentOS 8 job, while ubuntu bionic and focal and debian buster are working properly. > In the meanwhile nova and all other project tests can create instances on centos 8. The operation that fails is attaching a network interface to the amphora. I'd guess it's related to CentOS 8.3 which was released on 8th December. Mark > > 04.02.2021, 12:17, "W Ch" : > > > Hi: > > Recently, we added a CI task[0] for octavia in the kolla project. and we tested octavia based on the ubuntu and centos systems. > > The ubuntu system worked as expected but Centos did not work. > > I tried to debug it and result is following: > > Octavia can not create a load balancer on centos8. because octavia-worker failed to plug a vip port on amphora vm.[1] > > 2021-01-31 08:20:12.065 22 ERROR octavia.compute.drivers.nova_driver [-] Error attaching network None with ip None and port 26a39187-e95a-4131-91e7-24289e777f36 to amphora (compute_id: a210ec88-b554-487f-a125-30b5c7473060) : novaclient.exceptions.ClientException: Unknown Error (HTTP 504) > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs [-] Error plugging amphora (compute_id: a210ec88-b554-487f-a125-30b5c7473060) into port 26a39187-e95a-4131-91e7-24289e777f36.: octavia.common.exceptions.ComputeUnknownException: Unknown exception from the compute driver: Unknown Error (HTTP 504). > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs Traceback (most recent call last): > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/octavia/compute/drivers/nova_driver.py", line 318, in attach_network_or_port > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs port_id=port_id) > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/api_versions.py", line 393, in substitution > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs return methods[-1].func(obj, *args, **kwargs) > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/v2/servers.py", line 2063, in interface_attach > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs obj_class=NetworkInterface) > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/base.py", line 363, in _create > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs resp, body = self.api.client.post(url, body=body) > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 401, in post > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs return self.request(url, 'POST', **kwargs) > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs File "/var/lib/kolla/venv/lib/python3.6/site-packages/novaclient/client.py", line 78, in request > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs raise exceptions.from_response(resp, body, url, method) > 2021-01-31 08:20:12.066 22 ERROR octavia.network.drivers.neutron.allowed_address_pairs novaclient.exceptions.ClientException: Unknown Error (HTTP 504) > > Octavia-work called Neutron API to create a port. And called nova-api to attach the vip to amphora. > > Neutron created port successfully, but nova failed to attach the port to instance.[2] > > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [req-ab0e8d9b-664b-430f-8006-cad713b0c826 401ba22da5f8427fbda5fce24600041b 8ce8b97f710f43d7af2b8f9b1e0463c8 - default default] [instance: a210ec88-b554-487f-a125-30b5c7473060] attaching network adapter failed.: libvirt.libvirtError: Unable to read from monitor: Connection reset by peer > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] Traceback (most recent call last): > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 2149, in attach_interface > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] guest.attach_device(cfg, persistent=True, live=live) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", line 304, in attach_device > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] self._domain.attachDeviceFlags(device_xml, flags=flags) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] result = proxy_call(self._autowrap, f, *args, **kwargs) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] rv = execute(f, *args, **kwargs) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] six.reraise(c, e, tb) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] raise value > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] rv = meth(*args, **kwargs) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] File "/usr/lib64/python3.6/site-packages/libvirt.py", line 630, in attachDeviceFlags > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self) > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] libvirt.libvirtError: Unable to read from monitor: Connection reset by peer > 2021-01-31 08:20:23.945 8 ERROR nova.virt.libvirt.driver [instance: a210ec88-b554-487f-a125-30b5c7473060] > > > > Nova-compute called libvirt to attach the device. And libvirt also failed to attach the device[3] > > 2021-01-31 08:20:23.884+0000: 86663: error : qemuMonitorIORead:491 : Unable to read from monitor: Connection reset by peer > 2021-01-31 08:20:23.884+0000: 86663: debug : qemuMonitorIO:618 : Error on monitor Unable to read from monitor: Connection reset by peer > 2021-01-31 08:20:23.884+0000: 86663: info : virObjectRef:402 : OBJECT_REF: obj=0x7f004c00b610 > 2021-01-31 08:20:23.884+0000: 86663: debug : qemuMonitorIO:649 : Triggering error callback > 2021-01-31 08:20:23.884+0000: 86663: debug : qemuProcessHandleMonitorError:346 : Received error on 0x7f004c0095b0 'instance-00000001' > 2021-01-31 08:20:23.884+0000: 64768: debug : qemuMonitorSend:958 : Send command resulted in error Unable to read from monitor: Connection reset by peer > > I also tried to use kolla/ubuntu-source-nova-libvirt instead of kolla/centos-source-nova-libvirt. and it worked as expected. > > I think the root cause is that libvirt failed to attach a network device. but i don't know how to resolve this problem. > > > could anyone help me? > > Thanks > > Wuchunyang > > > [0]: https://review.opendev.org/c/openstack/kolla-ansible/+/754285 > [1]: https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/octavia/octavia-worker.txt#1116 > [2]: https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/nova/nova-compute.txt#2546 > [3]: https://zuul.opendev.org/t/openstack/build/e4b8c62c44a64b96bc287ba2ba2315f0/log/primary/logs/kolla/libvirt/libvirtd.txt#194472 > > > > > > > > > > > > > -- > Kind Regards, > Dmitriy Rabotyagov > From fungi at yuggoth.org Thu Feb 4 15:51:04 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 4 Feb 2021 15:51:04 +0000 Subject: Authentication error after configuring LDAP integration with openstack In-Reply-To: References: Message-ID: <20210204155103.uya7lprxtvg2abrg@yuggoth.org> On 2021-02-04 15:12:21 +0530 (+0530), Midhunlal Nb wrote: > Before ldap integration openstack working properly but if i set > "driver = ldap" in keystone.conf under [identity] section nothing > is working for me,I am not able run any openstack command and also > not able to create any project or domain or user.If remove "driver > = ldap" entry everything working back normally please help me on > this issue. > > If i run admin-openrc file I am getting below error; > > root at controller:~/client-scripts# openstack image list > The request you have made requires authentication. (HTTP 401) > (Request-ID: req-bdcde4be-5b62-4454-9084-19324603d0ce) > > Please help me .Where I am making mistakes? Did you see Radosław's reply to you on Tuesday? http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020159.html -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From gmann at ghanshyammann.com Thu Feb 4 16:23:13 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 04 Feb 2021 10:23:13 -0600 Subject: [nova][placement] adding nova-core to placement-core in gerrit In-Reply-To: <20210203110142.2noi7v2b4q57fozy@lyarwood-laptop.usersys.redhat.com> References: <0O5YNQ.4J6G712XMY0K@est.tech> <20210203110142.2noi7v2b4q57fozy@lyarwood-laptop.usersys.redhat.com> Message-ID: <1776ddad336.b34fff61221304.5838817347277468216@ghanshyammann.com> ---- On Wed, 03 Feb 2021 05:01:42 -0600 Lee Yarwood wrote ---- > On 03-02-21 10:43:12, Balazs Gibizer wrote: > > > > > > On Tue, Feb 2, 2021 at 09:06, Balazs Gibizer > > wrote: > > > Hi, > > > > > > I've now added nova-core to the placement-core group[1] > > > > It turned out that there is a separate placement-stable-maint group which > > does not have the nova-stable-maint included. Is there any objection to add > > nova-stable-maint to placement-stable-maint? +1, that makes sense. -gmann > > None from me as a member of nova-stable-maint. > > Thanks again for cleaning this up! > > -- > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 > From DHilsbos at performair.com Thu Feb 4 16:39:23 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Thu, 4 Feb 2021 16:39:23 +0000 Subject: [neutron][victoria] Front / Back Routers Message-ID: <0670B960225633449A24709C291A52524FAAB742@COM01.performair.local> All; My team and I have been working through the tutorials on Server World (server-world.com/ean/), in order to learn and build an OpenStack cluster. We've also been looking at the official documentation to attempt to increase our knowledge of the subject. I have a question about Neutron though. All the examples that I remember have Neutron setup with a single router. The router is part of a "provider" network, and subnet on the outside, and one or more "tenant" networks on the inside. Floating IPS, then appear to be IP addresses belonging to the "provider" subnet, that are applied to the router, and which the router then NATs. These setups look like this: Physrouter1 (physical router) subnet: 192.168.0.0/24, IP address: 192.168.0.1 | Physnet1 (192.168.0.0/24)(ovs network definition) | Router1 (ovs router)(allocation pool: 192.168.0.100 - 192.168.0.254) <-- Floating IPs are "owned" by this, and are in the range of the allocation pool | Tenant network(s) This has the advantage of being easy, fast, secure, and simple to setup. What if you wanted something where you could route whole subnet into your OpenStack cluster. Physrouter1 (physical router) subnet: 172.16.255.0/24, IP address: 172.16.255.1 | Physnet1 (172.16.255.0/24)(ovs network definition) | Router1 (ovs router)(fixed IP addresses: 172.16.255.2 & 172.16.254.1/24 + static / dynamic routing) | Network (17216.254.0/24) | Router2(ovs router)(allocation pool: 172.16.254.5 - 172.16.254.254) <-- Floating IPs are "owned" by this, and are in the range of the allocation pool | Tenant network(s) Is my understanding accurate? Are there advantages of one over the other? What commands are used to specify static IPs for ovs routers, and static routing rules? Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From ignaziocassano at gmail.com Thu Feb 4 16:58:48 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 4 Feb 2021 17:58:48 +0100 Subject: [stein][manila] share migration misconfiguration ? Message-ID: Hello All, I am trying to migrate a share between a netapp backend to another. Both backends are configured in my manila.conf. I am able to create share on both, but I am not able to migrate share between them. I am using DSSH=False. I did not understand how host and driver assisted migration work and what "data_node_access_ip" means. The share I want to migrate is on a network (10.102.186.0/24) that I can reach by my management controllers network (10.102.184.0/24). I Can mount share from my controllers and I can mount also the netapp SVM where the share is located. So in the data_node_access_ip I wrote the list of my controllers management ips. During the migrate phase I checked if my controller where manila is running mounts the share or the netapp SVM but It does not happen. Please, what is my mistake ? Thanks Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Thu Feb 4 17:28:47 2021 From: dms at danplanet.com (Dan Smith) Date: Thu, 04 Feb 2021 09:28:47 -0800 Subject: [all] Gate resources and performance Message-ID: Hi all, I have become increasingly concerned with CI performance lately, and have been raising those concerns with various people. Most specifically, I'm worried about our turnaround time or "time to get a result", which has been creeping up lately. Right after the beginning of the year, we had a really bad week where the turnaround time was well over 24 hours. That means if you submit a patch on Tuesday afternoon, you might not get a test result until Thursday. That is, IMHO, a real problem and massively hurts our ability to quickly merge priority fixes as well as just general velocity and morale. If people won't review my code until they see a +1 from Zuul, and that is two days after I submitted it, that's bad. Things have gotten a little better since that week, due in part to getting past a rush of new year submissions (we think) and also due to some job trimming in various places (thanks Neutron!). However, things are still not great. Being in almost the last timezone of the day, the queue is usually so full when I wake up that it's quite often I don't get to see a result before I stop working that day. I would like to ask that projects review their jobs for places where they can cut out redundancy, as well as turn their eyes towards optimizations that can be made. I've been looking at both Nova and Glance jobs and have found some things I think we can do less of. I also wanted to get an idea of who is "using too much" in the way of resources, so I've been working on trying to characterize the weight of the jobs we run for a project, based on the number of worker nodes required to run all the jobs, as well as the wall clock time of how long we tie those up. The results are interesting, I think, and may help us to identify where we see some gains. The idea here is to figure out[1] how many "node hours" it takes to run all the normal jobs on a Nova patch compared to, say, a Neutron one. If the jobs were totally serialized, this is the number of hours a single computer (of the size of a CI worker) would take to do all that work. If the number is 24 hours, that means a single computer could only check *one* patch in a day, running around the clock. I chose the top five projects in terms of usage[2] to report here, as they represent 70% of the total amount of resources consumed. The next five only add up to 13%, so the "top five" seems like a good target group. Here are the results, in order of total consumption: Project % of total Node Hours Nodes ------------------------------------------ 1. TripleO 38% 31 hours 20 2. Neutron 13% 38 hours 32 3. Nova 9% 21 hours 25 4. Kolla 5% 12 hours 18 5. OSA 5% 22 hours 17 What that means is that a single computer (of the size of a CI worker) couldn't even process the jobs required to run on a single patch for Neutron or TripleO in a 24-hour period. Now, we have lots of workers in the gate, of course, but there is also other potential overhead involved in that parallelism, like waiting for nodes to be available for dependent jobs. And of course, we'd like to be able to check more than patch per day. Most projects have smaller gate job sets than check, but assuming they are equivalent, a Neutron patch from submission to commit would undergo 76 hours of testing, not including revisions and not including rechecks. That's an enormous amount of time and resource for a single patch! Now, obviously nobody wants to run fewer tests on patches before they land, and I'm not really suggesting that we take that approach necessarily. However, I think there are probably a lot of places that we can cut down the amount of *work* we do. Some ways to do this are: 1. Evaluate whether or not you need to run all of tempest on two configurations of a devstack on each patch. Maybe having a stripped-down tempest (like just smoke) to run on unique configs, or even specific tests. 2. Revisit your "irrelevant_files" lists to see where you might be able to avoid running heavy jobs on patches that only touch something small. 3. Consider moving some jobs to the experimental queue and run them on-demand for patches that touch particular subsystems or affect particular configurations. 4. Consider some periodic testing for things that maybe don't need to run on every single patch. 5. Re-examine tests that take a long time to run to see if something can be done to make them more efficient. 6. Consider performance improvements in the actual server projects, which also benefits the users. If you're a project that is not in the top ten then your job configuration probably doesn't matter that much, since your usage is dwarfed by the heavy projects. If the heavy projects would consider making changes to decrease their workload, even small gains have the ability to multiply into noticeable improvement. The higher you are on the above list, the more impact a small change will have on the overall picture. Also, thanks to Neutron and TripleO, both of which have already addressed this in some respect, and have other changes on the horizon. Thanks for listening! --Dan 1: https://gist.github.com/kk7ds/5edbfacb2a341bb18df8f8f32d01b37c 2; http://paste.openstack.org/show/C4pwUpdgwUDrpW6V6vnC/ From satish.txt at gmail.com Thu Feb 4 18:24:46 2021 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 4 Feb 2021 13:24:46 -0500 Subject: LDAP integration with openstack In-Reply-To: References: Message-ID: check out my blog for full deployment of LDAP - https://satishdotpatel.github.io/openstack-ldap-integration/ On Thu, Feb 4, 2021 at 10:35 AM Midhunlal Nb wrote: > > Hi sathish, > Once you are free,please reply to my doubts,I believe that I can solve this issue with your solution. > > Thanks & Regards > Midhunlal N B > +918921245637 > > > On Thu, Feb 4, 2021 at 8:14 PM Midhunlal Nb wrote: >> >> Hi Satish, >> I have some doubt in your configuration >> 1.In keystone, "domains" directory and "keystone.myldapdomain.conf "file i need to create right? >> >> 2.In [ldap] section >> url = ldap://192.168.x.xx >> user = cn=admin,dc=blr,dc=ind,dc=company,dc=com >> password = xxxxx >> suffix = dc=company,dc=com >> I need to add or no need?if not how does openstack connect to my ldap?please reply me. >> >> Thanks & Regards >> Midhunlal N B >> +918921245637 >> >> >> On Thu, Feb 4, 2021 at 7:58 PM Satish Patel wrote: >>> >>> This is what i have >>> >>> In /etc/keystone/keystone.conf >>> >>> [identity] >>> driver = sql >>> domain_config_dir = /etc/keystone/domains >>> domain_specific_drivers_enabled = True >>> >>> In /etc/keystone/domains/keystone.myldapdomain.conf >>> >>> [identity] >>> driver = ldap >>> >>> [ldap] >>> group_allow_create = False >>> group_allow_delete = False >>> group_allow_update = False >>> group_id_attribute = cn >>> ... >>> ... >>> ... >>> <> >>> >>> >>> On Thu, Feb 4, 2021 at 9:10 AM Midhunlal Nb wrote: >>> > >>> > Hi satish, >>> > Thank you so much for your response!Here I am pasting my ldap configuration what i done in keystone.conf,please check and let me know what changes i need to make,also please tell me what are all the new entries i need to add in LDAP. >>> > I have been struggling with this issue for the last 2 month,please help me. >>> > >>> > 1.[identity] >>> > driver = ldap >>> > 2.[ldap] >>> > url = ldap://192.168.x.xx >>> > user = cn=admin,dc=blr,dc=ind,dc=company,dc=com >>> > password = xxxxx >>> > suffix = dc=company,dc=com >>> > query_scope = sub >>> > page_size = 2000 >>> > alias_dereferencing = default >>> > #chase_referrals = false >>> > chase_referrals = false >>> > debug_level = 0 >>> > use_pool = true >>> > pool_size = 10 >>> > pool_retry_max = 3 >>> > pool_retry_delay = 0.1 >>> > pool_connection_timeout = -1 >>> > pool_connection_lifetime = 600 >>> > use_auth_pool = false >>> > auth_pool_size = 100 >>> > auth_pool_connection_lifetime = 60 >>> > user_id_attribute = cn >>> > user_name_attribute = sn >>> > user_mail_attribute = mail >>> > user_pass_attribute = userPassword >>> > user_enabled_attribute = userAccountControl >>> > user_enabled_mask = 2 >>> > user_enabled_invert = false >>> > user_enabled_default = 512 >>> > user_default_project_id_attribute = >>> > user_additional_attribute_mapping = >>> > >>> > group_id_attribute = cn >>> > group_name_attribute = ou >>> > group_member_attribute = member >>> > group_desc_attribute = description >>> > group_additional_attribute_mapping = >>> > >>> > >>> > user_tree_dn = ou=people,dc=blr,dc=ind,dc=company,dc=com >>> > user_objectclass = inetOrgPerson >>> > >>> > group_tree_dn = ou=group,dc=blr,dc=ind,dc=company,dc=com >>> > group_objectclass = organizationalUnit >>> > >>> > This is the configuration I have in my keystone.conf file for ldap integration. >>> > >>> > Thanks & Regards >>> > Midhunlal N B >>> > +918921245637 >>> > >>> > >>> > On Thu, Feb 4, 2021 at 7:10 PM Satish Patel wrote: >>> >> >>> >> Default all group/role/project/user information in SQL but when you say use LDAP then it’s trying to find those information in LDAP, do you have all those information in LDAP? ( assuming not that is why you getting that error) >>> >> >>> >> You should tell your openstack use LDAP for only authentication for user information and look for remaining roles/project etc in SQL That is what I’m running in my cloud and everything works. >>> >> >>> >> Full LDAP integration is little complicated that is why I pick partial method. >>> >> >>> >> Sent from my iPhone >>> >> >>> >> > On Feb 4, 2021, at 7:16 AM, Midhunlal Nb wrote: >>> >> > >>> >> >  >>> >> > Hi all, >>> >> > >>> >> > Before ldap integration openstack working properly but if i set "driver = ldap" in keystone.conf under [identity] section nothing is working for me,I am not able run any openstack command and also not able to create any project or domain or user.If remove "driver = ldap" entry everything working back normally >>> >> > please help me on this issue. >>> >> > >>> >> > If i run admin-openrc file I am getting below error; >>> >> > >>> >> > root at controller:~/client-scripts# openstack image list >>> >> > The request you have made requires authentication. (HTTP 401) >>> >> > (Request-ID: req-bdcde4be-5b62-4454-9084-19324603d0ce) >>> >> > >>> >> > Please help me .Where I am making mistakes? >>> >> > >>> >> > Thanks & Regards >>> >> > Midhunlal N B >>> >> > +918921245637 From ruslanas at lpic.lt Thu Feb 4 18:50:05 2021 From: ruslanas at lpic.lt (=?UTF-8?Q?Ruslanas_G=C5=BEibovskis?=) Date: Thu, 4 Feb 2021 20:50:05 +0200 Subject: Cannot specify host in availability zone In-Reply-To: References: Message-ID: Hi Mauricio, Faced similar issue. Maybe my issue and idea will help you. First try same steps from horizon. Second. In cli, i used same name as hypervizor list output. In most cases for me it is: $stackName-$roleName-$index.localdomain if forget to change localdomain value ;)) My faced issues were related, tah one compute can be in one zone, and multiple aggregation groups in same zone. Note: Interesting that you use nova blah-blah, try openstack aggregate create zonename --zone zonename And same with openstack aggregate help see help how to add it with openstack. Hope this helps at least a bit. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rodrigo.barbieri2010 at gmail.com Thu Feb 4 20:13:37 2021 From: rodrigo.barbieri2010 at gmail.com (Rodrigo Barbieri) Date: Thu, 4 Feb 2021 17:13:37 -0300 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hello Ignazio, If you are attempting to migrate between 2 NetApp backends, then you shouldn't need to worry about correctly setting the data_node_access_ip. Your ideal migration scenario is a driver-assisted-migration, since it is between 2 NetApp backends. If that fails due to misconfiguration, it will fallback to a host-assisted migration, which will use the data_node_access_ip and the host will attempt to mount both shares. This is not what you want for this scenario, as this is useful for different backends, not your case. if you specify "manila migration-start --preserve-metadata True" it will prevent the fallback to host-assisted, so it is easier for you to narrow down the issue with the host-assisted migration out of the way. I used to be familiar with the NetApp driver set up to review your case, however that was a long time ago. I believe the current NetApp driver maintainers will be able to more accurately review your case and spot the problem. If you could share some info about your scenario such as: 1) the 2 backends config groups in manila.conf (sanitized, without passwords) 2) a "manila show" of the share you are trying to migrate (sanitized if needed) 3) the "manila migration-start" command you are using and its parameters. Regards, On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano wrote: > Hello All, > I am trying to migrate a share between a netapp backend to another. > Both backends are configured in my manila.conf. > I am able to create share on both, but I am not able to migrate share > between them. > I am using DSSH=False. > I did not understand how host and driver assisted migration work and what > "data_node_access_ip" means. > The share I want to migrate is on a network (10.102.186.0/24) that I can > reach by my management controllers network (10.102.184.0/24). I Can mount > share from my controllers and I can mount also the netapp SVM where the > share is located. > So in the data_node_access_ip I wrote the list of my controllers > management ips. > During the migrate phase I checked if my controller where manila is > running mounts the share or the netapp SVM but It does not happen. > Please, what is my mistake ? > Thanks > Ignazio > > > > -- Rodrigo Barbieri MSc Computer Scientist OpenStack Manila Core Contributor Federal University of São Carlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Thu Feb 4 20:29:35 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Thu, 04 Feb 2021 22:29:35 +0200 Subject: [all] Gate resources and performance In-Reply-To: References: Message-ID: <276821612469975@mail.yandex.ru> Hi! For OSA huge issue is how zuul clones required-projects. Just this single action takes for us from 6 to 10 minutes. It's not _so_ big amount of time but pretty fair, considering that we can decrease it for each CI job. Moreover, I think we're not alone who has more several repos in required-projects And maybe we have some kind of solution, which is ansible module [1] for parallel git clone. It speeds up process dramatically from what we see in our non-ci deployments. But it needs some time and resources for integration into zuul and I don't think we will be able to spend a lot of time on it during this cycle. Also we can probably decrease coverage for some operating systems, but we're about testing minimum of the required stuff and user scenarios out of possible amount of them. I will still try to drop something to the experimental pipeline though. [1] https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/library/git_requirements.py 04.02.2021, 19:35, "Dan Smith" : > Hi all, > > I have become increasingly concerned with CI performance lately, and > have been raising those concerns with various people. Most specifically, > I'm worried about our turnaround time or "time to get a result", which > has been creeping up lately. Right after the beginning of the year, we > had a really bad week where the turnaround time was well over 24 > hours. That means if you submit a patch on Tuesday afternoon, you might > not get a test result until Thursday. That is, IMHO, a real problem and > massively hurts our ability to quickly merge priority fixes as well as > just general velocity and morale. If people won't review my code until > they see a +1 from Zuul, and that is two days after I submitted it, > that's bad. > > Things have gotten a little better since that week, due in part to > getting past a rush of new year submissions (we think) and also due to > some job trimming in various places (thanks Neutron!). However, things > are still not great. Being in almost the last timezone of the day, the > queue is usually so full when I wake up that it's quite often I don't > get to see a result before I stop working that day. > > I would like to ask that projects review their jobs for places where > they can cut out redundancy, as well as turn their eyes towards > optimizations that can be made. I've been looking at both Nova and > Glance jobs and have found some things I think we can do less of. I also > wanted to get an idea of who is "using too much" in the way of > resources, so I've been working on trying to characterize the weight of > the jobs we run for a project, based on the number of worker nodes > required to run all the jobs, as well as the wall clock time of how long > we tie those up. The results are interesting, I think, and may help us > to identify where we see some gains. > > The idea here is to figure out[1] how many "node hours" it takes to run > all the normal jobs on a Nova patch compared to, say, a Neutron one. If > the jobs were totally serialized, this is the number of hours a single > computer (of the size of a CI worker) would take to do all that work. If > the number is 24 hours, that means a single computer could only check > *one* patch in a day, running around the clock. I chose the top five > projects in terms of usage[2] to report here, as they represent 70% of > the total amount of resources consumed. The next five only add up to > 13%, so the "top five" seems like a good target group. Here are the > results, in order of total consumption: > >     Project % of total Node Hours Nodes >     ------------------------------------------ >     1. TripleO 38% 31 hours 20 >     2. Neutron 13% 38 hours 32 >     3. Nova 9% 21 hours 25 >     4. Kolla 5% 12 hours 18 >     5. OSA 5% 22 hours 17 > > What that means is that a single computer (of the size of a CI worker) > couldn't even process the jobs required to run on a single patch for > Neutron or TripleO in a 24-hour period. Now, we have lots of workers in > the gate, of course, but there is also other potential overhead involved > in that parallelism, like waiting for nodes to be available for > dependent jobs. And of course, we'd like to be able to check more than > patch per day. Most projects have smaller gate job sets than check, but > assuming they are equivalent, a Neutron patch from submission to commit > would undergo 76 hours of testing, not including revisions and not > including rechecks. That's an enormous amount of time and resource for a > single patch! > > Now, obviously nobody wants to run fewer tests on patches before they > land, and I'm not really suggesting that we take that approach > necessarily. However, I think there are probably a lot of places that we > can cut down the amount of *work* we do. Some ways to do this are: > > 1. Evaluate whether or not you need to run all of tempest on two >    configurations of a devstack on each patch. Maybe having a >    stripped-down tempest (like just smoke) to run on unique configs, or >    even specific tests. > 2. Revisit your "irrelevant_files" lists to see where you might be able >    to avoid running heavy jobs on patches that only touch something >    small. > 3. Consider moving some jobs to the experimental queue and run them >    on-demand for patches that touch particular subsystems or affect >    particular configurations. > 4. Consider some periodic testing for things that maybe don't need to >    run on every single patch. > 5. Re-examine tests that take a long time to run to see if something can >    be done to make them more efficient. > 6. Consider performance improvements in the actual server projects, >    which also benefits the users. > > If you're a project that is not in the top ten then your job > configuration probably doesn't matter that much, since your usage is > dwarfed by the heavy projects. If the heavy projects would consider > making changes to decrease their workload, even small gains have the > ability to multiply into noticeable improvement. The higher you are on > the above list, the more impact a small change will have on the overall > picture. > > Also, thanks to Neutron and TripleO, both of which have already > addressed this in some respect, and have other changes on the horizon. > > Thanks for listening! > > --Dan > > 1: https://gist.github.com/kk7ds/5edbfacb2a341bb18df8f8f32d01b37c > 2; http://paste.openstack.org/show/C4pwUpdgwUDrpW6V6vnC/ --  Kind Regards, Dmitriy Rabotyagov From mark at stackhpc.com Thu Feb 4 20:39:03 2021 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 4 Feb 2021 20:39:03 +0000 Subject: [all] Gate resources and performance In-Reply-To: References: Message-ID: On Thu, 4 Feb 2021, 17:29 Dan Smith, wrote: > Hi all, > > I have become increasingly concerned with CI performance lately, and > have been raising those concerns with various people. Most specifically, > I'm worried about our turnaround time or "time to get a result", which > has been creeping up lately. Right after the beginning of the year, we > had a really bad week where the turnaround time was well over 24 > hours. That means if you submit a patch on Tuesday afternoon, you might > not get a test result until Thursday. That is, IMHO, a real problem and > massively hurts our ability to quickly merge priority fixes as well as > just general velocity and morale. If people won't review my code until > they see a +1 from Zuul, and that is two days after I submitted it, > that's bad. > Thanks for looking into this Dan, it's definitely an important issue and can introduce a lot of friction into and already heavy development process. > > Things have gotten a little better since that week, due in part to > getting past a rush of new year submissions (we think) and also due to > some job trimming in various places (thanks Neutron!). However, things > are still not great. Being in almost the last timezone of the day, the > queue is usually so full when I wake up that it's quite often I don't > get to see a result before I stop working that day. > > I would like to ask that projects review their jobs for places where > they can cut out redundancy, as well as turn their eyes towards > optimizations that can be made. I've been looking at both Nova and > Glance jobs and have found some things I think we can do less of. I also > wanted to get an idea of who is "using too much" in the way of > resources, so I've been working on trying to characterize the weight of > the jobs we run for a project, based on the number of worker nodes > required to run all the jobs, as well as the wall clock time of how long > we tie those up. The results are interesting, I think, and may help us > to identify where we see some gains. > > The idea here is to figure out[1] how many "node hours" it takes to run > all the normal jobs on a Nova patch compared to, say, a Neutron one. If > the jobs were totally serialized, this is the number of hours a single > computer (of the size of a CI worker) would take to do all that work. If > the number is 24 hours, that means a single computer could only check > *one* patch in a day, running around the clock. I chose the top five > projects in terms of usage[2] to report here, as they represent 70% of > the total amount of resources consumed. The next five only add up to > 13%, so the "top five" seems like a good target group. Here are the > results, in order of total consumption: > > Project % of total Node Hours Nodes > ------------------------------------------ > 1. TripleO 38% 31 hours 20 > 2. Neutron 13% 38 hours 32 > 3. Nova 9% 21 hours 25 > 4. Kolla 5% 12 hours 18 > 5. OSA 5% 22 hours 17 > Acknowledging Kolla is in the top 5. Deployment projects certainly tend to consume resources. I'll raise this at our next meeting and see what we can come up with. What that means is that a single computer (of the size of a CI worker) > couldn't even process the jobs required to run on a single patch for > Neutron or TripleO in a 24-hour period. Now, we have lots of workers in > the gate, of course, but there is also other potential overhead involved > in that parallelism, like waiting for nodes to be available for > dependent jobs. And of course, we'd like to be able to check more than > patch per day. Most projects have smaller gate job sets than check, but > assuming they are equivalent, a Neutron patch from submission to commit > would undergo 76 hours of testing, not including revisions and not > including rechecks. That's an enormous amount of time and resource for a > single patch! > > Now, obviously nobody wants to run fewer tests on patches before they > land, and I'm not really suggesting that we take that approach > necessarily. However, I think there are probably a lot of places that we > can cut down the amount of *work* we do. Some ways to do this are: > > 1. Evaluate whether or not you need to run all of tempest on two > configurations of a devstack on each patch. Maybe having a > stripped-down tempest (like just smoke) to run on unique configs, or > even specific tests. > 2. Revisit your "irrelevant_files" lists to see where you might be able > to avoid running heavy jobs on patches that only touch something > small. > 3. Consider moving some jobs to the experimental queue and run them > on-demand for patches that touch particular subsystems or affect > particular configurations. > 4. Consider some periodic testing for things that maybe don't need to > run on every single patch. > 5. Re-examine tests that take a long time to run to see if something can > be done to make them more efficient. > 6. Consider performance improvements in the actual server projects, > which also benefits the users. 7. Improve the reliability of jobs. Especially voting and gating ones. Rechecks increase resource usage and time to results/merge. I found querying the zuul API for failed jobs in the gate pipeline is a good way to find unexpected failures. 8. Reduce the node count in multi node jobs. If you're a project that is not in the top ten then your job > configuration probably doesn't matter that much, since your usage is > dwarfed by the heavy projects. If the heavy projects would consider > making changes to decrease their workload, even small gains have the > ability to multiply into noticeable improvement. The higher you are on > the above list, the more impact a small change will have on the overall > picture. > > Also, thanks to Neutron and TripleO, both of which have already > addressed this in some respect, and have other changes on the horizon. > > Thanks for listening! > > --Dan > > 1: https://gist.github.com/kk7ds/5edbfacb2a341bb18df8f8f32d01b37c > 2; http://paste.openstack.org/show/C4pwUpdgwUDrpW6V6vnC/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From viroel at gmail.com Thu Feb 4 20:39:11 2021 From: viroel at gmail.com (Douglas) Date: Thu, 4 Feb 2021 17:39:11 -0300 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hi Rodrigo, Thanks for your help on this. We were helping Ignazio in #openstack-manila channel. He wants to migrate a share across ONTAP clusters, which isn't supported in the current implementation of the driver-assisted-migration with NetApp driver. So, instead of using migration methods, we suggested using share-replication to create a copy in the destination, which will use the storage technologies to copy the data faster. Ignazio didn't try that out yet, since it was late in his timezone. We should continue tomorrow or in the next few days. Best regards, On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < rodrigo.barbieri2010 at gmail.com> wrote: > Hello Ignazio, > > If you are attempting to migrate between 2 NetApp backends, then you > shouldn't need to worry about correctly setting the data_node_access_ip. > Your ideal migration scenario is a driver-assisted-migration, since it is > between 2 NetApp backends. If that fails due to misconfiguration, it will > fallback to a host-assisted migration, which will use the > data_node_access_ip and the host will attempt to mount both shares. This is > not what you want for this scenario, as this is useful for different > backends, not your case. > > if you specify "manila migration-start --preserve-metadata True" it will > prevent the fallback to host-assisted, so it is easier for you to narrow > down the issue with the host-assisted migration out of the way. > > I used to be familiar with the NetApp driver set up to review your case, > however that was a long time ago. I believe the current NetApp driver > maintainers will be able to more accurately review your case and spot the > problem. > > If you could share some info about your scenario such as: > > 1) the 2 backends config groups in manila.conf (sanitized, without > passwords) > 2) a "manila show" of the share you are trying to migrate (sanitized if > needed) > 3) the "manila migration-start" command you are using and its parameters. > > Regards, > > > On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano > wrote: > >> Hello All, >> I am trying to migrate a share between a netapp backend to another. >> Both backends are configured in my manila.conf. >> I am able to create share on both, but I am not able to migrate share >> between them. >> I am using DSSH=False. >> I did not understand how host and driver assisted migration work and >> what "data_node_access_ip" means. >> The share I want to migrate is on a network (10.102.186.0/24) that I can >> reach by my management controllers network (10.102.184.0/24). I Can >> mount share from my controllers and I can mount also the netapp SVM where >> the share is located. >> So in the data_node_access_ip I wrote the list of my controllers >> management ips. >> During the migrate phase I checked if my controller where manila is >> running mounts the share or the netapp SVM but It does not happen. >> Please, what is my mistake ? >> Thanks >> Ignazio >> >> >> >> > > -- > Rodrigo Barbieri > MSc Computer Scientist > OpenStack Manila Core Contributor > Federal University of São Carlos > > -- Douglas Salles Viroel -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Thu Feb 4 20:49:02 2021 From: dms at danplanet.com (Dan Smith) Date: Thu, 04 Feb 2021 12:49:02 -0800 Subject: [all] Gate resources and performance In-Reply-To: (Mark Goddard's message of "Thu, 4 Feb 2021 20:39:03 +0000") References: Message-ID: > Acknowledging Kolla is in the top 5. Deployment projects certainly > tend to consume resources. I'll raise this at our next meeting and see > what we can come up with. Thanks - at least knowing and acknowledging is a great first step :) > 7. Improve the reliability of jobs. Especially voting and gating > ones. Rechecks increase resource usage and time to results/merge. I > found querying the zuul API for failed jobs in the gate pipeline is a > good way to find unexpected failures. For sure, and thanks for pointing this out. As mentioned in the Neutron example, 70some hours becomes 140some hours if the patch needs a couple rechecks. Rechecks due to spurious job failures reduce capacity and increase latency for everyone. > 8. Reduce the node count in multi node jobs. Yeah, I hope that people with three or more nodes in a job are doing so with lots of good reasoning, but this is an important point. Multi-node jobs consume N nodes for the full job runtime, but could be longer. If only some of the nodes are initially available, I believe zuul will spin those workers up and then wait for more, which means you are just burning node time not doing anything. I'm sure job configuration and other zuul details cause this to vary a lot (and I'm not an expert here), but it's good to note that fewer node counts will reduce the likelihood of the problem. --Dan From anlin.kong at gmail.com Thu Feb 4 20:48:59 2021 From: anlin.kong at gmail.com (Lingxian Kong) Date: Fri, 5 Feb 2021 09:48:59 +1300 Subject: Trove Multi-Tenancy In-Reply-To: References: Message-ID: Hi Syed, What's the trove version you've deployed? >From your configuration, once a trove instance is created, a nova server is created in the "service" project, as trove user, you can only show the trove instance. --- Lingxian Kong Senior Cloud Engineer (Catalyst Cloud) Trove PTL (OpenStack) OpenStack Cloud Provider Co-Lead (Kubernetes) On Fri, Feb 5, 2021 at 12:40 AM Ammad Syed wrote: > Hi, > > I have deployed trove and database instance deployment is successful. But > the problem is all the database servers are being created in service > account i.e openstack instance list shows the database instances in admin > user but when I check openstack server list the database instance won't > show up here, its visible in trove service account. > > Can you please advise how the servers will be visible in admin account ? I > want to enable multi-tenancy. > > Below is the configuration > > [DEFAULT] > log_dir = /var/log/trove > # RabbitMQ connection info > transport_url = rabbit://openstack:password at controller > control_exchange = trove > trove_api_workers = 5 > network_driver = trove.network.neutron.NeutronDriver > taskmanager_manager = trove.taskmanager.manager.Manager > default_datastore = mysql > cinder_volume_type = database_storage > reboot_time_out = 300 > usage_timeout = 900 > agent_call_high_timeout = 1200 > > nova_keypair = trove-key > > debug = true > trace = true > > # MariaDB connection info > [database] > connection = mysql+pymysql://trove:password at mariadb01/trove > > [mariadb] > tcp_ports = 3306,4444,4567,4568 > > [mysql] > tcp_ports = 3306 > > [postgresql] > tcp_ports = 5432 > > [redis] > tcp_ports = 6379,16379 > > # Keystone auth info > [keystone_authtoken] > www_authenticate_uri = http://controller:5000 > auth_url = http://controller:5000 > memcached_servers = controller:11211 > auth_type = password > project_domain_name = default > user_domain_name = default > project_name = service > username = trove > password = servicepassword > > [service_credentials] > auth_url = http://controller:5000 > region_name = RegionOne > project_domain_name = default > user_domain_name = default > project_name = service > username = trove > password = servicepassword > > -- > Regards, > > > Syed Ammad Ali > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Feb 4 20:50:08 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 04 Feb 2021 21:50:08 +0100 Subject: [neutron] Drivers meeting 05.02.2021 Message-ID: <1613953136.p7z3pmM7U7@p1> Hi, Due to lack of the topics to discuss, lets cancel tomorrow's drivers meeting. Please instead spent some time reviewing some opened specs :) Have a great weekend and see You online. -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From noonedeadpunk at ya.ru Thu Feb 4 20:51:55 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Thu, 04 Feb 2021 22:51:55 +0200 Subject: [all] Gate resources and performance In-Reply-To: <276821612469975@mail.yandex.ru> References: <276821612469975@mail.yandex.ru> Message-ID: <276311612471595@mail.yandex.ru> An HTML attachment was scrubbed... URL: From dms at danplanet.com Thu Feb 4 20:59:36 2021 From: dms at danplanet.com (Dan Smith) Date: Thu, 04 Feb 2021 12:59:36 -0800 Subject: [all] Gate resources and performance In-Reply-To: <276311612471595@mail.yandex.ru> (Dmitriy Rabotyagov's message of "Thu, 04 Feb 2021 22:51:55 +0200") References: <276821612469975@mail.yandex.ru> <276311612471595@mail.yandex.ru> Message-ID: > Another thing that I think may help us saving some CI time and that > affects most of the projects is pyenv build. There was a change made > to the zuul jobs that implements usage of stow. So we spend time on > building all major python version in images and doing instant select > of valid binary in jobs, rather then waiting for pyenv build during in > pipelines. Hmm, I guess I didn't realize most of the projects were spending time on this, but it looks like a good thread to chase. I've been digging through devstack looking for opportunities to make things faster lately. We do a ton of pip invocations, most of which I think are not really necessary, and the ones that are could be batched into a single go at the front to save quite a bit of time. Just a pip install of a requirements file seems to take a while, even when there's nothing that needs installing. We do that in devstack a lot. We also rebuild the tempest venv several times for reasons that I don't understand. So yeah, these are the kinds of things I'd really like to see people spend some time on. It is an investment, but worth it because the multiplier is so large. The CI system is so awesome in that it's just a tool that is there and easy to build on. But just like anything that makes stuff easy initially, there are often gaps entombed around quick work that need to be revisited over time. So, thanks for bringing this up as a potential thread for improvement! --Dan From fungi at yuggoth.org Thu Feb 4 22:46:34 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 4 Feb 2021 22:46:34 +0000 Subject: [all] Gate resources and performance In-Reply-To: References: Message-ID: <20210204224633.fzjpifhayb6rx2he@yuggoth.org> On 2021-02-04 12:49:02 -0800 (-0800), Dan Smith wrote: [...] > If only some of the nodes are initially available, I believe zuul > will spin those workers up and then wait for more, which means you > are just burning node time not doing anything. [...] I can imagine some pathological situations where this might be the case occasionally, but for the most part they come up around the same time. At risk of diving into too much internal implementation detail, here's the typical process at work: 1. The Zuul scheduler determines that it needs to schedule a build of your job, checks the definition to determine how many of which sorts of nodes that will require, and then puts a node request into Zookeeper with those details. 2. A Nodepool launcher checks for pending requests in Zookeeper, sees the one for your queued build, and evaluates whether it has a provider with the right labels and sufficient available quota to satisfy this request (and if not, skips it in hopes another launcher can instead). 3. If that launcher decides to attempt to fulfil the request, it issues parallel server create calls in the provider it chose, then waits for them to become available and reachable over the Internet. 4. Once the booted nodes are reachable, the launcher returns the request in Zookeeper and the node records are locked for use in the assigned build until it completes. Even our smallest providers have dozens of instances worth of capacity, and most multi-node jobs use only two or maybe three nodes for a build (granted I've seen some using five); so with the constant churn in builds completing and releasing spent nodes for deletion, there shouldn't be a significant amount of time spent where quota is consumed by some already active instances awaiting their compatriots for the same node request to also reach a ready state (though if the provider has a high incidence of boot failures, this becomes increasingly likely because some server create calls will need to be reissued). Where this gets a little more complicated is with dependent jobs, as Zuul requires they all be satisfied from the same provider. Certainly a large set of interdependent multi-node jobs becomes harder to choose a provider for and needs to wait longer for enough capacity to be freed there. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From fungi at yuggoth.org Thu Feb 4 22:54:18 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 4 Feb 2021 22:54:18 +0000 Subject: [all] Gate resources and performance In-Reply-To: <276821612469975@mail.yandex.ru> References: <276821612469975@mail.yandex.ru> Message-ID: <20210204225417.wp2sg3m5hoxmmvus@yuggoth.org> On 2021-02-04 22:29:35 +0200 (+0200), Dmitriy Rabotyagov wrote: > For OSA huge issue is how zuul clones required-projects. Just this > single action takes for us from 6 to 10 minutes. [...] I'd be curious to see some examples of this. Zuul doesn't clone required-projects, but it does push new commits from a cache on the executor to a cache on the node. The executor side caches are updated continually as new builds are scheduled, and the caches on the nodes are refreshed every time the images from which they're booted are assembled (typically daily unless something has temporarily broken our ability to rebuild a particular image). So on average, the executor is pushing only 12 hours worth of new commits for each required project. I don't recall if it performs those push operations in parallel, but I suppose that's something we could look into. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From ankit at aptira.com Thu Feb 4 17:18:43 2021 From: ankit at aptira.com (Ankit Goel) Date: Thu, 4 Feb 2021 17:18:43 +0000 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: <3cb755495b994352aaadf0d31ad295f3@elca.ch> References: <3cb755495b994352aaadf0d31ad295f3@elca.ch> Message-ID: Thanks for the response Jean. I could install rally with pip command. But when I am running rally deployment show command then it is failing. (rally) [root at rally ~]# rally deployment list +--------------------------------------+----------------------------+----------+------------------+--------+ | uuid | created_at | name | status | active | +--------------------------------------+----------------------------+----------+------------------+--------+ | 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a | 2021-02-04T14:57:02.638753 | existing | deploy->finished | * | +--------------------------------------+----------------------------+----------+------------------+--------+ (rally) [root at rally ~]# (rally) [root at rally ~]# rally deployment show 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a Command failed, please check log for more info 2021-02-04 16:06:49.576 19306 CRITICAL rally [-] Unhandled error: KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally Traceback (most recent call last): 2021-02-04 16:06:49.576 19306 ERROR rally File "/bin/rally", line 11, in 2021-02-04 16:06:49.576 19306 ERROR rally sys.exit(main()) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/main.py", line 40, in main 2021-02-04 16:06:49.576 19306 ERROR rally return cliutils.run(sys.argv, categories) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/cliutils.py", line 669, in run 2021-02-04 16:06:49.576 19306 ERROR rally ret = fn(*fn_args, **fn_kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/envutils.py", line 135, in default_from_global 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/plugins/__init__.py", line 59, in ensure_plugins_are_loaded 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/commands/deployment.py", line 205, in show 2021-02-04 16:06:49.576 19306 ERROR rally creds = deployment["credentials"]["openstack"][0] 2021-02-04 16:06:49.576 19306 ERROR rally KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally (rally) [root at rally ~]# Can you please help me to resolve this issue. Regards, Ankit Goel From: Taltavull Jean-Francois Sent: 03 February 2021 19:38 To: Ankit Goel ; openstack-dev at lists.openstack.org Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Ankit, Installation part of Rally official doc is not up to date, actually. Just do “pip install rally-openstack” (in a virtualenv, of course 😊) This will also install “rally” python package. Enjoy ! Jean-Francois From: Ankit Goel > Sent: mercredi, 3 février 2021 13:40 To: openstack-dev at lists.openstack.org Subject: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Experts, I was trying to install Openstack rally on centos 7 VM but the link provided in the Openstack doc to download the install_rally.sh is broken. Latest Rally Doc link - > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation Rally Install Script -> https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh - > This is broken After searching on internet I could reach to the Openstack rally Repo - > https://opendev.org/openstack/rally but here I am not seeing the install_ rally.sh script and according to all the information available on internet it says we need install_ rally.sh. Thus can you please let me know what’s the latest procedure to install Rally. Awaiting for your response. Thanks, Ankit Goel -------------- next part -------------- An HTML attachment was scrubbed... URL: From ankit at aptira.com Thu Feb 4 17:19:15 2021 From: ankit at aptira.com (Ankit Goel) Date: Thu, 4 Feb 2021 17:19:15 +0000 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: <1297683591.2146.1612361588263.JavaMail.zimbra@csc.fi> References: <1297683591.2146.1612361588263.JavaMail.zimbra@csc.fi> Message-ID: Thanks for the response kale. Regards, Ankit Goel -----Original Message----- From: Kalle Happonen Sent: 03 February 2021 19:43 To: Ankit Goel Cc: openstack-dev at lists.openstack.org Subject: Re: Rally - Unable to install rally - install_rally.sh is not available in repo Hi, I found this commit which says it was removed. https://opendev.org/openstack/rally/commit/9811aa9726c9a9befbe3acb6610c1c93c9924948 We use ansible to install Rally on CentOS7. The relevant roles are linked here. http://www.9bitwizard.eu/rally-and-tempest Although I think the roles are ours so they are not tested in all scenarios, so YMMV. Cheers, Kalle ----- Original Message ----- > From: "Ankit Goel" > To: openstack-dev at lists.openstack.org > Sent: Wednesday, 3 February, 2021 14:40:02 > Subject: Rally - Unable to install rally - install_rally.sh is not > available in repo > Hello Experts, > > I was trying to install Openstack rally on centos 7 VM but the link > provided in the Openstack doc to download the install_rally.sh is broken. > > Latest Rally Doc link - > > https://docs.openstack.org/rally/latest/install_and_upgrade/install.ht > ml#automated-installation > > Rally Install Script -> > https://raw.githubusercontent.com/openstack/rally/master/install_rally > .sh - > This is broken > > After searching on internet I could reach to the Openstack rally Repo > - > https://opendev.org/openstack/rally but here I am not seeing the > install_ rally.sh script and according to all the information > available on internet it says we need install_ rally.sh. > > Thus can you please let me know what's the latest procedure to install Rally. > > Awaiting for your response. > > Thanks, > Ankit Goel From smooney at redhat.com Fri Feb 5 01:21:44 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 05 Feb 2021 01:21:44 +0000 Subject: [all] Gate resources and performance In-Reply-To: <20210204225417.wp2sg3m5hoxmmvus@yuggoth.org> References: <276821612469975@mail.yandex.ru> <20210204225417.wp2sg3m5hoxmmvus@yuggoth.org> Message-ID: On Thu, 2021-02-04 at 22:54 +0000, Jeremy Stanley wrote: > On 2021-02-04 22:29:35 +0200 (+0200), Dmitriy Rabotyagov wrote: > > For OSA huge issue is how zuul clones required-projects. Just this > > single action takes for us from 6 to 10 minutes. > [...] > > I'd be curious to see some examples of this. Zuul doesn't clone > required-projects, but it does push new commits from a cache on the > executor to a cache on the node. > right originally that was don by https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/prepare-workspace-git as part of the base job https://opendev.org/opendev/base-jobs/src/branch/master/playbooks/base/pre.yaml#L33 that just sysncs the deltas using git between the git repo prepared on the zuul executor with those precached in the images. that will however only updte teh repos that are needed by your job via required-projects. it ensure that you are actully testing what you think you are testing and that you never need to cloue any of the repos you are test since they will be prepared for you. this should be much faster then clonneing or pulling the repos yourself in the job and more imporantly it avoid netwokring issue that can happen if you try to clone in the job from gerrit. its also what makes depends on work which is tricky if you dont leave zuul prepare the repos for you. > The executor side caches are > updated continually as new builds are scheduled, and the caches on > the nodes are refreshed every time the images from which they're > booted are assembled (typically daily unless something has > temporarily broken our ability to rebuild a particular image). So on > average, the executor is pushing only 12 hours worth of new commits > for each required project. I don't recall if it performs those push > operations in parallel, but I suppose that's something we could look > into. its not parallel if im reading this write but typically you will not need to pull alot of repos. https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/prepare-workspace-git/tasks/main.yaml#L11-L43 as in its maybe in the low 10s for typical full tempest jobs. OSA is slightly pathalogical in howe it use require porjects https://opendev.org/openstack/openstack-ansible/src/branch/master/zuul.d/jobs.yaml#L23-L132 its pulling in a large propoation of the openstack repos. its not suprising its slower then we typeicaly would expect but it shoudl be fater then if you actully clonned them without useing the cache in the image and executor. doing the clone in parallel would help in this case but it might also make sense to reasses how osa stuctures its jobs for example osa support both source and non souce installs correct. the non souce installs dont need the openstack pojects just the osa repos since it will be using the binary packages. so if you had a second intermeitady job of the souce install the the openstack compoenta repos listed you could skip updating 50 repos in your binary jobs (im assumning the _distro_ jobs are binary by the way.) currenlty its updating 105 for every job that is based on openstack-ansible-deploy-aio From laurentfdumont at gmail.com Fri Feb 5 03:08:01 2021 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Thu, 4 Feb 2021 22:08:01 -0500 Subject: [neutron][victoria] Front / Back Routers In-Reply-To: <0670B960225633449A24709C291A52524FAAB742@COM01.performair.local> References: <0670B960225633449A24709C291A52524FAAB742@COM01.performair.local> Message-ID: It's a bit hard to parse a network topology by email, but from a theoretical point of view - you can statically route a /24 towards the external IP of an openstack router from a device upstream. I do believe there is a BGP component for Openstack but I'm not sure it's role is to dynamically advertised networks from Openstack towards the wider network. On Thu, Feb 4, 2021 at 11:45 AM wrote: > All; > > My team and I have been working through the tutorials on Server World ( > server-world.com/ean/), in order to learn and build an OpenStack > cluster. We've also been looking at the official documentation to attempt > to increase our knowledge of the subject. > > I have a question about Neutron though. All the examples that I remember > have Neutron setup with a single router. The router is part of a > "provider" network, and subnet on the outside, and one or more "tenant" > networks on the inside. Floating IPS, then appear to be IP addresses > belonging to the "provider" subnet, that are applied to the router, and > which the router then NATs. > > These setups look like this: > > Physrouter1 (physical router) subnet: 192.168.0.0/24, IP address: > 192.168.0.1 > | > Physnet1 (192.168.0.0/24)(ovs network definition) > | > Router1 (ovs router)(allocation pool: 192.168.0.100 - 192.168.0.254) <-- > Floating IPs are "owned" by this, and are in the range of the allocation > pool > | > Tenant network(s) > > This has the advantage of being easy, fast, secure, and simple to setup. > > What if you wanted something where you could route whole subnet into your > OpenStack cluster. > > Physrouter1 (physical router) subnet: 172.16.255.0/24, IP address: > 172.16.255.1 > | > Physnet1 (172.16.255.0/24)(ovs network definition) > | > Router1 (ovs router)(fixed IP addresses: 172.16.255.2 & 172.16.254.1/24 + > static / dynamic routing) > | > Network (17216.254.0/24) > | > Router2(ovs router)(allocation pool: 172.16.254.5 - 172.16.254.254) <-- > Floating IPs are "owned" by this, and are in the range of the allocation > pool > | > Tenant network(s) > > Is my understanding accurate? > Are there advantages of one over the other? > What commands are used to specify static IPs for ovs routers, and static > routing rules? > > Thank you, > > Dominic L. Hilsbos, MBA > Director - Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Fri Feb 5 05:51:58 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Fri, 5 Feb 2021 10:51:58 +0500 Subject: Trove Multi-Tenancy In-Reply-To: References: Message-ID: Hello Kong, I am using latest victoria release and trove 14.0. Yes you are right, this is exactly happening. All the nova instances are in trove user service project. From my admin user i am only able to list database instances. Is it possible that all nova instances should also deploy in any tenant project i.e if i am deploying database instance from admin user having adminproject and default domain the nova instance should be in adminproject rather then trove service project. Ammad Sent from my iPhone > On Feb 5, 2021, at 1:49 AM, Lingxian Kong wrote: > >  > Hi Syed, > > What's the trove version you've deployed? > > From your configuration, once a trove instance is created, a nova server is created in the "service" project, as trove user, you can only show the trove instance. > > --- > Lingxian Kong > Senior Cloud Engineer (Catalyst Cloud) > Trove PTL (OpenStack) > OpenStack Cloud Provider Co-Lead (Kubernetes) > > >> On Fri, Feb 5, 2021 at 12:40 AM Ammad Syed wrote: >> Hi, >> >> I have deployed trove and database instance deployment is successful. But the problem is all the database servers are being created in service account i.e openstack instance list shows the database instances in admin user but when I check openstack server list the database instance won't show up here, its visible in trove service account. >> >> Can you please advise how the servers will be visible in admin account ? I want to enable multi-tenancy. >> >> Below is the configuration >> >> [DEFAULT] >> log_dir = /var/log/trove >> # RabbitMQ connection info >> transport_url = rabbit://openstack:password at controller >> control_exchange = trove >> trove_api_workers = 5 >> network_driver = trove.network.neutron.NeutronDriver >> taskmanager_manager = trove.taskmanager.manager.Manager >> default_datastore = mysql >> cinder_volume_type = database_storage >> reboot_time_out = 300 >> usage_timeout = 900 >> agent_call_high_timeout = 1200 >> >> nova_keypair = trove-key >> >> debug = true >> trace = true >> >> # MariaDB connection info >> [database] >> connection = mysql+pymysql://trove:password at mariadb01/trove >> >> [mariadb] >> tcp_ports = 3306,4444,4567,4568 >> >> [mysql] >> tcp_ports = 3306 >> >> [postgresql] >> tcp_ports = 5432 >> >> [redis] >> tcp_ports = 6379,16379 >> >> # Keystone auth info >> [keystone_authtoken] >> www_authenticate_uri = http://controller:5000 >> auth_url = http://controller:5000 >> memcached_servers = controller:11211 >> auth_type = password >> project_domain_name = default >> user_domain_name = default >> project_name = service >> username = trove >> password = servicepassword >> >> [service_credentials] >> auth_url = http://controller:5000 >> region_name = RegionOne >> project_domain_name = default >> user_domain_name = default >> project_name = service >> username = trove >> password = servicepassword >> >> -- >> Regards, >> >> >> Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.morin at gmail.com Fri Feb 5 07:01:09 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Fri, 5 Feb 2021 07:01:09 +0000 Subject: [ops][largescale-sig] How many compute nodes in a single cluster ? In-Reply-To: References: <533ac947-27cb-5ee8-ae7b-9553ca74ad8a@openstack.org> <20210202173713.GA14971@sync> <2ce0c6b648cfd3c9730748b9a5cc4bb36396d638.camel@redhat.com> <20210203142405.GB14971@sync> Message-ID: <20210205070109.GC14971@sync> Thanks for your reply, a lot of useful info! We already identified that using separated rabbit cluster for neutron could improve the scalability. About the usage of NATS, I never tried this piece of software but definitely sounds a good fit for large cloud. On rabbitmq side they worked on a new kind of queue called "quorum" that are HA by design. The documentation is recommending to use quorum now instead of classic queues with HA. Does anyone know if there is a chance that oslo_messaging will manage such kind of queues? Beside the rabbit, we also monitor our database cluster (we are using mariadb with galera) very carefully. About it, we also think that splitting the cluster in multiple deployment could help improving, but while it's easy to say, it's time consuming to move an already running cloud to a new architecture :) Regards, -- Arnaud Morin On 03.02.21 - 14:55, Sean Mooney wrote: > On Wed, 2021-02-03 at 14:24 +0000, Arnaud Morin wrote: > > Yes, totally agree with that, on our side we are used to monitor the > > number of neutron ports (and espacially the number of ports in BUILD > > state). > > > > As usually an instance is having one port in our cloud, number of > > instances is closed to number of ports. > > > > About the cellsv2, we are mostly struggling on neutron side, so cells > > are not helping us. > > > ack, that makes sense. > there are some things you can do to help scale neutron. > one semi simple step is if you are usign ml2/ovs, ml2/linux-bridge or ml2/sriov-nic-agent is to move > neutron to its own rabbitmq instance. > neutron using the default ml2 drivers tends to be quite chatty so placing those on there own rabbit instance > can help. while its in conflict with ha requirements ensuring that clustering is not used and instead > loadblanicn with something like pace maker to a signel rabbitmq server can also help. > rabbmqs clustering ablity while improving Ha by removing a singel point of failure decreease the performance > of rabbit so if you have good monitoring and simpley restat or redeploy rabbit quickly using k8s or something > else like an active backup deplopment mediataed by pacemeaker can work much better then actully clutering. > > if you use ml2/ovn that allows you to remove the need for the dhcp agent and l3 agent as well as the l2 agent per > compute host. that signifcaltly reducece neutron rpc impact however ovn does have some partiy gaps and scaling issues > of its own. if it works for you and you can use as a new enough version that allows the ovn southd process on the compute > nodes to subscibe to a subset of noth/southdb update relevent to just that node i can help with scaling neutorn. > > im not sure about usage fo feature like dvr or routed provider networks impact this as i mostly work on nova now but > at least form a data plane point of view it can reduce contention on the networing nodes(where l3 agents ran) to do routing > and nat on behalf of all compute nodes. > > at some point it might make sense for neutorn to take a similar cells approch to its own architrue but given the ablity of it to > delegate some or all of the networkign to extrenal network contoler like ovn/odl its never been clear that an in tree sharding > mechium like cells was actully required. > > one thing that i hope some one will have time to investate at some point is can we replace rabbitmq in general with nats. > this general topic comes up with different technolgies form time to time. nats however look like it would actuly > be a good match in terms of feature and intended use while being much lighter weight then rabbitmq and actully improving > in performance the more nats server instance you cluster since that was a design constraint form the start. > > i dont actully think neutorn acritrues or nova for that matter is inherintly flawed but a more moderne messagaing buts might > help all distibuted services scale with fewer issues then they have today. > > > > From skaplons at redhat.com Fri Feb 5 07:31:34 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Fri, 5 Feb 2021 08:31:34 +0100 Subject: [neutron][victoria] Front / Back Routers In-Reply-To: <0670B960225633449A24709C291A52524FAAB742@COM01.performair.local> References: <0670B960225633449A24709C291A52524FAAB742@COM01.performair.local> Message-ID: <20210205073134.3w266wzeowe5qbcw@p1.localdomain> Hi, On Thu, Feb 04, 2021 at 04:39:23PM +0000, DHilsbos at performair.com wrote: > All; > > My team and I have been working through the tutorials on Server World (server-world.com/ean/), in order to learn and build an OpenStack cluster. We've also been looking at the official documentation to attempt to increase our knowledge of the subject. > > I have a question about Neutron though. All the examples that I remember have Neutron setup with a single router. The router is part of a "provider" network, and subnet on the outside, and one or more "tenant" networks on the inside. Floating IPS, then appear to be IP addresses belonging to the "provider" subnet, that are applied to the router, and which the router then NATs. > > These setups look like this: > > Physrouter1 (physical router) subnet: 192.168.0.0/24, IP address: 192.168.0.1 > | > Physnet1 (192.168.0.0/24)(ovs network definition) Can You explain more what "ovs network definition" means really? > | > Router1 (ovs router)(allocation pool: 192.168.0.100 - 192.168.0.254) <-- Floating IPs are "owned" by this, and are in the range of the allocation pool What is "ovs router" here? Can You explain? In Neutron router don't have allocation pool. Allocation pool is attribute of the subnet defined in the network. Subnet can be then plugged to the router. Or if it's "router:external" network, it can be used as a gateway for the router and then You can have Floating IPs from it. > | > Tenant network(s) > > This has the advantage of being easy, fast, secure, and simple to setup. > > What if you wanted something where you could route whole subnet into your OpenStack cluster. IIUC You can route /24 on Your physical router and then configure it as subnet in the provider network (vlan or flat). If You will set it as router:external, You will be able to use it as neutron router's gateway and have FIPs from it. But You can also plug vms directly to that network. Neutron will then allocation IP addresses from Your /24 subnet to the instances. > > Physrouter1 (physical router) subnet: 172.16.255.0/24, IP address: 172.16.255.1 > | > Physnet1 (172.16.255.0/24)(ovs network definition) > | > Router1 (ovs router)(fixed IP addresses: 172.16.255.2 & 172.16.254.1/24 + static / dynamic routing) > | > Network (17216.254.0/24) > | > Router2(ovs router)(allocation pool: 172.16.254.5 - 172.16.254.254) <-- Floating IPs are "owned" by this, and are in the range of the allocation pool > | > Tenant network(s) > > Is my understanding accurate? > Are there advantages of one over the other? > What commands are used to specify static IPs for ovs routers, and static routing rules? > > Thank you, > > Dominic L. Hilsbos, MBA > Director - Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > > -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From skaplons at redhat.com Fri Feb 5 07:33:28 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Fri, 5 Feb 2021 08:33:28 +0100 Subject: [neutron][victoria] Front / Back Routers In-Reply-To: References: <0670B960225633449A24709C291A52524FAAB742@COM01.performair.local> Message-ID: <20210205073328.2rpwdrnv2e7s4ng4@p1.localdomain> Hi, On Thu, Feb 04, 2021 at 10:08:01PM -0500, Laurent Dumont wrote: > It's a bit hard to parse a network topology by email, but from a > theoretical point of view - you can statically route a /24 towards the > external IP of an openstack router from a device upstream. > > I do believe there is a BGP component for Openstack but I'm not sure it's > role is to dynamically advertised networks from Openstack towards the wider > network. It is neutron-dynamic-routing. Documentation is available at https://docs.openstack.org/neutron-dynamic-routing/victoria/ > > On Thu, Feb 4, 2021 at 11:45 AM wrote: > > > All; > > > > My team and I have been working through the tutorials on Server World ( > > server-world.com/ean/), in order to learn and build an OpenStack > > cluster. We've also been looking at the official documentation to attempt > > to increase our knowledge of the subject. > > > > I have a question about Neutron though. All the examples that I remember > > have Neutron setup with a single router. The router is part of a > > "provider" network, and subnet on the outside, and one or more "tenant" > > networks on the inside. Floating IPS, then appear to be IP addresses > > belonging to the "provider" subnet, that are applied to the router, and > > which the router then NATs. > > > > These setups look like this: > > > > Physrouter1 (physical router) subnet: 192.168.0.0/24, IP address: > > 192.168.0.1 > > | > > Physnet1 (192.168.0.0/24)(ovs network definition) > > | > > Router1 (ovs router)(allocation pool: 192.168.0.100 - 192.168.0.254) <-- > > Floating IPs are "owned" by this, and are in the range of the allocation > > pool > > | > > Tenant network(s) > > > > This has the advantage of being easy, fast, secure, and simple to setup. > > > > What if you wanted something where you could route whole subnet into your > > OpenStack cluster. > > > > Physrouter1 (physical router) subnet: 172.16.255.0/24, IP address: > > 172.16.255.1 > > | > > Physnet1 (172.16.255.0/24)(ovs network definition) > > | > > Router1 (ovs router)(fixed IP addresses: 172.16.255.2 & 172.16.254.1/24 + > > static / dynamic routing) > > | > > Network (17216.254.0/24) > > | > > Router2(ovs router)(allocation pool: 172.16.254.5 - 172.16.254.254) <-- > > Floating IPs are "owned" by this, and are in the range of the allocation > > pool > > | > > Tenant network(s) > > > > Is my understanding accurate? > > Are there advantages of one over the other? > > What commands are used to specify static IPs for ovs routers, and static > > routing rules? > > > > Thank you, > > > > Dominic L. Hilsbos, MBA > > Director - Information Technology > > Perform Air International Inc. > > DHilsbos at PerformAir.com > > www.PerformAir.com > > > > > > > > -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From jean-francois.taltavull at elca.ch Fri Feb 5 08:09:09 2021 From: jean-francois.taltavull at elca.ch (Taltavull Jean-Francois) Date: Fri, 5 Feb 2021 08:09:09 +0000 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: References: <3cb755495b994352aaadf0d31ad295f3@elca.ch> Message-ID: <7fdbc97a688744f399f7358b1250bc30@elca.ch> Hello, 1/ Is “rally-openstack” python package correctly installed ? On my side I have: (venv) vagrant at rally: $ pip list | grep rally rally 3.2.0 rally-openstack 2.1.0 2/ Could you please show the json file used to create the deployment ? From: Ankit Goel Sent: jeudi, 4 février 2021 18:19 To: Taltavull Jean-Francois ; openstack-dev at lists.openstack.org Cc: John Spillane Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Thanks for the response Jean. I could install rally with pip command. But when I am running rally deployment show command then it is failing. (rally) [root at rally ~]# rally deployment list +--------------------------------------+----------------------------+----------+------------------+--------+ | uuid | created_at | name | status | active | +--------------------------------------+----------------------------+----------+------------------+--------+ | 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a | 2021-02-04T14:57:02.638753 | existing | deploy->finished | * | +--------------------------------------+----------------------------+----------+------------------+--------+ (rally) [root at rally ~]# (rally) [root at rally ~]# rally deployment show 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a Command failed, please check log for more info 2021-02-04 16:06:49.576 19306 CRITICAL rally [-] Unhandled error: KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally Traceback (most recent call last): 2021-02-04 16:06:49.576 19306 ERROR rally File "/bin/rally", line 11, in 2021-02-04 16:06:49.576 19306 ERROR rally sys.exit(main()) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/main.py", line 40, in main 2021-02-04 16:06:49.576 19306 ERROR rally return cliutils.run(sys.argv, categories) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/cliutils.py", line 669, in run 2021-02-04 16:06:49.576 19306 ERROR rally ret = fn(*fn_args, **fn_kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/envutils.py", line 135, in default_from_global 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/plugins/__init__.py", line 59, in ensure_plugins_are_loaded 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/commands/deployment.py", line 205, in show 2021-02-04 16:06:49.576 19306 ERROR rally creds = deployment["credentials"]["openstack"][0] 2021-02-04 16:06:49.576 19306 ERROR rally KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally (rally) [root at rally ~]# Can you please help me to resolve this issue. Regards, Ankit Goel From: Taltavull Jean-Francois > Sent: 03 February 2021 19:38 To: Ankit Goel >; openstack-dev at lists.openstack.org Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Ankit, Installation part of Rally official doc is not up to date, actually. Just do “pip install rally-openstack” (in a virtualenv, of course 😊) This will also install “rally” python package. Enjoy ! Jean-Francois From: Ankit Goel > Sent: mercredi, 3 février 2021 13:40 To: openstack-dev at lists.openstack.org Subject: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Experts, I was trying to install Openstack rally on centos 7 VM but the link provided in the Openstack doc to download the install_rally.sh is broken. Latest Rally Doc link - > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation Rally Install Script -> https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh - > This is broken After searching on internet I could reach to the Openstack rally Repo - > https://opendev.org/openstack/rally but here I am not seeing the install_ rally.sh script and according to all the information available on internet it says we need install_ rally.sh. Thus can you please let me know what’s the latest procedure to install Rally. Awaiting for your response. Thanks, Ankit Goel -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.morin at gmail.com Fri Feb 5 08:32:34 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Fri, 5 Feb 2021 08:32:34 +0000 Subject: [largescale-sig] OpenStack DB Archiver In-Reply-To: <639c7a77f7812ad8897656404cb06cc67cf51609.camel@redhat.com> References: <20200716133127.GA31915@sync> <045a2dea-02f0-26ca-96d6-46d8cdbe2d16@openstack.org> <639c7a77f7812ad8897656404cb06cc67cf51609.camel@redhat.com> Message-ID: <20210205083234.GD14971@sync> Hello, At first, we designed OSarchiver to be openstack agnostic. It can be executed on any kind of database, the only thing it needs is a column in the DB to read in order to perform the removal. We are using it on our cloud, executing it in a cron just like we would do nova-manage. We monitor the output and raise an alert if something is broken. If I remember correctly we used it first on our mistral DB, I dont know if mistral implement this "shadow" table mechanism that you describe for nova, so that's maybe why we initiate this project at first. Moreover, I dont know the details about how nova move data into shadow tables, but were more confident in re-using our OSarchiver on nova after it had prove a good work on mistral :) On our side, we would agree with something like a retention policy of data, but I dont believe all the projects can/would implement such functionality, so the approach to have an external tool doing it is IMHO a path we can follow. About the 3 options for OSarchiver that Thierry proposed, we dont have a strong opinion but options 2 or 3 OK for us. Option 3 would be the best, but we dont know what is the necesary work that should be done on the code itself before this option become viable. Finally, no problem to move under APACHE2 license. Cheers, -- Arnaud Morin On 29.01.21 - 14:22, Sean Mooney wrote: > On Fri, 2021-01-29 at 13:47 +0100, Thierry Carrez wrote: > > Arnaud Morin wrote: > > > [...] > > > We were wondering if some other users would be interested in using the > > > tool, and maybe move it under the opendev governance? > > > > Resurrecting this thread, as OSops has now been revived under the > > auspices of the OpenStack Operation Docs and Tooling SIG. > > > > There are basically 3 potential ways forward for OSarchiver: > > > > 1- Keep it as-is on GitHub, and reference it where we can in OpenStack docs > > > > 2- Relicense it under Apache-2 and move it in a subdirectory under > > openstack/osops > > > > 3- Move it under its own repository under opendev and propose it as a > > new official OpenStack project (relicensing under Apache-2 will be > > necessary if accepted) > > > > Options (1) and (3) have the benefit of keeping it under its own > > repository. Options (2) and (3) have the benefit of counting towards an > > official OpenStack contribution. Options (1) and (2) have the benefit of > > not requiring TC approval. > > > > All other things being equal, if the end goal is to increase > > discoverability, option 3 is probably the best. > > not to detract form the converation on where to host it, but now that i have discoverd this > via this thread i have one quetion. OSarchiver appears to be bypassing the shadow tabels > whcih the project maintian to allow you to archive rows in the the project db in a different table. > > instad OSarchiver chooese to archive it in an external DB or file > > we have talked about wheter or not we can remove shaddow tables in nova enteirly a few times in the past > but we did not want to break operators that actully use them but it appares OVH at least has developed there > own alrenitive presumably becasue the proejcts own archive and purge functionality was not meetingyour need. > > would the option to disable shadow tabels or define a retention policy for delete rows be useful to operators > or is this even a capablity that project coudl declare out of scope and delegate that to a new openstack porject > e.g. opeiton 3 above to do instead? > > im not sure how suportable OSarchiver would be in our downstream product right now but with testing it might be > somethign we could look at includign in the futrue. we currently rely on chron jobs to invoke nova-magne ecta > to achive similar functionality to OSarchiver but if that chron job breaks its hard to detect and the delete rows > can build up causing really slow db queries. as a seperate service with loggin i assuem this is simplere to monitor > and alarm on if it fails since it provices one central point to manage the archival and deletion of rows so i kind > of like this approch even if its direct db access right now would make it unsupportable in our product without veting > the code and productising the repo via ooo integration. > > > > > > > Regards, > > > > > From marios at redhat.com Fri Feb 5 08:33:40 2021 From: marios at redhat.com (Marios Andreou) Date: Fri, 5 Feb 2021 10:33:40 +0200 Subject: [all] Gate resources and performance In-Reply-To: References: Message-ID: On Thu, Feb 4, 2021 at 7:30 PM Dan Smith wrote: > Hi all, > > I have become increasingly concerned with CI performance lately, and > have been raising those concerns with various people. Most specifically, > I'm worried about our turnaround time or "time to get a result", which > has been creeping up lately. Right after the beginning of the year, we > had a really bad week where the turnaround time was well over 24 > hours. That means if you submit a patch on Tuesday afternoon, you might > not get a test result until Thursday. That is, IMHO, a real problem and > massively hurts our ability to quickly merge priority fixes as well as > just general velocity and morale. If people won't review my code until > they see a +1 from Zuul, and that is two days after I submitted it, > that's bad. > > Things have gotten a little better since that week, due in part to > getting past a rush of new year submissions (we think) and also due to > some job trimming in various places (thanks Neutron!). However, things > are still not great. Being in almost the last timezone of the day, the > queue is usually so full when I wake up that it's quite often I don't > get to see a result before I stop working that day. > first thanks for bringing this topic - fully agreed that 24 hours before zuul reports back on a patch is unacceptable. The tripleo-ci team is *always* looking at improving CI efficiency, if nothing else for the very reason you started this thread i.e. we don't want so many jobs (or too many long jobs) that it takes 24 or more hours for zuul to report (ie this obviously affects us, too). We have been called out as a community on resource usage in the past so we are of course aware of, acknowledge and are trying to address the issue. > > I would like to ask that projects review their jobs for places where > they can cut out redundancy, as well as turn their eyes towards > optimizations that can be made. I've been looking at both Nova and > Glance jobs and have found some things I think we can do less of. I also > wanted to get an idea of who is "using too much" in the way of > resources, so I've been working on trying to characterize the weight of > the jobs we run for a project, based on the number of worker nodes > required to run all the jobs, as well as the wall clock time of how long > we tie those up. The results are interesting, I think, and may help us > to identify where we see some gains. > > The idea here is to figure out[1] how many "node hours" it takes to run > all the normal jobs on a Nova patch compared to, say, a Neutron one. If > just wanted to point out the 'node hours' comparison may not be fair because what is a typical nova patch or a typical tripleo patch? The number of jobs matched & executed by zuul on a given review will be different to another tripleo patch in the same repo depending on the files touched or branch (etc.) and will vary even more compared to other tripleo repos; I think this is the same for nova or any other project with multiple repos. > the jobs were totally serialized, this is the number of hours a single > computer (of the size of a CI worker) would take to do all that work. If > the number is 24 hours, that means a single computer could only check > *one* patch in a day, running around the clock. I chose the top five > projects in terms of usage[2] to report here, as they represent 70% of > the total amount of resources consumed. The next five only add up to > 13%, so the "top five" seems like a good target group. Here are the > results, in order of total consumption: > > Project % of total Node Hours Nodes > ------------------------------------------ > 1. TripleO 38% 31 hours 20 > 2. Neutron 13% 38 hours 32 > 3. Nova 9% 21 hours 25 > 4. Kolla 5% 12 hours 18 > 5. OSA 5% 22 hours 17 > > What that means is that a single computer (of the size of a CI worker) > couldn't even process the jobs required to run on a single patch for > Neutron or TripleO in a 24-hour period. Now, we have lots of workers in > the gate, of course, but there is also other potential overhead involved > in that parallelism, like waiting for nodes to be available for > dependent jobs. And of course, we'd like to be able to check more than > patch per day. Most projects have smaller gate job sets than check, but > assuming they are equivalent, a Neutron patch from submission to commit > would undergo 76 hours of testing, not including revisions and not > including rechecks. That's an enormous amount of time and resource for a > single patch! > > Now, obviously nobody wants to run fewer tests on patches before they > land, and I'm not really suggesting that we take that approach > necessarily. However, I think there are probably a lot of places that we > can cut down the amount of *work* we do. Some ways to do this are: > > 1. Evaluate whether or not you need to run all of tempest on two > configurations of a devstack on each patch. Maybe having a > stripped-down tempest (like just smoke) to run on unique configs, or > even specific tests. > 2. Revisit your "irrelevant_files" lists to see where you might be able > to avoid running heavy jobs on patches that only touch something > small. > 3. Consider moving some jobs to the experimental queue and run them > on-demand for patches that touch particular subsystems or affect > particular configurations. > 4. Consider some periodic testing for things that maybe don't need to > run on every single patch. > 5. Re-examine tests that take a long time to run to see if something can > be done to make them more efficient. > 6. Consider performance improvements in the actual server projects, > which also benefits the users. > ACK. We have recently completed some work (as I said, this is an ongoing issue/process for us) at [1][2] to remove some redundant jobs which should start to help. Mohamed (mnaser o/) has reached out about this and joined our most recent irc meeting [3]. We're already prioritized some more cleanup work for this sprint including checking file patterns (e.g. started at [4]), tempest tests and removing many/all of our non-voting jobs as a first pass. Hope that at least starts to address you concern, regards, marios [1] https://review.opendev.org/q/topic:reduce-content-providers [2] https://review.opendev.org/q/topic:tripleo-c7-update-upgrade-removal [3] http://eavesdrop.openstack.org/meetings/tripleo/2021/tripleo.2021-02-02-14.00.log.html#l-37 [4] https://review.opendev.org/c/openstack/tripleo-ci/+/773692 > If you're a project that is not in the top ten then your job > configuration probably doesn't matter that much, since your usage is > dwarfed by the heavy projects. If the heavy projects would consider > making changes to decrease their workload, even small gains have the > ability to multiply into noticeable improvement. The higher you are on > the above list, the more impact a small change will have on the overall > picture. > > Also, thanks to Neutron and TripleO, both of which have already > addressed this in some respect, and have other changes on the horizon. > > Thanks for listening! > > --Dan > > 1: https://gist.github.com/kk7ds/5edbfacb2a341bb18df8f8f32d01b37c > 2; http://paste.openstack.org/show/C4pwUpdgwUDrpW6V6vnC/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From roshananvekar at gmail.com Fri Feb 5 09:23:52 2021 From: roshananvekar at gmail.com (roshan anvekar) Date: Fri, 5 Feb 2021 14:53:52 +0530 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty Message-ID: Hi all, Scenario: I have an installation of Openstack stein through kolla-ansible. The deployment went fine and all services look good. Although I am seeing that under Admin--> Compute --> Hypervisors panel in horizon, all the controller nodes are missing. It's a blank list. Also "Openstack hypervisor list" gives an empty list. I skimmed through the logs and found no error message other than in nova-scheduler that: *Got no allocation candidates from the Placement API. This could be due to insufficient resources or a temporary occurence as compute nodes start up.* Subsequently I checked placement container logs and found no error message or anamoly. Not sure what the issue is. Any help in the above case would be appreciated. Regards, Roshan -------------- next part -------------- An HTML attachment was scrubbed... URL: From anlin.kong at gmail.com Fri Feb 5 10:08:10 2021 From: anlin.kong at gmail.com (Lingxian Kong) Date: Fri, 5 Feb 2021 23:08:10 +1300 Subject: Trove Multi-Tenancy In-Reply-To: References: Message-ID: There are several config options you can change to support this model: [DEFAULT] remote_nova_client = trove.common.clients.nova_client remote_neutron_client = trove.common.clients.neutron_client remote_cinder_client = trove.common.clients.cinder_client remote_glance_client = trove.common.clients.glance_client *However, those configs are extremely not recommended and not maintained any more in Trove, *which means, function may broken in this case. The reasons are many folds. Apart from the security reason, one important thing is, Trove is a database as a service, what the cloud user is getting from Trove are the access to the database and some management APIs for database operations, rather than a purely Nova VM that has a database installed and can be accessed by the cloud user. If you prefer this model, why not just create Nova VM on your own and manually install database software so you have more control of that? --- Lingxian Kong Senior Cloud Engineer (Catalyst Cloud) Trove PTL (OpenStack) OpenStack Cloud Provider Co-Lead (Kubernetes) On Fri, Feb 5, 2021 at 6:52 PM Ammad Syed wrote: > Hello Kong, > > I am using latest victoria release and trove 14.0. > > Yes you are right, this is exactly happening. All the nova instances are > in trove user service project. From my admin user i am only able to list > database instances. > > Is it possible that all nova instances should also deploy in any tenant > project i.e if i am deploying database instance from admin user having > adminproject and default domain the nova instance should be in adminproject > rather then trove service project. > > Ammad > Sent from my iPhone > > On Feb 5, 2021, at 1:49 AM, Lingxian Kong wrote: > >  > Hi Syed, > > What's the trove version you've deployed? > > From your configuration, once a trove instance is created, a nova server > is created in the "service" project, as trove user, you can only show the > trove instance. > > --- > Lingxian Kong > Senior Cloud Engineer (Catalyst Cloud) > Trove PTL (OpenStack) > OpenStack Cloud Provider Co-Lead (Kubernetes) > > > On Fri, Feb 5, 2021 at 12:40 AM Ammad Syed wrote: > >> Hi, >> >> I have deployed trove and database instance deployment is successful. But >> the problem is all the database servers are being created in service >> account i.e openstack instance list shows the database instances in admin >> user but when I check openstack server list the database instance won't >> show up here, its visible in trove service account. >> >> Can you please advise how the servers will be visible in admin account ? >> I want to enable multi-tenancy. >> >> Below is the configuration >> >> [DEFAULT] >> log_dir = /var/log/trove >> # RabbitMQ connection info >> transport_url = rabbit://openstack:password at controller >> control_exchange = trove >> trove_api_workers = 5 >> network_driver = trove.network.neutron.NeutronDriver >> taskmanager_manager = trove.taskmanager.manager.Manager >> default_datastore = mysql >> cinder_volume_type = database_storage >> reboot_time_out = 300 >> usage_timeout = 900 >> agent_call_high_timeout = 1200 >> >> nova_keypair = trove-key >> >> debug = true >> trace = true >> >> # MariaDB connection info >> [database] >> connection = mysql+pymysql://trove:password at mariadb01/trove >> >> [mariadb] >> tcp_ports = 3306,4444,4567,4568 >> >> [mysql] >> tcp_ports = 3306 >> >> [postgresql] >> tcp_ports = 5432 >> >> [redis] >> tcp_ports = 6379,16379 >> >> # Keystone auth info >> [keystone_authtoken] >> www_authenticate_uri = http://controller:5000 >> auth_url = http://controller:5000 >> memcached_servers = controller:11211 >> auth_type = password >> project_domain_name = default >> user_domain_name = default >> project_name = service >> username = trove >> password = servicepassword >> >> [service_credentials] >> auth_url = http://controller:5000 >> region_name = RegionOne >> project_domain_name = default >> user_domain_name = default >> project_name = service >> username = trove >> password = servicepassword >> >> -- >> Regards, >> >> >> Syed Ammad Ali >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Fri Feb 5 11:12:19 2021 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 5 Feb 2021 12:12:19 +0100 Subject: [zaqar][stable]Proposing Hao Wang as a stable branches core reviewer In-Reply-To: <112818e5-9f84-5753-5cb9-d054805d4b3a@openstack.org> References: <112818e5-9f84-5753-5cb9-d054805d4b3a@openstack.org> Message-ID: <78daea07-ac95-ded3-3523-e870f24754a5@openstack.org> Thierry Carrez wrote: > hao wang wrote: >> I want to propose myself(wanghao) to be a new core reviewer of the >> Zaqar stable core team. >> I have been PTL in Zaqar for almost two years. I also want to help the >> stable branches better. > > Thanks for volunteering! Let's wait a couple of days for feedback from > other stable-maint-core members, and I'll add you in. > > In the meantime, please review the stable branch policy, and let us know > if you have any questions: > > https://docs.openstack.org/project-team-guide/stable-branches.html Since we heard no objections, I just added you. Thanks again! -- Thierry From ignaziocassano at gmail.com Fri Feb 5 11:16:17 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 5 Feb 2021 12:16:17 +0100 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hello, thanks for your help. I am waiting my storage administrators have a window to help me because they must setup the snapmirror. Meanwhile I am trying the host assisted migration but it does not work. The share remains in migrating for ever. I am sure the replication-dr works because I tested it one year ago. I had an openstack on site A with a netapp storage I had another openstack on Site B with another netapp storage. The two openstack installation did not share anything. So I made a replication between two volumes (shares). I demoted the source share taking note about its export location list I managed the destination on openstack and it worked. The process for replication is not fully handled by openstack api, so I should call netapp api for creating snapmirror relationship or ansible modules or ask help to my storage administrators , right ? Instead, using share migration, I could use only openstack api: I understood that driver assisted cannot work in this case, but host assisted should work. Best Regards Ignazio Il giorno gio 4 feb 2021 alle ore 21:39 Douglas ha scritto: > Hi Rodrigo, > > Thanks for your help on this. We were helping Ignazio in #openstack-manila > channel. He wants to migrate a share across ONTAP clusters, which isn't > supported in the current implementation of the driver-assisted-migration > with NetApp driver. So, instead of using migration methods, we suggested > using share-replication to create a copy in the destination, which will use > the storage technologies to copy the data faster. Ignazio didn't try that > out yet, since it was late in his timezone. We should continue tomorrow or > in the next few days. > > Best regards, > > On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < > rodrigo.barbieri2010 at gmail.com> wrote: > >> Hello Ignazio, >> >> If you are attempting to migrate between 2 NetApp backends, then you >> shouldn't need to worry about correctly setting the data_node_access_ip. >> Your ideal migration scenario is a driver-assisted-migration, since it is >> between 2 NetApp backends. If that fails due to misconfiguration, it will >> fallback to a host-assisted migration, which will use the >> data_node_access_ip and the host will attempt to mount both shares. This is >> not what you want for this scenario, as this is useful for different >> backends, not your case. >> >> if you specify "manila migration-start --preserve-metadata True" it will >> prevent the fallback to host-assisted, so it is easier for you to narrow >> down the issue with the host-assisted migration out of the way. >> >> I used to be familiar with the NetApp driver set up to review your case, >> however that was a long time ago. I believe the current NetApp driver >> maintainers will be able to more accurately review your case and spot the >> problem. >> >> If you could share some info about your scenario such as: >> >> 1) the 2 backends config groups in manila.conf (sanitized, without >> passwords) >> 2) a "manila show" of the share you are trying to migrate (sanitized if >> needed) >> 3) the "manila migration-start" command you are using and its parameters. >> >> Regards, >> >> >> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano >> wrote: >> >>> Hello All, >>> I am trying to migrate a share between a netapp backend to another. >>> Both backends are configured in my manila.conf. >>> I am able to create share on both, but I am not able to migrate share >>> between them. >>> I am using DSSH=False. >>> I did not understand how host and driver assisted migration work and >>> what "data_node_access_ip" means. >>> The share I want to migrate is on a network (10.102.186.0/24) that I >>> can reach by my management controllers network (10.102.184.0/24). I Can >>> mount share from my controllers and I can mount also the netapp SVM where >>> the share is located. >>> So in the data_node_access_ip I wrote the list of my controllers >>> management ips. >>> During the migrate phase I checked if my controller where manila is >>> running mounts the share or the netapp SVM but It does not happen. >>> Please, what is my mistake ? >>> Thanks >>> Ignazio >>> >>> >>> >>> >> >> -- >> Rodrigo Barbieri >> MSc Computer Scientist >> OpenStack Manila Core Contributor >> Federal University of São Carlos >> >> > > -- > Douglas Salles Viroel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From derekh at redhat.com Fri Feb 5 11:25:33 2021 From: derekh at redhat.com (Derek Higgins) Date: Fri, 5 Feb 2021 11:25:33 +0000 Subject: [ironic] Node wont boot from virtual media Message-ID: Hi, I've been trying to get virtual media booting to work on a ProLiant DL380 Gen10 but not having any luck. Over redfish Ironic attaches the media, sets the Next boot option to 'cd' and then restarts the node. But it continues to boot from the HD as if the vmedia is being ignored. I'm wondering if anybody has seen anything similar, I'm thinking perhaps I don't have a bios setting configured that I need? I have the same problem if I set the One time boot on the iLo dashboard. System ROM U30 v2.40 (10/26/2020) iLO Firmware Version 2.33 Dec 09 2020 On another ProLiant DL380 Gen10 with the same ROM and iLo version where this works. Some of the hardware is different, in particular the one that works has a "HPE Smart Array P408i-a SR Gen10 " but the one that doesn't has a "HPE Smart Array E208i-a SR Gen10" could this be the relevant difference? any ideas would be great, thanks, Derek. From viroel at gmail.com Fri Feb 5 11:34:22 2021 From: viroel at gmail.com (Douglas) Date: Fri, 5 Feb 2021 08:34:22 -0300 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hi Ignazio, In order to use share replication between NetApp backends, you'll need that Clusters and SVMs be peered in advance, which can be done by the storage administrators once. You don't need to handle any SnapMirror operation in the storage since it is fully handled by Manila and the NetApp driver. You can find all operations needed here [1][2]. If you have CIFS shares that need to be replicated and promoted, you will hit a bug that is being backported [3] at the moment. NFS shares should work fine. If you want, we can assist you on creating replicas for your shares in #openstack-manila channel. Just reach us there. [1] https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html [2] https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas [3] https://bugs.launchpad.net/manila/+bug/1896949 On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano wrote: > Hello, thanks for your help. > I am waiting my storage administrators have a window to help me because > they must setup the snapmirror. > Meanwhile I am trying the host assisted migration but it does not work. > The share remains in migrating for ever. > I am sure the replication-dr works because I tested it one year ago. > I had an openstack on site A with a netapp storage > I had another openstack on Site B with another netapp storage. > The two openstack installation did not share anything. > So I made a replication between two volumes (shares). > I demoted the source share taking note about its export location list > I managed the destination on openstack and it worked. > > The process for replication is not fully handled by openstack api, so I > should call netapp api for creating snapmirror relationship or ansible > modules or ask help to my storage administrators , right ? > Instead, using share migration, I could use only openstack api: I > understood that driver assisted cannot work in this case, but host assisted > should work. > > Best Regards > Ignazio > > > > Il giorno gio 4 feb 2021 alle ore 21:39 Douglas ha > scritto: > >> Hi Rodrigo, >> >> Thanks for your help on this. We were helping Ignazio in >> #openstack-manila channel. He wants to migrate a share across ONTAP >> clusters, which isn't supported in the current implementation of the >> driver-assisted-migration with NetApp driver. So, instead of using >> migration methods, we suggested using share-replication to create a copy in >> the destination, which will use the storage technologies to copy the data >> faster. Ignazio didn't try that out yet, since it was late in his timezone. >> We should continue tomorrow or in the next few days. >> >> Best regards, >> >> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >> rodrigo.barbieri2010 at gmail.com> wrote: >> >>> Hello Ignazio, >>> >>> If you are attempting to migrate between 2 NetApp backends, then you >>> shouldn't need to worry about correctly setting the data_node_access_ip. >>> Your ideal migration scenario is a driver-assisted-migration, since it is >>> between 2 NetApp backends. If that fails due to misconfiguration, it will >>> fallback to a host-assisted migration, which will use the >>> data_node_access_ip and the host will attempt to mount both shares. This is >>> not what you want for this scenario, as this is useful for different >>> backends, not your case. >>> >>> if you specify "manila migration-start --preserve-metadata True" it will >>> prevent the fallback to host-assisted, so it is easier for you to narrow >>> down the issue with the host-assisted migration out of the way. >>> >>> I used to be familiar with the NetApp driver set up to review your case, >>> however that was a long time ago. I believe the current NetApp driver >>> maintainers will be able to more accurately review your case and spot the >>> problem. >>> >>> If you could share some info about your scenario such as: >>> >>> 1) the 2 backends config groups in manila.conf (sanitized, without >>> passwords) >>> 2) a "manila show" of the share you are trying to migrate (sanitized if >>> needed) >>> 3) the "manila migration-start" command you are using and its parameters. >>> >>> Regards, >>> >>> >>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano >>> wrote: >>> >>>> Hello All, >>>> I am trying to migrate a share between a netapp backend to another. >>>> Both backends are configured in my manila.conf. >>>> I am able to create share on both, but I am not able to migrate share >>>> between them. >>>> I am using DSSH=False. >>>> I did not understand how host and driver assisted migration work and >>>> what "data_node_access_ip" means. >>>> The share I want to migrate is on a network (10.102.186.0/24) that I >>>> can reach by my management controllers network (10.102.184.0/24). I >>>> Can mount share from my controllers and I can mount also the netapp SVM >>>> where the share is located. >>>> So in the data_node_access_ip I wrote the list of my controllers >>>> management ips. >>>> During the migrate phase I checked if my controller where manila is >>>> running mounts the share or the netapp SVM but It does not happen. >>>> Please, what is my mistake ? >>>> Thanks >>>> Ignazio >>>> >>>> >>>> >>>> >>> >>> -- >>> Rodrigo Barbieri >>> MSc Computer Scientist >>> OpenStack Manila Core Contributor >>> Federal University of São Carlos >>> >>> >> >> -- >> Douglas Salles Viroel >> > -- Douglas Salles Viroel -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Fri Feb 5 12:00:27 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 5 Feb 2021 13:00:27 +0100 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hi Douglas, you are really kind. Let my to to recap and please correct if I am wrong: - manila share on netapp are under svm - storage administrator createx a peering between svm source and svm destination (or on single share volume ?) - I create a manila share with specs replication type (the share belongs to source svm) . In manila.conf source and destination must have the same replication domain - Creating the replication type it initializes the snapmirror Is it correct ? Ignazio Il giorno ven 5 feb 2021 alle ore 12:34 Douglas ha scritto: > Hi Ignazio, > > In order to use share replication between NetApp backends, you'll need > that Clusters and SVMs be peered in advance, which can be done by the > storage administrators once. You don't need to handle any SnapMirror > operation in the storage since it is fully handled by Manila and the NetApp > driver. You can find all operations needed here [1][2]. If you have CIFS > shares that need to be replicated and promoted, you will hit a bug that is > being backported [3] at the moment. NFS shares should work fine. > > If you want, we can assist you on creating replicas for your shares in > #openstack-manila channel. Just reach us there. > > [1] > https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html > [2] > https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas > [3] https://bugs.launchpad.net/manila/+bug/1896949 > > On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano > wrote: > >> Hello, thanks for your help. >> I am waiting my storage administrators have a window to help me because >> they must setup the snapmirror. >> Meanwhile I am trying the host assisted migration but it does not work. >> The share remains in migrating for ever. >> I am sure the replication-dr works because I tested it one year ago. >> I had an openstack on site A with a netapp storage >> I had another openstack on Site B with another netapp storage. >> The two openstack installation did not share anything. >> So I made a replication between two volumes (shares). >> I demoted the source share taking note about its export location list >> I managed the destination on openstack and it worked. >> >> The process for replication is not fully handled by openstack api, so I >> should call netapp api for creating snapmirror relationship or ansible >> modules or ask help to my storage administrators , right ? >> Instead, using share migration, I could use only openstack api: I >> understood that driver assisted cannot work in this case, but host assisted >> should work. >> >> Best Regards >> Ignazio >> >> >> >> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas ha >> scritto: >> >>> Hi Rodrigo, >>> >>> Thanks for your help on this. We were helping Ignazio in >>> #openstack-manila channel. He wants to migrate a share across ONTAP >>> clusters, which isn't supported in the current implementation of the >>> driver-assisted-migration with NetApp driver. So, instead of using >>> migration methods, we suggested using share-replication to create a copy in >>> the destination, which will use the storage technologies to copy the data >>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>> We should continue tomorrow or in the next few days. >>> >>> Best regards, >>> >>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>> rodrigo.barbieri2010 at gmail.com> wrote: >>> >>>> Hello Ignazio, >>>> >>>> If you are attempting to migrate between 2 NetApp backends, then you >>>> shouldn't need to worry about correctly setting the data_node_access_ip. >>>> Your ideal migration scenario is a driver-assisted-migration, since it is >>>> between 2 NetApp backends. If that fails due to misconfiguration, it will >>>> fallback to a host-assisted migration, which will use the >>>> data_node_access_ip and the host will attempt to mount both shares. This is >>>> not what you want for this scenario, as this is useful for different >>>> backends, not your case. >>>> >>>> if you specify "manila migration-start --preserve-metadata True" it >>>> will prevent the fallback to host-assisted, so it is easier for you to >>>> narrow down the issue with the host-assisted migration out of the way. >>>> >>>> I used to be familiar with the NetApp driver set up to review your >>>> case, however that was a long time ago. I believe the current NetApp driver >>>> maintainers will be able to more accurately review your case and spot the >>>> problem. >>>> >>>> If you could share some info about your scenario such as: >>>> >>>> 1) the 2 backends config groups in manila.conf (sanitized, without >>>> passwords) >>>> 2) a "manila show" of the share you are trying to migrate (sanitized if >>>> needed) >>>> 3) the "manila migration-start" command you are using and its >>>> parameters. >>>> >>>> Regards, >>>> >>>> >>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Hello All, >>>>> I am trying to migrate a share between a netapp backend to another. >>>>> Both backends are configured in my manila.conf. >>>>> I am able to create share on both, but I am not able to migrate share >>>>> between them. >>>>> I am using DSSH=False. >>>>> I did not understand how host and driver assisted migration work and >>>>> what "data_node_access_ip" means. >>>>> The share I want to migrate is on a network (10.102.186.0/24) that I >>>>> can reach by my management controllers network (10.102.184.0/24). I >>>>> Can mount share from my controllers and I can mount also the netapp SVM >>>>> where the share is located. >>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>> management ips. >>>>> During the migrate phase I checked if my controller where manila is >>>>> running mounts the share or the netapp SVM but It does not happen. >>>>> Please, what is my mistake ? >>>>> Thanks >>>>> Ignazio >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Rodrigo Barbieri >>>> MSc Computer Scientist >>>> OpenStack Manila Core Contributor >>>> Federal University of São Carlos >>>> >>>> >>> >>> -- >>> Douglas Salles Viroel >>> >> > > -- > Douglas Salles Viroel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From viroel at gmail.com Fri Feb 5 12:30:05 2021 From: viroel at gmail.com (Douglas) Date: Fri, 5 Feb 2021 09:30:05 -0300 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hi, Yes, looks correct to me. Since you are working on DHSS=False mode, the SVMs should be peered in advance by the storage administrator. In DHSS=True mode, starting at Train, this is also managed by Manila. Only the SnapMirror relationship is at volume level, which is fully handled by the NetApp driver, so you don't need to worry about it. By creating a replica of your share, the driver will create the snapmirror relationship and initialize it [1]. Your replica starts in 'out-of-sync' replica-state, meaning that the snapmirror is still 'transferring' data. Manila will periodically check if the replica is already 'in-sync' and will update the 'replica-state' as soon it gets synchronized with the source (snapmirrored). You can also request a 'resync' operation at any moment, through the 'share-replica-resync' operation[2]. Regards, [1] https://opendev.org/openstack/manila/src/branch/master/manila/share/drivers/netapp/dataontap/cluster_mode/data_motion.py#L177-L189 [2] https://docs.openstack.org/api-ref/shared-file-system/#resync-share-replica On Fri, Feb 5, 2021 at 9:00 AM Ignazio Cassano wrote: > Hi Douglas, you are really kind. > Let my to to recap and please correct if I am wrong: > > - manila share on netapp are under svm > - storage administrator createx a peering between svm source and svm > destination (or on single share volume ?) > - I create a manila share with specs replication type (the share belongs > to source svm) . In manila.conf source and destination must have the same > replication domain > - Creating the replication type it initializes the snapmirror > > Is it correct ? > Ignazio > > Il giorno ven 5 feb 2021 alle ore 12:34 Douglas ha > scritto: > >> Hi Ignazio, >> >> In order to use share replication between NetApp backends, you'll need >> that Clusters and SVMs be peered in advance, which can be done by the >> storage administrators once. You don't need to handle any SnapMirror >> operation in the storage since it is fully handled by Manila and the NetApp >> driver. You can find all operations needed here [1][2]. If you have CIFS >> shares that need to be replicated and promoted, you will hit a bug that is >> being backported [3] at the moment. NFS shares should work fine. >> >> If you want, we can assist you on creating replicas for your shares in >> #openstack-manila channel. Just reach us there. >> >> [1] >> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >> [2] >> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >> [3] https://bugs.launchpad.net/manila/+bug/1896949 >> >> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano >> wrote: >> >>> Hello, thanks for your help. >>> I am waiting my storage administrators have a window to help me because >>> they must setup the snapmirror. >>> Meanwhile I am trying the host assisted migration but it does not work. >>> The share remains in migrating for ever. >>> I am sure the replication-dr works because I tested it one year ago. >>> I had an openstack on site A with a netapp storage >>> I had another openstack on Site B with another netapp storage. >>> The two openstack installation did not share anything. >>> So I made a replication between two volumes (shares). >>> I demoted the source share taking note about its export location list >>> I managed the destination on openstack and it worked. >>> >>> The process for replication is not fully handled by openstack api, so I >>> should call netapp api for creating snapmirror relationship or ansible >>> modules or ask help to my storage administrators , right ? >>> Instead, using share migration, I could use only openstack api: I >>> understood that driver assisted cannot work in this case, but host assisted >>> should work. >>> >>> Best Regards >>> Ignazio >>> >>> >>> >>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas ha >>> scritto: >>> >>>> Hi Rodrigo, >>>> >>>> Thanks for your help on this. We were helping Ignazio in >>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>> clusters, which isn't supported in the current implementation of the >>>> driver-assisted-migration with NetApp driver. So, instead of using >>>> migration methods, we suggested using share-replication to create a copy in >>>> the destination, which will use the storage technologies to copy the data >>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>> We should continue tomorrow or in the next few days. >>>> >>>> Best regards, >>>> >>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>> >>>>> Hello Ignazio, >>>>> >>>>> If you are attempting to migrate between 2 NetApp backends, then you >>>>> shouldn't need to worry about correctly setting the data_node_access_ip. >>>>> Your ideal migration scenario is a driver-assisted-migration, since it is >>>>> between 2 NetApp backends. If that fails due to misconfiguration, it will >>>>> fallback to a host-assisted migration, which will use the >>>>> data_node_access_ip and the host will attempt to mount both shares. This is >>>>> not what you want for this scenario, as this is useful for different >>>>> backends, not your case. >>>>> >>>>> if you specify "manila migration-start --preserve-metadata True" it >>>>> will prevent the fallback to host-assisted, so it is easier for you to >>>>> narrow down the issue with the host-assisted migration out of the way. >>>>> >>>>> I used to be familiar with the NetApp driver set up to review your >>>>> case, however that was a long time ago. I believe the current NetApp driver >>>>> maintainers will be able to more accurately review your case and spot the >>>>> problem. >>>>> >>>>> If you could share some info about your scenario such as: >>>>> >>>>> 1) the 2 backends config groups in manila.conf (sanitized, without >>>>> passwords) >>>>> 2) a "manila show" of the share you are trying to migrate (sanitized >>>>> if needed) >>>>> 3) the "manila migration-start" command you are using and its >>>>> parameters. >>>>> >>>>> Regards, >>>>> >>>>> >>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Hello All, >>>>>> I am trying to migrate a share between a netapp backend to another. >>>>>> Both backends are configured in my manila.conf. >>>>>> I am able to create share on both, but I am not able to migrate share >>>>>> between them. >>>>>> I am using DSSH=False. >>>>>> I did not understand how host and driver assisted migration work and >>>>>> what "data_node_access_ip" means. >>>>>> The share I want to migrate is on a network (10.102.186.0/24) that I >>>>>> can reach by my management controllers network (10.102.184.0/24). I >>>>>> Can mount share from my controllers and I can mount also the netapp SVM >>>>>> where the share is located. >>>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>>> management ips. >>>>>> During the migrate phase I checked if my controller where manila is >>>>>> running mounts the share or the netapp SVM but It does not happen. >>>>>> Please, what is my mistake ? >>>>>> Thanks >>>>>> Ignazio >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Rodrigo Barbieri >>>>> MSc Computer Scientist >>>>> OpenStack Manila Core Contributor >>>>> Federal University of São Carlos >>>>> >>>>> >>>> >>>> -- >>>> Douglas Salles Viroel >>>> >>> >> >> -- >> Douglas Salles Viroel >> > -- Douglas Salles Viroel -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Fri Feb 5 12:37:17 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 5 Feb 2021 13:37:17 +0100 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hello, I am sorry. I read the documentation. SMV must be peered once bye storage admimistrator or using ansible playbook. I must create a two backend in manila.conf with the same replication domain. I must assign to the source a type and set replication type dr. When I create a share if I want to enable snapmirror for it I must create on openstack a share replica for it. The share on destination is read only until I promote it. When I promote it, it become writable. Then I can manage it on target openstack. I hope the above is the correct procedure Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < ignaziocassano at gmail.com> ha scritto: > Hi Douglas, you are really kind. > Let my to to recap and please correct if I am wrong: > > - manila share on netapp are under svm > - storage administrator createx a peering between svm source and svm > destination (or on single share volume ?) > - I create a manila share with specs replication type (the share belongs > to source svm) . In manila.conf source and destination must have the same > replication domain > - Creating the replication type it initializes the snapmirror > > Is it correct ? > Ignazio > > Il giorno ven 5 feb 2021 alle ore 12:34 Douglas ha > scritto: > >> Hi Ignazio, >> >> In order to use share replication between NetApp backends, you'll need >> that Clusters and SVMs be peered in advance, which can be done by the >> storage administrators once. You don't need to handle any SnapMirror >> operation in the storage since it is fully handled by Manila and the NetApp >> driver. You can find all operations needed here [1][2]. If you have CIFS >> shares that need to be replicated and promoted, you will hit a bug that is >> being backported [3] at the moment. NFS shares should work fine. >> >> If you want, we can assist you on creating replicas for your shares in >> #openstack-manila channel. Just reach us there. >> >> [1] >> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >> [2] >> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >> [3] https://bugs.launchpad.net/manila/+bug/1896949 >> >> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano >> wrote: >> >>> Hello, thanks for your help. >>> I am waiting my storage administrators have a window to help me because >>> they must setup the snapmirror. >>> Meanwhile I am trying the host assisted migration but it does not work. >>> The share remains in migrating for ever. >>> I am sure the replication-dr works because I tested it one year ago. >>> I had an openstack on site A with a netapp storage >>> I had another openstack on Site B with another netapp storage. >>> The two openstack installation did not share anything. >>> So I made a replication between two volumes (shares). >>> I demoted the source share taking note about its export location list >>> I managed the destination on openstack and it worked. >>> >>> The process for replication is not fully handled by openstack api, so I >>> should call netapp api for creating snapmirror relationship or ansible >>> modules or ask help to my storage administrators , right ? >>> Instead, using share migration, I could use only openstack api: I >>> understood that driver assisted cannot work in this case, but host assisted >>> should work. >>> >>> Best Regards >>> Ignazio >>> >>> >>> >>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas ha >>> scritto: >>> >>>> Hi Rodrigo, >>>> >>>> Thanks for your help on this. We were helping Ignazio in >>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>> clusters, which isn't supported in the current implementation of the >>>> driver-assisted-migration with NetApp driver. So, instead of using >>>> migration methods, we suggested using share-replication to create a copy in >>>> the destination, which will use the storage technologies to copy the data >>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>> We should continue tomorrow or in the next few days. >>>> >>>> Best regards, >>>> >>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>> >>>>> Hello Ignazio, >>>>> >>>>> If you are attempting to migrate between 2 NetApp backends, then you >>>>> shouldn't need to worry about correctly setting the data_node_access_ip. >>>>> Your ideal migration scenario is a driver-assisted-migration, since it is >>>>> between 2 NetApp backends. If that fails due to misconfiguration, it will >>>>> fallback to a host-assisted migration, which will use the >>>>> data_node_access_ip and the host will attempt to mount both shares. This is >>>>> not what you want for this scenario, as this is useful for different >>>>> backends, not your case. >>>>> >>>>> if you specify "manila migration-start --preserve-metadata True" it >>>>> will prevent the fallback to host-assisted, so it is easier for you to >>>>> narrow down the issue with the host-assisted migration out of the way. >>>>> >>>>> I used to be familiar with the NetApp driver set up to review your >>>>> case, however that was a long time ago. I believe the current NetApp driver >>>>> maintainers will be able to more accurately review your case and spot the >>>>> problem. >>>>> >>>>> If you could share some info about your scenario such as: >>>>> >>>>> 1) the 2 backends config groups in manila.conf (sanitized, without >>>>> passwords) >>>>> 2) a "manila show" of the share you are trying to migrate (sanitized >>>>> if needed) >>>>> 3) the "manila migration-start" command you are using and its >>>>> parameters. >>>>> >>>>> Regards, >>>>> >>>>> >>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Hello All, >>>>>> I am trying to migrate a share between a netapp backend to another. >>>>>> Both backends are configured in my manila.conf. >>>>>> I am able to create share on both, but I am not able to migrate share >>>>>> between them. >>>>>> I am using DSSH=False. >>>>>> I did not understand how host and driver assisted migration work and >>>>>> what "data_node_access_ip" means. >>>>>> The share I want to migrate is on a network (10.102.186.0/24) that I >>>>>> can reach by my management controllers network (10.102.184.0/24). I >>>>>> Can mount share from my controllers and I can mount also the netapp SVM >>>>>> where the share is located. >>>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>>> management ips. >>>>>> During the migrate phase I checked if my controller where manila is >>>>>> running mounts the share or the netapp SVM but It does not happen. >>>>>> Please, what is my mistake ? >>>>>> Thanks >>>>>> Ignazio >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Rodrigo Barbieri >>>>> MSc Computer Scientist >>>>> OpenStack Manila Core Contributor >>>>> Federal University of São Carlos >>>>> >>>>> >>>> >>>> -- >>>> Douglas Salles Viroel >>>> >>> >> >> -- >> Douglas Salles Viroel >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From viroel at gmail.com Fri Feb 5 12:48:03 2021 From: viroel at gmail.com (Douglas) Date: Fri, 5 Feb 2021 09:48:03 -0300 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Yes, it is correct. This should work as an alternative for host-assisted-migration and will be faster since it uses storage technologies to synchronize data. If your share isn't associated with a share-type that has replication_type='dr' you can: 1) create a new share-type with replication_type extra-spec, 2) unmanage your share, 3) manage it again using the new share-type. On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano wrote: > Hello, I am sorry. > > I read the documentation. > > SMV must be peered once bye storage admimistrator or using ansible > playbook. > I must create a two backend in manila.conf with the same replication > domain. > I must assign to the source a type and set replication type dr. > When I create a share if I want to enable snapmirror for it I must create > on openstack a share replica for it. > The share on destination is read only until I promote it. > When I promote it, it become writable. > Then I can manage it on target openstack. > > I hope the above is the correct procedure > > Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < > ignaziocassano at gmail.com> ha scritto: > >> Hi Douglas, you are really kind. >> Let my to to recap and please correct if I am wrong: >> >> - manila share on netapp are under svm >> - storage administrator createx a peering between svm source and svm >> destination (or on single share volume ?) >> - I create a manila share with specs replication type (the share belongs >> to source svm) . In manila.conf source and destination must have the same >> replication domain >> - Creating the replication type it initializes the snapmirror >> >> Is it correct ? >> Ignazio >> >> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas ha >> scritto: >> >>> Hi Ignazio, >>> >>> In order to use share replication between NetApp backends, you'll need >>> that Clusters and SVMs be peered in advance, which can be done by the >>> storage administrators once. You don't need to handle any SnapMirror >>> operation in the storage since it is fully handled by Manila and the NetApp >>> driver. You can find all operations needed here [1][2]. If you have CIFS >>> shares that need to be replicated and promoted, you will hit a bug that is >>> being backported [3] at the moment. NFS shares should work fine. >>> >>> If you want, we can assist you on creating replicas for your shares in >>> #openstack-manila channel. Just reach us there. >>> >>> [1] >>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>> [2] >>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>> >>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano >>> wrote: >>> >>>> Hello, thanks for your help. >>>> I am waiting my storage administrators have a window to help me because >>>> they must setup the snapmirror. >>>> Meanwhile I am trying the host assisted migration but it does not work. >>>> The share remains in migrating for ever. >>>> I am sure the replication-dr works because I tested it one year ago. >>>> I had an openstack on site A with a netapp storage >>>> I had another openstack on Site B with another netapp storage. >>>> The two openstack installation did not share anything. >>>> So I made a replication between two volumes (shares). >>>> I demoted the source share taking note about its export location list >>>> I managed the destination on openstack and it worked. >>>> >>>> The process for replication is not fully handled by openstack api, so >>>> I should call netapp api for creating snapmirror relationship or ansible >>>> modules or ask help to my storage administrators , right ? >>>> Instead, using share migration, I could use only openstack api: I >>>> understood that driver assisted cannot work in this case, but host assisted >>>> should work. >>>> >>>> Best Regards >>>> Ignazio >>>> >>>> >>>> >>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas ha >>>> scritto: >>>> >>>>> Hi Rodrigo, >>>>> >>>>> Thanks for your help on this. We were helping Ignazio in >>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>> clusters, which isn't supported in the current implementation of the >>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>> migration methods, we suggested using share-replication to create a copy in >>>>> the destination, which will use the storage technologies to copy the data >>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>> We should continue tomorrow or in the next few days. >>>>> >>>>> Best regards, >>>>> >>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>> >>>>>> Hello Ignazio, >>>>>> >>>>>> If you are attempting to migrate between 2 NetApp backends, then you >>>>>> shouldn't need to worry about correctly setting the data_node_access_ip. >>>>>> Your ideal migration scenario is a driver-assisted-migration, since it is >>>>>> between 2 NetApp backends. If that fails due to misconfiguration, it will >>>>>> fallback to a host-assisted migration, which will use the >>>>>> data_node_access_ip and the host will attempt to mount both shares. This is >>>>>> not what you want for this scenario, as this is useful for different >>>>>> backends, not your case. >>>>>> >>>>>> if you specify "manila migration-start --preserve-metadata True" it >>>>>> will prevent the fallback to host-assisted, so it is easier for you to >>>>>> narrow down the issue with the host-assisted migration out of the way. >>>>>> >>>>>> I used to be familiar with the NetApp driver set up to review your >>>>>> case, however that was a long time ago. I believe the current NetApp driver >>>>>> maintainers will be able to more accurately review your case and spot the >>>>>> problem. >>>>>> >>>>>> If you could share some info about your scenario such as: >>>>>> >>>>>> 1) the 2 backends config groups in manila.conf (sanitized, without >>>>>> passwords) >>>>>> 2) a "manila show" of the share you are trying to migrate (sanitized >>>>>> if needed) >>>>>> 3) the "manila migration-start" command you are using and its >>>>>> parameters. >>>>>> >>>>>> Regards, >>>>>> >>>>>> >>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Hello All, >>>>>>> I am trying to migrate a share between a netapp backend to another. >>>>>>> Both backends are configured in my manila.conf. >>>>>>> I am able to create share on both, but I am not able to migrate >>>>>>> share between them. >>>>>>> I am using DSSH=False. >>>>>>> I did not understand how host and driver assisted migration work and >>>>>>> what "data_node_access_ip" means. >>>>>>> The share I want to migrate is on a network (10.102.186.0/24) that >>>>>>> I can reach by my management controllers network (10.102.184.0/24). >>>>>>> I Can mount share from my controllers and I can mount also the netapp SVM >>>>>>> where the share is located. >>>>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>>>> management ips. >>>>>>> During the migrate phase I checked if my controller where manila is >>>>>>> running mounts the share or the netapp SVM but It does not happen. >>>>>>> Please, what is my mistake ? >>>>>>> Thanks >>>>>>> Ignazio >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Rodrigo Barbieri >>>>>> MSc Computer Scientist >>>>>> OpenStack Manila Core Contributor >>>>>> Federal University of São Carlos >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Douglas Salles Viroel >>>>> >>>> >>> >>> -- >>> Douglas Salles Viroel >>> >> -- Douglas Salles Viroel -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Fri Feb 5 12:59:52 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 5 Feb 2021 13:59:52 +0100 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Thanks, Douglas. On another question: the manila share-replica-delete delete the snapmirror ? If yes, source and destination volume become both writable ? Ignazio Il giorno ven 5 feb 2021 alle ore 13:48 Douglas ha scritto: > Yes, it is correct. This should work as an alternative for > host-assisted-migration and will be faster since it uses storage > technologies to synchronize data. > If your share isn't associated with a share-type that has > replication_type='dr' you can: 1) create a new share-type with > replication_type extra-spec, 2) unmanage your share, 3) manage it again > using the new share-type. > > > > On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano > wrote: > >> Hello, I am sorry. >> >> I read the documentation. >> >> SMV must be peered once bye storage admimistrator or using ansible >> playbook. >> I must create a two backend in manila.conf with the same replication >> domain. >> I must assign to the source a type and set replication type dr. >> When I create a share if I want to enable snapmirror for it I must create >> on openstack a share replica for it. >> The share on destination is read only until I promote it. >> When I promote it, it become writable. >> Then I can manage it on target openstack. >> >> I hope the above is the correct procedure >> >> Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < >> ignaziocassano at gmail.com> ha scritto: >> >>> Hi Douglas, you are really kind. >>> Let my to to recap and please correct if I am wrong: >>> >>> - manila share on netapp are under svm >>> - storage administrator createx a peering between svm source and svm >>> destination (or on single share volume ?) >>> - I create a manila share with specs replication type (the share >>> belongs to source svm) . In manila.conf source and destination must have >>> the same replication domain >>> - Creating the replication type it initializes the snapmirror >>> >>> Is it correct ? >>> Ignazio >>> >>> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas ha >>> scritto: >>> >>>> Hi Ignazio, >>>> >>>> In order to use share replication between NetApp backends, you'll need >>>> that Clusters and SVMs be peered in advance, which can be done by the >>>> storage administrators once. You don't need to handle any SnapMirror >>>> operation in the storage since it is fully handled by Manila and the NetApp >>>> driver. You can find all operations needed here [1][2]. If you have CIFS >>>> shares that need to be replicated and promoted, you will hit a bug that is >>>> being backported [3] at the moment. NFS shares should work fine. >>>> >>>> If you want, we can assist you on creating replicas for your shares in >>>> #openstack-manila channel. Just reach us there. >>>> >>>> [1] >>>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>>> [2] >>>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>>> >>>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Hello, thanks for your help. >>>>> I am waiting my storage administrators have a window to help me >>>>> because they must setup the snapmirror. >>>>> Meanwhile I am trying the host assisted migration but it does not work. >>>>> The share remains in migrating for ever. >>>>> I am sure the replication-dr works because I tested it one year ago. >>>>> I had an openstack on site A with a netapp storage >>>>> I had another openstack on Site B with another netapp storage. >>>>> The two openstack installation did not share anything. >>>>> So I made a replication between two volumes (shares). >>>>> I demoted the source share taking note about its export location list >>>>> I managed the destination on openstack and it worked. >>>>> >>>>> The process for replication is not fully handled by openstack api, so >>>>> I should call netapp api for creating snapmirror relationship or ansible >>>>> modules or ask help to my storage administrators , right ? >>>>> Instead, using share migration, I could use only openstack api: I >>>>> understood that driver assisted cannot work in this case, but host assisted >>>>> should work. >>>>> >>>>> Best Regards >>>>> Ignazio >>>>> >>>>> >>>>> >>>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas ha >>>>> scritto: >>>>> >>>>>> Hi Rodrigo, >>>>>> >>>>>> Thanks for your help on this. We were helping Ignazio in >>>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>>> clusters, which isn't supported in the current implementation of the >>>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>>> migration methods, we suggested using share-replication to create a copy in >>>>>> the destination, which will use the storage technologies to copy the data >>>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>>> We should continue tomorrow or in the next few days. >>>>>> >>>>>> Best regards, >>>>>> >>>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>>> >>>>>>> Hello Ignazio, >>>>>>> >>>>>>> If you are attempting to migrate between 2 NetApp backends, then you >>>>>>> shouldn't need to worry about correctly setting the data_node_access_ip. >>>>>>> Your ideal migration scenario is a driver-assisted-migration, since it is >>>>>>> between 2 NetApp backends. If that fails due to misconfiguration, it will >>>>>>> fallback to a host-assisted migration, which will use the >>>>>>> data_node_access_ip and the host will attempt to mount both shares. This is >>>>>>> not what you want for this scenario, as this is useful for different >>>>>>> backends, not your case. >>>>>>> >>>>>>> if you specify "manila migration-start --preserve-metadata True" it >>>>>>> will prevent the fallback to host-assisted, so it is easier for you to >>>>>>> narrow down the issue with the host-assisted migration out of the way. >>>>>>> >>>>>>> I used to be familiar with the NetApp driver set up to review your >>>>>>> case, however that was a long time ago. I believe the current NetApp driver >>>>>>> maintainers will be able to more accurately review your case and spot the >>>>>>> problem. >>>>>>> >>>>>>> If you could share some info about your scenario such as: >>>>>>> >>>>>>> 1) the 2 backends config groups in manila.conf (sanitized, without >>>>>>> passwords) >>>>>>> 2) a "manila show" of the share you are trying to migrate (sanitized >>>>>>> if needed) >>>>>>> 3) the "manila migration-start" command you are using and its >>>>>>> parameters. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> >>>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> Hello All, >>>>>>>> I am trying to migrate a share between a netapp backend to another. >>>>>>>> Both backends are configured in my manila.conf. >>>>>>>> I am able to create share on both, but I am not able to migrate >>>>>>>> share between them. >>>>>>>> I am using DSSH=False. >>>>>>>> I did not understand how host and driver assisted migration work >>>>>>>> and what "data_node_access_ip" means. >>>>>>>> The share I want to migrate is on a network (10.102.186.0/24) that >>>>>>>> I can reach by my management controllers network (10.102.184.0/24). >>>>>>>> I Can mount share from my controllers and I can mount also the netapp SVM >>>>>>>> where the share is located. >>>>>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>>>>> management ips. >>>>>>>> During the migrate phase I checked if my controller where manila is >>>>>>>> running mounts the share or the netapp SVM but It does not happen. >>>>>>>> Please, what is my mistake ? >>>>>>>> Thanks >>>>>>>> Ignazio >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Rodrigo Barbieri >>>>>>> MSc Computer Scientist >>>>>>> OpenStack Manila Core Contributor >>>>>>> Federal University of São Carlos >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Douglas Salles Viroel >>>>>> >>>>> >>>> >>>> -- >>>> Douglas Salles Viroel >>>> >>> > > -- > Douglas Salles Viroel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Fri Feb 5 13:56:12 2021 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 5 Feb 2021 14:56:12 +0100 Subject: [largescale-sig] Next meeting: February 10, 15utc Message-ID: <24e58c7c-982c-14d1-e0b8-8fe20f7da50f@openstack.org> Hi everyone, Our next Large Scale SIG meeting will be this Wednesday in #openstack-meeting-3 on IRC, at 15UTC. You can doublecheck how it translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20210210T15 Feel free to add other topics to our agenda at: https://etherpad.openstack.org/p/large-scale-sig-meeting Talk to you all later, -- Thierry Carrez From viroel at gmail.com Fri Feb 5 13:59:09 2021 From: viroel at gmail.com (Douglas) Date: Fri, 5 Feb 2021 10:59:09 -0300 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hi Ignazio, By deleting the replica you will delete the replicated volume, and in order to do that, the driver will need to destroy the snapmirror relationship, before actually deleting the volume. You can only delete the non-active replicas, which means that your 'writable' volume won't be affected. By promoting a replica, the provided 'dr' volume will become 'writable', and the original active replica will turn to a 'dr' volume.There is no way of having two 'writable' instances of the same share when using NetApp backend in Manila. On Fri, Feb 5, 2021 at 10:00 AM Ignazio Cassano wrote: > Thanks, Douglas. > On another question: > the manila share-replica-delete delete the snapmirror ? > If yes, source and destination volume become both writable ? > > Ignazio > > Il giorno ven 5 feb 2021 alle ore 13:48 Douglas ha > scritto: > >> Yes, it is correct. This should work as an alternative for >> host-assisted-migration and will be faster since it uses storage >> technologies to synchronize data. >> If your share isn't associated with a share-type that has >> replication_type='dr' you can: 1) create a new share-type with >> replication_type extra-spec, 2) unmanage your share, 3) manage it again >> using the new share-type. >> >> >> >> On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano >> wrote: >> >>> Hello, I am sorry. >>> >>> I read the documentation. >>> >>> SMV must be peered once bye storage admimistrator or using ansible >>> playbook. >>> I must create a two backend in manila.conf with the same replication >>> domain. >>> I must assign to the source a type and set replication type dr. >>> When I create a share if I want to enable snapmirror for it I must >>> create on openstack a share replica for it. >>> The share on destination is read only until I promote it. >>> When I promote it, it become writable. >>> Then I can manage it on target openstack. >>> >>> I hope the above is the correct procedure >>> >>> Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < >>> ignaziocassano at gmail.com> ha scritto: >>> >>>> Hi Douglas, you are really kind. >>>> Let my to to recap and please correct if I am wrong: >>>> >>>> - manila share on netapp are under svm >>>> - storage administrator createx a peering between svm source and svm >>>> destination (or on single share volume ?) >>>> - I create a manila share with specs replication type (the share >>>> belongs to source svm) . In manila.conf source and destination must have >>>> the same replication domain >>>> - Creating the replication type it initializes the snapmirror >>>> >>>> Is it correct ? >>>> Ignazio >>>> >>>> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas ha >>>> scritto: >>>> >>>>> Hi Ignazio, >>>>> >>>>> In order to use share replication between NetApp backends, you'll need >>>>> that Clusters and SVMs be peered in advance, which can be done by the >>>>> storage administrators once. You don't need to handle any SnapMirror >>>>> operation in the storage since it is fully handled by Manila and the NetApp >>>>> driver. You can find all operations needed here [1][2]. If you have CIFS >>>>> shares that need to be replicated and promoted, you will hit a bug that is >>>>> being backported [3] at the moment. NFS shares should work fine. >>>>> >>>>> If you want, we can assist you on creating replicas for your shares in >>>>> #openstack-manila channel. Just reach us there. >>>>> >>>>> [1] >>>>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>>>> [2] >>>>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>>>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>>>> >>>>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Hello, thanks for your help. >>>>>> I am waiting my storage administrators have a window to help me >>>>>> because they must setup the snapmirror. >>>>>> Meanwhile I am trying the host assisted migration but it does not >>>>>> work. >>>>>> The share remains in migrating for ever. >>>>>> I am sure the replication-dr works because I tested it one year ago. >>>>>> I had an openstack on site A with a netapp storage >>>>>> I had another openstack on Site B with another netapp storage. >>>>>> The two openstack installation did not share anything. >>>>>> So I made a replication between two volumes (shares). >>>>>> I demoted the source share taking note about its export location list >>>>>> I managed the destination on openstack and it worked. >>>>>> >>>>>> The process for replication is not fully handled by openstack api, >>>>>> so I should call netapp api for creating snapmirror relationship or >>>>>> ansible modules or ask help to my storage administrators , right ? >>>>>> Instead, using share migration, I could use only openstack api: I >>>>>> understood that driver assisted cannot work in this case, but host assisted >>>>>> should work. >>>>>> >>>>>> Best Regards >>>>>> Ignazio >>>>>> >>>>>> >>>>>> >>>>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas >>>>>> ha scritto: >>>>>> >>>>>>> Hi Rodrigo, >>>>>>> >>>>>>> Thanks for your help on this. We were helping Ignazio in >>>>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>>>> clusters, which isn't supported in the current implementation of the >>>>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>>>> migration methods, we suggested using share-replication to create a copy in >>>>>>> the destination, which will use the storage technologies to copy the data >>>>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>>>> We should continue tomorrow or in the next few days. >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>>>> >>>>>>>> Hello Ignazio, >>>>>>>> >>>>>>>> If you are attempting to migrate between 2 NetApp backends, then >>>>>>>> you shouldn't need to worry about correctly setting the >>>>>>>> data_node_access_ip. Your ideal migration scenario is a >>>>>>>> driver-assisted-migration, since it is between 2 NetApp backends. If that >>>>>>>> fails due to misconfiguration, it will fallback to a host-assisted >>>>>>>> migration, which will use the data_node_access_ip and the host will attempt >>>>>>>> to mount both shares. This is not what you want for this scenario, as this >>>>>>>> is useful for different backends, not your case. >>>>>>>> >>>>>>>> if you specify "manila migration-start --preserve-metadata True" it >>>>>>>> will prevent the fallback to host-assisted, so it is easier for you to >>>>>>>> narrow down the issue with the host-assisted migration out of the way. >>>>>>>> >>>>>>>> I used to be familiar with the NetApp driver set up to review your >>>>>>>> case, however that was a long time ago. I believe the current NetApp driver >>>>>>>> maintainers will be able to more accurately review your case and spot the >>>>>>>> problem. >>>>>>>> >>>>>>>> If you could share some info about your scenario such as: >>>>>>>> >>>>>>>> 1) the 2 backends config groups in manila.conf (sanitized, without >>>>>>>> passwords) >>>>>>>> 2) a "manila show" of the share you are trying to migrate >>>>>>>> (sanitized if needed) >>>>>>>> 3) the "manila migration-start" command you are using and its >>>>>>>> parameters. >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hello All, >>>>>>>>> I am trying to migrate a share between a netapp backend to another. >>>>>>>>> Both backends are configured in my manila.conf. >>>>>>>>> I am able to create share on both, but I am not able to migrate >>>>>>>>> share between them. >>>>>>>>> I am using DSSH=False. >>>>>>>>> I did not understand how host and driver assisted migration work >>>>>>>>> and what "data_node_access_ip" means. >>>>>>>>> The share I want to migrate is on a network (10.102.186.0/24) >>>>>>>>> that I can reach by my management controllers network ( >>>>>>>>> 10.102.184.0/24). I Can mount share from my controllers and I can >>>>>>>>> mount also the netapp SVM where the share is located. >>>>>>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>>>>>> management ips. >>>>>>>>> During the migrate phase I checked if my controller where manila >>>>>>>>> is running mounts the share or the netapp SVM but It does not happen. >>>>>>>>> Please, what is my mistake ? >>>>>>>>> Thanks >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Rodrigo Barbieri >>>>>>>> MSc Computer Scientist >>>>>>>> OpenStack Manila Core Contributor >>>>>>>> Federal University of São Carlos >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Douglas Salles Viroel >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> Douglas Salles Viroel >>>>> >>>> >> >> -- >> Douglas Salles Viroel >> > -- Douglas Salles Viroel -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Fri Feb 5 14:13:27 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 05 Feb 2021 14:13:27 +0000 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty In-Reply-To: References: Message-ID: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> On Fri, 2021-02-05 at 14:53 +0530, roshan anvekar wrote: > Hi all, > > Scenario: I have an installation of Openstack stein through kolla-ansible. > The deployment went fine and all services look good. > > Although I am seeing that under Admin--> Compute --> Hypervisors panel in > horizon, all the controller nodes are missing. It's a blank list. did you actully deploy the nova compute agent service to them? that view is showing the list of host that are running the nova compute service typically that is not deployed to the contolers. host in the contol group in the kolla multi node inventlry https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L3 are not use to run the compute agent by default only nodes in the compute group are https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L18 the eception to that is ironic https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L301-L302 which is deployed to the contolers. the nova compute agent used for libvirt is deployed specificlly to the compute hosts via the nova-cell role at least on master https://github.com/openstack/kolla-ansible/blob/master/ansible/nova.yml#L118 this was done a little simpler before adding cell support but the inventory side has not changed in many release in this regard. > > Also "Openstack hypervisor list" gives an empty list. > > I skimmed through the logs and found no error message other than in > nova-scheduler that: > > *Got no allocation candidates from the Placement API. This could be due to > insufficient resources or a temporary occurence as compute nodes start up.* > > Subsequently I checked placement container logs and found no error message > or anamoly. > > Not sure what the issue is. Any help in the above case would be appreciated. > > Regards, > Roshan From marios at redhat.com Fri Feb 5 14:40:23 2021 From: marios at redhat.com (Marios Andreou) Date: Fri, 5 Feb 2021 16:40:23 +0200 Subject: [TripleO] Moving stable/rocky for *-tripleo-* repos to End of Life OK? Message-ID: Hello all, it's been ~ 2 months now since my initial mail about $subject [1] and just under a month since my last bump on the thread [2] and I haven't heard any objections so far. So I think it's now appropriate to move forward with [3] which tags the latest commits to the stable/rocky branch of all tripleo-repos [4] as 'rock-eol' (except archived things like instack/tripleo-ui). Once it merges we will no longer be able to land anything into stable/rocky for all tripleo repos and the stable/rocky branch will be deleted. So, last chance! If you object please go and -1 the patch at [3] and/or reply here thanks, marios [1] http://lists.openstack.org/pipermail/openstack-discuss/2020-December/019338.html [2] http://lists.openstack.org/pipermail/openstack-discuss/2021-January/019860.html [3] https://review.opendev.org/c/openstack/releases/+/774244 [4] https://releases.openstack.org/teams/tripleo.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Fri Feb 5 14:48:05 2021 From: dms at danplanet.com (Dan Smith) Date: Fri, 05 Feb 2021 06:48:05 -0800 Subject: [all] Gate resources and performance In-Reply-To: (Marios Andreou's message of "Fri, 5 Feb 2021 10:33:40 +0200") References: Message-ID: > just wanted to point out the 'node hours' comparison may not be fair > because what is a typical nova patch or a typical tripleo patch? The > number of jobs matched & executed by zuul on a given review will be > different to another tripleo patch in the same repo depending on the > files touched or branch (etc.) and will vary even more compared to > other tripleo repos; I think this is the same for nova or any other > project with multiple repos. It is indeed important to note that some projects may have wildly different numbers depending on what is touched in the patch. Speaking from experience with Nova, Glance, and QA, most job runs are going to be the same for anything that touches code. Nova will only run unit or functional tests if those are the only files you touched, or docs if so, but otherwise we're pretty much running everything all the time, AFAIK. That could be an area for improvement for us, although I think that determining the scope by the file changed is hard for us just because of how intertwined things are, so we probably need to figure out how to target our tests another way. And basically all of Nova is in a single repo. But yes, totally fair point. I picked a couple test runs at random to generate these numbers, based on looking like they were running most/all of what is configured. First time I did that I picked a stable Neutron patch from before they dropped some testing and got a sky-high number of 54h for a single patch run. So clearly it can vary :) > ACK. We have recently completed some work (as I said, this is an > ongoing issue/process for us) at [1][2] to remove some redundant jobs > which should start to help. Mohamed (mnaser o/) has reached out about > this and joined our most recent irc meeting [3]. We're already > prioritized some more cleanup work for this sprint including checking > file patterns (e.g. started at [4]), tempest tests and removing > many/all of our non-voting jobs as a first pass. Hope that at least > starts to address you concern, Yep, and thanks a lot for what you've done and continue to do. Obviously looking at the "tripleo is ~40%" report, I expected my script to show tripleo as having some insanely high test load. Looking at the actual numbers, it's clear that you're not only not the heaviest, but given what we know to be a super heavy process of deploying nodes like you do, seemingly relatively efficient. I'm sure there's still improvement that could be made on top of your current list, but I think the lesson in these numbers is that we definitely need to look elsewhere than the traditional openstack pastime of blaming tripleo ;) For my part so far, I've got a stack of patches proposed to make devstack run quite a bit faster for jobs that use it: https://review.opendev.org/q/topic:%2522async%2522+status:open+project:openstack/devstack and I've also proposed that nova stop running two grenades which almost 100% overlap (which strangely has to be a change in the tempest repo): https://review.opendev.org/c/openstack/tempest/+/771499 Both of these have barriers to approval at the moment, but both have big multipliers capable of making a difference. --Dan From juliaashleykreger at gmail.com Fri Feb 5 15:06:52 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 5 Feb 2021 07:06:52 -0800 Subject: [ironic] Review Jams Message-ID: In the Ironic team's recent mid-cycle call, we discussed the need to return to occasionally having review jams in order to help streamline the review process. In other words, get eyes on a change in parallel and be able to discuss the change. The goal is to help get people on the same page in terms of what and why. Be on hand to answer questions or back-fill context. This is to hopefully avoid the more iterative back and forth nature of code review, which can draw out a long chain of patches. As always, the goal is not perfection, but forward movement especially for complex changes. We've established two time windows that will hopefully not to be too hard for some contributors to make it to. It doesn't need to be everyone, but it would help for at least some people whom actively review or want to actively participate in reviewing, or whom are even interested in a feature to join us for our meeting. I've added an entry on to our wiki page to cover this, with the current agenda and anticipated review jam topic schedule. The tl;dr is we will use meetpad[1] and meet on Mondays at 2 PM UTC and Tuesdays at 6 PM UTC. The hope is to to enable some overlap of reviewers. If people are interested in other times, please bring this up in the weekly meeting or on the mailing list. I'm not sending out calendar invites for this. Yet. :) See everyone next week! -Julia [0]: https://wiki.openstack.org/wiki/Meetings/Ironic#Review_Jams [1]: https://meetpad.opendev.org/ironic From andr.kurilin at gmail.com Fri Feb 5 15:38:54 2021 From: andr.kurilin at gmail.com (Andrey Kurilin) Date: Fri, 5 Feb 2021 17:38:54 +0200 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: <7fdbc97a688744f399f7358b1250bc30@elca.ch> References: <3cb755495b994352aaadf0d31ad295f3@elca.ch> <7fdbc97a688744f399f7358b1250bc30@elca.ch> Message-ID: Hi, Ankit! Please accept my apologies for the outdated documentation. Updating docs is my top 1 priority for Rally project, but unfortunately, I do not have much time for doing that. As for the issue you are facing with `rally deployment show command`, it looks like a bug. Switching from `rally deployment` to `rally env` subset of command should solve the problem (rally deployment will be deprecated at some point). пт, 5 февр. 2021 г. в 10:17, Taltavull Jean-Francois < jean-francois.taltavull at elca.ch>: > Hello, > > > > 1/ Is “rally-openstack” python package correctly installed ? On my side I > have: > > > > (venv) vagrant at rally: $ pip list | grep rally > > rally 3.2.0 > > rally-openstack 2.1.0 > > > > 2/ Could you please show the json file used to create the deployment ? > > > > > > *From:* Ankit Goel > *Sent:* jeudi, 4 février 2021 18:19 > *To:* Taltavull Jean-Francois ; > openstack-dev at lists.openstack.org > *Cc:* John Spillane > *Subject:* RE: Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Thanks for the response Jean. I could install rally with pip command. But > when I am running rally deployment show command then it is failing. > > (rally) [root at rally ~]# rally deployment list > > > +--------------------------------------+----------------------------+----------+------------------+--------+ > > | uuid | created_at | > name | status | active | > > > +--------------------------------------+----------------------------+----------+------------------+--------+ > > | 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a | 2021-02-04T14:57:02.638753 | > existing | deploy->finished | * | > > > +--------------------------------------+----------------------------+----------+------------------+--------+ > > (rally) [root at rally ~]# > > (rally) [root at rally ~]# rally deployment show > 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a > > Command failed, please check log for more info > > 2021-02-04 16:06:49.576 19306 CRITICAL rally [-] Unhandled error: > KeyError: 'openstack' > > 2021-02-04 16:06:49.576 19306 ERROR rally Traceback (most recent call > last): > > 2021-02-04 16:06:49.576 19306 ERROR rally File "/bin/rally", line 11, in > > > 2021-02-04 16:06:49.576 19306 ERROR rally sys.exit(main()) > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/main.py", line 40, in main > > 2021-02-04 16:06:49.576 19306 ERROR rally return > cliutils.run(sys.argv, categories) > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/cliutils.py", line 669, > in run > > 2021-02-04 16:06:49.576 19306 ERROR rally ret = fn(*fn_args, > **fn_kwargs) > > 2021-02-04 16:06:49.576 19306 ERROR rally File "", > line 2, in show > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/envutils.py", line 135, > in default_from_global > > 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) > > 2021-02-04 16:06:49.576 19306 ERROR rally File "", > line 2, in show > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/plugins/__init__.py", line > 59, in ensure_plugins_are_loaded > > 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/commands/deployment.py", > line 205, in show > > 2021-02-04 16:06:49.576 19306 ERROR rally creds = > deployment["credentials"]["openstack"][0] > > 2021-02-04 16:06:49.576 19306 ERROR rally KeyError: 'openstack' > > 2021-02-04 16:06:49.576 19306 ERROR rally > > (rally) [root at rally ~]# > > > > Can you please help me to resolve this issue. > > > > Regards, > > Ankit Goel > > > > *From:* Taltavull Jean-Francois > *Sent:* 03 February 2021 19:38 > *To:* Ankit Goel ; openstack-dev at lists.openstack.org > *Subject:* RE: Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Hello Ankit, > > > > Installation part of Rally official doc is not up to date, actually. > > > > Just do “pip install rally-openstack” (in a virtualenv, of course 😊) > > This will also install “rally” python package. > > > > Enjoy ! > > > > Jean-Francois > > > > *From:* Ankit Goel > *Sent:* mercredi, 3 février 2021 13:40 > *To:* openstack-dev at lists.openstack.org > *Subject:* Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Hello Experts, > > > > I was trying to install Openstack rally on centos 7 VM but the link > provided in the Openstack doc to download the install_rally.sh is broken. > > > > Latest Rally Doc link - > > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation > > > > Rally Install Script -> > https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh > - > This is broken > > > > After searching on internet I could reach to the Openstack rally Repo - > > https://opendev.org/openstack/rally but here I am not seeing the install_ > rally.sh script and according to all the information available on internet > it says we need install_ rally.sh. > > > > Thus can you please let me know what’s the latest procedure to install > Rally. > > > > Awaiting for your response. > > Thanks, > > Ankit Goel > > > > > > > > > > > > > > > -- Best regards, Andrey Kurilin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-francois.taltavull at elca.ch Fri Feb 5 15:42:45 2021 From: jean-francois.taltavull at elca.ch (Taltavull Jean-Francois) Date: Fri, 5 Feb 2021 15:42:45 +0000 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: References: <3cb755495b994352aaadf0d31ad295f3@elca.ch> <7fdbc97a688744f399f7358b1250bc30@elca.ch> Message-ID: Hello, `rally deployment show` works fine for me. With rally v2.1.0 and rally-openstack v3.2.0 From: Andrey Kurilin Sent: vendredi, 5 février 2021 16:39 To: Taltavull Jean-Francois Cc: Ankit Goel ; openstack-dev at lists.openstack.org; John Spillane Subject: Re: Rally - Unable to install rally - install_rally.sh is not available in repo Hi, Ankit! Please accept my apologies for the outdated documentation. Updating docs is my top 1 priority for Rally project, but unfortunately, I do not have much time for doing that. As for the issue you are facing with `rally deployment show command`, it looks like a bug. Switching from `rally deployment` to `rally env` subset of command should solve the problem (rally deployment will be deprecated at some point). пт, 5 февр. 2021 г. в 10:17, Taltavull Jean-Francois >: Hello, 1/ Is “rally-openstack” python package correctly installed ? On my side I have: (venv) vagrant at rally: $ pip list | grep rally rally 3.2.0 rally-openstack 2.1.0 2/ Could you please show the json file used to create the deployment ? From: Ankit Goel > Sent: jeudi, 4 février 2021 18:19 To: Taltavull Jean-Francois >; openstack-dev at lists.openstack.org Cc: John Spillane > Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Thanks for the response Jean. I could install rally with pip command. But when I am running rally deployment show command then it is failing. (rally) [root at rally ~]# rally deployment list +--------------------------------------+----------------------------+----------+------------------+--------+ | uuid | created_at | name | status | active | +--------------------------------------+----------------------------+----------+------------------+--------+ | 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a | 2021-02-04T14:57:02.638753 | existing | deploy->finished | * | +--------------------------------------+----------------------------+----------+------------------+--------+ (rally) [root at rally ~]# (rally) [root at rally ~]# rally deployment show 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a Command failed, please check log for more info 2021-02-04 16:06:49.576 19306 CRITICAL rally [-] Unhandled error: KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally Traceback (most recent call last): 2021-02-04 16:06:49.576 19306 ERROR rally File "/bin/rally", line 11, in 2021-02-04 16:06:49.576 19306 ERROR rally sys.exit(main()) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/main.py", line 40, in main 2021-02-04 16:06:49.576 19306 ERROR rally return cliutils.run(sys.argv, categories) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/cliutils.py", line 669, in run 2021-02-04 16:06:49.576 19306 ERROR rally ret = fn(*fn_args, **fn_kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/envutils.py", line 135, in default_from_global 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/plugins/__init__.py", line 59, in ensure_plugins_are_loaded 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/commands/deployment.py", line 205, in show 2021-02-04 16:06:49.576 19306 ERROR rally creds = deployment["credentials"]["openstack"][0] 2021-02-04 16:06:49.576 19306 ERROR rally KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally (rally) [root at rally ~]# Can you please help me to resolve this issue. Regards, Ankit Goel From: Taltavull Jean-Francois > Sent: 03 February 2021 19:38 To: Ankit Goel >; openstack-dev at lists.openstack.org Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Ankit, Installation part of Rally official doc is not up to date, actually. Just do “pip install rally-openstack” (in a virtualenv, of course 😊) This will also install “rally” python package. Enjoy ! Jean-Francois From: Ankit Goel > Sent: mercredi, 3 février 2021 13:40 To: openstack-dev at lists.openstack.org Subject: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Experts, I was trying to install Openstack rally on centos 7 VM but the link provided in the Openstack doc to download the install_rally.sh is broken. Latest Rally Doc link - > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation Rally Install Script -> https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh - > This is broken After searching on internet I could reach to the Openstack rally Repo - > https://opendev.org/openstack/rally but here I am not seeing the install_ rally.sh script and according to all the information available on internet it says we need install_ rally.sh. Thus can you please let me know what’s the latest procedure to install Rally. Awaiting for your response. Thanks, Ankit Goel -- Best regards, Andrey Kurilin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From roshananvekar at gmail.com Fri Feb 5 15:49:06 2021 From: roshananvekar at gmail.com (roshan anvekar) Date: Fri, 5 Feb 2021 21:19:06 +0530 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty In-Reply-To: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> References: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> Message-ID: Thanks for the reply. Well, I have a multinode setup ( 3 controllers and multiple compute nodes) which was initially deployed with rocky and was working fine. I checked the globals.yml and site.yml files between rocky and stein and I could not see any significant changes. Also under Admin-Compute-Hypervisor, I see that all the compute nodes are showing up under Compute section. the hypervisor section is empty. I was wondering if controllers are placed under a different aggregate and not able to show up. I can see all 3 controllers listed in host-aggregates panel though and are in service up state. VM creation fails with no valid host found error. I am not able to point out issue since I don't see any errors in deployment too. Regards, Roshan On Fri, Feb 5, 2021, 7:47 PM Sean Mooney wrote: > On Fri, 2021-02-05 at 14:53 +0530, roshan anvekar wrote: > > Hi all, > > > > Scenario: I have an installation of Openstack stein through > kolla-ansible. > > The deployment went fine and all services look good. > > > > Although I am seeing that under Admin--> Compute --> Hypervisors panel in > > horizon, all the controller nodes are missing. It's a blank list. > did you actully deploy the nova compute agent service to them? > > that view is showing the list of host that are running the nova compute > service > typically that is not deployed to the contolers. > > host in the contol group in the kolla multi node inventlry > > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L3 > are not use to run the compute agent by default > only nodes in the compute group are > > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L18 > the eception to that is ironic > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L301-L302 > which is deployed to the contolers. > > the nova compute agent used for libvirt is deployed specificlly to the > compute hosts via the nova-cell role at least on master > > https://github.com/openstack/kolla-ansible/blob/master/ansible/nova.yml#L118 > this was done a little simpler before adding cell support but the > inventory side has not changed in many release in this > regard. > > > > > > Also "Openstack hypervisor list" gives an empty list. > > > > I skimmed through the logs and found no error message other than in > > nova-scheduler that: > > > > *Got no allocation candidates from the Placement API. This could be due > to > > insufficient resources or a temporary occurence as compute nodes start > up.* > > > > Subsequently I checked placement container logs and found no error > message > > or anamoly. > > > > Not sure what the issue is. Any help in the above case would be > appreciated. > > > > Regards, > > Roshan > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Fri Feb 5 16:02:46 2021 From: haleyb.dev at gmail.com (Brian Haley) Date: Fri, 5 Feb 2021 11:02:46 -0500 Subject: [all] Gate resources and performance In-Reply-To: References: Message-ID: <7861fab8-cc8b-3679-998b-093731b96217@gmail.com> On 2/4/21 12:28 PM, Dan Smith wrote: > Hi all, > > I have become increasingly concerned with CI performance lately, and > have been raising those concerns with various people. Most specifically, > I'm worried about our turnaround time or "time to get a result", which > has been creeping up lately. Right after the beginning of the year, we > had a really bad week where the turnaround time was well over 24 > hours. That means if you submit a patch on Tuesday afternoon, you might > not get a test result until Thursday. That is, IMHO, a real problem and > massively hurts our ability to quickly merge priority fixes as well as > just general velocity and morale. If people won't review my code until > they see a +1 from Zuul, and that is two days after I submitted it, > that's bad. Thanks for raising the issue Dan, I've definitely been hit by this issue myself. > Now, obviously nobody wants to run fewer tests on patches before they > land, and I'm not really suggesting that we take that approach > necessarily. However, I think there are probably a lot of places that we > can cut down the amount of *work* we do. Some ways to do this are: > > 1. Evaluate whether or not you need to run all of tempest on two > configurations of a devstack on each patch. Maybe having a > stripped-down tempest (like just smoke) to run on unique configs, or > even specific tests. > 2. Revisit your "irrelevant_files" lists to see where you might be able > to avoid running heavy jobs on patches that only touch something > small. > 3. Consider moving some jobs to the experimental queue and run them > on-demand for patches that touch particular subsystems or affect > particular configurations. > 4. Consider some periodic testing for things that maybe don't need to > run on every single patch. > 5. Re-examine tests that take a long time to run to see if something can > be done to make them more efficient. > 6. Consider performance improvements in the actual server projects, > which also benefits the users. There's another little used feature of Zuul called "fail fast", it's something used in the Octavia* repos in our gate jobs: project: gate: fail-fast: true Description is: Zuul now supports :attr:`project..fail-fast` to immediately report and cancel builds on the first failure in a buildset. I feel it's useful for gate jobs since they've already gone through the check queue and typically shouldn't fail. For example, a mirror failure should stop things quickly, since the next action will most likely be a 'recheck' anyways. And thinking along those lines, I remember a discussion years ago about having a 'canary' job, [0] (credit to Gmann and Jeremy). Is having a multi-stage pipeline where the 'low impact' jobs are run first - pep8, unit, functional, docs, and only if they pass run things like Tempest, more palatable now? I realize there are some downsides, but it mostly penalizes those that have failed to run the simple checks locally before pushing out a review. Just wanted to throw it out there. -Brian [0] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000755.html From fungi at yuggoth.org Fri Feb 5 16:36:12 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 5 Feb 2021 16:36:12 +0000 Subject: [all] Gate resources and performance In-Reply-To: <7861fab8-cc8b-3679-998b-093731b96217@gmail.com> References: <7861fab8-cc8b-3679-998b-093731b96217@gmail.com> Message-ID: <20210205163612.qubrh2woosa6nu7l@yuggoth.org> On 2021-02-05 11:02:46 -0500 (-0500), Brian Haley wrote: [...] > There's another little used feature of Zuul called "fail fast", it's > something used in the Octavia* repos in our gate jobs: > > project: > gate: > fail-fast: true > > Description is: > > Zuul now supports :attr:`project..fail-fast` to immediately > report and cancel builds on the first failure in a buildset. > > I feel it's useful for gate jobs since they've already gone through the > check queue and typically shouldn't fail. For example, a mirror failure > should stop things quickly, since the next action will most likely be a > 'recheck' anyways. > > And thinking along those lines, I remember a discussion years ago about > having a 'canary' job, [0] (credit to Gmann and Jeremy). Is having a > multi-stage pipeline where the 'low impact' jobs are run first - pep8, unit, > functional, docs, and only if they pass run things like Tempest, more > palatable now? I realize there are some downsides, but it mostly penalizes > those that have failed to run the simple checks locally before pushing out a > review. Just wanted to throw it out there. > > -Brian > > [0] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000755.html The fundamental downside to these sorts of defensive approaches is that they make it easier to avoid solving the underlying issues. We've designed Zuul to perform most efficiently when it's running tests which are deterministic and mostly free of "false negative" failures. Making your tests and the software being tested efficient and predictable maximizes CI throughput under such an optimistic model. Sinking engineering effort into workarounds for unstable tests and buggy software is time which could have been invested in improving things instead, but also to a great extent removes a lot of the incentive to bother. Sure it could be seen as a pragmatic approach, accepting that in a large software ecosystem such seemingly pathological problems are actually inevitable, but that strikes me as a bit defeatist. There will of course always be temporary problems resulting from outages/incidents in donated resources or regressions in external dependencies outside our control, but if our background failure rate was significantly reduced it would also be far easier to spot and mitigate an order of magnitude failure increase quickly, rather than trying to find the cause of a sudden 25% uptick in failures. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From andr.kurilin at gmail.com Fri Feb 5 16:38:37 2021 From: andr.kurilin at gmail.com (Andrey Kurilin) Date: Fri, 5 Feb 2021 18:38:37 +0200 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: References: <3cb755495b994352aaadf0d31ad295f3@elca.ch> <7fdbc97a688744f399f7358b1250bc30@elca.ch> Message-ID: Hi! I'm not saying `rally deployment show` is completely broken. It should work if you pass the "proper" environment spec to the create command. By "proper" I mean using "openstack" instead of "existing at openstack" plugin name. In other cases, including --from-env option, it looks broken. Anyway, even with "broken" `rally deployment show` command, `rally task` should work. пт, 5 февр. 2021 г. в 17:42, Taltavull Jean-Francois < jean-francois.taltavull at elca.ch>: > Hello, > > > > `rally deployment show` works fine for me. > > > > With rally v2.1.0 and rally-openstack v3.2.0 > > > > > > *From:* Andrey Kurilin > *Sent:* vendredi, 5 février 2021 16:39 > *To:* Taltavull Jean-Francois > *Cc:* Ankit Goel ; openstack-dev at lists.openstack.org; > John Spillane > *Subject:* Re: Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Hi, Ankit! > > > > Please accept my apologies for the outdated documentation. Updating docs > is my top 1 priority for Rally project, but unfortunately, I do not have > much time for doing that. > > > > As for the issue you are facing with `rally deployment show command`, it > looks like a bug. Switching from `rally deployment` to `rally env` subset > of command should solve the problem (rally deployment will be deprecated at > some point). > > > > > > пт, 5 февр. 2021 г. в 10:17, Taltavull Jean-Francois < > jean-francois.taltavull at elca.ch>: > > Hello, > > > > 1/ Is “rally-openstack” python package correctly installed ? On my side I > have: > > > > (venv) vagrant at rally: $ pip list | grep rally > > rally 3.2.0 > > rally-openstack 2.1.0 > > > > 2/ Could you please show the json file used to create the deployment ? > > > > > > *From:* Ankit Goel > *Sent:* jeudi, 4 février 2021 18:19 > *To:* Taltavull Jean-Francois ; > openstack-dev at lists.openstack.org > *Cc:* John Spillane > *Subject:* RE: Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Thanks for the response Jean. I could install rally with pip command. But > when I am running rally deployment show command then it is failing. > > (rally) [root at rally ~]# rally deployment list > > > +--------------------------------------+----------------------------+----------+------------------+--------+ > > | uuid | created_at | > name | status | active | > > > +--------------------------------------+----------------------------+----------+------------------+--------+ > > | 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a | 2021-02-04T14:57:02.638753 | > existing | deploy->finished | * | > > > +--------------------------------------+----------------------------+----------+------------------+--------+ > > (rally) [root at rally ~]# > > (rally) [root at rally ~]# rally deployment show > 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a > > Command failed, please check log for more info > > 2021-02-04 16:06:49.576 19306 CRITICAL rally [-] Unhandled error: > KeyError: 'openstack' > > 2021-02-04 16:06:49.576 19306 ERROR rally Traceback (most recent call > last): > > 2021-02-04 16:06:49.576 19306 ERROR rally File "/bin/rally", line 11, in > > > 2021-02-04 16:06:49.576 19306 ERROR rally sys.exit(main()) > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/main.py", line 40, in main > > 2021-02-04 16:06:49.576 19306 ERROR rally return > cliutils.run(sys.argv, categories) > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/cliutils.py", line 669, > in run > > 2021-02-04 16:06:49.576 19306 ERROR rally ret = fn(*fn_args, > **fn_kwargs) > > 2021-02-04 16:06:49.576 19306 ERROR rally File "", > line 2, in show > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/envutils.py", line 135, > in default_from_global > > 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) > > 2021-02-04 16:06:49.576 19306 ERROR rally File "", > line 2, in show > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/plugins/__init__.py", line > 59, in ensure_plugins_are_loaded > > 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/commands/deployment.py", > line 205, in show > > 2021-02-04 16:06:49.576 19306 ERROR rally creds = > deployment["credentials"]["openstack"][0] > > 2021-02-04 16:06:49.576 19306 ERROR rally KeyError: 'openstack' > > 2021-02-04 16:06:49.576 19306 ERROR rally > > (rally) [root at rally ~]# > > > > Can you please help me to resolve this issue. > > > > Regards, > > Ankit Goel > > > > *From:* Taltavull Jean-Francois > *Sent:* 03 February 2021 19:38 > *To:* Ankit Goel ; openstack-dev at lists.openstack.org > *Subject:* RE: Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Hello Ankit, > > > > Installation part of Rally official doc is not up to date, actually. > > > > Just do “pip install rally-openstack” (in a virtualenv, of course 😊) > > This will also install “rally” python package. > > > > Enjoy ! > > > > Jean-Francois > > > > *From:* Ankit Goel > *Sent:* mercredi, 3 février 2021 13:40 > *To:* openstack-dev at lists.openstack.org > *Subject:* Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Hello Experts, > > > > I was trying to install Openstack rally on centos 7 VM but the link > provided in the Openstack doc to download the install_rally.sh is broken. > > > > Latest Rally Doc link - > > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation > > > > Rally Install Script -> > https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh > - > This is broken > > > > After searching on internet I could reach to the Openstack rally Repo - > > https://opendev.org/openstack/rally but here I am not seeing the install_ > rally.sh script and according to all the information available on internet > it says we need install_ rally.sh. > > > > Thus can you please let me know what’s the latest procedure to install > Rally. > > > > Awaiting for your response. > > Thanks, > > Ankit Goel > > > > > > > > > > > > > > > > > > -- > > Best regards, > Andrey Kurilin. > -- Best regards, Andrey Kurilin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Fri Feb 5 16:53:47 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 5 Feb 2021 08:53:47 -0800 Subject: [ironic] Node wont boot from virtual media In-Reply-To: References: Message-ID: Is this in UEFI or Bios boot mode? On Fri, Feb 5, 2021 at 3:34 AM Derek Higgins wrote: > > Hi, > > I've been trying to get virtual media booting to work on a ProLiant DL380 > Gen10 but not having any luck. Over redfish Ironic attaches the media, sets > the Next boot option to 'cd' and then restarts the node. But it continues > to boot from the HD as if the vmedia is being ignored. I'm wondering if > anybody has seen anything similar, I'm thinking perhaps I don't have a > bios setting configured that I need? > > I have the same problem if I set the One time boot on the iLo dashboard. > > System ROM U30 v2.40 (10/26/2020) > iLO Firmware Version 2.33 Dec 09 2020 > > On another ProLiant DL380 Gen10 with the same ROM and iLo version where > this works. Some of the hardware is different, in particular the one that works > has a "HPE Smart Array P408i-a SR Gen10 " but the one that doesn't has a > "HPE Smart Array E208i-a SR Gen10" could this be the relevant difference? > > any ideas would be great, > > thanks, > Derek. > > From dms at danplanet.com Fri Feb 5 17:04:09 2021 From: dms at danplanet.com (Dan Smith) Date: Fri, 05 Feb 2021 09:04:09 -0800 Subject: [all] Gate resources and performance In-Reply-To: <20210205163612.qubrh2woosa6nu7l@yuggoth.org> (Jeremy Stanley's message of "Fri, 5 Feb 2021 16:36:12 +0000") References: <7861fab8-cc8b-3679-998b-093731b96217@gmail.com> <20210205163612.qubrh2woosa6nu7l@yuggoth.org> Message-ID: > The fundamental downside to these sorts of defensive approaches is > that they make it easier to avoid solving the underlying issues. We certainly don't want to incentivize relying on aggregate throughput in place of actually making things faster and better. That's why I started this thread. However... > We've designed Zuul to perform most efficiently when it's running > tests which are deterministic and mostly free of "false negative" > failures. Making your tests and the software being tested efficient > and predictable maximizes CI throughput under such an optimistic > model. This is a nice ideal and definitely what we should strive for, no doubt. But I think it's pretty clear that what we're doing here is hard, with potential failures at all layers above and below a thing you're working on at any given point. Striving to get there and expecting we ever will are very different. I remember back when we moved from serialized tests to parallel ones, there was a lot of concern over being able to reproduce a test failure that only occasionally happens due to ordering. The benefit of running in parallel greatly outweighs the cost of not doing so. Still today, it is incredibly time consuming to reproduce, debug and fix issues that come from running in parallel. Our tests are more complicated (but better of course) because of it, and just yesterday I -1'd a patch because I could spot some non-reentrant behavior it was proposing to add. In terms of aggregate performance, we get far more done I'm sure with parallelized tests along with some increased spurious failure rate, over a very low failure rate and serialized tests. > Sinking engineering effort into workarounds for unstable tests and > buggy software is time which could have been invested in improving > things instead, but also to a great extent removes a lot of the > incentive to bother. Like everything, it's a tradeoff. If we didn't run in parallel, we'd waste a lot more gate resources in serial, but we would almost definitely have to recheck less, our tests could be a lot simpler and we could spend time (and be rewarded in test execution) by making the actual servers faster instead of debugging failures. You might even argue that such an arrangement would benefit the users more than making our tests capable of running in parallel ;) > Sure it could be seen as a pragmatic approach, accepting that in a > large software ecosystem such seemingly pathological problems are > actually inevitable, but that strikes me as a bit defeatist. There > will of course always be temporary problems resulting from > outages/incidents in donated resources or regressions in external > dependencies outside our control, but if our background failure rate > was significantly reduced it would also be far easier to spot and > mitigate an order of magnitude failure increase quickly, rather than > trying to find the cause of a sudden 25% uptick in failures. Looking back on the eight years I've been doing this, I really don't think that zero fails is realistic or even useful as a goal, unless it's your only goal. Thus, I expect we're always going to be ticking up or down over time. Debugging and fixing the non-trivial things that plague us is some of the harder work we do, more so in almost all cases than the work we did that introduced the problem in the first place. We definitely need to be constantly trying to increase stability, but let's be clear that it is likely the _most_ difficult think a stacker can do with their time. --Dan From adam.zheng at colorado.edu Fri Feb 5 17:53:32 2021 From: adam.zheng at colorado.edu (Adam Zheng) Date: Fri, 5 Feb 2021 17:53:32 +0000 Subject: [ops][cinder][kolla-ansible] cinder-backup fails if source disk not in nova az Message-ID: <9650D354-5141-47CA-B3C6-EB867CE4524F@colorado.edu> Hello, I’ve been trying to get availability zones defined for volumes. Everything works fine if I leave the zone at “nova”, all volume types work and backups/snapshots also work. ie: +------------------+----------------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+----------------------------+------+---------+-------+----------------------------+ | cinder-scheduler | cs-os-ctl-001 | nova | enabled | up | 2021-02-05T17:22:51.000000 | | cinder-scheduler | cs-os-ctl-003 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-scheduler | cs-os-ctl-002 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-volume | cs-os-ctl-001 at rbd-ceph-gp2 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-volume | cs-os-ctl-001 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-volume | cs-os-ctl-002 at rbd-ceph-gp2 | nova | enabled | up | 2021-02-05T17:22:50.000000 | | cinder-volume | cs-os-ctl-003 at rbd-ceph-gp2 | nova | enabled | up | 2021-02-05T17:22:55.000000 | | cinder-volume | cs-os-ctl-002 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:57.000000 | | cinder-volume | cs-os-ctl-003 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-backup | cs-os-ctl-002 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-backup | cs-os-ctl-001 | nova | enabled | up | 2021-02-05T17:22:53.000000 | | cinder-backup | cs-os-ctl-003 | nova | enabled | up | 2021-02-05T17:22:58.000000 | +------------------+----------------------------+------+---------+-------+----------------------------+ However, if I apply the following changes: cinder-api.conf [DEFAULT] default_availability_zone = not-nova default_volume_type = ceph-gp2 allow_availability_zone_fallback=True cinder-volume.conf [rbd-ceph-gp2] <…> backend_availability_zone = not-nova <…> I’ll get the following +------------------+----------------------------+----------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+----------------------------+----------+---------+-------+----------------------------+ | cinder-scheduler | cs-os-ctl-001 | nova | enabled | up | 2021-02-05T17:22:51.000000 | | cinder-scheduler | cs-os-ctl-003 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-scheduler | cs-os-ctl-002 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-volume | cs-os-ctl-001 at rbd-ceph-gp2 | not-nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-volume | cs-os-ctl-001 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-volume | cs-os-ctl-002 at rbd-ceph-gp2 | not-nova | enabled | up | 2021-02-05T17:22:50.000000 | | cinder-volume | cs-os-ctl-003 at rbd-ceph-gp2 | not-nova | enabled | up | 2021-02-05T17:22:55.000000 | | cinder-volume | cs-os-ctl-002 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:57.000000 | | cinder-volume | cs-os-ctl-003 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-backup | cs-os-ctl-002 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-backup | cs-os-ctl-001 | nova | enabled | up | 2021-02-05T17:22:53.000000 | | cinder-backup | cs-os-ctl-003 | nova | enabled | up | 2021-02-05T17:22:58.000000 | +------------------+----------------------------+----------+---------+-------+----------------------------+ At this point, creating new volumes still work and go into the expected ceph pools. However, backups no longer work for the cinder-volume that is not nova. In the above example, it still works fine for volumes that that were created with type “ceph-gp2” in az “nova”. Does not work for volumes that were created with type “ceph-st1” in az “not-nova”. It fails immediately and goes into error state with reason “Service not found for creating backup.” I suspect I need to try to get another set of “cinder-backup” services running in the Zone “not-nova”, but cannot seem to figure out how. I’ve scoured the docs on cinder.conf, and if I set default zones in cinder-backup (I’ve tried backend_availability_zone, default_availability_zone, and storage_availability_zone) I cannot seem to get backups working if the disk it’s backing up is not in az “nova”. The cinder-backup service in volume service list will always show “nova” no matter what I put there. Any advice would be appreciated. OpenStack Victoria deployed via kolla-ansible Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Fri Feb 5 18:02:42 2021 From: kennelson11 at gmail.com (Kendall Nelson) Date: Fri, 5 Feb 2021 10:02:42 -0800 Subject: [All][StoryBoard] Angular.js Alternatives In-Reply-To: <78e2ff18-0321-0c32-4aa0-b8f77971c870@openstack.org> References: <0a0bf782-09eb-aeb6-fdd7-0d3e3d7b4f89@catalystcloud.nz> <4b35907b-f53e-4812-1140-ab8abb770af2@catalystcloud.nz> <78e2ff18-0321-0c32-4aa0-b8f77971c870@openstack.org> Message-ID: Hello! I definitely agree that it would be nice to unify, someday, but like Thierry said, we've tried to use the same framework in the past and it resulted in the team having to learn and start from scratch and resulted in almost no additional help. We talked about all of the pros and cons in our last meeting if you'd like to read it[1]. I was definitely interested in React because of Zuul's use of it, but I know they are a small team themselves and can't expect that they will come help us just because we picked a framework they have experience with. Since we already have a POC in Vue and some of our team members have experience with it, that's what we are going to go with. That said, if there were a handful of contributors that wanted to help us build it in React and volunteered right now, we might be willing to reopen discussion :) - Kendall (diablo_rojo) [1] http://eavesdrop.openstack.org/meetings/storyboard/2021/storyboard.2021-01-28-18.01.log.html#l-8 On Thu, Feb 4, 2021 at 1:13 AM Thierry Carrez wrote: > Adrian Turjak wrote: > > [...] > > I just think that if we stick to one frontend framework for most of > > OpenStack it can make it easier to share resources. :) > > I agree on principle... But unfortunately in the StoryBoard experience > adopting the same framework as Horizon did not magically make Horizon > people invest time improving StoryBoard. > > All other things being equal, I would indeed recommend alignment on the > same framework. But teams existing familiarity with the framework chosen > (or ease of learning said chosen framework, or desirable features for > the specific use case) probably rank higher in the list of criteria. > > -- > Thierry Carrez (ttx) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From derekh at redhat.com Fri Feb 5 18:08:28 2021 From: derekh at redhat.com (Derek Higgins) Date: Fri, 5 Feb 2021 18:08:28 +0000 Subject: [ironic] Node wont boot from virtual media In-Reply-To: References: Message-ID: On Fri, 5 Feb 2021 at 16:54, Julia Kreger wrote: > > Is this in UEFI or Bios boot mode? Sorry I should have mentioned this, it's UEFI. > > On Fri, Feb 5, 2021 at 3:34 AM Derek Higgins wrote: > > > > Hi, > > > > I've been trying to get virtual media booting to work on a ProLiant DL380 > > Gen10 but not having any luck. Over redfish Ironic attaches the media, sets > > the Next boot option to 'cd' and then restarts the node. But it continues > > to boot from the HD as if the vmedia is being ignored. I'm wondering if > > anybody has seen anything similar, I'm thinking perhaps I don't have a > > bios setting configured that I need? > > > > I have the same problem if I set the One time boot on the iLo dashboard. > > > > System ROM U30 v2.40 (10/26/2020) > > iLO Firmware Version 2.33 Dec 09 2020 > > > > On another ProLiant DL380 Gen10 with the same ROM and iLo version where > > this works. Some of the hardware is different, in particular the one that works > > has a "HPE Smart Array P408i-a SR Gen10 " but the one that doesn't has a > > "HPE Smart Array E208i-a SR Gen10" could this be the relevant difference? > > > > any ideas would be great, > > > > thanks, > > Derek. > > > > > From kennelson11 at gmail.com Fri Feb 5 18:15:14 2021 From: kennelson11 at gmail.com (Kendall Nelson) Date: Fri, 5 Feb 2021 10:15:14 -0800 Subject: [all][mentoring][outreachy][GSoC]Summer Interns/ Student Coders Anyone? In-Reply-To: References: Message-ID: I wanted to bring it to the top of people's inboxes again :) The due date for applications is February 19, 2021 at 11:00 (Pacific Standard Time)! Again, let me know if you'd like help getting an application together! -Kendall (diablo_rojo) On Wed, Jan 27, 2021 at 10:04 AM Kendall Nelson wrote: > Hello :) > > So, if you've been involved in Outreachy, the Google Summer of Code[1] is > pretty similar. It looks like the applications for projects open January > 29, 2021 at 11:00 (Pacific Standard Time) if anyone wants to apply. I > encourage you all to do so! It gets a lot of attention from university > students so it's a great opportunity to get some new contributors and > teach some more of the world about our community and open source! > > If you are interested in applying, let me know and I will do what I can to > help you with the application to get it done on time (deadline not listed > on the site yet, but they always come faster than we'd like)! > > -Kendall (diablo_rojo) > > [1] https://summerofcode.withgoogle.com/get-started/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andr.kurilin at gmail.com Fri Feb 5 18:53:39 2021 From: andr.kurilin at gmail.com (Andrey Kurilin) Date: Fri, 5 Feb 2021 20:53:39 +0200 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: References: <3cb755495b994352aaadf0d31ad295f3@elca.ch> <7fdbc97a688744f399f7358b1250bc30@elca.ch> Message-ID: Does `rally env info` show information about your deployment? пт, 5 февр. 2021 г. в 19:49, Ankit Goel : > Hi Andrey, > > > > Yes I used the command which includes --fromenv. Below is the command I > used after sourcing my Openstack RC file. > > rally deployment create --fromenv --name=existing > > Below are the contents for Openstack RC file. > > > > export OS_PROJECT_DOMAIN_NAME=Default > > export OS_USER_DOMAIN_NAME=Default > > export OS_PROJECT_NAME=admin > > export OS_TENANT_NAME=admin > > export OS_USERNAME=admin > > export OS_PASSWORD=kMswJJGAeKhziXGWloLjYESvfOytK4DkCAAXcpA8 > > export OS_AUTH_URL=https://osp.example.com:5000/v3 > > export OS_INTERFACE=public > > export OS_IDENTITY_API_VERSION=3 > > export OS_REGION_NAME=RegionOne > > export OS_AUTH_PLUGIN=password > > export OS_CACERT=/root/certificates/haproxy-ca.crt > > > > Regards, > > Ankit Goel > > > > *From:* Andrey Kurilin > *Sent:* 05 February 2021 22:09 > *To:* Taltavull Jean-Francois > *Cc:* Ankit Goel ; openstack-dev at lists.openstack.org; > John Spillane > *Subject:* Re: Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Hi! > > I'm not saying `rally deployment show` is completely broken. > > It should work if you pass the "proper" environment spec to the create > command. By "proper" I mean using "openstack" instead of "existing at openstack" > plugin name. In other cases, including --from-env option, it looks broken. > > > > Anyway, even with "broken" `rally deployment show` command, `rally task` > should work. > > > > пт, 5 февр. 2021 г. в 17:42, Taltavull Jean-Francois < > jean-francois.taltavull at elca.ch>: > > Hello, > > > > `rally deployment show` works fine for me. > > > > With rally v2.1.0 and rally-openstack v3.2.0 > > > > > > *From:* Andrey Kurilin > *Sent:* vendredi, 5 février 2021 16:39 > *To:* Taltavull Jean-Francois > *Cc:* Ankit Goel ; openstack-dev at lists.openstack.org; > John Spillane > *Subject:* Re: Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Hi, Ankit! > > > > Please accept my apologies for the outdated documentation. Updating docs > is my top 1 priority for Rally project, but unfortunately, I do not have > much time for doing that. > > > > As for the issue you are facing with `rally deployment show command`, it > looks like a bug. Switching from `rally deployment` to `rally env` subset > of command should solve the problem (rally deployment will be deprecated at > some point). > > > > > > пт, 5 февр. 2021 г. в 10:17, Taltavull Jean-Francois < > jean-francois.taltavull at elca.ch>: > > Hello, > > > > 1/ Is “rally-openstack” python package correctly installed ? On my side I > have: > > > > (venv) vagrant at rally: $ pip list | grep rally > > rally 3.2.0 > > rally-openstack 2.1.0 > > > > 2/ Could you please show the json file used to create the deployment ? > > > > > > *From:* Ankit Goel > *Sent:* jeudi, 4 février 2021 18:19 > *To:* Taltavull Jean-Francois ; > openstack-dev at lists.openstack.org > *Cc:* John Spillane > *Subject:* RE: Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Thanks for the response Jean. I could install rally with pip command. But > when I am running rally deployment show command then it is failing. > > (rally) [root at rally ~]# rally deployment list > > > +--------------------------------------+----------------------------+----------+------------------+--------+ > > | uuid | created_at | > name | status | active | > > > +--------------------------------------+----------------------------+----------+------------------+--------+ > > | 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a | 2021-02-04T14:57:02.638753 | > existing | deploy->finished | * | > > > +--------------------------------------+----------------------------+----------+------------------+--------+ > > (rally) [root at rally ~]# > > (rally) [root at rally ~]# rally deployment show > 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a > > Command failed, please check log for more info > > 2021-02-04 16:06:49.576 19306 CRITICAL rally [-] Unhandled error: > KeyError: 'openstack' > > 2021-02-04 16:06:49.576 19306 ERROR rally Traceback (most recent call > last): > > 2021-02-04 16:06:49.576 19306 ERROR rally File "/bin/rally", line 11, in > > > 2021-02-04 16:06:49.576 19306 ERROR rally sys.exit(main()) > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/main.py", line 40, in main > > 2021-02-04 16:06:49.576 19306 ERROR rally return > cliutils.run(sys.argv, categories) > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/cliutils.py", line 669, > in run > > 2021-02-04 16:06:49.576 19306 ERROR rally ret = fn(*fn_args, > **fn_kwargs) > > 2021-02-04 16:06:49.576 19306 ERROR rally File "", > line 2, in show > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/envutils.py", line 135, > in default_from_global > > 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) > > 2021-02-04 16:06:49.576 19306 ERROR rally File "", > line 2, in show > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/plugins/__init__.py", line > 59, in ensure_plugins_are_loaded > > 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) > > 2021-02-04 16:06:49.576 19306 ERROR rally File > "/usr/local/lib/python3.6/site-packages/rally/cli/commands/deployment.py", > line 205, in show > > 2021-02-04 16:06:49.576 19306 ERROR rally creds = > deployment["credentials"]["openstack"][0] > > 2021-02-04 16:06:49.576 19306 ERROR rally KeyError: 'openstack' > > 2021-02-04 16:06:49.576 19306 ERROR rally > > (rally) [root at rally ~]# > > > > Can you please help me to resolve this issue. > > > > Regards, > > Ankit Goel > > > > *From:* Taltavull Jean-Francois > *Sent:* 03 February 2021 19:38 > *To:* Ankit Goel ; openstack-dev at lists.openstack.org > *Subject:* RE: Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Hello Ankit, > > > > Installation part of Rally official doc is not up to date, actually. > > > > Just do “pip install rally-openstack” (in a virtualenv, of course 😊) > > This will also install “rally” python package. > > > > Enjoy ! > > > > Jean-Francois > > > > *From:* Ankit Goel > *Sent:* mercredi, 3 février 2021 13:40 > *To:* openstack-dev at lists.openstack.org > *Subject:* Rally - Unable to install rally - install_rally.sh is not > available in repo > > > > Hello Experts, > > > > I was trying to install Openstack rally on centos 7 VM but the link > provided in the Openstack doc to download the install_rally.sh is broken. > > > > Latest Rally Doc link - > > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation > > > > Rally Install Script -> > https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh > - > This is broken > > > > After searching on internet I could reach to the Openstack rally Repo - > > https://opendev.org/openstack/rally but here I am not seeing the install_ > rally.sh script and according to all the information available on internet > it says we need install_ rally.sh. > > > > Thus can you please let me know what’s the latest procedure to install > Rally. > > > > Awaiting for your response. > > Thanks, > > Ankit Goel > > > > > > > > > > > > > > > > > > -- > > Best regards, > Andrey Kurilin. > > > > -- > > Best regards, > Andrey Kurilin. > -- Best regards, Andrey Kurilin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abishop at redhat.com Fri Feb 5 20:48:45 2021 From: abishop at redhat.com (Alan Bishop) Date: Fri, 5 Feb 2021 12:48:45 -0800 Subject: [ops][cinder][kolla-ansible] cinder-backup fails if source disk not in nova az In-Reply-To: <9650D354-5141-47CA-B3C6-EB867CE4524F@colorado.edu> References: <9650D354-5141-47CA-B3C6-EB867CE4524F@colorado.edu> Message-ID: On Fri, Feb 5, 2021 at 10:00 AM Adam Zheng wrote: > Hello, > > > > I’ve been trying to get availability zones defined for volumes. > > Everything works fine if I leave the zone at “nova”, all volume types work > and backups/snapshots also work. > > > > ie: > > > +------------------+----------------------------+------+---------+-------+----------------------------+ > > | Binary | Host | Zone | Status | State | > Updated At | > > > +------------------+----------------------------+------+---------+-------+----------------------------+ > > | cinder-scheduler | cs-os-ctl-001 | nova | enabled | up | > 2021-02-05T17:22:51.000000 | > > | cinder-scheduler | cs-os-ctl-003 | nova | enabled | up | > 2021-02-05T17:22:54.000000 | > > | cinder-scheduler | cs-os-ctl-002 | nova | enabled | up | > 2021-02-05T17:22:56.000000 | > > | cinder-volume | cs-os-ctl-001 at rbd-ceph-gp2 | nova | enabled | up > | 2021-02-05T17:22:56.000000 | > > | cinder-volume | cs-os-ctl-001 at rbd-ceph-st1 | nova | enabled | up > | 2021-02-05T17:22:54.000000 | > > | cinder-volume | cs-os-ctl-002 at rbd-ceph-gp2 | nova | enabled | up > | 2021-02-05T17:22:50.000000 | > > | cinder-volume | cs-os-ctl-003 at rbd-ceph-gp2 | nova | enabled | up > | 2021-02-05T17:22:55.000000 | > > | cinder-volume | cs-os-ctl-002 at rbd-ceph-st1 | nova | enabled | up > | 2021-02-05T17:22:57.000000 | > > | cinder-volume | cs-os-ctl-003 at rbd-ceph-st1 | nova | enabled | up > | 2021-02-05T17:22:54.000000 | > > | cinder-backup | cs-os-ctl-002 | nova | enabled | up | > 2021-02-05T17:22:56.000000 | > > | cinder-backup | cs-os-ctl-001 | nova | enabled | up | > 2021-02-05T17:22:53.000000 | > > | cinder-backup | cs-os-ctl-003 | nova | enabled | up | > 2021-02-05T17:22:58.000000 | > > > +------------------+----------------------------+------+---------+-------+----------------------------+ > > > > However, if I apply the following changes: > > > > cinder-api.conf > > [DEFAULT] > > default_availability_zone = not-nova > > default_volume_type = ceph-gp2 > > allow_availability_zone_fallback=True > > > > cinder-volume.conf > > [rbd-ceph-gp2] > > <…> > > backend_availability_zone = not-nova > > <…> > > > > I’ll get the following > > > +------------------+----------------------------+----------+---------+-------+----------------------------+ > > | Binary | Host | Zone | Status | > State | Updated At | > > > +------------------+----------------------------+----------+---------+-------+----------------------------+ > > | cinder-scheduler | cs-os-ctl-001 | nova | enabled | > up | 2021-02-05T17:22:51.000000 | > > | cinder-scheduler | cs-os-ctl-003 | nova | enabled | > up | 2021-02-05T17:22:54.000000 | > > | cinder-scheduler | cs-os-ctl-002 | nova | enabled | > up | 2021-02-05T17:22:56.000000 | > > | cinder-volume | cs-os-ctl-001 at rbd-ceph-gp2 | not-nova | enabled | > up | 2021-02-05T17:22:56.000000 | > > | cinder-volume | cs-os-ctl-001 at rbd-ceph-st1 | nova | enabled | > up | 2021-02-05T17:22:54.000000 | > > | cinder-volume | cs-os-ctl-002 at rbd-ceph-gp2 | not-nova | enabled | > up | 2021-02-05T17:22:50.000000 | > > | cinder-volume | cs-os-ctl-003 at rbd-ceph-gp2 | not-nova | enabled | > up | 2021-02-05T17:22:55.000000 | > > | cinder-volume | cs-os-ctl-002 at rbd-ceph-st1 | nova | enabled | > up | 2021-02-05T17:22:57.000000 | > > | cinder-volume | cs-os-ctl-003 at rbd-ceph-st1 | nova | enabled | > up | 2021-02-05T17:22:54.000000 | > > | cinder-backup | cs-os-ctl-002 | nova | enabled | > up | 2021-02-05T17:22:56.000000 | > > | cinder-backup | cs-os-ctl-001 | nova | enabled | > up | 2021-02-05T17:22:53.000000 | > > | cinder-backup | cs-os-ctl-003 | nova | enabled | > up | 2021-02-05T17:22:58.000000 | > > > +------------------+----------------------------+----------+---------+-------+----------------------------+ > > > > At this point, creating new volumes still work and go into the expected > ceph pools. > > However, backups no longer work for the cinder-volume that is not nova. > > In the above example, it still works fine for volumes that that were > created with type “ceph-gp2” in az “nova”. > > Does not work for volumes that were created with type “ceph-st1” in az > “not-nova”. It fails immediately and goes into error state with reason > “Service not found for creating backup.” > Hi Adam, Cinder's backup service has the ability to create backups of volumes in another AZ. The 'cinder' CLI supports this feature as of microversion 3.51. (bear in mind the 'openstack' client doesn't support microversions for the cinder (volume) service, so you'll need to use the 'cinder' command. Rather than repeat what I've written previously, I refer you to [1] for additional details. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1649845#c4 One other thing to note is the corresponding "cinder backup-restore" command currently does not support restoring to a volume in another AZ, but there is a workaround. You can pre-create a new volume in the destination AZ, and use the ability to restore a backup to a specific volume (which just happens to be in your desired AZ). There's also a patch [2] under review to enhance the cinder shell so that both backup and restore shell commands work the same way. [2] https://review.opendev.org/c/openstack/python-cinderclient/+/762020 Alan > I suspect I need to try to get another set of “cinder-backup” services > running in the Zone “not-nova”, but cannot seem to figure out how. > > > > I’ve scoured the docs on cinder.conf, and if I set default zones in > cinder-backup (I’ve tried backend_availability_zone, > default_availability_zone, and storage_availability_zone) I cannot seem to > get backups working if the disk it’s backing up is not in az “nova”. The > cinder-backup service in volume service list will always show “nova” no > matter what I put there. > > > > Any advice would be appreciated. > > OpenStack Victoria deployed via kolla-ansible > > > > Thanks! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gouthampravi at gmail.com Fri Feb 5 21:09:30 2021 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Fri, 5 Feb 2021 13:09:30 -0800 Subject: [dev][cinder][keystone] Properly consuming system-scope in cinder In-Reply-To: <20210201125717.imyp5vyzhn5t44fj@localhost> References: <20210129172347.7wi3cv3gnneb46dj@localhost> <1774f7582e2.126a1dcb261735.4477287504407985916@ghanshyammann.com> <20210201125717.imyp5vyzhn5t44fj@localhost> Message-ID: On Mon, Feb 1, 2021 at 5:07 AM Gorka Eguileor wrote: > > On 29/01, Ghanshyam Mann wrote: > > ---- On Fri, 29 Jan 2021 11:23:47 -0600 Gorka Eguileor wrote ---- > > > On 28/01, Lance Bragstad wrote: > > > > Hey folks, > > > > > > > > As I'm sure some of the cinder folks are aware, I'm updating cinder > > > > policies to include support for some default personas keystone ships with. > > > > Some of those personas use system-scope (e.g., system-reader and > > > > system-admin) and I've already proposed a series of patches that describe > > > > what those changes look like from a policy perspective [0]. > > > > > > > > The question now is how we test those changes. To help guide that decision, > > > > I worked on three different testing approaches. The first was to continue > > > > testing policy using unit tests in cinder with mocked context objects. The > > > > second was to use DDT with keystonemiddleware mocked to remove a dependency > > > > on keystone. The third also used DDT, but included changes to update > > > > NoAuthMiddleware so that it wasn't as opinionated about authentication or > > > > authorization. I brought each approach in the cinder meeting this week > > > > where we discussed a fourth approach, doing everything in tempest. I > > > > summarized all of this in an etherpad [1] > > > > > > > > Up to yesterday morning, the only approach I hadn't tinkered with manually > > > > was tempest. I spent some time today figuring that out, resulting in a > > > > patch to cinderlib [2] to enable a protection test job, and > > > > cinder_tempest_plugin [3] that adds the plumbing and some example tests. > > > > > > > > In the process of implementing support for tempest testing, I noticed that > > > > service catalogs for system-scoped tokens don't contain cinder endpoints > > > > [4]. This is because the cinder endpoint contains endpoint templating in > > > > the URL [5], which keystone will substitute with the project ID of the > > > > token, if and only if the catalog is built for a project-scoped token. > > > > System and domain-scoped tokens do not have a reasonable project ID to use > > > > in this case, so the templating is skipped, resulting in a cinder service > > > > in the catalog without endpoints [6]. > > > > > > > > This cascades in the client, specifically tempest's volume client, because > > > > it can't find a suitable endpoint for request to the volume service [7]. > > > > > > > > Initially, my testing approaches were to provide examples for cinder > > > > developers to assess the viability of each approach before committing to a > > > > protection testing strategy. But, the tempest approach highlighted a larger > > > > issue for how we integrate system-scope support into cinder because of the > > > > assumption there will always be a project ID in the path (for the majority > > > > of the cinder API). I can think of two ways to approach the problem, but > > > > I'm hoping others have more. > > > > > > > > > > Hi Lance, > > > > > > Sorry to hear that the Cinder is giving you such trouble. > > > > > > > First, we remove project IDs from cinder's API path. > > > > > > > > This would be similar to how nova (and I assume other services) moved away > > > > from project-specific URLs (e.g., /v3/%{project_id}s/volumes would become > > > > /v3/volumes). This would obviously require refactoring to remove any > > > > assumptions cinder has about project IDs being supplied on the request > > > > path. But, this would force all authorization information to come from the > > > > context object. Once a deployer removes the endpoint URL templating, the > > > > endpoints will populate in the cinder entry of the service catalog. Brian's > > > > been helping me understand this and we're unsure if this is something we > > > > could even do with a microversion. I think nova did it moving from /v2/ to > > > > /v2.0/, which was technically classified as a major bump? This feels like a > > > > moon shot. > > > > > > > > > > In my opinion such a change should not be treated as a microversion and > > > would require us to go into v4, which is not something that is feasible > > > in the short term. > > > > We can do it by supporting both URL with and without project_id. Nova did the same way > > Hi, > > I was not doubting that this was technically possible, I was arguing > that a change that affects every single API endpoint in Cinder would not > be described as "micro" and doing so could be considered a bit of abuse > to the microversion infrastructure. > > This is just my opinion, not the Cinder official position. > > > > in Mitaka cycle and also bumped the microversion but just for > > notification. It was done in 2.18 microversion[1]. > > > > That way you can request compute API with or without project_id and later is recommended. > > I think the same approach Cinder can consider. > > > > Thanks for this information. It will definitely come in handy knowing > where we have code references if we decide to go with the microversion > route. > > Cheers, > Gorka. > > > > [1] https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id16 > > > > -gmann > > > > > > > > > > > > Second, we update cinder's clients, including tempest, to put the project > > > > ID on the URL. > > > > > > > > After we update the clients to append the project ID for cinder endpoints, > > > > we should be able to remove the URL templating in keystone, allowing cinder > > > > endpoints to appear in system-scoped service catalogs (just like the first > > > > approach). Clients can use the base URL from the catalog and append the > > > > > > I'm not familiar with keystone catalog entries, so maybe I'm saying > > > something stupid, but couldn't we have multiple entries? A > > > project-specific URL and another one for the project and system scoped > > > requests? > > > > > > I know it sounds kind of hackish, but if we add them in the right order, > > > first the project one and then the new one, it would probably be > > > backward compatible, as older clients would get the first endpoint and > > > new clients would be able to select the right one. > > > > > > > admin project ID before putting the request on the wire. Even though the > > > > request has a project ID in the path, cinder would ignore it for > > > > system-specific APIs. This is already true for users with an admin role on > > > > a project because cinder will allow you to get volumes in one project if > > > > you have a token scoped to another with the admin role [8]. One potential > > > > side-effect is that cinder clients would need *a* project ID to build a > > > > request, potentially requiring another roundtrip to keystone. > > > > > > What would happen in this additional roundtrip? Would we be converting > > > provided project's name into its UUID? > > > > > > If that's the case then it wouldn't happen when UUIDs are being > > > provided, so for cases where this extra request means a performance > > > problem they could just provide the UUID. > > > > > > > > > > > Thoughts? > > > > > > Truth is that I would love to see the Cinder API move into URLs without > > > the project id as well as move out everything from contrib, but that > > > doesn't seem like a realistic piece of work we can bite right now. Hi, As I've discussed in private with you folks, manila has the exact same issue as cinder in this domain. As has been mentioned in this thread, having the project_id in the API routes was something that most projects forked out of nova had, since the beginning. The manila team weighed the options we had and we concluded that the nova approach [1] of advertising routes without project_id and raising the API microversion is a good step for the API consumers. This is a grey area for micro versioning, I agree - however, we're maintaining backwards compatibility by not breaking any consumer that would like to use project_ids in the v2 URLs. It would be a more painful pill to swallow if we wanted to make this change at another major API microversion given all the client automation, sdks and CLIs written against the micro-versioned API. You could also, in theory, make an argument that the "project_id" resolution/substitution is owned by Keystone and the service catalog mechanism - it is controlled by the cloud administrator who creates these endpoints/services, and the catalog endpoints can be updated at any time. These sort of changes are outside of the service API's control. For example, the catalog entry for manila today can be: https://192.168.10.1/share/v2/{project_id} and the administrator may decide to rename it to: https://shared-file-systems.fancycloud.org/v2/{project_id} We shouldn't expect this to break anyone if the application always used the service catalog as the source of truth, and hence, after upgrading to Wallaby, if administrators can change the catalog endpoint to remove the project_id, it shouldn't affect applications negatively - and no matter our choice of major or minor versioning of the API here, endpoint URL changes are really outside of our control. manila's changes to remove the "project_id" requirement from URLs are here: https://review.opendev.org/q/topic:%22bp/remove-project-id-from-urls%22 [1] https://review.opendev.org/233076 > > > > > > So I think your second proposal is the way to go. > > > > > > Thanks for all the work you are putting into this. > > > > > > Cheers, > > > Gorka. > > > > > > > > > > > > > > [0] https://review.opendev.org/q/project:openstack/cinder+topic:secure-rbac > > > > [1] https://etherpad.opendev.org/p/cinder-secure-rbac-protection-testing > > > > [2] https://review.opendev.org/c/openstack/cinderlib/+/772770 > > > > [3] https://review.opendev.org/c/openstack/cinder-tempest-plugin/+/772915 > > > > [4] http://paste.openstack.org/show/802117/ > > > > [5] http://paste.openstack.org/show/802097/ > > > > [6] > > > > https://opendev.org/openstack/keystone/src/commit/c239cc66615b41a0c09e031b3e268c82678bac12/keystone/catalog/backends/sql.py > > > > [7] http://paste.openstack.org/show/802092/ > > > > [8] http://paste.openstack.org/show/802118/ > > > > > > > > > > > > > From dtantsur at redhat.com Fri Feb 5 21:52:15 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 5 Feb 2021 22:52:15 +0100 Subject: [all] Gate resources and performance In-Reply-To: References: Message-ID: On Thu, Feb 4, 2021 at 9:41 PM Mark Goddard wrote: > > > On Thu, 4 Feb 2021, 17:29 Dan Smith, wrote: > >> Hi all, >> >> I have become increasingly concerned with CI performance lately, and >> have been raising those concerns with various people. Most specifically, >> I'm worried about our turnaround time or "time to get a result", which >> has been creeping up lately. Right after the beginning of the year, we >> had a really bad week where the turnaround time was well over 24 >> hours. That means if you submit a patch on Tuesday afternoon, you might >> not get a test result until Thursday. That is, IMHO, a real problem and >> massively hurts our ability to quickly merge priority fixes as well as >> just general velocity and morale. If people won't review my code until >> they see a +1 from Zuul, and that is two days after I submitted it, >> that's bad. >> > Thanks for looking into this Dan, it's definitely an important issue and > can introduce a lot of friction into and already heavy development process. > >> >> Things have gotten a little better since that week, due in part to >> getting past a rush of new year submissions (we think) and also due to >> some job trimming in various places (thanks Neutron!). However, things >> are still not great. Being in almost the last timezone of the day, the >> queue is usually so full when I wake up that it's quite often I don't >> get to see a result before I stop working that day. >> >> I would like to ask that projects review their jobs for places where >> they can cut out redundancy, as well as turn their eyes towards >> optimizations that can be made. I've been looking at both Nova and >> Glance jobs and have found some things I think we can do less of. I also >> wanted to get an idea of who is "using too much" in the way of >> resources, so I've been working on trying to characterize the weight of >> the jobs we run for a project, based on the number of worker nodes >> required to run all the jobs, as well as the wall clock time of how long >> we tie those up. The results are interesting, I think, and may help us >> to identify where we see some gains. >> >> The idea here is to figure out[1] how many "node hours" it takes to run >> all the normal jobs on a Nova patch compared to, say, a Neutron one. If >> the jobs were totally serialized, this is the number of hours a single >> computer (of the size of a CI worker) would take to do all that work. If >> the number is 24 hours, that means a single computer could only check >> *one* patch in a day, running around the clock. I chose the top five >> projects in terms of usage[2] to report here, as they represent 70% of >> the total amount of resources consumed. The next five only add up to >> 13%, so the "top five" seems like a good target group. Here are the >> results, in order of total consumption: >> >> Project % of total Node Hours Nodes >> ------------------------------------------ >> 1. TripleO 38% 31 hours 20 >> 2. Neutron 13% 38 hours 32 >> 3. Nova 9% 21 hours 25 >> 4. Kolla 5% 12 hours 18 >> 5. OSA 5% 22 hours 17 >> > > Acknowledging Kolla is in the top 5. Deployment projects certainly tend to > consume resources. I'll raise this at our next meeting and see what we can > come up with. > > What that means is that a single computer (of the size of a CI worker) >> couldn't even process the jobs required to run on a single patch for >> Neutron or TripleO in a 24-hour period. Now, we have lots of workers in >> the gate, of course, but there is also other potential overhead involved >> in that parallelism, like waiting for nodes to be available for >> dependent jobs. And of course, we'd like to be able to check more than >> patch per day. Most projects have smaller gate job sets than check, but >> assuming they are equivalent, a Neutron patch from submission to commit >> would undergo 76 hours of testing, not including revisions and not >> including rechecks. That's an enormous amount of time and resource for a >> single patch! >> >> Now, obviously nobody wants to run fewer tests on patches before they >> land, and I'm not really suggesting that we take that approach >> necessarily. However, I think there are probably a lot of places that we >> can cut down the amount of *work* we do. Some ways to do this are: >> >> 1. Evaluate whether or not you need to run all of tempest on two >> configurations of a devstack on each patch. Maybe having a >> stripped-down tempest (like just smoke) to run on unique configs, or >> even specific tests. >> 2. Revisit your "irrelevant_files" lists to see where you might be able >> to avoid running heavy jobs on patches that only touch something >> small. >> 3. Consider moving some jobs to the experimental queue and run them >> on-demand for patches that touch particular subsystems or affect >> particular configurations. >> 4. Consider some periodic testing for things that maybe don't need to >> run on every single patch. >> 5. Re-examine tests that take a long time to run to see if something can >> be done to make them more efficient. >> 6. Consider performance improvements in the actual server projects, >> which also benefits the users. > > > 7. Improve the reliability of jobs. Especially voting and gating ones. > Rechecks increase resource usage and time to results/merge. I found > querying the zuul API for failed jobs in the gate pipeline is a good way to > find unexpected failures. > 7.1. Stop marking dependent patches with Verified-2 if their parent fails in the gate, keep them at Verified+1 (their previous state). This is a common source of unnecessary rechecks in the ironic land. > > 8. Reduce the node count in multi node jobs. > > If you're a project that is not in the top ten then your job >> configuration probably doesn't matter that much, since your usage is >> dwarfed by the heavy projects. If the heavy projects would consider >> making changes to decrease their workload, even small gains have the >> ability to multiply into noticeable improvement. The higher you are on >> the above list, the more impact a small change will have on the overall >> picture. >> >> Also, thanks to Neutron and TripleO, both of which have already >> addressed this in some respect, and have other changes on the horizon. >> >> Thanks for listening! >> >> --Dan >> >> 1: https://gist.github.com/kk7ds/5edbfacb2a341bb18df8f8f32d01b37c >> 2; http://paste.openstack.org/show/C4pwUpdgwUDrpW6V6vnC/ >> >> -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Fri Feb 5 21:55:35 2021 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 5 Feb 2021 15:55:35 -0600 Subject: [oslo][keystone] Future of oslo.limit Message-ID: <346b670c-0773-d7e5-106f-5e1be3971e33@nemebean.com> Hi, Last week in the Oslo meeting it was noted that we have some rather old patches open against oslo.limit. Before we spend a bunch of time getting those reviewed and merged, we wanted to confirm that this was still something people wanted to pursue. Things have been pretty quiet on the limit front for a while now, and doing the Oslo side does us no good if no one is going to consume it. I realize this is something of a chicken-and-egg situation since the library was somewhat blocked on [0], but that's been ready to merge for months and so far has only one review. If we do want to continue, merging that would be the first step. Then we should look at [1] which was the patch that prompted this conversation. Thanks. -Ben 0: https://review.opendev.org/c/openstack/oslo.limit/+/733881 1: https://review.opendev.org/c/openstack/oslo.limit/+/695527 From Arkady.Kanevsky at dell.com Fri Feb 5 22:11:26 2021 From: Arkady.Kanevsky at dell.com (Kanevsky, Arkady) Date: Fri, 5 Feb 2021 22:11:26 +0000 Subject: [Interop, Users] listing interop results Message-ID: Dell Customer Communication - Confidential Team, We are reviewing our openstack interop program. We have 3 base "certifications" for interop that cloud vendors can submit and foundation list on marketplace https://www.openstack.org/marketplace/ : 1. Openstack Powered Storage 2. OpenStack Powered Compute 3. OpenStack Powered Platform (combines both compute and storage) These can be viewed as the core. And there are 3 add-on programs for specific projects of openstack: 1. DNS (Designate) 2. Orchestration (Heat) 3. Shared-File-System (Manila) We have 4 categories of clouds that use interop programs and list their offering with interoperability results and logo: 1. Remotely managed private clouds - https://www.openstack.org/marketplace/remotely-managed-private-clouds/ * The latest interop guidelines listed there is 2019.11 and most are from 2017-2018 * All used OpenStack Powered Platform * Vexxhost listed without listing what guidelines they used. (Jimmy can you pull what they had submitted and update their entry) 2. Distros and Appliances - https://www.openstack.org/marketplace/distros/ : * These seems to be updated most often. Several used the latest 2020.06 guidelines * There are usage of all 3 core logos. * No add-on logos listed 3. Hosted Private Clouds - https://www.openstack.org/marketplace/hosted-private-clouds/ : * There is one entry that uses 2020.06 guidelines but most are 2016-2018. * Most for Powered Platform and one for Powered Compute 4. Public Cloud - https://www.openstack.org/marketplace/public-clouds/ : * There is one entry that uses 2020.06 guidelines but most are 2016-2018. * Mixture of Powered Platform and Powered Compute I will skip the question on the value of Marketplace program as it will be discussed separately in newly formed temporary WG under foundation to review/define Open Infrastructure Branding and Trademark policy. We need to address the issue on how to list add-ons for current openstack marketplace program. The simplest way is to add to "TESTED" OpenStack Powered Platform/Compute/Storage also add-ons that vendors passed. In the detailed section we already listing what APIs and services supported and which version. So do not see any changes needed there. Comments and suggestions please? Thanks, Arkady Kanevsky, Ph.D. SP Chief Technologist & DE Dell EMC office of CTO Dell Inc. One Dell Way, MS PS2-91 Round Rock, TX 78682, USA Phone: 512 7204955 -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Fri Feb 5 22:33:33 2021 From: dms at danplanet.com (Dan Smith) Date: Fri, 05 Feb 2021 14:33:33 -0800 Subject: [all] Gate resources and performance In-Reply-To: (Dmitry Tantsur's message of "Fri, 5 Feb 2021 22:52:15 +0100") References: Message-ID: > 7.1. Stop marking dependent patches with Verified-2 if their parent > fails in the gate, keep them at Verified+1 (their previous state). > This is a common source of unnecessary rechecks in the ironic land. Ooh, that's a good one. I'm guessing that may require more state in zuul. Although, maybe it could check to see if it has +1d that patchset before it -2s just for a parent fail. --Dan From fungi at yuggoth.org Fri Feb 5 23:07:10 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 5 Feb 2021 23:07:10 +0000 Subject: [all] Gate resources and performance In-Reply-To: References: Message-ID: <20210205230710.nn3zuwuikwocqdcf@yuggoth.org> On 2021-02-05 22:52:15 +0100 (+0100), Dmitry Tantsur wrote: [...] > 7.1. Stop marking dependent patches with Verified-2 if their > parent fails in the gate, keep them at Verified+1 (their previous > state). This is a common source of unnecessary rechecks in the > ironic land. [...] Zuul generally assumes that if a change fails tests, it's going to need to be revised. Gerrit will absolutely refuse to allow a change to merge if its parent has been revised and the child has not been rebased onto that new revision. Revising or rebasing a change clears the Verified label and will require new test results. Which one or more of these conditions should be considered faulty? I'm guessing you're going to say it's the first one, that we shouldn't assume just because a change fails tests that means it needs to be fixed. This takes us back to the other subthread, wherein we entertain the notion that if changes have failing jobs and the changes themselves aren't at fault, then we should accept this as commonplace and lower our expectations. Keep in mind that the primary source of pain here is one OpenStack has chosen. That is, the "clean check" requirement that a change get a +1 test result in the check pipeline before it can enter the gate pipeline. This is an arbitrary pipeline criterion, chosen to keep problematic changes from getting approved and making their way through the gate queue like a wrecking-ball, causing repeated test resets for the changes after them until they reach the front and Zuul is finally able to determine they're not just conflicting with other changes ahead. If a major pain for Ironic and other OpenStack projects is the need to revisit the check pipeline after a gate failure, that can be alleviated by dropping the clean check requirement. Without clean check, a change which got a -2 in the gate could simply be enqueued directly back to the gate again. This is how it works in our other Zuul tenants. But the reason OpenStack started enforcing it is that reviewers couldn't be bothered to confirm changes really were reasonable, had *recent* passing check results, and confirmed that observed job failures were truly unrelated to the changes themselves. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From gouthampravi at gmail.com Sat Feb 6 00:53:20 2021 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Fri, 5 Feb 2021 16:53:20 -0800 Subject: [oslo][keystone] Future of oslo.limit In-Reply-To: <346b670c-0773-d7e5-106f-5e1be3971e33@nemebean.com> References: <346b670c-0773-d7e5-106f-5e1be3971e33@nemebean.com> Message-ID: On Fri, Feb 5, 2021 at 2:03 PM Ben Nemec wrote: > > Hi, > > Last week in the Oslo meeting it was noted that we have some rather old > patches open against oslo.limit. Before we spend a bunch of time getting > those reviewed and merged, we wanted to confirm that this was still > something people wanted to pursue. Things have been pretty quiet on the > limit front for a while now, and doing the Oslo side does us no good if > no one is going to consume it. Hi Ben! I've been following this loosely, and we thought it would be a good thing to prototype unified limits with manila. We've had an Outreachy intern (Paul Ali) join us this cycle to work on this, and he's getting familiar with the implementation in keystone and oslo.limit and is currently working on the glue code within manila to use the library. So, yes, the answer is we're definitely interested. As you mention, there was an effort to get this tested and used in nova: https://review.opendev.org/q/topic:%22bp%252Funified-limits-nova%22+(status:open%20OR%20status:merged) which ties into [1]. I'll take a look at [1] with Paul, and help with code review. > > I realize this is something of a chicken-and-egg situation since the > library was somewhat blocked on [0], but that's been ready to merge for > months and so far has only one review. If we do want to continue, > merging that would be the first step. Then we should look at [1] which > was the patch that prompted this conversation. > > Thanks. > > -Ben > > 0: https://review.opendev.org/c/openstack/oslo.limit/+/733881 > 1: https://review.opendev.org/c/openstack/oslo.limit/+/695527 > From fungi at yuggoth.org Sat Feb 6 02:17:08 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sat, 6 Feb 2021 02:17:08 +0000 Subject: [infra] bindep, pacman, and PyPI release In-Reply-To: References: Message-ID: <20210206021707.5nro5g2rdorn7e2f@yuggoth.org> On 2020-12-13 11:43:35 +0100 (+0100), Jakob Lykke Andersen wrote: > I'm trying to use bindep on Arch but hitting a problem with the > output handling. It seems that 'pacman -Q' may output warning > lines if you have a file with the same name as the package. [...] > After applying this patch, or whichever change you deem reasonable > to fix the issue, it would be great if you could make a new > release on PyPI. [...] The 2.9.0 release of bindep, uploaded to PyPI a few hours ago, incorporates your fix. Thanks again! See the release announcement here: http://lists.opendev.org/pipermail/service-announce/2021-February/000015.html -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From dtantsur at redhat.com Sat Feb 6 09:33:17 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sat, 6 Feb 2021 10:33:17 +0100 Subject: [all] Gate resources and performance In-Reply-To: <20210205230710.nn3zuwuikwocqdcf@yuggoth.org> References: <20210205230710.nn3zuwuikwocqdcf@yuggoth.org> Message-ID: On Sat, Feb 6, 2021 at 12:10 AM Jeremy Stanley wrote: > On 2021-02-05 22:52:15 +0100 (+0100), Dmitry Tantsur wrote: > [...] > > 7.1. Stop marking dependent patches with Verified-2 if their > > parent fails in the gate, keep them at Verified+1 (their previous > > state). This is a common source of unnecessary rechecks in the > > ironic land. > [...] > > Zuul generally assumes that if a change fails tests, it's going to > need to be revised. Very unfortunately, it's far from being the case in the ironic world. > Gerrit will absolutely refuse to allow a change > to merge if its parent has been revised and the child has not been > rebased onto that new revision. Revising or rebasing a change clears > the Verified label and will require new test results. This is fair, I'm only referring to the case where the parent has to be rechecked because of a transient problem. > Which one or > more of these conditions should be considered faulty? I'm guessing > you're going to say it's the first one, that we shouldn't assume > just because a change fails tests that means it needs to be fixed. > Unfortunately, yes. A parallel proposal, that has been rejected numerous times, is to allow recheching only the failed jobs. Dmitry > This takes us back to the other subthread, wherein we entertain the > notion that if changes have failing jobs and the changes themselves > aren't at fault, then we should accept this as commonplace and lower > our expectations. > > Keep in mind that the primary source of pain here is one OpenStack > has chosen. That is, the "clean check" requirement that a change get > a +1 test result in the check pipeline before it can enter the gate > pipeline. This is an arbitrary pipeline criterion, chosen to keep > problematic changes from getting approved and making their way > through the gate queue like a wrecking-ball, causing repeated test > resets for the changes after them until they reach the front and > Zuul is finally able to determine they're not just conflicting with > other changes ahead. If a major pain for Ironic and other OpenStack > projects is the need to revisit the check pipeline after a gate > failure, that can be alleviated by dropping the clean check > requirement. > > Without clean check, a change which got a -2 in the gate could > simply be enqueued directly back to the gate again. This is how it > works in our other Zuul tenants. But the reason OpenStack started > enforcing it is that reviewers couldn't be bothered to confirm > changes really were reasonable, had *recent* passing check results, > and confirmed that observed job failures were truly unrelated to the > changes themselves. > -- > Jeremy Stanley > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From ankit at aptira.com Fri Feb 5 17:43:44 2021 From: ankit at aptira.com (Ankit Goel) Date: Fri, 5 Feb 2021 17:43:44 +0000 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: <7fdbc97a688744f399f7358b1250bc30@elca.ch> References: <3cb755495b994352aaadf0d31ad295f3@elca.ch> <7fdbc97a688744f399f7358b1250bc30@elca.ch> Message-ID: Hi Jean, Please find below the output. (rally) [root at rally tasks]# pip list | grep rally WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip. Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue. To avoid this problem you can invoke Python with '-m pip' instead of running pip directly. rally 3.2.0 rally-openstack 2.1.0 (rally) [root at rally tasks]# I sourced my Openstack RC file and then used the below command to create the deployment. rally deployment create --fromenv --name=existing Below are the contents for Openstack-rc file. (rally) [root at rally ~]# cat admin-openstack.sh export OS_PROJECT_DOMAIN_NAME=Default export OS_USER_DOMAIN_NAME=Default export OS_PROJECT_NAME=admin export OS_TENANT_NAME=admin export OS_USERNAME=admin export OS_PASSWORD=kMswJJGAeKhziXGWloLjYESvfOytK4DkCAAXcpA8 export OS_AUTH_URL=https://osp.example.com:5000/v3 export OS_INTERFACE=public export OS_IDENTITY_API_VERSION=3 export OS_REGION_NAME=RegionOne export OS_AUTH_PLUGIN=password export OS_CACERT=/root/certificates/haproxy-ca.crt (rally) [root at rally ~]# Regards, Ankit Goel From: Taltavull Jean-Francois Sent: 05 February 2021 13:39 To: Ankit Goel ; openstack-dev at lists.openstack.org Cc: John Spillane Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Hello, 1/ Is “rally-openstack” python package correctly installed ? On my side I have: (venv) vagrant at rally: $ pip list | grep rally rally 3.2.0 rally-openstack 2.1.0 2/ Could you please show the json file used to create the deployment ? From: Ankit Goel > Sent: jeudi, 4 février 2021 18:19 To: Taltavull Jean-Francois >; openstack-dev at lists.openstack.org Cc: John Spillane > Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Thanks for the response Jean. I could install rally with pip command. But when I am running rally deployment show command then it is failing. (rally) [root at rally ~]# rally deployment list +--------------------------------------+----------------------------+----------+------------------+--------+ | uuid | created_at | name | status | active | +--------------------------------------+----------------------------+----------+------------------+--------+ | 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a | 2021-02-04T14:57:02.638753 | existing | deploy->finished | * | +--------------------------------------+----------------------------+----------+------------------+--------+ (rally) [root at rally ~]# (rally) [root at rally ~]# rally deployment show 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a Command failed, please check log for more info 2021-02-04 16:06:49.576 19306 CRITICAL rally [-] Unhandled error: KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally Traceback (most recent call last): 2021-02-04 16:06:49.576 19306 ERROR rally File "/bin/rally", line 11, in 2021-02-04 16:06:49.576 19306 ERROR rally sys.exit(main()) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/main.py", line 40, in main 2021-02-04 16:06:49.576 19306 ERROR rally return cliutils.run(sys.argv, categories) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/cliutils.py", line 669, in run 2021-02-04 16:06:49.576 19306 ERROR rally ret = fn(*fn_args, **fn_kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/envutils.py", line 135, in default_from_global 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/plugins/__init__.py", line 59, in ensure_plugins_are_loaded 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/commands/deployment.py", line 205, in show 2021-02-04 16:06:49.576 19306 ERROR rally creds = deployment["credentials"]["openstack"][0] 2021-02-04 16:06:49.576 19306 ERROR rally KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally (rally) [root at rally ~]# Can you please help me to resolve this issue. Regards, Ankit Goel From: Taltavull Jean-Francois > Sent: 03 February 2021 19:38 To: Ankit Goel >; openstack-dev at lists.openstack.org Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Ankit, Installation part of Rally official doc is not up to date, actually. Just do “pip install rally-openstack” (in a virtualenv, of course 😊) This will also install “rally” python package. Enjoy ! Jean-Francois From: Ankit Goel > Sent: mercredi, 3 février 2021 13:40 To: openstack-dev at lists.openstack.org Subject: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Experts, I was trying to install Openstack rally on centos 7 VM but the link provided in the Openstack doc to download the install_rally.sh is broken. Latest Rally Doc link - > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation Rally Install Script -> https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh - > This is broken After searching on internet I could reach to the Openstack rally Repo - > https://opendev.org/openstack/rally but here I am not seeing the install_ rally.sh script and according to all the information available on internet it says we need install_ rally.sh. Thus can you please let me know what’s the latest procedure to install Rally. Awaiting for your response. Thanks, Ankit Goel -------------- next part -------------- An HTML attachment was scrubbed... URL: From ankit at aptira.com Fri Feb 5 17:49:39 2021 From: ankit at aptira.com (Ankit Goel) Date: Fri, 5 Feb 2021 17:49:39 +0000 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: References: <3cb755495b994352aaadf0d31ad295f3@elca.ch> <7fdbc97a688744f399f7358b1250bc30@elca.ch> Message-ID: Hi Andrey, Yes I used the command which includes --fromenv. Below is the command I used after sourcing my Openstack RC file. rally deployment create --fromenv --name=existing Below are the contents for Openstack RC file. export OS_PROJECT_DOMAIN_NAME=Default export OS_USER_DOMAIN_NAME=Default export OS_PROJECT_NAME=admin export OS_TENANT_NAME=admin export OS_USERNAME=admin export OS_PASSWORD=kMswJJGAeKhziXGWloLjYESvfOytK4DkCAAXcpA8 export OS_AUTH_URL=https://osp.example.com:5000/v3 export OS_INTERFACE=public export OS_IDENTITY_API_VERSION=3 export OS_REGION_NAME=RegionOne export OS_AUTH_PLUGIN=password export OS_CACERT=/root/certificates/haproxy-ca.crt Regards, Ankit Goel From: Andrey Kurilin Sent: 05 February 2021 22:09 To: Taltavull Jean-Francois Cc: Ankit Goel ; openstack-dev at lists.openstack.org; John Spillane Subject: Re: Rally - Unable to install rally - install_rally.sh is not available in repo Hi! I'm not saying `rally deployment show` is completely broken. It should work if you pass the "proper" environment spec to the create command. By "proper" I mean using "openstack" instead of "existing at openstack" plugin name. In other cases, including --from-env option, it looks broken. Anyway, even with "broken" `rally deployment show` command, `rally task` should work. пт, 5 февр. 2021 г. в 17:42, Taltavull Jean-Francois >: Hello, `rally deployment show` works fine for me. With rally v2.1.0 and rally-openstack v3.2.0 From: Andrey Kurilin > Sent: vendredi, 5 février 2021 16:39 To: Taltavull Jean-Francois > Cc: Ankit Goel >; openstack-dev at lists.openstack.org; John Spillane > Subject: Re: Rally - Unable to install rally - install_rally.sh is not available in repo Hi, Ankit! Please accept my apologies for the outdated documentation. Updating docs is my top 1 priority for Rally project, but unfortunately, I do not have much time for doing that. As for the issue you are facing with `rally deployment show command`, it looks like a bug. Switching from `rally deployment` to `rally env` subset of command should solve the problem (rally deployment will be deprecated at some point). пт, 5 февр. 2021 г. в 10:17, Taltavull Jean-Francois >: Hello, 1/ Is “rally-openstack” python package correctly installed ? On my side I have: (venv) vagrant at rally: $ pip list | grep rally rally 3.2.0 rally-openstack 2.1.0 2/ Could you please show the json file used to create the deployment ? From: Ankit Goel > Sent: jeudi, 4 février 2021 18:19 To: Taltavull Jean-Francois >; openstack-dev at lists.openstack.org Cc: John Spillane > Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Thanks for the response Jean. I could install rally with pip command. But when I am running rally deployment show command then it is failing. (rally) [root at rally ~]# rally deployment list +--------------------------------------+----------------------------+----------+------------------+--------+ | uuid | created_at | name | status | active | +--------------------------------------+----------------------------+----------+------------------+--------+ | 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a | 2021-02-04T14:57:02.638753 | existing | deploy->finished | * | +--------------------------------------+----------------------------+----------+------------------+--------+ (rally) [root at rally ~]# (rally) [root at rally ~]# rally deployment show 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a Command failed, please check log for more info 2021-02-04 16:06:49.576 19306 CRITICAL rally [-] Unhandled error: KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally Traceback (most recent call last): 2021-02-04 16:06:49.576 19306 ERROR rally File "/bin/rally", line 11, in 2021-02-04 16:06:49.576 19306 ERROR rally sys.exit(main()) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/main.py", line 40, in main 2021-02-04 16:06:49.576 19306 ERROR rally return cliutils.run(sys.argv, categories) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/cliutils.py", line 669, in run 2021-02-04 16:06:49.576 19306 ERROR rally ret = fn(*fn_args, **fn_kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/envutils.py", line 135, in default_from_global 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/plugins/__init__.py", line 59, in ensure_plugins_are_loaded 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/commands/deployment.py", line 205, in show 2021-02-04 16:06:49.576 19306 ERROR rally creds = deployment["credentials"]["openstack"][0] 2021-02-04 16:06:49.576 19306 ERROR rally KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally (rally) [root at rally ~]# Can you please help me to resolve this issue. Regards, Ankit Goel From: Taltavull Jean-Francois > Sent: 03 February 2021 19:38 To: Ankit Goel >; openstack-dev at lists.openstack.org Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Ankit, Installation part of Rally official doc is not up to date, actually. Just do “pip install rally-openstack” (in a virtualenv, of course 😊) This will also install “rally” python package. Enjoy ! Jean-Francois From: Ankit Goel > Sent: mercredi, 3 février 2021 13:40 To: openstack-dev at lists.openstack.org Subject: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Experts, I was trying to install Openstack rally on centos 7 VM but the link provided in the Openstack doc to download the install_rally.sh is broken. Latest Rally Doc link - > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation Rally Install Script -> https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh - > This is broken After searching on internet I could reach to the Openstack rally Repo - > https://opendev.org/openstack/rally but here I am not seeing the install_ rally.sh script and according to all the information available on internet it says we need install_ rally.sh. Thus can you please let me know what’s the latest procedure to install Rally. Awaiting for your response. Thanks, Ankit Goel -- Best regards, Andrey Kurilin. -- Best regards, Andrey Kurilin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Sat Feb 6 19:51:38 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Sat, 06 Feb 2021 20:51:38 +0100 Subject: [all] Gate resources and performance In-Reply-To: References: <20210205230710.nn3zuwuikwocqdcf@yuggoth.org> Message-ID: <4369261.TshihmYaz6@p1> Hi, Dnia sobota, 6 lutego 2021 10:33:17 CET Dmitry Tantsur pisze: > On Sat, Feb 6, 2021 at 12:10 AM Jeremy Stanley wrote: > > On 2021-02-05 22:52:15 +0100 (+0100), Dmitry Tantsur wrote: > > [...] > > > > > 7.1. Stop marking dependent patches with Verified-2 if their > > > parent fails in the gate, keep them at Verified+1 (their previous > > > state). This is a common source of unnecessary rechecks in the > > > ironic land. > > > > [...] > > > > Zuul generally assumes that if a change fails tests, it's going to > > need to be revised. > > Very unfortunately, it's far from being the case in the ironic world. > > > Gerrit will absolutely refuse to allow a change > > to merge if its parent has been revised and the child has not been > > rebased onto that new revision. Revising or rebasing a change clears > > the Verified label and will require new test results. > > This is fair, I'm only referring to the case where the parent has to be > rechecked because of a transient problem. > > > Which one or > > more of these conditions should be considered faulty? I'm guessing > > you're going to say it's the first one, that we shouldn't assume > > just because a change fails tests that means it needs to be fixed. > > Unfortunately, yes. > > A parallel proposal, that has been rejected numerous times, is to allow > recheching only the failed jobs. Even if I totally understand cons of that I would also be for such possibility. Maybe e.g. if only cores would have such possibility somehow would be good trade off? > > Dmitry > > > This takes us back to the other subthread, wherein we entertain the > > notion that if changes have failing jobs and the changes themselves > > aren't at fault, then we should accept this as commonplace and lower > > our expectations. > > > > Keep in mind that the primary source of pain here is one OpenStack > > has chosen. That is, the "clean check" requirement that a change get > > a +1 test result in the check pipeline before it can enter the gate > > pipeline. This is an arbitrary pipeline criterion, chosen to keep > > problematic changes from getting approved and making their way > > through the gate queue like a wrecking-ball, causing repeated test > > resets for the changes after them until they reach the front and > > Zuul is finally able to determine they're not just conflicting with > > other changes ahead. If a major pain for Ironic and other OpenStack > > projects is the need to revisit the check pipeline after a gate > > failure, that can be alleviated by dropping the clean check > > requirement. > > > > Without clean check, a change which got a -2 in the gate could > > simply be enqueued directly back to the gate again. This is how it > > works in our other Zuul tenants. But the reason OpenStack started > > enforcing it is that reviewers couldn't be bothered to confirm > > changes really were reasonable, had *recent* passing check results, > > and confirmed that observed job failures were truly unrelated to the > > changes themselves. > > -- > > Jeremy Stanley > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > O'Neill -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From i at liuyulong.me Sun Feb 7 06:03:04 2021 From: i at liuyulong.me (=?utf-8?B?TElVIFl1bG9uZw==?=) Date: Sun, 7 Feb 2021 14:03:04 +0800 Subject: [neutron] cancel the neutron L3 meeting next week 2021-02-10 Message-ID: Hi there, The Spring Festival holiday will start at next Wednesday, I will be 10 days offline. So let's cancel the L3 meeting next week. Next neutron L3 meeting will be scheduled for 2021-02-24. Happy lunar New Year! 春节快乐! Regards, LIU Yulong -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Sun Feb 7 07:28:35 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Sun, 07 Feb 2021 09:28:35 +0200 Subject: [all] Gate resources and performance In-Reply-To: <20210204225417.wp2sg3m5hoxmmvus@yuggoth.org> References: <276821612469975@mail.yandex.ru> <20210204225417.wp2sg3m5hoxmmvus@yuggoth.org> Message-ID: <1136221612681434@mail.yandex.ru> An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Sun Feb 7 08:33:18 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Sun, 07 Feb 2021 10:33:18 +0200 Subject: [all] Gate resources and performance In-Reply-To: References: <276821612469975@mail.yandex.ru> <20210204225417.wp2sg3m5hoxmmvus@yuggoth.org> Message-ID: <780171612683132@mail.yandex.ru> That is actually very good idea, thanks! Eventually distro jobs take the way less time then source ones anyway, so I wasn't thinking a lot how to optimize them, while it's also important. So pushed [1] to cover that. Unfortunatelly haven't found the way to properly flatten the list of projects in case of yaml anchors usage :( > for example osa support both source and non souce installs correct. the non souce installs dont need the openstack pojects > just the osa repos since it will be using the binary packages. > > so if you had a second intermeitady job of the souce install the the openstack compoenta repos listed you could skip > updating 50 repos in your binary jobs (im assumning the _distro_ jobs are binary by the way.) > currenlty its updating 105 for every job that is based on openstack-ansible-deploy-aio [1] https://review.opendev.org/c/openstack/openstack-ansible/+/774372/1/zuul.d/jobs.yaml --  Kind Regards, Dmitriy Rabotyagov From dtantsur at redhat.com Sun Feb 7 13:58:14 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sun, 7 Feb 2021 14:58:14 +0100 Subject: [all] Gate resources and performance In-Reply-To: <4369261.TshihmYaz6@p1> References: <20210205230710.nn3zuwuikwocqdcf@yuggoth.org> <4369261.TshihmYaz6@p1> Message-ID: On Sat, Feb 6, 2021 at 8:52 PM Slawek Kaplonski wrote: > Hi, > > Dnia sobota, 6 lutego 2021 10:33:17 CET Dmitry Tantsur pisze: > > On Sat, Feb 6, 2021 at 12:10 AM Jeremy Stanley > wrote: > > > On 2021-02-05 22:52:15 +0100 (+0100), Dmitry Tantsur wrote: > > > [...] > > > > > > > 7.1. Stop marking dependent patches with Verified-2 if their > > > > parent fails in the gate, keep them at Verified+1 (their previous > > > > state). This is a common source of unnecessary rechecks in the > > > > ironic land. > > > > > > [...] > > > > > > Zuul generally assumes that if a change fails tests, it's going to > > > need to be revised. > > > > Very unfortunately, it's far from being the case in the ironic world. > > > > > Gerrit will absolutely refuse to allow a change > > > to merge if its parent has been revised and the child has not been > > > rebased onto that new revision. Revising or rebasing a change clears > > > the Verified label and will require new test results. > > > > This is fair, I'm only referring to the case where the parent has to be > > rechecked because of a transient problem. > > > > > Which one or > > > more of these conditions should be considered faulty? I'm guessing > > > you're going to say it's the first one, that we shouldn't assume > > > just because a change fails tests that means it needs to be fixed. > > > > Unfortunately, yes. > > > > A parallel proposal, that has been rejected numerous times, is to allow > > recheching only the failed jobs. > > Even if I totally understand cons of that I would also be for such > possibility. Maybe e.g. if only cores would have such possibility somehow > would be good trade off? > That would work for me. Although currently there is an unfortunately tendency between newcomers to blindly recheck their patches despite clearly not passing some checks. If they could recheck only some jobs, it would limit their negative impact on the whole CI (and maybe make them realize that it's always the same jobs that fail). Dmitry > > > > > Dmitry > > > > > This takes us back to the other subthread, wherein we entertain the > > > notion that if changes have failing jobs and the changes themselves > > > aren't at fault, then we should accept this as commonplace and lower > > > our expectations. > > > > > > Keep in mind that the primary source of pain here is one OpenStack > > > has chosen. That is, the "clean check" requirement that a change get > > > a +1 test result in the check pipeline before it can enter the gate > > > pipeline. This is an arbitrary pipeline criterion, chosen to keep > > > problematic changes from getting approved and making their way > > > through the gate queue like a wrecking-ball, causing repeated test > > > resets for the changes after them until they reach the front and > > > Zuul is finally able to determine they're not just conflicting with > > > other changes ahead. If a major pain for Ironic and other OpenStack > > > projects is the need to revisit the check pipeline after a gate > > > failure, that can be alleviated by dropping the clean check > > > requirement. > > > > > > Without clean check, a change which got a -2 in the gate could > > > simply be enqueued directly back to the gate again. This is how it > > > works in our other Zuul tenants. But the reason OpenStack started > > > enforcing it is that reviewers couldn't be bothered to confirm > > > changes really were reasonable, had *recent* passing check results, > > > and confirmed that observed job failures were truly unrelated to the > > > changes themselves. > > > -- > > > Jeremy Stanley > > > > -- > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > > Commercial register: Amtsgericht Muenchen, HRB 153243, > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > > O'Neill > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Sun Feb 7 14:06:04 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sun, 7 Feb 2021 14:06:04 +0000 Subject: [all] Gate resources and performance In-Reply-To: <1136221612681434@mail.yandex.ru> References: <276821612469975@mail.yandex.ru> <20210204225417.wp2sg3m5hoxmmvus@yuggoth.org> <1136221612681434@mail.yandex.ru> Message-ID: <20210207140603.krpv3y5r6td42zql@yuggoth.org> On 2021-02-07 09:28:35 +0200 (+0200), Dmitriy Rabotyagov wrote: > Once you said that, I looked through the actual code of the > prepare-workspace-git role more carefully and you're right - all > actions are made against already cached repos there. However since > it mostly uses commands, it would still be the way more efficient > to make up some module to replace all commands/shell to run things > in multiprocess way.   Regarding example, you can take any random > task from osa, ie [1] - it takes a bit more then 6 mins. When load > on providers is high (or their volume backend io is poor), time > increases [...] Okay, so that's these tasks: https://opendev.org/zuul/zuul-jobs/src/commit/8bdb2b538c79dd75bac14180b905a1e6d6465a81/roles/prepare-workspace-git/tasks/main.yaml https://opendev.org/zuul/zuul-jobs/src/commit/8bdb2b538c79dd75bac14180b905a1e6d6465a81/roles/mirror-workspace-git-repos/tasks/main.yaml It's doing a git clone from the cache on the node into the workspace (in theory from one path to another within the same filesystem, which should normally just result in git creating hardlinks to the original objects/packs), and that took 101 seconds to clone 106 repositories. After that, 83 seconds were spent fixing up configuration on each of those clones. The longest step does indeed seem to be the 128 seconds where it pushed updated refs from the cache on the executor over the network into the prepared workspace on the remote build node. I wonder if combining these into a single loop could help reduce the iteration overhead, or whether processing repositories in parallel would help (if they're limited by I/O bandwidth then I expect not)? Regardless, yeah, 5m12s does seem like a good chunk of time. On the other hand, it's worth keeping in mind that's just shy of 3 seconds per required-project so like you say, it's mainly impacting jobs with a massive number of required-projects. A different approach might be to revisit the list of required-projects for that job and check whether they're all actually used. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From fungi at yuggoth.org Sun Feb 7 14:10:39 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sun, 7 Feb 2021 14:10:39 +0000 Subject: [all] Gate resources and performance In-Reply-To: References: <20210205230710.nn3zuwuikwocqdcf@yuggoth.org> <4369261.TshihmYaz6@p1> Message-ID: <20210207141038.tm7uyjqmh2b3ciwn@yuggoth.org> On 2021-02-07 14:58:14 +0100 (+0100), Dmitry Tantsur wrote: [...] > Although currently there is an unfortunately tendency between > newcomers to blindly recheck their patches despite clearly not > passing some checks. If they could recheck only some jobs, it > would limit their negative impact on the whole CI (and maybe make > them realize that it's always the same jobs that fail). [...] Put differently, there is a strong tendency for newcomers and long-timers alike to just beep blindly rechecking their buggy changed until they merge and introduce new nondeterministic behaviors into the software. If they only needed to recheck the specific jobs which failed on those bugs they're introducing one build at a time, it would become far easier for them to accomplish their apparent (judging from this habit) goal of making the software essentially unusable and untestable. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From dtantsur at redhat.com Sun Feb 7 16:25:29 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sun, 7 Feb 2021 17:25:29 +0100 Subject: [all] Gate resources and performance In-Reply-To: <20210207141038.tm7uyjqmh2b3ciwn@yuggoth.org> References: <20210205230710.nn3zuwuikwocqdcf@yuggoth.org> <4369261.TshihmYaz6@p1> <20210207141038.tm7uyjqmh2b3ciwn@yuggoth.org> Message-ID: On Sun, Feb 7, 2021 at 3:12 PM Jeremy Stanley wrote: > On 2021-02-07 14:58:14 +0100 (+0100), Dmitry Tantsur wrote: > [...] > > Although currently there is an unfortunately tendency between > > newcomers to blindly recheck their patches despite clearly not > > passing some checks. If they could recheck only some jobs, it > > would limit their negative impact on the whole CI (and maybe make > > them realize that it's always the same jobs that fail). > [...] > > Put differently, there is a strong tendency for newcomers and > long-timers alike to just beep blindly rechecking their buggy > changed until they merge and introduce new nondeterministic > behaviors into the software. If they only needed to recheck the > specific jobs which failed on those bugs they're introducing one > build at a time, it would become far easier for them to accomplish > their apparent (judging from this habit) goal of making the software > essentially unusable and untestable. > I cannot confirm your observation. In the cases I've seen it's a hard failure, completely deterministic, they just fail to recognize it. In any case, leaving this right to cores only more or less fixes this concern. Dmitry > -- > Jeremy Stanley > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Sun Feb 7 16:40:59 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sun, 7 Feb 2021 16:40:59 +0000 Subject: [all] Gate resources and performance In-Reply-To: References: <20210205230710.nn3zuwuikwocqdcf@yuggoth.org> <4369261.TshihmYaz6@p1> <20210207141038.tm7uyjqmh2b3ciwn@yuggoth.org> Message-ID: <20210207164059.djd3kim5l5tzw5el@yuggoth.org> On 2021-02-07 17:25:29 +0100 (+0100), Dmitry Tantsur wrote: > On Sun, Feb 7, 2021 at 3:12 PM Jeremy Stanley wrote: > > On 2021-02-07 14:58:14 +0100 (+0100), Dmitry Tantsur wrote: > > [...] > > > Although currently there is an unfortunately tendency between > > > newcomers to blindly recheck their patches despite clearly not > > > passing some checks. If they could recheck only some jobs, it > > > would limit their negative impact on the whole CI (and maybe make > > > them realize that it's always the same jobs that fail). > > [...] > > > > Put differently, there is a strong tendency for newcomers and > > long-timers alike to just beep blindly rechecking their buggy > > changed until they merge and introduce new nondeterministic > > behaviors into the software. If they only needed to recheck the > > specific jobs which failed on those bugs they're introducing one > > build at a time, it would become far easier for them to accomplish > > their apparent (judging from this habit) goal of making the software > > essentially unusable and untestable. > > I cannot confirm your observation. In the cases I've seen it's a hard > failure, completely deterministic, they just fail to recognize it. > > In any case, leaving this right to cores only more or less fixes this > concern. Before we began enforcing a "clean check" rule with Zuul, there were many occasions where a ~50% failure condition was merged in some project, and upon digging into the origin it was discovered that the patch which introduced it actually failed at least once and was rechecked until it passed, then the prior rechecks were ignored by core reviewers who went on to approve the patch, and the author proceeded to recheck-spam it until it merged, repeatedly tripping tests on the same bug in the process. Once changes were required to pass their full battery of tests twice in a row to merge, situations like this were drastically reduced in frequency. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From syedammad83 at gmail.com Mon Feb 8 04:49:07 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Mon, 8 Feb 2021 09:49:07 +0500 Subject: Trove Multi-Tenancy In-Reply-To: References: Message-ID: Hi Lingxian, You are right, the user has access to the database instance and that is what a user expects from Database as a Service. I was thinking as a cloud operator keeping in view the billing perspective, we usually do billing in terms of nova instance. Here we need to change our approach. Ammad Ali On Fri, Feb 5, 2021 at 3:08 PM Lingxian Kong wrote: > There are several config options you can change to support this model: > > [DEFAULT] > remote_nova_client = trove.common.clients.nova_client > remote_neutron_client = trove.common.clients.neutron_client > remote_cinder_client = trove.common.clients.cinder_client > remote_glance_client = trove.common.clients.glance_client > > *However, those configs are extremely not recommended and not maintained > any more in Trove, *which means, function may broken in this case. > > The reasons are many folds. Apart from the security reason, one important > thing is, Trove is a database as a service, what the cloud user is getting > from Trove are the access to the database and some management APIs for > database operations, rather than a purely Nova VM that has a database > installed and can be accessed by the cloud user. If you prefer this model, > why not just create Nova VM on your own and manually install database > software so you have more control of that? > > --- > Lingxian Kong > Senior Cloud Engineer (Catalyst Cloud) > Trove PTL (OpenStack) > OpenStack Cloud Provider Co-Lead (Kubernetes) > > > On Fri, Feb 5, 2021 at 6:52 PM Ammad Syed wrote: > >> Hello Kong, >> >> I am using latest victoria release and trove 14.0. >> >> Yes you are right, this is exactly happening. All the nova instances are >> in trove user service project. From my admin user i am only able to list >> database instances. >> >> Is it possible that all nova instances should also deploy in any tenant >> project i.e if i am deploying database instance from admin user having >> adminproject and default domain the nova instance should be in adminproject >> rather then trove service project. >> >> Ammad >> Sent from my iPhone >> >> On Feb 5, 2021, at 1:49 AM, Lingxian Kong wrote: >> >>  >> Hi Syed, >> >> What's the trove version you've deployed? >> >> From your configuration, once a trove instance is created, a nova server >> is created in the "service" project, as trove user, you can only show the >> trove instance. >> >> --- >> Lingxian Kong >> Senior Cloud Engineer (Catalyst Cloud) >> Trove PTL (OpenStack) >> OpenStack Cloud Provider Co-Lead (Kubernetes) >> >> >> On Fri, Feb 5, 2021 at 12:40 AM Ammad Syed wrote: >> >>> Hi, >>> >>> I have deployed trove and database instance deployment is successful. >>> But the problem is all the database servers are being created in service >>> account i.e openstack instance list shows the database instances in admin >>> user but when I check openstack server list the database instance won't >>> show up here, its visible in trove service account. >>> >>> Can you please advise how the servers will be visible in admin account ? >>> I want to enable multi-tenancy. >>> >>> Below is the configuration >>> >>> [DEFAULT] >>> log_dir = /var/log/trove >>> # RabbitMQ connection info >>> transport_url = rabbit://openstack:password at controller >>> control_exchange = trove >>> trove_api_workers = 5 >>> network_driver = trove.network.neutron.NeutronDriver >>> taskmanager_manager = trove.taskmanager.manager.Manager >>> default_datastore = mysql >>> cinder_volume_type = database_storage >>> reboot_time_out = 300 >>> usage_timeout = 900 >>> agent_call_high_timeout = 1200 >>> >>> nova_keypair = trove-key >>> >>> debug = true >>> trace = true >>> >>> # MariaDB connection info >>> [database] >>> connection = mysql+pymysql://trove:password at mariadb01/trove >>> >>> [mariadb] >>> tcp_ports = 3306,4444,4567,4568 >>> >>> [mysql] >>> tcp_ports = 3306 >>> >>> [postgresql] >>> tcp_ports = 5432 >>> >>> [redis] >>> tcp_ports = 6379,16379 >>> >>> # Keystone auth info >>> [keystone_authtoken] >>> www_authenticate_uri = http://controller:5000 >>> auth_url = http://controller:5000 >>> memcached_servers = controller:11211 >>> auth_type = password >>> project_domain_name = default >>> user_domain_name = default >>> project_name = service >>> username = trove >>> password = servicepassword >>> >>> [service_credentials] >>> auth_url = http://controller:5000 >>> region_name = RegionOne >>> project_domain_name = default >>> user_domain_name = default >>> project_name = service >>> username = trove >>> password = servicepassword >>> >>> -- >>> Regards, >>> >>> >>> Syed Ammad Ali >>> >> -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From arne.wiebalck at cern.ch Mon Feb 8 08:04:36 2021 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Mon, 8 Feb 2021 09:04:36 +0100 Subject: [baremetal-sig][ironic] Tue Feb 9, 2021, 2pm UTC: Deploy Steps in Ironic Message-ID: <1bbe3d44-dc36-d168-fab4-1f81a34b32ac@cern.ch> Dear all, The Bare Metal SIG will meet tomorrow Tue Feb 9, 2021 at 2pm UTC on zoom. This time there will be a 10 minute "topic-of-the-day" presentation by Dmitry Tantsur (dtansur) on: 'An Introduction to Deploy Steps in Ironic' If you've never found the time to understand what deploy steps are and how they are useful, this is your chance to hear it from one of the experts! You can find all the details for this meeting on the SIG's etherpad: https://etherpad.opendev.org/p/bare-metal-sig Everyone is welcome, don't miss out! Cheers, Arne From mark at stackhpc.com Mon Feb 8 09:14:41 2021 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 8 Feb 2021 09:14:41 +0000 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty In-Reply-To: References: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> Message-ID: On Fri, 5 Feb 2021 at 15:49, roshan anvekar wrote: > > Thanks for the reply. > > Well, I have a multinode setup ( 3 controllers and multiple compute nodes) which was initially deployed with rocky and was working fine. > > I checked the globals.yml and site.yml files between rocky and stein and I could not see any significant changes. > > Also under Admin-Compute-Hypervisor, I see that all the compute nodes are showing up under Compute section. the hypervisor section is empty. > > I was wondering if controllers are placed under a different aggregate and not able to show up. I can see all 3 controllers listed in host-aggregates panel though and are in service up state. > > VM creation fails with no valid host found error. > > I am not able to point out issue since I don't see any errors in deployment too. > > Regards, > Roshan > > > > > > On Fri, Feb 5, 2021, 7:47 PM Sean Mooney wrote: >> >> On Fri, 2021-02-05 at 14:53 +0530, roshan anvekar wrote: >> > Hi all, >> > >> > Scenario: I have an installation of Openstack stein through kolla-ansible. >> > The deployment went fine and all services look good. >> > >> > Although I am seeing that under Admin--> Compute --> Hypervisors panel in >> > horizon, all the controller nodes are missing. It's a blank list. >> did you actully deploy the nova compute agent service to them? >> >> that view is showing the list of host that are running the nova compute service >> typically that is not deployed to the contolers. >> >> host in the contol group in the kolla multi node inventlry >> https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L3 >> are not use to run the compute agent by default >> only nodes in the compute group are >> https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L18 >> the eception to that is ironic https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L301-L302 >> which is deployed to the contolers. >> >> the nova compute agent used for libvirt is deployed specificlly to the compute hosts via the nova-cell role at least on master >> https://github.com/openstack/kolla-ansible/blob/master/ansible/nova.yml#L118 >> this was done a little simpler before adding cell support but the inventory side has not changed in many release in this >> regard. >> >> >> > >> > Also "Openstack hypervisor list" gives an empty list. >> > >> > I skimmed through the logs and found no error message other than in >> > nova-scheduler that: >> > >> > *Got no allocation candidates from the Placement API. This could be due to >> > insufficient resources or a temporary occurence as compute nodes start up.* >> > >> > Subsequently I checked placement container logs and found no error message >> > or anamoly. >> > >> > Not sure what the issue is. Any help in the above case would be appreciated. >> > >> > Regards, >> > Roshan Nova compute logs are probably the best place to go digging. >> >> >> From pierre at stackhpc.com Mon Feb 8 09:29:59 2021 From: pierre at stackhpc.com (Pierre Riteau) Date: Mon, 8 Feb 2021 10:29:59 +0100 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty In-Reply-To: References: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> Message-ID: The hypervisor list in Horizon should show all hypervisors, indifferently of which aggregate they belong to. I am thinking your issue could be related to cell configuration. On one of your Nova controller, drop into a nova_api container and use nova-manage to inspect the cell configuration: $ docker exec -it nova_api bash (nova-api)[nova at controller0 /]$ nova-manage cell_v2 usage: nova-manage cell_v2 [-h] {create_cell,delete_cell,delete_host,discover_hosts,list_cells,list_hosts,map_cell0,map_cell_and_hosts,map_instances,simple_cell_setup,update_cell,verify_instance} ... nova-manage cell_v2: error: too few arguments (nova-api)[nova at controller0 /]$ nova-manage cell_v2 list_cells [...] (nova-api)[nova at controller0 /]$ nova-manage cell_v2 list_hosts [...] Does the last command show your hypervisors? On Fri, 5 Feb 2021 at 16:50, roshan anvekar wrote: > > Thanks for the reply. > > Well, I have a multinode setup ( 3 controllers and multiple compute nodes) which was initially deployed with rocky and was working fine. > > I checked the globals.yml and site.yml files between rocky and stein and I could not see any significant changes. > > Also under Admin-Compute-Hypervisor, I see that all the compute nodes are showing up under Compute section. the hypervisor section is empty. > > I was wondering if controllers are placed under a different aggregate and not able to show up. I can see all 3 controllers listed in host-aggregates panel though and are in service up state. > > VM creation fails with no valid host found error. > > I am not able to point out issue since I don't see any errors in deployment too. > > Regards, > Roshan > > > > > > On Fri, Feb 5, 2021, 7:47 PM Sean Mooney wrote: >> >> On Fri, 2021-02-05 at 14:53 +0530, roshan anvekar wrote: >> > Hi all, >> > >> > Scenario: I have an installation of Openstack stein through kolla-ansible. >> > The deployment went fine and all services look good. >> > >> > Although I am seeing that under Admin--> Compute --> Hypervisors panel in >> > horizon, all the controller nodes are missing. It's a blank list. >> did you actully deploy the nova compute agent service to them? >> >> that view is showing the list of host that are running the nova compute service >> typically that is not deployed to the contolers. >> >> host in the contol group in the kolla multi node inventlry >> https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L3 >> are not use to run the compute agent by default >> only nodes in the compute group are >> https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L18 >> the eception to that is ironic https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L301-L302 >> which is deployed to the contolers. >> >> the nova compute agent used for libvirt is deployed specificlly to the compute hosts via the nova-cell role at least on master >> https://github.com/openstack/kolla-ansible/blob/master/ansible/nova.yml#L118 >> this was done a little simpler before adding cell support but the inventory side has not changed in many release in this >> regard. >> >> >> > >> > Also "Openstack hypervisor list" gives an empty list. >> > >> > I skimmed through the logs and found no error message other than in >> > nova-scheduler that: >> > >> > *Got no allocation candidates from the Placement API. This could be due to >> > insufficient resources or a temporary occurence as compute nodes start up.* >> > >> > Subsequently I checked placement container logs and found no error message >> > or anamoly. >> > >> > Not sure what the issue is. Any help in the above case would be appreciated. >> > >> > Regards, >> > Roshan >> >> >> From lucasagomes at gmail.com Mon Feb 8 09:41:51 2021 From: lucasagomes at gmail.com (Lucas Alvares Gomes) Date: Mon, 8 Feb 2021 09:41:51 +0000 Subject: [neutron] Bug Deputy Report Feb 1-8 Message-ID: Hi, This is the Neutron bug report of the week of 2021-02-01. High: * https://bugs.launchpad.net/neutron/+bug/1914394 - "[OVN] RowNotFound exception while waiting for Chassis metadata networks" Assigned to: lucasagomes * https://bugs.launchpad.net/neutron/+bug/1914747 - " Trunk subport's port status and binding:host_id don't updated after live-migration" Assigned to: slaweq Medium: * https://bugs.launchpad.net/neutron/+bug/1914754 - " [OVN][FT] Use the correct SB chassis table in testing" Assigned to: ralonsoh Needs further triage: * https://bugs.launchpad.net/neutron/+bug/1914745 - "[OVN/OVS] security groups erroneously dropping IGMP/multicast traffic" Unassigned * https://bugs.launchpad.net/neutron/+bug/1914231 - "IPv6 subnet creation in segmented network partially fails" Unassigned * https://bugs.launchpad.net/neutron/+bug/1914522 - "migrate from iptables firewall to ovs firewall" Unassigned * https://bugs.launchpad.net/neutron/+bug/1914842 - "Create vlan network will get 500 error when enable network_segment_range service plugin" Unassigned * https://bugs.launchpad.net/neutron/+bug/1914857 - "AttributeError: 'NoneType' object has no attribute 'db_find_rows" Unassigned Wishlist: * https://bugs.launchpad.net/neutron/+bug/1914757 - "[ovn] add ovn driver for security-group-logging" Assigned to: flaviof From syedammad83 at gmail.com Mon Feb 8 09:41:08 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Mon, 8 Feb 2021 14:41:08 +0500 Subject: [victoria][trove] percona server deployment error Message-ID: Hi, I am trying to create a percona server datastore and datastore version. I have used the below commands. root at trove:/etc/trove# su -s /bin/bash trove -c "trove-manage datastore_update percona ''" Datastore 'percona' updated. root@ trove :/etc/trove# su -s /bin/sh -c "trove-manage datastore_version_update percona 5.7 percona cc2fbce7-4711-49bd-98ae-19e165c7a003 percona 1" trove Datastore version '5.7' updated. root@ trove :/etc/trove# su -s /bin/bash trove -c "trove-manage db_load_datastore_config_parameters percona 5.7 /usr/lib/python3/dist-packages/trove/templates/percona/validation-rules.json" Loading config parameters for datastore (percona) version (5.7) I have added percona section in trove.conf [percona] tcp_ports = 3306 When I try to deploy a database instance, it gives an error. Digging down further via ssh to the db instance, I have found below repetitive errors in agent logs. 2021-02-08 09:33:01.661 1288 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo groupadd --gid 1001 database execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:384 2021-02-08 09:33:01.689 1288 DEBUG oslo_concurrency.processutils [-] CMD "sudo groupadd --gid 1001 database" returned: 9 in 0.029s execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:423 2021-02-08 09:33:01.691 1288 DEBUG oslo_concurrency.processutils [-] 'sudo groupadd --gid 1001 database' failed. Not Retrying. execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:474 2021-02-08 09:33:01.692 1288 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo useradd --uid 1001 --gid 1001 -M database execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:384 2021-02-08 09:33:01.708 1288 DEBUG oslo_concurrency.processutils [-] CMD "sudo useradd --uid 1001 --gid 1001 -M database" returned: 9 in 0.016s execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:423 2021-02-08 09:33:01.709 1288 DEBUG oslo_concurrency.processutils [-] 'sudo useradd --uid 1001 --gid 1001 -M database' failed. Not Retrying. execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:474 2021-02-08 09:33:01.709 1288 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): grep '^/dev/vdb ' /etc/mtab execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:384 2021-02-08 09:33:01.716 1288 DEBUG oslo_concurrency.processutils [-] CMD "grep '^/dev/vdb ' /etc/mtab" returned: 0 in 0.007s execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:423 2021-02-08 09:33:01.877 1288 CRITICAL root [-] Unhandled error: ModuleNotFoundError: No module named 'trove.guestagent.datastore.experimental' 2021-02-08 09:33:01.877 1288 ERROR root Traceback (most recent call last): 2021-02-08 09:33:01.877 1288 ERROR root File "/home/ubuntu/trove/contrib/trove-guestagent", line 34, in 2021-02-08 09:33:01.877 1288 ERROR root sys.exit(main()) 2021-02-08 09:33:01.877 1288 ERROR root File "/home/ubuntu/trove/trove/cmd/guest.py", line 94, in main 2021-02-08 09:33:01.877 1288 ERROR root rpc_api_version=guest_api.API.API_LATEST_VERSION) 2021-02-08 09:33:01.877 1288 ERROR root File "/home/ubuntu/trove/trove/common/rpc/service.py", line 48, in __init__ 2021-02-08 09:33:01.877 1288 ERROR root _manager = importutils.import_object(manager) 2021-02-08 09:33:01.877 1288 ERROR root File "/opt/guest-agent-venv/lib/python3.6/site-packages/oslo_utils/importutils.py", line 44, in import_object 2021-02-08 09:33:01.877 1288 ERROR root return import_class(import_str)(*args, **kwargs) 2021-02-08 09:33:01.877 1288 ERROR root File "/opt/guest-agent-venv/lib/python3.6/site-packages/oslo_utils/importutils.py", line 30, in import_class 2021-02-08 09:33:01.877 1288 ERROR root __import__(mod_str) 2021-02-08 09:33:01.877 1288 ERROR root ModuleNotFoundError: No module named 'trove.guestagent.datastore.experimental' 2021-02-08 09:33:01.877 1288 ERROR root 2021-02-08 09:33:08.287 1394 INFO trove.cmd.guest [-] Creating user and group for database service 2021-02-08 09:33:08.288 1394 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): sudo groupadd --gid 1001 database execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:384 2021-02-08 09:33:08.314 1394 DEBUG oslo_concurrency.processutils [-] CMD "sudo groupadd --gid 1001 database" returned: 9 in 0.026s execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:423 2021-02-08 09:33:08.315 1394 DEBUG oslo_concurrency.processutils [-] 'sudo groupadd --gid 1001 database' failed. Not Retrying. execute /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:474 -- Regards, Ammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From anlin.kong at gmail.com Mon Feb 8 09:59:53 2021 From: anlin.kong at gmail.com (Lingxian Kong) Date: Mon, 8 Feb 2021 22:59:53 +1300 Subject: Trove Multi-Tenancy In-Reply-To: References: Message-ID: In Trove's case, I would suggest to consider the flavor/volume of the instance and floating IP if instance is public. --- Lingxian Kong Senior Cloud Engineer (Catalyst Cloud) Trove PTL (OpenStack) OpenStack Cloud Provider Co-Lead (Kubernetes) On Mon, Feb 8, 2021 at 5:51 PM Ammad Syed wrote: > Hi Lingxian, > > You are right, the user has access to the database instance and that is > what a user expects from Database as a Service. I was thinking as a cloud > operator keeping in view the billing perspective, we usually do billing in > terms of nova instance. Here we need to change our approach. > > Ammad Ali > > On Fri, Feb 5, 2021 at 3:08 PM Lingxian Kong wrote: > >> There are several config options you can change to support this model: >> >> [DEFAULT] >> remote_nova_client = trove.common.clients.nova_client >> remote_neutron_client = trove.common.clients.neutron_client >> remote_cinder_client = trove.common.clients.cinder_client >> remote_glance_client = trove.common.clients.glance_client >> >> *However, those configs are extremely not recommended and not maintained >> any more in Trove, *which means, function may broken in this case. >> >> The reasons are many folds. Apart from the security reason, one important >> thing is, Trove is a database as a service, what the cloud user is getting >> from Trove are the access to the database and some management APIs for >> database operations, rather than a purely Nova VM that has a database >> installed and can be accessed by the cloud user. If you prefer this model, >> why not just create Nova VM on your own and manually install database >> software so you have more control of that? >> >> --- >> Lingxian Kong >> Senior Cloud Engineer (Catalyst Cloud) >> Trove PTL (OpenStack) >> OpenStack Cloud Provider Co-Lead (Kubernetes) >> >> >> On Fri, Feb 5, 2021 at 6:52 PM Ammad Syed wrote: >> >>> Hello Kong, >>> >>> I am using latest victoria release and trove 14.0. >>> >>> Yes you are right, this is exactly happening. All the nova instances are >>> in trove user service project. From my admin user i am only able to list >>> database instances. >>> >>> Is it possible that all nova instances should also deploy in any tenant >>> project i.e if i am deploying database instance from admin user having >>> adminproject and default domain the nova instance should be in adminproject >>> rather then trove service project. >>> >>> Ammad >>> Sent from my iPhone >>> >>> On Feb 5, 2021, at 1:49 AM, Lingxian Kong wrote: >>> >>>  >>> Hi Syed, >>> >>> What's the trove version you've deployed? >>> >>> From your configuration, once a trove instance is created, a nova server >>> is created in the "service" project, as trove user, you can only show the >>> trove instance. >>> >>> --- >>> Lingxian Kong >>> Senior Cloud Engineer (Catalyst Cloud) >>> Trove PTL (OpenStack) >>> OpenStack Cloud Provider Co-Lead (Kubernetes) >>> >>> >>> On Fri, Feb 5, 2021 at 12:40 AM Ammad Syed >>> wrote: >>> >>>> Hi, >>>> >>>> I have deployed trove and database instance deployment is successful. >>>> But the problem is all the database servers are being created in service >>>> account i.e openstack instance list shows the database instances in admin >>>> user but when I check openstack server list the database instance won't >>>> show up here, its visible in trove service account. >>>> >>>> Can you please advise how the servers will be visible in admin account >>>> ? I want to enable multi-tenancy. >>>> >>>> Below is the configuration >>>> >>>> [DEFAULT] >>>> log_dir = /var/log/trove >>>> # RabbitMQ connection info >>>> transport_url = rabbit://openstack:password at controller >>>> control_exchange = trove >>>> trove_api_workers = 5 >>>> network_driver = trove.network.neutron.NeutronDriver >>>> taskmanager_manager = trove.taskmanager.manager.Manager >>>> default_datastore = mysql >>>> cinder_volume_type = database_storage >>>> reboot_time_out = 300 >>>> usage_timeout = 900 >>>> agent_call_high_timeout = 1200 >>>> >>>> nova_keypair = trove-key >>>> >>>> debug = true >>>> trace = true >>>> >>>> # MariaDB connection info >>>> [database] >>>> connection = mysql+pymysql://trove:password at mariadb01/trove >>>> >>>> [mariadb] >>>> tcp_ports = 3306,4444,4567,4568 >>>> >>>> [mysql] >>>> tcp_ports = 3306 >>>> >>>> [postgresql] >>>> tcp_ports = 5432 >>>> >>>> [redis] >>>> tcp_ports = 6379,16379 >>>> >>>> # Keystone auth info >>>> [keystone_authtoken] >>>> www_authenticate_uri = http://controller:5000 >>>> auth_url = http://controller:5000 >>>> memcached_servers = controller:11211 >>>> auth_type = password >>>> project_domain_name = default >>>> user_domain_name = default >>>> project_name = service >>>> username = trove >>>> password = servicepassword >>>> >>>> [service_credentials] >>>> auth_url = http://controller:5000 >>>> region_name = RegionOne >>>> project_domain_name = default >>>> user_domain_name = default >>>> project_name = service >>>> username = trove >>>> password = servicepassword >>>> >>>> -- >>>> Regards, >>>> >>>> >>>> Syed Ammad Ali >>>> >>> > > -- > Regards, > > > Syed Ammad Ali > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anlin.kong at gmail.com Mon Feb 8 10:04:42 2021 From: anlin.kong at gmail.com (Lingxian Kong) Date: Mon, 8 Feb 2021 23:04:42 +1300 Subject: [victoria][trove] percona server deployment error In-Reply-To: References: Message-ID: I'm afraid only MySQL 5.7.x and MariaDB 10.4.x are fully supported in Victoria, PostgreSQL 12.4 is partially supported, all other drivers have no maintainers any more, unfortunately. --- Lingxian Kong Senior Cloud Engineer (Catalyst Cloud) Trove PTL (OpenStack) OpenStack Cloud Provider Co-Lead (Kubernetes) On Mon, Feb 8, 2021 at 10:49 PM Ammad Syed wrote: > Hi, > > I am trying to create a percona server datastore and datastore version. I > have used the below commands. > > root at trove:/etc/trove# su -s /bin/bash trove -c "trove-manage > datastore_update percona ''" > Datastore 'percona' updated. > root@ trove :/etc/trove# su -s /bin/sh -c "trove-manage > datastore_version_update percona 5.7 percona > cc2fbce7-4711-49bd-98ae-19e165c7a003 percona 1" trove > Datastore version '5.7' updated. > root@ trove :/etc/trove# su -s /bin/bash trove -c "trove-manage > db_load_datastore_config_parameters percona 5.7 > /usr/lib/python3/dist-packages/trove/templates/percona/validation-rules.json" > Loading config parameters for datastore (percona) version (5.7) > > I have added percona section in trove.conf > > [percona] > tcp_ports = 3306 > > When I try to deploy a database instance, it gives an error. Digging down > further via ssh to the db instance, I have found below repetitive errors in > agent logs. > > 2021-02-08 09:33:01.661 1288 DEBUG oslo_concurrency.processutils [-] > Running cmd (subprocess): sudo groupadd --gid 1001 database execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:384 > 2021-02-08 09:33:01.689 1288 DEBUG oslo_concurrency.processutils [-] CMD > "sudo groupadd --gid 1001 database" returned: 9 in 0.029s execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:423 > 2021-02-08 09:33:01.691 1288 DEBUG oslo_concurrency.processutils [-] 'sudo > groupadd --gid 1001 database' failed. Not Retrying. execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:474 > 2021-02-08 09:33:01.692 1288 DEBUG oslo_concurrency.processutils [-] > Running cmd (subprocess): sudo useradd --uid 1001 --gid 1001 -M database > execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:384 > 2021-02-08 09:33:01.708 1288 DEBUG oslo_concurrency.processutils [-] CMD > "sudo useradd --uid 1001 --gid 1001 -M database" returned: 9 in 0.016s > execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:423 > 2021-02-08 09:33:01.709 1288 DEBUG oslo_concurrency.processutils [-] 'sudo > useradd --uid 1001 --gid 1001 -M database' failed. Not Retrying. execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:474 > 2021-02-08 09:33:01.709 1288 DEBUG oslo_concurrency.processutils [-] > Running cmd (subprocess): grep '^/dev/vdb ' /etc/mtab execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:384 > 2021-02-08 09:33:01.716 1288 DEBUG oslo_concurrency.processutils [-] CMD > "grep '^/dev/vdb ' /etc/mtab" returned: 0 in 0.007s execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:423 > 2021-02-08 09:33:01.877 1288 CRITICAL root [-] Unhandled error: > ModuleNotFoundError: No module named > 'trove.guestagent.datastore.experimental' > 2021-02-08 09:33:01.877 1288 ERROR root Traceback (most recent call last): > 2021-02-08 09:33:01.877 1288 ERROR root File > "/home/ubuntu/trove/contrib/trove-guestagent", line 34, in > 2021-02-08 09:33:01.877 1288 ERROR root sys.exit(main()) > 2021-02-08 09:33:01.877 1288 ERROR root File > "/home/ubuntu/trove/trove/cmd/guest.py", line 94, in main > 2021-02-08 09:33:01.877 1288 ERROR root > rpc_api_version=guest_api.API.API_LATEST_VERSION) > 2021-02-08 09:33:01.877 1288 ERROR root File > "/home/ubuntu/trove/trove/common/rpc/service.py", line 48, in __init__ > 2021-02-08 09:33:01.877 1288 ERROR root _manager = > importutils.import_object(manager) > 2021-02-08 09:33:01.877 1288 ERROR root File > "/opt/guest-agent-venv/lib/python3.6/site-packages/oslo_utils/importutils.py", > line 44, in import_object > 2021-02-08 09:33:01.877 1288 ERROR root return > import_class(import_str)(*args, **kwargs) > 2021-02-08 09:33:01.877 1288 ERROR root File > "/opt/guest-agent-venv/lib/python3.6/site-packages/oslo_utils/importutils.py", > line 30, in import_class > 2021-02-08 09:33:01.877 1288 ERROR root __import__(mod_str) > 2021-02-08 09:33:01.877 1288 ERROR root ModuleNotFoundError: No module > named 'trove.guestagent.datastore.experimental' > 2021-02-08 09:33:01.877 1288 ERROR root > 2021-02-08 09:33:08.287 1394 INFO trove.cmd.guest [-] Creating user and > group for database service > 2021-02-08 09:33:08.288 1394 DEBUG oslo_concurrency.processutils [-] > Running cmd (subprocess): sudo groupadd --gid 1001 database execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:384 > 2021-02-08 09:33:08.314 1394 DEBUG oslo_concurrency.processutils [-] CMD > "sudo groupadd --gid 1001 database" returned: 9 in 0.026s execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:423 > 2021-02-08 09:33:08.315 1394 DEBUG oslo_concurrency.processutils [-] 'sudo > groupadd --gid 1001 database' failed. Not Retrying. execute > /opt/guest-agent-venv/lib/python3.6/site-packages/oslo_concurrency/processutils.py:474 > > -- > Regards, > > > Ammad > -------------- next part -------------- An HTML attachment was scrubbed... URL: From roshananvekar at gmail.com Mon Feb 8 10:14:19 2021 From: roshananvekar at gmail.com (roshan anvekar) Date: Mon, 8 Feb 2021 15:44:19 +0530 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty In-Reply-To: References: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> Message-ID: Hello, Yes nova-manage cell_v2 list_cells shows me 2 entries within a common cell0. List_hosts shows all the compute nodes but not controllers. Regards, Roshan On Mon, Feb 8, 2021, 3:00 PM Pierre Riteau wrote: > The hypervisor list in Horizon should show all hypervisors, > indifferently of which aggregate they belong to. > > I am thinking your issue could be related to cell configuration. On > one of your Nova controller, drop into a nova_api container and use > nova-manage to inspect the cell configuration: > > $ docker exec -it nova_api bash > (nova-api)[nova at controller0 /]$ nova-manage cell_v2 > usage: nova-manage cell_v2 [-h] > > > > {create_cell,delete_cell,delete_host,discover_hosts,list_cells,list_hosts,map_cell0,map_cell_and_hosts,map_instances,simple_cell_setup,update_cell,verify_instance} > ... > nova-manage cell_v2: error: too few arguments > (nova-api)[nova at controller0 /]$ nova-manage cell_v2 list_cells > [...] > (nova-api)[nova at controller0 /]$ nova-manage cell_v2 list_hosts > [...] > > Does the last command show your hypervisors? > > On Fri, 5 Feb 2021 at 16:50, roshan anvekar > wrote: > > > > Thanks for the reply. > > > > Well, I have a multinode setup ( 3 controllers and multiple compute > nodes) which was initially deployed with rocky and was working fine. > > > > I checked the globals.yml and site.yml files between rocky and stein and > I could not see any significant changes. > > > > Also under Admin-Compute-Hypervisor, I see that all the compute nodes > are showing up under Compute section. the hypervisor section is empty. > > > > I was wondering if controllers are placed under a different aggregate > and not able to show up. I can see all 3 controllers listed in > host-aggregates panel though and are in service up state. > > > > VM creation fails with no valid host found error. > > > > I am not able to point out issue since I don't see any errors in > deployment too. > > > > Regards, > > Roshan > > > > > > > > > > > > On Fri, Feb 5, 2021, 7:47 PM Sean Mooney wrote: > >> > >> On Fri, 2021-02-05 at 14:53 +0530, roshan anvekar wrote: > >> > Hi all, > >> > > >> > Scenario: I have an installation of Openstack stein through > kolla-ansible. > >> > The deployment went fine and all services look good. > >> > > >> > Although I am seeing that under Admin--> Compute --> Hypervisors > panel in > >> > horizon, all the controller nodes are missing. It's a blank list. > >> did you actully deploy the nova compute agent service to them? > >> > >> that view is showing the list of host that are running the nova compute > service > >> typically that is not deployed to the contolers. > >> > >> host in the contol group in the kolla multi node inventlry > >> > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L3 > >> are not use to run the compute agent by default > >> only nodes in the compute group are > >> > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L18 > >> the eception to that is ironic > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L301-L302 > >> which is deployed to the contolers. > >> > >> the nova compute agent used for libvirt is deployed specificlly to the > compute hosts via the nova-cell role at least on master > >> > https://github.com/openstack/kolla-ansible/blob/master/ansible/nova.yml#L118 > >> this was done a little simpler before adding cell support but the > inventory side has not changed in many release in this > >> regard. > >> > >> > >> > > >> > Also "Openstack hypervisor list" gives an empty list. > >> > > >> > I skimmed through the logs and found no error message other than in > >> > nova-scheduler that: > >> > > >> > *Got no allocation candidates from the Placement API. This could be > due to > >> > insufficient resources or a temporary occurence as compute nodes > start up.* > >> > > >> > Subsequently I checked placement container logs and found no error > message > >> > or anamoly. > >> > > >> > Not sure what the issue is. Any help in the above case would be > appreciated. > >> > > >> > Regards, > >> > Roshan > >> > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From roshananvekar at gmail.com Mon Feb 8 10:15:37 2021 From: roshananvekar at gmail.com (roshan anvekar) Date: Mon, 8 Feb 2021 15:45:37 +0530 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty In-Reply-To: References: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> Message-ID: VM creation is not getting scheduled itself showing no valid host found. The logs show no error in scheduler logs too On Mon, Feb 8, 2021, 2:44 PM Mark Goddard wrote: > On Fri, 5 Feb 2021 at 15:49, roshan anvekar > wrote: > > > > Thanks for the reply. > > > > Well, I have a multinode setup ( 3 controllers and multiple compute > nodes) which was initially deployed with rocky and was working fine. > > > > I checked the globals.yml and site.yml files between rocky and stein and > I could not see any significant changes. > > > > Also under Admin-Compute-Hypervisor, I see that all the compute nodes > are showing up under Compute section. the hypervisor section is empty. > > > > I was wondering if controllers are placed under a different aggregate > and not able to show up. I can see all 3 controllers listed in > host-aggregates panel though and are in service up state. > > > > VM creation fails with no valid host found error. > > > > I am not able to point out issue since I don't see any errors in > deployment too. > > > > Regards, > > Roshan > > > > > > > > > > > > On Fri, Feb 5, 2021, 7:47 PM Sean Mooney wrote: > >> > >> On Fri, 2021-02-05 at 14:53 +0530, roshan anvekar wrote: > >> > Hi all, > >> > > >> > Scenario: I have an installation of Openstack stein through > kolla-ansible. > >> > The deployment went fine and all services look good. > >> > > >> > Although I am seeing that under Admin--> Compute --> Hypervisors > panel in > >> > horizon, all the controller nodes are missing. It's a blank list. > >> did you actully deploy the nova compute agent service to them? > >> > >> that view is showing the list of host that are running the nova compute > service > >> typically that is not deployed to the contolers. > >> > >> host in the contol group in the kolla multi node inventlry > >> > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L3 > >> are not use to run the compute agent by default > >> only nodes in the compute group are > >> > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L18 > >> the eception to that is ironic > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L301-L302 > >> which is deployed to the contolers. > >> > >> the nova compute agent used for libvirt is deployed specificlly to the > compute hosts via the nova-cell role at least on master > >> > https://github.com/openstack/kolla-ansible/blob/master/ansible/nova.yml#L118 > >> this was done a little simpler before adding cell support but the > inventory side has not changed in many release in this > >> regard. > >> > >> > >> > > >> > Also "Openstack hypervisor list" gives an empty list. > >> > > >> > I skimmed through the logs and found no error message other than in > >> > nova-scheduler that: > >> > > >> > *Got no allocation candidates from the Placement API. This could be > due to > >> > insufficient resources or a temporary occurence as compute nodes > start up.* > >> > > >> > Subsequently I checked placement container logs and found no error > message > >> > or anamoly. > >> > > >> > Not sure what the issue is. Any help in the above case would be > appreciated. > >> > > >> > Regards, > >> > Roshan > Nova compute logs are probably the best place to go digging. > >> > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam.zheng at colorado.edu Mon Feb 8 10:26:52 2021 From: adam.zheng at colorado.edu (Adam Zheng) Date: Mon, 8 Feb 2021 10:26:52 +0000 Subject: [ops][cinder][kolla-ansible] cinder-backup fails if source disk not in nova az In-Reply-To: References: <9650D354-5141-47CA-B3C6-EB867CE4524F@colorado.edu> Message-ID: <4667D680-E1FD-4FA3-811D-065BE735EFD1@colorado.edu> Hello Alan, Thank you for the clarification and the pointers. I did also previously find that only the cinder client had the az option, which appeared to work on backing up a volume from another az. However, is there a way to get this working from horizon? While I can certainly make pages for my users that they will need to use the cli to do backups I feel it is not very friendly on my part to do that. For now if there is not a way, I may leave the az for cinder on nova so backups/restores work in properly in horizon. I can control where the data is in ceph; was mainly just hoping to set this in openstack for aesthetic/clarity (ie which datacenter they are saving their volumes) for users utilizing the horizon volumes interface. Thanks, -- Adam From: Alan Bishop Date: Friday, February 5, 2021 at 1:49 PM To: Adam Zheng Cc: "openstack-discuss at lists.openstack.org" Subject: Re: [ops][cinder][kolla-ansible] cinder-backup fails if source disk not in nova az On Fri, Feb 5, 2021 at 10:00 AM Adam Zheng > wrote: Hello, I’ve been trying to get availability zones defined for volumes. Everything works fine if I leave the zone at “nova”, all volume types work and backups/snapshots also work. ie: +------------------+----------------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+----------------------------+------+---------+-------+----------------------------+ | cinder-scheduler | cs-os-ctl-001 | nova | enabled | up | 2021-02-05T17:22:51.000000 | | cinder-scheduler | cs-os-ctl-003 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-scheduler | cs-os-ctl-002 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-volume | cs-os-ctl-001 at rbd-ceph-gp2 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-volume | cs-os-ctl-001 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-volume | cs-os-ctl-002 at rbd-ceph-gp2 | nova | enabled | up | 2021-02-05T17:22:50.000000 | | cinder-volume | cs-os-ctl-003 at rbd-ceph-gp2 | nova | enabled | up | 2021-02-05T17:22:55.000000 | | cinder-volume | cs-os-ctl-002 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:57.000000 | | cinder-volume | cs-os-ctl-003 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-backup | cs-os-ctl-002 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-backup | cs-os-ctl-001 | nova | enabled | up | 2021-02-05T17:22:53.000000 | | cinder-backup | cs-os-ctl-003 | nova | enabled | up | 2021-02-05T17:22:58.000000 | +------------------+----------------------------+------+---------+-------+----------------------------+ However, if I apply the following changes: cinder-api.conf [DEFAULT] default_availability_zone = not-nova default_volume_type = ceph-gp2 allow_availability_zone_fallback=True cinder-volume.conf [rbd-ceph-gp2] <…> backend_availability_zone = not-nova <…> I’ll get the following +------------------+----------------------------+----------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+----------------------------+----------+---------+-------+----------------------------+ | cinder-scheduler | cs-os-ctl-001 | nova | enabled | up | 2021-02-05T17:22:51.000000 | | cinder-scheduler | cs-os-ctl-003 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-scheduler | cs-os-ctl-002 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-volume | cs-os-ctl-001 at rbd-ceph-gp2 | not-nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-volume | cs-os-ctl-001 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-volume | cs-os-ctl-002 at rbd-ceph-gp2 | not-nova | enabled | up | 2021-02-05T17:22:50.000000 | | cinder-volume | cs-os-ctl-003 at rbd-ceph-gp2 | not-nova | enabled | up | 2021-02-05T17:22:55.000000 | | cinder-volume | cs-os-ctl-002 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:57.000000 | | cinder-volume | cs-os-ctl-003 at rbd-ceph-st1 | nova | enabled | up | 2021-02-05T17:22:54.000000 | | cinder-backup | cs-os-ctl-002 | nova | enabled | up | 2021-02-05T17:22:56.000000 | | cinder-backup | cs-os-ctl-001 | nova | enabled | up | 2021-02-05T17:22:53.000000 | | cinder-backup | cs-os-ctl-003 | nova | enabled | up | 2021-02-05T17:22:58.000000 | +------------------+----------------------------+----------+---------+-------+----------------------------+ At this point, creating new volumes still work and go into the expected ceph pools. However, backups no longer work for the cinder-volume that is not nova. In the above example, it still works fine for volumes that that were created with type “ceph-gp2” in az “nova”. Does not work for volumes that were created with type “ceph-st1” in az “not-nova”. It fails immediately and goes into error state with reason “Service not found for creating backup.” Hi Adam, Cinder's backup service has the ability to create backups of volumes in another AZ. The 'cinder' CLI supports this feature as of microversion 3.51. (bear in mind the 'openstack' client doesn't support microversions for the cinder (volume) service, so you'll need to use the 'cinder' command. Rather than repeat what I've written previously, I refer you to [1] for additional details. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1649845#c4 One other thing to note is the corresponding "cinder backup-restore" command currently does not support restoring to a volume in another AZ, but there is a workaround. You can pre-create a new volume in the destination AZ, and use the ability to restore a backup to a specific volume (which just happens to be in your desired AZ). There's also a patch [2] under review to enhance the cinder shell so that both backup and restore shell commands work the same way. [2] https://review.opendev.org/c/openstack/python-cinderclient/+/762020 Alan I suspect I need to try to get another set of “cinder-backup” services running in the Zone “not-nova”, but cannot seem to figure out how. I’ve scoured the docs on cinder.conf, and if I set default zones in cinder-backup (I’ve tried backend_availability_zone, default_availability_zone, and storage_availability_zone) I cannot seem to get backups working if the disk it’s backing up is not in az “nova”. The cinder-backup service in volume service list will always show “nova” no matter what I put there. Any advice would be appreciated. OpenStack Victoria deployed via kolla-ansible Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at stackhpc.com Mon Feb 8 10:32:34 2021 From: pierre at stackhpc.com (Pierre Riteau) Date: Mon, 8 Feb 2021 11:32:34 +0100 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty In-Reply-To: References: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> Message-ID: You can also use the osc-placement CLI to check what the placement data looks like. Do you have resource providers and matching inventories for your hypervisors? On Mon, 8 Feb 2021 at 11:16, roshan anvekar wrote: > > VM creation is not getting scheduled itself showing no valid host found. > > The logs show no error in scheduler logs too > > On Mon, Feb 8, 2021, 2:44 PM Mark Goddard wrote: >> >> On Fri, 5 Feb 2021 at 15:49, roshan anvekar wrote: >> > >> > Thanks for the reply. >> > >> > Well, I have a multinode setup ( 3 controllers and multiple compute nodes) which was initially deployed with rocky and was working fine. >> > >> > I checked the globals.yml and site.yml files between rocky and stein and I could not see any significant changes. >> > >> > Also under Admin-Compute-Hypervisor, I see that all the compute nodes are showing up under Compute section. the hypervisor section is empty. >> > >> > I was wondering if controllers are placed under a different aggregate and not able to show up. I can see all 3 controllers listed in host-aggregates panel though and are in service up state. >> > >> > VM creation fails with no valid host found error. >> > >> > I am not able to point out issue since I don't see any errors in deployment too. >> > >> > Regards, >> > Roshan >> > >> > >> > >> > >> > >> > On Fri, Feb 5, 2021, 7:47 PM Sean Mooney wrote: >> >> >> >> On Fri, 2021-02-05 at 14:53 +0530, roshan anvekar wrote: >> >> > Hi all, >> >> > >> >> > Scenario: I have an installation of Openstack stein through kolla-ansible. >> >> > The deployment went fine and all services look good. >> >> > >> >> > Although I am seeing that under Admin--> Compute --> Hypervisors panel in >> >> > horizon, all the controller nodes are missing. It's a blank list. >> >> did you actully deploy the nova compute agent service to them? >> >> >> >> that view is showing the list of host that are running the nova compute service >> >> typically that is not deployed to the contolers. >> >> >> >> host in the contol group in the kolla multi node inventlry >> >> https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L3 >> >> are not use to run the compute agent by default >> >> only nodes in the compute group are >> >> https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L18 >> >> the eception to that is ironic https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L301-L302 >> >> which is deployed to the contolers. >> >> >> >> the nova compute agent used for libvirt is deployed specificlly to the compute hosts via the nova-cell role at least on master >> >> https://github.com/openstack/kolla-ansible/blob/master/ansible/nova.yml#L118 >> >> this was done a little simpler before adding cell support but the inventory side has not changed in many release in this >> >> regard. >> >> >> >> >> >> > >> >> > Also "Openstack hypervisor list" gives an empty list. >> >> > >> >> > I skimmed through the logs and found no error message other than in >> >> > nova-scheduler that: >> >> > >> >> > *Got no allocation candidates from the Placement API. This could be due to >> >> > insufficient resources or a temporary occurence as compute nodes start up.* >> >> > >> >> > Subsequently I checked placement container logs and found no error message >> >> > or anamoly. >> >> > >> >> > Not sure what the issue is. Any help in the above case would be appreciated. >> >> > >> >> > Regards, >> >> > Roshan >> Nova compute logs are probably the best place to go digging. >> >> >> >> >> >> From roshananvekar at gmail.com Mon Feb 8 11:03:27 2021 From: roshananvekar at gmail.com (roshan anvekar) Date: Mon, 8 Feb 2021 16:33:27 +0530 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty In-Reply-To: References: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> Message-ID: No, I don't have these tools enabled to check the placement allocation and resources. I can try to. Meanwhile is there some configuration to be set in stein release as opposed to rocky towards cells or placement ? On the same nodes rocky was installed and it worked fine straight out of box. Stein is showing issues with controllers and says "Failed to compute_task_build_instances: No valid host was found." under nova-conductor.log. Also under nova-scheduler.log I could see a single info entry: " Got no allocation candidates from the Placement API. This could be due to insufficient resources or a temporary occurence as compute nodes start up." But I do see that all compute nodes are enabled and is up. On Mon, Feb 8, 2021, 4:03 PM Pierre Riteau wrote: > You can also use the osc-placement CLI to check what the placement > data looks like. Do you have resource providers and matching > inventories for your hypervisors? > > On Mon, 8 Feb 2021 at 11:16, roshan anvekar > wrote: > > > > VM creation is not getting scheduled itself showing no valid host found. > > > > The logs show no error in scheduler logs too > > > > On Mon, Feb 8, 2021, 2:44 PM Mark Goddard wrote: > >> > >> On Fri, 5 Feb 2021 at 15:49, roshan anvekar > wrote: > >> > > >> > Thanks for the reply. > >> > > >> > Well, I have a multinode setup ( 3 controllers and multiple compute > nodes) which was initially deployed with rocky and was working fine. > >> > > >> > I checked the globals.yml and site.yml files between rocky and stein > and I could not see any significant changes. > >> > > >> > Also under Admin-Compute-Hypervisor, I see that all the compute nodes > are showing up under Compute section. the hypervisor section is empty. > >> > > >> > I was wondering if controllers are placed under a different aggregate > and not able to show up. I can see all 3 controllers listed in > host-aggregates panel though and are in service up state. > >> > > >> > VM creation fails with no valid host found error. > >> > > >> > I am not able to point out issue since I don't see any errors in > deployment too. > >> > > >> > Regards, > >> > Roshan > >> > > >> > > >> > > >> > > >> > > >> > On Fri, Feb 5, 2021, 7:47 PM Sean Mooney wrote: > >> >> > >> >> On Fri, 2021-02-05 at 14:53 +0530, roshan anvekar wrote: > >> >> > Hi all, > >> >> > > >> >> > Scenario: I have an installation of Openstack stein through > kolla-ansible. > >> >> > The deployment went fine and all services look good. > >> >> > > >> >> > Although I am seeing that under Admin--> Compute --> Hypervisors > panel in > >> >> > horizon, all the controller nodes are missing. It's a blank list. > >> >> did you actully deploy the nova compute agent service to them? > >> >> > >> >> that view is showing the list of host that are running the nova > compute service > >> >> typically that is not deployed to the contolers. > >> >> > >> >> host in the contol group in the kolla multi node inventlry > >> >> > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L3 > >> >> are not use to run the compute agent by default > >> >> only nodes in the compute group are > >> >> > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L18 > >> >> the eception to that is ironic > https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L301-L302 > >> >> which is deployed to the contolers. > >> >> > >> >> the nova compute agent used for libvirt is deployed specificlly to > the compute hosts via the nova-cell role at least on master > >> >> > https://github.com/openstack/kolla-ansible/blob/master/ansible/nova.yml#L118 > >> >> this was done a little simpler before adding cell support but the > inventory side has not changed in many release in this > >> >> regard. > >> >> > >> >> > >> >> > > >> >> > Also "Openstack hypervisor list" gives an empty list. > >> >> > > >> >> > I skimmed through the logs and found no error message other than in > >> >> > nova-scheduler that: > >> >> > > >> >> > *Got no allocation candidates from the Placement API. This could > be due to > >> >> > insufficient resources or a temporary occurence as compute nodes > start up.* > >> >> > > >> >> > Subsequently I checked placement container logs and found no error > message > >> >> > or anamoly. > >> >> > > >> >> > Not sure what the issue is. Any help in the above case would be > appreciated. > >> >> > > >> >> > Regards, > >> >> > Roshan > >> Nova compute logs are probably the best place to go digging. > >> >> > >> >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at stackhpc.com Mon Feb 8 11:33:50 2021 From: pierre at stackhpc.com (Pierre Riteau) Date: Mon, 8 Feb 2021 12:33:50 +0100 Subject: [stein][hypervisor] Post successful stein deployment Openstack hypervisor list is empty In-Reply-To: References: <51f5c09f245a3f251dc0b72a993ad7391e0ead35.camel@redhat.com> Message-ID: This is just an OpenStackClient plugin which you can install with `pip install osc-placement`. On Mon, 8 Feb 2021 at 12:03, roshan anvekar wrote: > > No, I don't have these tools enabled to check the placement allocation and resources. I can try to. > > Meanwhile is there some configuration to be set in stein release as opposed to rocky towards cells or placement ? > > On the same nodes rocky was installed and it worked fine straight out of box. Stein is showing issues with controllers and says "Failed to compute_task_build_instances: No valid host was found." under nova-conductor.log. > > Also under nova-scheduler.log I could see a single info entry: " Got no allocation candidates from the Placement API. This could be due to insufficient resources or a temporary occurence as compute nodes start up." > > But I do see that all compute nodes are enabled and is up. > > > > On Mon, Feb 8, 2021, 4:03 PM Pierre Riteau wrote: >> >> You can also use the osc-placement CLI to check what the placement >> data looks like. Do you have resource providers and matching >> inventories for your hypervisors? >> >> On Mon, 8 Feb 2021 at 11:16, roshan anvekar wrote: >> > >> > VM creation is not getting scheduled itself showing no valid host found. >> > >> > The logs show no error in scheduler logs too >> > >> > On Mon, Feb 8, 2021, 2:44 PM Mark Goddard wrote: >> >> >> >> On Fri, 5 Feb 2021 at 15:49, roshan anvekar wrote: >> >> > >> >> > Thanks for the reply. >> >> > >> >> > Well, I have a multinode setup ( 3 controllers and multiple compute nodes) which was initially deployed with rocky and was working fine. >> >> > >> >> > I checked the globals.yml and site.yml files between rocky and stein and I could not see any significant changes. >> >> > >> >> > Also under Admin-Compute-Hypervisor, I see that all the compute nodes are showing up under Compute section. the hypervisor section is empty. >> >> > >> >> > I was wondering if controllers are placed under a different aggregate and not able to show up. I can see all 3 controllers listed in host-aggregates panel though and are in service up state. >> >> > >> >> > VM creation fails with no valid host found error. >> >> > >> >> > I am not able to point out issue since I don't see any errors in deployment too. >> >> > >> >> > Regards, >> >> > Roshan >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > On Fri, Feb 5, 2021, 7:47 PM Sean Mooney wrote: >> >> >> >> >> >> On Fri, 2021-02-05 at 14:53 +0530, roshan anvekar wrote: >> >> >> > Hi all, >> >> >> > >> >> >> > Scenario: I have an installation of Openstack stein through kolla-ansible. >> >> >> > The deployment went fine and all services look good. >> >> >> > >> >> >> > Although I am seeing that under Admin--> Compute --> Hypervisors panel in >> >> >> > horizon, all the controller nodes are missing. It's a blank list. >> >> >> did you actully deploy the nova compute agent service to them? >> >> >> >> >> >> that view is showing the list of host that are running the nova compute service >> >> >> typically that is not deployed to the contolers. >> >> >> >> >> >> host in the contol group in the kolla multi node inventlry >> >> >> https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L3 >> >> >> are not use to run the compute agent by default >> >> >> only nodes in the compute group are >> >> >> https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L18 >> >> >> the eception to that is ironic https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L301-L302 >> >> >> which is deployed to the contolers. >> >> >> >> >> >> the nova compute agent used for libvirt is deployed specificlly to the compute hosts via the nova-cell role at least on master >> >> >> https://github.com/openstack/kolla-ansible/blob/master/ansible/nova.yml#L118 >> >> >> this was done a little simpler before adding cell support but the inventory side has not changed in many release in this >> >> >> regard. >> >> >> >> >> >> >> >> >> > >> >> >> > Also "Openstack hypervisor list" gives an empty list. >> >> >> > >> >> >> > I skimmed through the logs and found no error message other than in >> >> >> > nova-scheduler that: >> >> >> > >> >> >> > *Got no allocation candidates from the Placement API. This could be due to >> >> >> > insufficient resources or a temporary occurence as compute nodes start up.* >> >> >> > >> >> >> > Subsequently I checked placement container logs and found no error message >> >> >> > or anamoly. >> >> >> > >> >> >> > Not sure what the issue is. Any help in the above case would be appreciated. >> >> >> > >> >> >> > Regards, >> >> >> > Roshan >> >> Nova compute logs are probably the best place to go digging. >> >> >> >> >> >> >> >> >> From smooney at redhat.com Mon Feb 8 11:50:40 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 08 Feb 2021 11:50:40 +0000 Subject: [all] Gate resources and performance In-Reply-To: <4369261.TshihmYaz6@p1> References: <20210205230710.nn3zuwuikwocqdcf@yuggoth.org> <4369261.TshihmYaz6@p1> Message-ID: <937f5e50d32f1d7cb460a84eefc3b7f4ec54527f.camel@redhat.com> On Sat, 2021-02-06 at 20:51 +0100, Slawek Kaplonski wrote: > Hi, > > Dnia sobota, 6 lutego 2021 10:33:17 CET Dmitry Tantsur pisze: > > On Sat, Feb 6, 2021 at 12:10 AM Jeremy Stanley wrote: > > > On 2021-02-05 22:52:15 +0100 (+0100), Dmitry Tantsur wrote: > > > [...] > > > > > > > 7.1. Stop marking dependent patches with Verified-2 if their > > > > parent fails in the gate, keep them at Verified+1 (their previous > > > > state). This is a common source of unnecessary rechecks in the > > > > ironic land. > > > > > > [...] > > > > > > Zuul generally assumes that if a change fails tests, it's going to > > > need to be revised. > > > > Very unfortunately, it's far from being the case in the ironic world. > > > > > Gerrit will absolutely refuse to allow a change > > > to merge if its parent has been revised and the child has not been > > > rebased onto that new revision. Revising or rebasing a change clears > > > the Verified label and will require new test results. > > > > This is fair, I'm only referring to the case where the parent has to be > > rechecked because of a transient problem. > > > > > Which one or > > > more of these conditions should be considered faulty? I'm guessing > > > you're going to say it's the first one, that we shouldn't assume > > > just because a change fails tests that means it needs to be fixed. > > > > Unfortunately, yes. > > > > A parallel proposal, that has been rejected numerous times, is to allow > > recheching only the failed jobs. > > Even if I totally understand cons of that I would also be for such > possibility. Maybe e.g. if only cores would have such possibility somehow > would be good trade off? it would require zuul to fundemtally be altered. currently triggers are defined at teh pipeline level we would have to instead define them per job. and im not sure restcting it to core would really help. it might but unless we force the same commit hashes to be reused so that all jobs used the same exact version fo the code i dont think it safe. > > > > Dmitry > > > > > This takes us back to the other subthread, wherein we entertain the > > > notion that if changes have failing jobs and the changes themselves > > > aren't at fault, then we should accept this as commonplace and lower > > > our expectations. > > > > > > Keep in mind that the primary source of pain here is one OpenStack > > > has chosen. That is, the "clean check" requirement that a change get > > > a +1 test result in the check pipeline before it can enter the gate > > > pipeline. This is an arbitrary pipeline criterion, chosen to keep > > > problematic changes from getting approved and making their way > > > through the gate queue like a wrecking-ball, causing repeated test > > > resets for the changes after them until they reach the front and > > > Zuul is finally able to determine they're not just conflicting with > > > other changes ahead. If a major pain for Ironic and other OpenStack > > > projects is the need to revisit the check pipeline after a gate > > > failure, that can be alleviated by dropping the clean check > > > requirement. > > > > > > Without clean check, a change which got a -2 in the gate could > > > simply be enqueued directly back to the gate again. This is how it > > > works in our other Zuul tenants. But the reason OpenStack started > > > enforcing it is that reviewers couldn't be bothered to confirm > > > changes really were reasonable, had *recent* passing check results, > > > and confirmed that observed job failures were truly unrelated to the > > > changes themselves. > > > -- > > > Jeremy Stanley > > > > -- > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > > Commercial register: Amtsgericht Muenchen, HRB 153243, > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > > O'Neill > > From ignaziocassano at gmail.com Mon Feb 8 11:52:24 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 8 Feb 2021 12:52:24 +0100 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hello All, I am able to replicate shares between netapp storages but I have some questions. Reading netapp documentation, seems the replication svm must not be enabled in manila.conf. The following is what netapp suggests: enabled_share_backends = svm-tst-nfs-565 [svm-tst-nfs-565] share_backend_name = svm-tst-nfs-565 driver_handles_share_servers = false share_driver = manila.share.drivers.netapp.common.NetAppDriver netapp_storage_family = ontap_cluster netapp_server_hostname = fas8040.csi.it netapp_server_port = 80 netapp_login = admin netapp_password = ****** netapp_transport_type = http netapp_vserver = svm-tst-nfs-565 netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ replication_domain = replication_domain_1 [netapp-nfs-566] share_backend_name = netapp-nfs-566 driver_handles_share_servers = False share_driver = manila.share.drivers.netapp.common.NetAppDriver netapp_storage_family = ontap_cluster netapp_server_hostname = fas.csi.it netapp_server_port = 80 netapp_login = admin netapp_password = ***** netapp_transport_type = http netapp_vserver = manila-nfs-566 netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ replication_domain = replication_domain_1 As you can see above, the target of replica netapp-nfs-566 is not included in enabled backends. When I try to create a replica in this situation, the manila schedule reports "no valid host found". It works if I enable in manila.conf the target like this: enabled_share_backends = svm-tst-nfs-565,netapp-nfs-566 Please, any suggestion ? Thanks Ignazio Il giorno ven 5 feb 2021 alle ore 13:59 Ignazio Cassano < ignaziocassano at gmail.com> ha scritto: > Thanks, Douglas. > On another question: > the manila share-replica-delete delete the snapmirror ? > If yes, source and destination volume become both writable ? > > Ignazio > > Il giorno ven 5 feb 2021 alle ore 13:48 Douglas ha > scritto: > >> Yes, it is correct. This should work as an alternative for >> host-assisted-migration and will be faster since it uses storage >> technologies to synchronize data. >> If your share isn't associated with a share-type that has >> replication_type='dr' you can: 1) create a new share-type with >> replication_type extra-spec, 2) unmanage your share, 3) manage it again >> using the new share-type. >> >> >> >> On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano >> wrote: >> >>> Hello, I am sorry. >>> >>> I read the documentation. >>> >>> SMV must be peered once bye storage admimistrator or using ansible >>> playbook. >>> I must create a two backend in manila.conf with the same replication >>> domain. >>> I must assign to the source a type and set replication type dr. >>> When I create a share if I want to enable snapmirror for it I must >>> create on openstack a share replica for it. >>> The share on destination is read only until I promote it. >>> When I promote it, it become writable. >>> Then I can manage it on target openstack. >>> >>> I hope the above is the correct procedure >>> >>> Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < >>> ignaziocassano at gmail.com> ha scritto: >>> >>>> Hi Douglas, you are really kind. >>>> Let my to to recap and please correct if I am wrong: >>>> >>>> - manila share on netapp are under svm >>>> - storage administrator createx a peering between svm source and svm >>>> destination (or on single share volume ?) >>>> - I create a manila share with specs replication type (the share >>>> belongs to source svm) . In manila.conf source and destination must have >>>> the same replication domain >>>> - Creating the replication type it initializes the snapmirror >>>> >>>> Is it correct ? >>>> Ignazio >>>> >>>> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas ha >>>> scritto: >>>> >>>>> Hi Ignazio, >>>>> >>>>> In order to use share replication between NetApp backends, you'll need >>>>> that Clusters and SVMs be peered in advance, which can be done by the >>>>> storage administrators once. You don't need to handle any SnapMirror >>>>> operation in the storage since it is fully handled by Manila and the NetApp >>>>> driver. You can find all operations needed here [1][2]. If you have CIFS >>>>> shares that need to be replicated and promoted, you will hit a bug that is >>>>> being backported [3] at the moment. NFS shares should work fine. >>>>> >>>>> If you want, we can assist you on creating replicas for your shares in >>>>> #openstack-manila channel. Just reach us there. >>>>> >>>>> [1] >>>>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>>>> [2] >>>>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>>>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>>>> >>>>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Hello, thanks for your help. >>>>>> I am waiting my storage administrators have a window to help me >>>>>> because they must setup the snapmirror. >>>>>> Meanwhile I am trying the host assisted migration but it does not >>>>>> work. >>>>>> The share remains in migrating for ever. >>>>>> I am sure the replication-dr works because I tested it one year ago. >>>>>> I had an openstack on site A with a netapp storage >>>>>> I had another openstack on Site B with another netapp storage. >>>>>> The two openstack installation did not share anything. >>>>>> So I made a replication between two volumes (shares). >>>>>> I demoted the source share taking note about its export location list >>>>>> I managed the destination on openstack and it worked. >>>>>> >>>>>> The process for replication is not fully handled by openstack api, >>>>>> so I should call netapp api for creating snapmirror relationship or >>>>>> ansible modules or ask help to my storage administrators , right ? >>>>>> Instead, using share migration, I could use only openstack api: I >>>>>> understood that driver assisted cannot work in this case, but host assisted >>>>>> should work. >>>>>> >>>>>> Best Regards >>>>>> Ignazio >>>>>> >>>>>> >>>>>> >>>>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas >>>>>> ha scritto: >>>>>> >>>>>>> Hi Rodrigo, >>>>>>> >>>>>>> Thanks for your help on this. We were helping Ignazio in >>>>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>>>> clusters, which isn't supported in the current implementation of the >>>>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>>>> migration methods, we suggested using share-replication to create a copy in >>>>>>> the destination, which will use the storage technologies to copy the data >>>>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>>>> We should continue tomorrow or in the next few days. >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>>>> >>>>>>>> Hello Ignazio, >>>>>>>> >>>>>>>> If you are attempting to migrate between 2 NetApp backends, then >>>>>>>> you shouldn't need to worry about correctly setting the >>>>>>>> data_node_access_ip. Your ideal migration scenario is a >>>>>>>> driver-assisted-migration, since it is between 2 NetApp backends. If that >>>>>>>> fails due to misconfiguration, it will fallback to a host-assisted >>>>>>>> migration, which will use the data_node_access_ip and the host will attempt >>>>>>>> to mount both shares. This is not what you want for this scenario, as this >>>>>>>> is useful for different backends, not your case. >>>>>>>> >>>>>>>> if you specify "manila migration-start --preserve-metadata True" it >>>>>>>> will prevent the fallback to host-assisted, so it is easier for you to >>>>>>>> narrow down the issue with the host-assisted migration out of the way. >>>>>>>> >>>>>>>> I used to be familiar with the NetApp driver set up to review your >>>>>>>> case, however that was a long time ago. I believe the current NetApp driver >>>>>>>> maintainers will be able to more accurately review your case and spot the >>>>>>>> problem. >>>>>>>> >>>>>>>> If you could share some info about your scenario such as: >>>>>>>> >>>>>>>> 1) the 2 backends config groups in manila.conf (sanitized, without >>>>>>>> passwords) >>>>>>>> 2) a "manila show" of the share you are trying to migrate >>>>>>>> (sanitized if needed) >>>>>>>> 3) the "manila migration-start" command you are using and its >>>>>>>> parameters. >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hello All, >>>>>>>>> I am trying to migrate a share between a netapp backend to another. >>>>>>>>> Both backends are configured in my manila.conf. >>>>>>>>> I am able to create share on both, but I am not able to migrate >>>>>>>>> share between them. >>>>>>>>> I am using DSSH=False. >>>>>>>>> I did not understand how host and driver assisted migration work >>>>>>>>> and what "data_node_access_ip" means. >>>>>>>>> The share I want to migrate is on a network (10.102.186.0/24) >>>>>>>>> that I can reach by my management controllers network ( >>>>>>>>> 10.102.184.0/24). I Can mount share from my controllers and I can >>>>>>>>> mount also the netapp SVM where the share is located. >>>>>>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>>>>>> management ips. >>>>>>>>> During the migrate phase I checked if my controller where manila >>>>>>>>> is running mounts the share or the netapp SVM but It does not happen. >>>>>>>>> Please, what is my mistake ? >>>>>>>>> Thanks >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Rodrigo Barbieri >>>>>>>> MSc Computer Scientist >>>>>>>> OpenStack Manila Core Contributor >>>>>>>> Federal University of São Carlos >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Douglas Salles Viroel >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> Douglas Salles Viroel >>>>> >>>> >> >> -- >> Douglas Salles Viroel >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rodrigo.barbieri2010 at gmail.com Mon Feb 8 11:56:46 2021 From: rodrigo.barbieri2010 at gmail.com (Rodrigo Barbieri) Date: Mon, 8 Feb 2021 08:56:46 -0300 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hi Ignazio, The way you set it up is correct with "enabled_share_backends = svm-tst-nfs-565,netapp-nfs-566". You need both backends enabled for the feature to work. Regards, Rodrigo On Mon, Feb 8, 2021 at 8:52 AM Ignazio Cassano wrote: > Hello All, > I am able to replicate shares between netapp storages but I have some > questions. > Reading netapp documentation, seems the replication svm must not be > enabled in manila.conf. > The following is what netapp suggests: > > enabled_share_backends = svm-tst-nfs-565 > > [svm-tst-nfs-565] > share_backend_name = svm-tst-nfs-565 > driver_handles_share_servers = false > share_driver = manila.share.drivers.netapp.common.NetAppDriver > netapp_storage_family = ontap_cluster > netapp_server_hostname = fas8040.csi.it > netapp_server_port = 80 > netapp_login = admin > netapp_password = ****** > netapp_transport_type = http > netapp_vserver = svm-tst-nfs-565 > netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ > replication_domain = replication_domain_1 > > > [netapp-nfs-566] > share_backend_name = netapp-nfs-566 > driver_handles_share_servers = False > share_driver = manila.share.drivers.netapp.common.NetAppDriver > netapp_storage_family = ontap_cluster > netapp_server_hostname = fas.csi.it > netapp_server_port = 80 > netapp_login = admin > netapp_password = ***** > netapp_transport_type = http > netapp_vserver = manila-nfs-566 > netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ > replication_domain = replication_domain_1 > > As you can see above, the target of replica netapp-nfs-566 is not included > in enabled backends. > > When I try to create a replica in this situation, the manila schedule > reports "no valid host found". > > It works if I enable in manila.conf the target like this: > enabled_share_backends = svm-tst-nfs-565,netapp-nfs-566 > > Please, any suggestion ? > > Thanks > Ignazio > > > Il giorno ven 5 feb 2021 alle ore 13:59 Ignazio Cassano < > ignaziocassano at gmail.com> ha scritto: > >> Thanks, Douglas. >> On another question: >> the manila share-replica-delete delete the snapmirror ? >> If yes, source and destination volume become both writable ? >> >> Ignazio >> >> Il giorno ven 5 feb 2021 alle ore 13:48 Douglas ha >> scritto: >> >>> Yes, it is correct. This should work as an alternative for >>> host-assisted-migration and will be faster since it uses storage >>> technologies to synchronize data. >>> If your share isn't associated with a share-type that has >>> replication_type='dr' you can: 1) create a new share-type with >>> replication_type extra-spec, 2) unmanage your share, 3) manage it again >>> using the new share-type. >>> >>> >>> >>> On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano >>> wrote: >>> >>>> Hello, I am sorry. >>>> >>>> I read the documentation. >>>> >>>> SMV must be peered once bye storage admimistrator or using ansible >>>> playbook. >>>> I must create a two backend in manila.conf with the same replication >>>> domain. >>>> I must assign to the source a type and set replication type dr. >>>> When I create a share if I want to enable snapmirror for it I must >>>> create on openstack a share replica for it. >>>> The share on destination is read only until I promote it. >>>> When I promote it, it become writable. >>>> Then I can manage it on target openstack. >>>> >>>> I hope the above is the correct procedure >>>> >>>> Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < >>>> ignaziocassano at gmail.com> ha scritto: >>>> >>>>> Hi Douglas, you are really kind. >>>>> Let my to to recap and please correct if I am wrong: >>>>> >>>>> - manila share on netapp are under svm >>>>> - storage administrator createx a peering between svm source and svm >>>>> destination (or on single share volume ?) >>>>> - I create a manila share with specs replication type (the share >>>>> belongs to source svm) . In manila.conf source and destination must have >>>>> the same replication domain >>>>> - Creating the replication type it initializes the snapmirror >>>>> >>>>> Is it correct ? >>>>> Ignazio >>>>> >>>>> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas ha >>>>> scritto: >>>>> >>>>>> Hi Ignazio, >>>>>> >>>>>> In order to use share replication between NetApp backends, you'll >>>>>> need that Clusters and SVMs be peered in advance, which can be done by the >>>>>> storage administrators once. You don't need to handle any SnapMirror >>>>>> operation in the storage since it is fully handled by Manila and the NetApp >>>>>> driver. You can find all operations needed here [1][2]. If you have CIFS >>>>>> shares that need to be replicated and promoted, you will hit a bug that is >>>>>> being backported [3] at the moment. NFS shares should work fine. >>>>>> >>>>>> If you want, we can assist you on creating replicas for your shares >>>>>> in #openstack-manila channel. Just reach us there. >>>>>> >>>>>> [1] >>>>>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>>>>> [2] >>>>>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>>>>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>>>>> >>>>>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Hello, thanks for your help. >>>>>>> I am waiting my storage administrators have a window to help me >>>>>>> because they must setup the snapmirror. >>>>>>> Meanwhile I am trying the host assisted migration but it does not >>>>>>> work. >>>>>>> The share remains in migrating for ever. >>>>>>> I am sure the replication-dr works because I tested it one year ago. >>>>>>> I had an openstack on site A with a netapp storage >>>>>>> I had another openstack on Site B with another netapp storage. >>>>>>> The two openstack installation did not share anything. >>>>>>> So I made a replication between two volumes (shares). >>>>>>> I demoted the source share taking note about its export location list >>>>>>> I managed the destination on openstack and it worked. >>>>>>> >>>>>>> The process for replication is not fully handled by openstack api, >>>>>>> so I should call netapp api for creating snapmirror relationship or >>>>>>> ansible modules or ask help to my storage administrators , right ? >>>>>>> Instead, using share migration, I could use only openstack api: I >>>>>>> understood that driver assisted cannot work in this case, but host assisted >>>>>>> should work. >>>>>>> >>>>>>> Best Regards >>>>>>> Ignazio >>>>>>> >>>>>>> >>>>>>> >>>>>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas >>>>>>> ha scritto: >>>>>>> >>>>>>>> Hi Rodrigo, >>>>>>>> >>>>>>>> Thanks for your help on this. We were helping Ignazio in >>>>>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>>>>> clusters, which isn't supported in the current implementation of the >>>>>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>>>>> migration methods, we suggested using share-replication to create a copy in >>>>>>>> the destination, which will use the storage technologies to copy the data >>>>>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>>>>> We should continue tomorrow or in the next few days. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> >>>>>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hello Ignazio, >>>>>>>>> >>>>>>>>> If you are attempting to migrate between 2 NetApp backends, then >>>>>>>>> you shouldn't need to worry about correctly setting the >>>>>>>>> data_node_access_ip. Your ideal migration scenario is a >>>>>>>>> driver-assisted-migration, since it is between 2 NetApp backends. If that >>>>>>>>> fails due to misconfiguration, it will fallback to a host-assisted >>>>>>>>> migration, which will use the data_node_access_ip and the host will attempt >>>>>>>>> to mount both shares. This is not what you want for this scenario, as this >>>>>>>>> is useful for different backends, not your case. >>>>>>>>> >>>>>>>>> if you specify "manila migration-start --preserve-metadata True" >>>>>>>>> it will prevent the fallback to host-assisted, so it is easier for you to >>>>>>>>> narrow down the issue with the host-assisted migration out of the way. >>>>>>>>> >>>>>>>>> I used to be familiar with the NetApp driver set up to review your >>>>>>>>> case, however that was a long time ago. I believe the current NetApp driver >>>>>>>>> maintainers will be able to more accurately review your case and spot the >>>>>>>>> problem. >>>>>>>>> >>>>>>>>> If you could share some info about your scenario such as: >>>>>>>>> >>>>>>>>> 1) the 2 backends config groups in manila.conf (sanitized, without >>>>>>>>> passwords) >>>>>>>>> 2) a "manila show" of the share you are trying to migrate >>>>>>>>> (sanitized if needed) >>>>>>>>> 3) the "manila migration-start" command you are using and its >>>>>>>>> parameters. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hello All, >>>>>>>>>> I am trying to migrate a share between a netapp backend to >>>>>>>>>> another. >>>>>>>>>> Both backends are configured in my manila.conf. >>>>>>>>>> I am able to create share on both, but I am not able to migrate >>>>>>>>>> share between them. >>>>>>>>>> I am using DSSH=False. >>>>>>>>>> I did not understand how host and driver assisted migration work >>>>>>>>>> and what "data_node_access_ip" means. >>>>>>>>>> The share I want to migrate is on a network (10.102.186.0/24) >>>>>>>>>> that I can reach by my management controllers network ( >>>>>>>>>> 10.102.184.0/24). I Can mount share from my controllers and I >>>>>>>>>> can mount also the netapp SVM where the share is located. >>>>>>>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>>>>>>> management ips. >>>>>>>>>> During the migrate phase I checked if my controller where manila >>>>>>>>>> is running mounts the share or the netapp SVM but It does not happen. >>>>>>>>>> Please, what is my mistake ? >>>>>>>>>> Thanks >>>>>>>>>> Ignazio >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Rodrigo Barbieri >>>>>>>>> MSc Computer Scientist >>>>>>>>> OpenStack Manila Core Contributor >>>>>>>>> Federal University of São Carlos >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Douglas Salles Viroel >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Douglas Salles Viroel >>>>>> >>>>> >>> >>> -- >>> Douglas Salles Viroel >>> >> -- Rodrigo Barbieri MSc Computer Scientist OpenStack Manila Core Contributor Federal University of São Carlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Feb 8 12:06:09 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 8 Feb 2021 13:06:09 +0100 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Many Thanks, Rodrigo. Il giorno lun 8 feb 2021 alle ore 12:56 Rodrigo Barbieri < rodrigo.barbieri2010 at gmail.com> ha scritto: > Hi Ignazio, > > The way you set it up is correct with "enabled_share_backends = > svm-tst-nfs-565,netapp-nfs-566". You need both backends enabled for the > feature to work. > > Regards, > > Rodrigo > > On Mon, Feb 8, 2021 at 8:52 AM Ignazio Cassano > wrote: > >> Hello All, >> I am able to replicate shares between netapp storages but I have some >> questions. >> Reading netapp documentation, seems the replication svm must not be >> enabled in manila.conf. >> The following is what netapp suggests: >> >> enabled_share_backends = svm-tst-nfs-565 >> >> [svm-tst-nfs-565] >> share_backend_name = svm-tst-nfs-565 >> driver_handles_share_servers = false >> share_driver = manila.share.drivers.netapp.common.NetAppDriver >> netapp_storage_family = ontap_cluster >> netapp_server_hostname = fas8040.csi.it >> netapp_server_port = 80 >> netapp_login = admin >> netapp_password = ****** >> netapp_transport_type = http >> netapp_vserver = svm-tst-nfs-565 >> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >> replication_domain = replication_domain_1 >> >> >> [netapp-nfs-566] >> share_backend_name = netapp-nfs-566 >> driver_handles_share_servers = False >> share_driver = manila.share.drivers.netapp.common.NetAppDriver >> netapp_storage_family = ontap_cluster >> netapp_server_hostname = fas.csi.it >> netapp_server_port = 80 >> netapp_login = admin >> netapp_password = ***** >> netapp_transport_type = http >> netapp_vserver = manila-nfs-566 >> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >> replication_domain = replication_domain_1 >> >> As you can see above, the target of replica netapp-nfs-566 is not >> included in enabled backends. >> >> When I try to create a replica in this situation, the manila schedule >> reports "no valid host found". >> >> It works if I enable in manila.conf the target like this: >> enabled_share_backends = svm-tst-nfs-565,netapp-nfs-566 >> >> Please, any suggestion ? >> >> Thanks >> Ignazio >> >> >> Il giorno ven 5 feb 2021 alle ore 13:59 Ignazio Cassano < >> ignaziocassano at gmail.com> ha scritto: >> >>> Thanks, Douglas. >>> On another question: >>> the manila share-replica-delete delete the snapmirror ? >>> If yes, source and destination volume become both writable ? >>> >>> Ignazio >>> >>> Il giorno ven 5 feb 2021 alle ore 13:48 Douglas ha >>> scritto: >>> >>>> Yes, it is correct. This should work as an alternative for >>>> host-assisted-migration and will be faster since it uses storage >>>> technologies to synchronize data. >>>> If your share isn't associated with a share-type that has >>>> replication_type='dr' you can: 1) create a new share-type with >>>> replication_type extra-spec, 2) unmanage your share, 3) manage it again >>>> using the new share-type. >>>> >>>> >>>> >>>> On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Hello, I am sorry. >>>>> >>>>> I read the documentation. >>>>> >>>>> SMV must be peered once bye storage admimistrator or using ansible >>>>> playbook. >>>>> I must create a two backend in manila.conf with the same replication >>>>> domain. >>>>> I must assign to the source a type and set replication type dr. >>>>> When I create a share if I want to enable snapmirror for it I must >>>>> create on openstack a share replica for it. >>>>> The share on destination is read only until I promote it. >>>>> When I promote it, it become writable. >>>>> Then I can manage it on target openstack. >>>>> >>>>> I hope the above is the correct procedure >>>>> >>>>> Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < >>>>> ignaziocassano at gmail.com> ha scritto: >>>>> >>>>>> Hi Douglas, you are really kind. >>>>>> Let my to to recap and please correct if I am wrong: >>>>>> >>>>>> - manila share on netapp are under svm >>>>>> - storage administrator createx a peering between svm source and svm >>>>>> destination (or on single share volume ?) >>>>>> - I create a manila share with specs replication type (the share >>>>>> belongs to source svm) . In manila.conf source and destination must have >>>>>> the same replication domain >>>>>> - Creating the replication type it initializes the snapmirror >>>>>> >>>>>> Is it correct ? >>>>>> Ignazio >>>>>> >>>>>> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas >>>>>> ha scritto: >>>>>> >>>>>>> Hi Ignazio, >>>>>>> >>>>>>> In order to use share replication between NetApp backends, you'll >>>>>>> need that Clusters and SVMs be peered in advance, which can be done by the >>>>>>> storage administrators once. You don't need to handle any SnapMirror >>>>>>> operation in the storage since it is fully handled by Manila and the NetApp >>>>>>> driver. You can find all operations needed here [1][2]. If you have CIFS >>>>>>> shares that need to be replicated and promoted, you will hit a bug that is >>>>>>> being backported [3] at the moment. NFS shares should work fine. >>>>>>> >>>>>>> If you want, we can assist you on creating replicas for your shares >>>>>>> in #openstack-manila channel. Just reach us there. >>>>>>> >>>>>>> [1] >>>>>>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>>>>>> [2] >>>>>>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>>>>>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>>>>>> >>>>>>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> Hello, thanks for your help. >>>>>>>> I am waiting my storage administrators have a window to help me >>>>>>>> because they must setup the snapmirror. >>>>>>>> Meanwhile I am trying the host assisted migration but it does not >>>>>>>> work. >>>>>>>> The share remains in migrating for ever. >>>>>>>> I am sure the replication-dr works because I tested it one year ago. >>>>>>>> I had an openstack on site A with a netapp storage >>>>>>>> I had another openstack on Site B with another netapp storage. >>>>>>>> The two openstack installation did not share anything. >>>>>>>> So I made a replication between two volumes (shares). >>>>>>>> I demoted the source share taking note about its export location >>>>>>>> list >>>>>>>> I managed the destination on openstack and it worked. >>>>>>>> >>>>>>>> The process for replication is not fully handled by openstack api, >>>>>>>> so I should call netapp api for creating snapmirror relationship or >>>>>>>> ansible modules or ask help to my storage administrators , right ? >>>>>>>> Instead, using share migration, I could use only openstack api: I >>>>>>>> understood that driver assisted cannot work in this case, but host assisted >>>>>>>> should work. >>>>>>>> >>>>>>>> Best Regards >>>>>>>> Ignazio >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas >>>>>>>> ha scritto: >>>>>>>> >>>>>>>>> Hi Rodrigo, >>>>>>>>> >>>>>>>>> Thanks for your help on this. We were helping Ignazio in >>>>>>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>>>>>> clusters, which isn't supported in the current implementation of the >>>>>>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>>>>>> migration methods, we suggested using share-replication to create a copy in >>>>>>>>> the destination, which will use the storage technologies to copy the data >>>>>>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>>>>>> We should continue tomorrow or in the next few days. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> >>>>>>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>>>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hello Ignazio, >>>>>>>>>> >>>>>>>>>> If you are attempting to migrate between 2 NetApp backends, then >>>>>>>>>> you shouldn't need to worry about correctly setting the >>>>>>>>>> data_node_access_ip. Your ideal migration scenario is a >>>>>>>>>> driver-assisted-migration, since it is between 2 NetApp backends. If that >>>>>>>>>> fails due to misconfiguration, it will fallback to a host-assisted >>>>>>>>>> migration, which will use the data_node_access_ip and the host will attempt >>>>>>>>>> to mount both shares. This is not what you want for this scenario, as this >>>>>>>>>> is useful for different backends, not your case. >>>>>>>>>> >>>>>>>>>> if you specify "manila migration-start --preserve-metadata True" >>>>>>>>>> it will prevent the fallback to host-assisted, so it is easier for you to >>>>>>>>>> narrow down the issue with the host-assisted migration out of the way. >>>>>>>>>> >>>>>>>>>> I used to be familiar with the NetApp driver set up to review >>>>>>>>>> your case, however that was a long time ago. I believe the current NetApp >>>>>>>>>> driver maintainers will be able to more accurately review your case and >>>>>>>>>> spot the problem. >>>>>>>>>> >>>>>>>>>> If you could share some info about your scenario such as: >>>>>>>>>> >>>>>>>>>> 1) the 2 backends config groups in manila.conf (sanitized, >>>>>>>>>> without passwords) >>>>>>>>>> 2) a "manila show" of the share you are trying to migrate >>>>>>>>>> (sanitized if needed) >>>>>>>>>> 3) the "manila migration-start" command you are using and its >>>>>>>>>> parameters. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hello All, >>>>>>>>>>> I am trying to migrate a share between a netapp backend to >>>>>>>>>>> another. >>>>>>>>>>> Both backends are configured in my manila.conf. >>>>>>>>>>> I am able to create share on both, but I am not able to migrate >>>>>>>>>>> share between them. >>>>>>>>>>> I am using DSSH=False. >>>>>>>>>>> I did not understand how host and driver assisted migration work >>>>>>>>>>> and what "data_node_access_ip" means. >>>>>>>>>>> The share I want to migrate is on a network (10.102.186.0/24) >>>>>>>>>>> that I can reach by my management controllers network ( >>>>>>>>>>> 10.102.184.0/24). I Can mount share from my controllers and I >>>>>>>>>>> can mount also the netapp SVM where the share is located. >>>>>>>>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>>>>>>>> management ips. >>>>>>>>>>> During the migrate phase I checked if my controller where manila >>>>>>>>>>> is running mounts the share or the netapp SVM but It does not happen. >>>>>>>>>>> Please, what is my mistake ? >>>>>>>>>>> Thanks >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Rodrigo Barbieri >>>>>>>>>> MSc Computer Scientist >>>>>>>>>> OpenStack Manila Core Contributor >>>>>>>>>> Federal University of São Carlos >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Douglas Salles Viroel >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Douglas Salles Viroel >>>>>>> >>>>>> >>>> >>>> -- >>>> Douglas Salles Viroel >>>> >>> > > -- > Rodrigo Barbieri > MSc Computer Scientist > OpenStack Manila Core Contributor > Federal University of São Carlos > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Mon Feb 8 12:57:00 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Mon, 8 Feb 2021 13:57:00 +0100 Subject: [ironic] A new project for useful deploy steps? Message-ID: Hi all, We have finally implemented in-band deploy steps (w00t!), and people started coming up with ideas. I have two currently: 1) configure arbitrary kernel command line arguments via grub 2) write NetworkManager configuration (for those not using cloud-init) I'm not sure how I feel about putting these in IPA proper, seems like we may go down a rabbit hole here. But what about a new project (ironic-python-agent-extras?) with a hardware manager providing a collection of potentially useful deploy steps? Or should we nonetheless just put them in IPA with priority=0? Opinions welcome. Dmitry -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From ildiko.vancsa at gmail.com Mon Feb 8 13:19:32 2021 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Mon, 8 Feb 2021 14:19:32 +0100 Subject: [cyborg] Participating in Anuket hardware acceleration discussions Message-ID: <9C198953-F80F-4487-99DD-469CD758496D@gmail.com> Hi Cyborg Team, I’m reaching out to you about a recent discussion[1] in Anuket (project created by the merger of CNTT and OPNFV) about hardware acceleration. In the Reference Model discussions the interested people are planning to create a sub-group to identify the hardware acceleration needs of a telecom cloud infrastructure and identify and define interfaces to manage and integrate hardware accelerator devices. Currently the discussions are taking place as part of the Reference Model discussions[1] until a sub-group is formed. It would be great to have people from the Cyborg team to participate to update the Anuket contributors about the project’s current features and roadmap as well as gain more requirements that can be used in the project onwards. Would someone from the Cyborg team be available to participate? Please let me know if you have questions. Thanks, Ildikó [1] https://wiki.lfnetworking.org/display/LN/2021-02-02+-+RM%3A+Hardware+Acceleration+Abstraction [2] https://wiki.anuket.io/display/HOME/RM From jean-francois.taltavull at elca.ch Mon Feb 8 13:44:26 2021 From: jean-francois.taltavull at elca.ch (Taltavull Jean-Francois) Date: Mon, 8 Feb 2021 13:44:26 +0000 Subject: [KEYSTONE][FEDERATION] Groups mapping problem when using keycloak as IDP In-Reply-To: <0917ee6e-f918-8a2e-7abb-6de42724ba5c@rd.bbc.co.uk> References: <4b328f90066149db85d0a006fb7ea01b@elca.ch> <0917ee6e-f918-8a2e-7abb-6de42724ba5c@rd.bbc.co.uk> Message-ID: Hi Jonathan, I cherry-picked the patch on the os_keystone role installed by OSA 21.2.2 and it works. Thanks ! Jean-Francois > -----Original Message----- > From: Jonathan Rosser > Sent: mercredi, 3 février 2021 19:27 > To: openstack-discuss at lists.openstack.org > Subject: Re: [KEYSTONE][FEDERATION] Groups mapping problem when using > keycloak as IDP > > Hi Jean-Francois, > > I made a patch to the openstack-ansible keystone role which will hopefully > address this. It would be really helpful if you are able to test the patch and > provide some feedback. > > https://review.opendev.org/c/openstack/openstack-ansible- > os_keystone/+/773978 > > Regards, > Jonathan. > > On 03/02/2021 10:03, Taltavull Jean-Francois wrote: > > Hello, > > > > Actually, the solution is to add this line to Apache configuration: > > OIDCClaimDelimiter ";" > > > > The problem is that this configuration variable does not exist in OSA keystone > role and its apache configuration template > (https://opendev.org/openstack/openstack-ansible- > os_keystone/src/branch/master/templates/keystone-httpd.conf.j2). > > > > > > Jean-Francois > > > >> -----Original Message----- > >> From: Taltavull Jean-Francois > >> Sent: lundi, 1 février 2021 14:44 > >> To: openstack-discuss at lists.openstack.org > >> Subject: [KEYSTONE][FEDERATION] Groups mapping problem when using > >> keycloak as IDP > >> > >> Hello, > >> > >> In order to implement identity federation, I've deployed (with OSA) > >> keystone > >> (Ussuri) as Service Provider and Keycloak as IDP. > >> > >> As one can read at [1], "groups" can have multiple values and each > >> value must be separated by a ";" > >> > >> But, in the OpenID token sent by keycloak, groups are represented > >> with a JSON list and keystone fails to parse it well (only the first group of the > list is mapped). > >> > >> Have any of you already faced this problem ? > >> > >> Thanks ! > >> > >> Jean-François > >> > >> [1] > >> https://docs.openstack.org/keystone/ussuri/admin/federation/mapping_c > >> ombi > >> nations.html > > From abishop at redhat.com Mon Feb 8 14:07:26 2021 From: abishop at redhat.com (Alan Bishop) Date: Mon, 8 Feb 2021 06:07:26 -0800 Subject: [ops][cinder][horizon][kolla-ansible] cinder-backup fails if source disk not in nova az In-Reply-To: <4667D680-E1FD-4FA3-811D-065BE735EFD1@colorado.edu> References: <9650D354-5141-47CA-B3C6-EB867CE4524F@colorado.edu> <4667D680-E1FD-4FA3-811D-065BE735EFD1@colorado.edu> Message-ID: On Mon, Feb 8, 2021 at 2:27 AM Adam Zheng wrote: > Hello Alan, > > > > Thank you for the clarification and the pointers. > > I did also previously find that only the cinder client had the az option, > which appeared to work on backing up a volume from another az. > > > > However, is there a way to get this working from horizon? While I can > certainly make pages for my users that they will need to use the cli to do > backups I feel it is not very friendly on my part to do that. > Adding [horizon] tag to involve that community. > For now if there is not a way, I may leave the az for cinder on nova so > backups/restores work in properly in horizon. I can control where the data > is in ceph; was mainly just hoping to set this in openstack for > aesthetic/clarity (ie which datacenter they are saving their volumes) for > users utilizing the horizon volumes interface. > > > > Thanks, > > > > -- > > Adam > > > > *From: *Alan Bishop > *Date: *Friday, February 5, 2021 at 1:49 PM > *To: *Adam Zheng > *Cc: *"openstack-discuss at lists.openstack.org" < > openstack-discuss at lists.openstack.org> > *Subject: *Re: [ops][cinder][kolla-ansible] cinder-backup fails if source > disk not in nova az > > > > > > > > On Fri, Feb 5, 2021 at 10:00 AM Adam Zheng > wrote: > > Hello, > > > > I’ve been trying to get availability zones defined for volumes. > > Everything works fine if I leave the zone at “nova”, all volume types work > and backups/snapshots also work. > > > > ie: > > > +------------------+----------------------------+------+---------+-------+----------------------------+ > > | Binary | Host | Zone | Status | State | > Updated At | > > > +------------------+----------------------------+------+---------+-------+----------------------------+ > > | cinder-scheduler | cs-os-ctl-001 | nova | enabled | up | > 2021-02-05T17:22:51.000000 | > > | cinder-scheduler | cs-os-ctl-003 | nova | enabled | up | > 2021-02-05T17:22:54.000000 | > > | cinder-scheduler | cs-os-ctl-002 | nova | enabled | up | > 2021-02-05T17:22:56.000000 | > > | cinder-volume | cs-os-ctl-001 at rbd-ceph-gp2 | nova | enabled | up > | 2021-02-05T17:22:56.000000 | > > | cinder-volume | cs-os-ctl-001 at rbd-ceph-st1 | nova | enabled | up > | 2021-02-05T17:22:54.000000 | > > | cinder-volume | cs-os-ctl-002 at rbd-ceph-gp2 | nova | enabled | up > | 2021-02-05T17:22:50.000000 | > > | cinder-volume | cs-os-ctl-003 at rbd-ceph-gp2 | nova | enabled | up > | 2021-02-05T17:22:55.000000 | > > | cinder-volume | cs-os-ctl-002 at rbd-ceph-st1 | nova | enabled | up > | 2021-02-05T17:22:57.000000 | > > | cinder-volume | cs-os-ctl-003 at rbd-ceph-st1 | nova | enabled | up > | 2021-02-05T17:22:54.000000 | > > | cinder-backup | cs-os-ctl-002 | nova | enabled | up | > 2021-02-05T17:22:56.000000 | > > | cinder-backup | cs-os-ctl-001 | nova | enabled | up | > 2021-02-05T17:22:53.000000 | > > | cinder-backup | cs-os-ctl-003 | nova | enabled | up | > 2021-02-05T17:22:58.000000 | > > > +------------------+----------------------------+------+---------+-------+----------------------------+ > > > > However, if I apply the following changes: > > > > cinder-api.conf > > [DEFAULT] > > default_availability_zone = not-nova > > default_volume_type = ceph-gp2 > > allow_availability_zone_fallback=True > > > > cinder-volume.conf > > [rbd-ceph-gp2] > > <…> > > backend_availability_zone = not-nova > > <…> > > > > I’ll get the following > > > +------------------+----------------------------+----------+---------+-------+----------------------------+ > > | Binary | Host | Zone | Status | > State | Updated At | > > > +------------------+----------------------------+----------+---------+-------+----------------------------+ > > | cinder-scheduler | cs-os-ctl-001 | nova | enabled | > up | 2021-02-05T17:22:51.000000 | > > | cinder-scheduler | cs-os-ctl-003 | nova | enabled | > up | 2021-02-05T17:22:54.000000 | > > | cinder-scheduler | cs-os-ctl-002 | nova | enabled | > up | 2021-02-05T17:22:56.000000 | > > | cinder-volume | cs-os-ctl-001 at rbd-ceph-gp2 | not-nova | enabled | > up | 2021-02-05T17:22:56.000000 | > > | cinder-volume | cs-os-ctl-001 at rbd-ceph-st1 | nova | enabled | > up | 2021-02-05T17:22:54.000000 | > > | cinder-volume | cs-os-ctl-002 at rbd-ceph-gp2 | not-nova | enabled | > up | 2021-02-05T17:22:50.000000 | > > | cinder-volume | cs-os-ctl-003 at rbd-ceph-gp2 | not-nova | enabled | > up | 2021-02-05T17:22:55.000000 | > > | cinder-volume | cs-os-ctl-002 at rbd-ceph-st1 | nova | enabled | > up | 2021-02-05T17:22:57.000000 | > > | cinder-volume | cs-os-ctl-003 at rbd-ceph-st1 | nova | enabled | > up | 2021-02-05T17:22:54.000000 | > > | cinder-backup | cs-os-ctl-002 | nova | enabled | > up | 2021-02-05T17:22:56.000000 | > > | cinder-backup | cs-os-ctl-001 | nova | enabled | > up | 2021-02-05T17:22:53.000000 | > > | cinder-backup | cs-os-ctl-003 | nova | enabled | > up | 2021-02-05T17:22:58.000000 | > > > +------------------+----------------------------+----------+---------+-------+----------------------------+ > > > > At this point, creating new volumes still work and go into the expected > ceph pools. > > However, backups no longer work for the cinder-volume that is not nova. > > In the above example, it still works fine for volumes that that were > created with type “ceph-gp2” in az “nova”. > > Does not work for volumes that were created with type “ceph-st1” in az > “not-nova”. It fails immediately and goes into error state with reason > “Service not found for creating backup.” > > Hi Adam, > > > > Cinder's backup service has the ability to create backups of volumes in > another AZ. The 'cinder' CLI supports this feature as of microversion 3.51. > (bear in mind the 'openstack' client doesn't support microversions for the > cinder (volume) service, so you'll need to use the 'cinder' command. > > > > Rather than repeat what I've written previously, I refer you to [1] for > additional details. > > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1649845#c4 > > > > One other thing to note is the corresponding "cinder backup-restore" > command currently does not support restoring to a volume in another AZ, but > there is a workaround. You can pre-create a new volume in the destination > AZ, and use the ability to restore a backup to a specific volume (which > just happens to be in your desired AZ). > > > > There's also a patch [2] under review to enhance the cinder shell so that > both backup and restore shell commands work the same way. > > > > [2] https://review.opendev.org/c/openstack/python-cinderclient/+/762020 > > > > > Alan > > > > I suspect I need to try to get another set of “cinder-backup” services > running in the Zone “not-nova”, but cannot seem to figure out how. > > > > I’ve scoured the docs on cinder.conf, and if I set default zones in > cinder-backup (I’ve tried backend_availability_zone, > default_availability_zone, and storage_availability_zone) I cannot seem to > get backups working if the disk it’s backing up is not in az “nova”. The > cinder-backup service in volume service list will always show “nova” no > matter what I put there. > > > > Any advice would be appreciated. > > OpenStack Victoria deployed via kolla-ansible > > > > Thanks! > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From senrique at redhat.com Mon Feb 8 14:45:54 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Mon, 8 Feb 2021 11:45:54 -0300 Subject: [cinder] bug deputy report for week of 2021-02-01 Message-ID: This is a bug report from 2021-01-27 to 2021-01-05. Most of this bugs were discussed at the Cinder meeting last Wednesday 2021-02-03 Critical: - High: - Medium: - https://bugs.launchpad.net/cinder/+bug/1914319: "DellEMC VNX driver: volume is not created on array but status is 'available' in openstack". Assigned to Yong Huan. - https://bugs.launchpad.net/cinder/+bug/1913578: "PowerFlex documentation contains invalid paths". Unassigned. - https://bugs.launchpad.net/python-cinderclient/+bug/1913474: "Cinderclient needs to support mv 3.63". Unassigned. Low: - https://bugs.launchpad.net/cinder/+bug/1913705: "OsProfiler does not work". Unassigned. Incomplete: - https://bugs.launchpad.net/cinder/+bug/1913363: "Multiple calls to lshost during attach operation". Assigned to Girish Chilukuri. Undecided/Unconfirmed: - https://bugs.launchpad.net/cinder/+bug/1914639: "[NetApp] Issue while trying to attach a volume using CHAP authentication". Unassigned. Not a bug: - https://bugs.launchpad.net/cinder/+bug/1914273: "After attachment-delete nova still sees volume as attached". Feel free to reply/reach me if I missed something. Regards Sofi -- L. Sofía Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon Feb 8 15:15:50 2021 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 8 Feb 2021 16:15:50 +0100 Subject: =?UTF-8?Q?=5Brelease=5D_Proposing_El=C5=91d_Ill=C3=A9s_=28elod=29_for_Release_?= =?UTF-8?Q?Management_Core?= Message-ID: Hi, Előd has been working on Release management for quite some time now and in that time has shown tremendous growth in his understanding of our processes and on how deliverables work on Openstack. I think he would make a good addition to the core team. Existing team members, please respond with +1/-1. If there are no objections we'll add him to the ACL soon. :-) Thanks. -- Hervé Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ https://twitter.com/4383hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From opensrloo at gmail.com Mon Feb 8 15:35:02 2021 From: opensrloo at gmail.com (Ruby Loo) Date: Mon, 8 Feb 2021 10:35:02 -0500 Subject: [ironic] A new project for useful deploy steps? In-Reply-To: References: Message-ID: Hi Dmitry, Thanks for bringing this up! We discussed this in our weekly ironic meeting [1]. The consensus there seems to be to keep the ideas in IPA (with priority=0). The additional code will be 'negligible' in size so ramdisk won't be bloated due to this. Also, it keeps things simple. Having a separate package means more maintenance overhead and confusion for our users. Would be good to hear from others, if they don't think this is a good idea. Otherwise, I'm looking forward to Dmitry's RFEs on this :) --ruby [1] http://eavesdrop.openstack.org/irclogs/%23openstack-ironic/%23openstack-ironic.2021-02-08.log.html#t2021-02-08T15:23:02 On Mon, Feb 8, 2021 at 8:02 AM Dmitry Tantsur wrote: > Hi all, > > We have finally implemented in-band deploy steps (w00t!), and people > started coming up with ideas. I have two currently: > 1) configure arbitrary kernel command line arguments via grub > 2) write NetworkManager configuration (for those not using cloud-init) > > I'm not sure how I feel about putting these in IPA proper, seems like we > may go down a rabbit hole here. But what about a new project > (ironic-python-agent-extras?) with a hardware manager providing a > collection of potentially useful deploy steps? > > Or should we nonetheless just put them in IPA with priority=0? > > Opinions welcome. > > Dmitry > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > O'Neill > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Mon Feb 8 16:09:34 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 8 Feb 2021 08:09:34 -0800 Subject: [ironic] A new project for useful deploy steps? In-Reply-To: References: Message-ID: I concur. The only thing with RFE's and patch approvals is I think we should remember that we want it to be easy. So processes like RFEs may not be helpful to a "oh, this tiny little thing makes a lot of sense" sort of things, since it quickly becomes a situation where you spend more time on the RFE than the patch itself. On Mon, Feb 8, 2021 at 7:43 AM Ruby Loo wrote: > > Hi Dmitry, > > Thanks for bringing this up! We discussed this in our weekly ironic meeting [1]. The consensus there seems to be to keep the ideas in IPA (with priority=0). The additional code will be 'negligible' in size so ramdisk won't be bloated due to this. Also, it keeps things simple. Having a separate package means more maintenance overhead and confusion for our users. > > Would be good to hear from others, if they don't think this is a good idea. Otherwise, I'm looking forward to Dmitry's RFEs on this :) > > --ruby > > [1] http://eavesdrop.openstack.org/irclogs/%23openstack-ironic/%23openstack-ironic.2021-02-08.log.html#t2021-02-08T15:23:02 > > On Mon, Feb 8, 2021 at 8:02 AM Dmitry Tantsur wrote: >> >> Hi all, >> >> We have finally implemented in-band deploy steps (w00t!), and people started coming up with ideas. I have two currently: >> 1) configure arbitrary kernel command line arguments via grub >> 2) write NetworkManager configuration (for those not using cloud-init) >> >> I'm not sure how I feel about putting these in IPA proper, seems like we may go down a rabbit hole here. But what about a new project (ironic-python-agent-extras?) with a hardware manager providing a collection of potentially useful deploy steps? >> >> Or should we nonetheless just put them in IPA with priority=0? >> >> Opinions welcome. >> >> Dmitry >> >> -- >> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, >> Commercial register: Amtsgericht Muenchen, HRB 153243, >> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill From ildiko.vancsa at gmail.com Mon Feb 8 17:44:24 2021 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Mon, 8 Feb 2021 18:44:24 +0100 Subject: [edge][all] Features and updates relevant to edge computing use cases Message-ID: <71140411-61B6-4D80-B6F4-E3EF74D36D2B@gmail.com> Hi, I’m reaching out as I need a little guidance. I was asked to give a presentation about the latest features and improvements in OpenStack that are relevant for edge computing. I wanted to ask for the community’s help, as the project teams are the experts, to give any pointers that you think would be great to include in my presentation. I appreciate all your guidance in advance. Thanks, Ildikó From ignaziocassano at gmail.com Mon Feb 8 18:01:22 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 8 Feb 2021 19:01:22 +0100 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hello All, I have another question about replication sync time, please. I did not specify any option in manila.conf and seems netapp set it one time every hour. I did not understand if replica_state_update_interval is the replication sync time frequency or only checks the replica state. Is there any parameter I can use to setup the replica sync time ? Thanks Ignazio Il giorno lun 8 feb 2021 alle ore 12:56 Rodrigo Barbieri < rodrigo.barbieri2010 at gmail.com> ha scritto: > Hi Ignazio, > > The way you set it up is correct with "enabled_share_backends = > svm-tst-nfs-565,netapp-nfs-566". You need both backends enabled for the > feature to work. > > Regards, > > Rodrigo > > On Mon, Feb 8, 2021 at 8:52 AM Ignazio Cassano > wrote: > >> Hello All, >> I am able to replicate shares between netapp storages but I have some >> questions. >> Reading netapp documentation, seems the replication svm must not be >> enabled in manila.conf. >> The following is what netapp suggests: >> >> enabled_share_backends = svm-tst-nfs-565 >> >> [svm-tst-nfs-565] >> share_backend_name = svm-tst-nfs-565 >> driver_handles_share_servers = false >> share_driver = manila.share.drivers.netapp.common.NetAppDriver >> netapp_storage_family = ontap_cluster >> netapp_server_hostname = fas8040.csi.it >> netapp_server_port = 80 >> netapp_login = admin >> netapp_password = ****** >> netapp_transport_type = http >> netapp_vserver = svm-tst-nfs-565 >> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >> replication_domain = replication_domain_1 >> >> >> [netapp-nfs-566] >> share_backend_name = netapp-nfs-566 >> driver_handles_share_servers = False >> share_driver = manila.share.drivers.netapp.common.NetAppDriver >> netapp_storage_family = ontap_cluster >> netapp_server_hostname = fas.csi.it >> netapp_server_port = 80 >> netapp_login = admin >> netapp_password = ***** >> netapp_transport_type = http >> netapp_vserver = manila-nfs-566 >> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >> replication_domain = replication_domain_1 >> >> As you can see above, the target of replica netapp-nfs-566 is not >> included in enabled backends. >> >> When I try to create a replica in this situation, the manila schedule >> reports "no valid host found". >> >> It works if I enable in manila.conf the target like this: >> enabled_share_backends = svm-tst-nfs-565,netapp-nfs-566 >> >> Please, any suggestion ? >> >> Thanks >> Ignazio >> >> >> Il giorno ven 5 feb 2021 alle ore 13:59 Ignazio Cassano < >> ignaziocassano at gmail.com> ha scritto: >> >>> Thanks, Douglas. >>> On another question: >>> the manila share-replica-delete delete the snapmirror ? >>> If yes, source and destination volume become both writable ? >>> >>> Ignazio >>> >>> Il giorno ven 5 feb 2021 alle ore 13:48 Douglas ha >>> scritto: >>> >>>> Yes, it is correct. This should work as an alternative for >>>> host-assisted-migration and will be faster since it uses storage >>>> technologies to synchronize data. >>>> If your share isn't associated with a share-type that has >>>> replication_type='dr' you can: 1) create a new share-type with >>>> replication_type extra-spec, 2) unmanage your share, 3) manage it again >>>> using the new share-type. >>>> >>>> >>>> >>>> On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Hello, I am sorry. >>>>> >>>>> I read the documentation. >>>>> >>>>> SMV must be peered once bye storage admimistrator or using ansible >>>>> playbook. >>>>> I must create a two backend in manila.conf with the same replication >>>>> domain. >>>>> I must assign to the source a type and set replication type dr. >>>>> When I create a share if I want to enable snapmirror for it I must >>>>> create on openstack a share replica for it. >>>>> The share on destination is read only until I promote it. >>>>> When I promote it, it become writable. >>>>> Then I can manage it on target openstack. >>>>> >>>>> I hope the above is the correct procedure >>>>> >>>>> Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < >>>>> ignaziocassano at gmail.com> ha scritto: >>>>> >>>>>> Hi Douglas, you are really kind. >>>>>> Let my to to recap and please correct if I am wrong: >>>>>> >>>>>> - manila share on netapp are under svm >>>>>> - storage administrator createx a peering between svm source and svm >>>>>> destination (or on single share volume ?) >>>>>> - I create a manila share with specs replication type (the share >>>>>> belongs to source svm) . In manila.conf source and destination must have >>>>>> the same replication domain >>>>>> - Creating the replication type it initializes the snapmirror >>>>>> >>>>>> Is it correct ? >>>>>> Ignazio >>>>>> >>>>>> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas >>>>>> ha scritto: >>>>>> >>>>>>> Hi Ignazio, >>>>>>> >>>>>>> In order to use share replication between NetApp backends, you'll >>>>>>> need that Clusters and SVMs be peered in advance, which can be done by the >>>>>>> storage administrators once. You don't need to handle any SnapMirror >>>>>>> operation in the storage since it is fully handled by Manila and the NetApp >>>>>>> driver. You can find all operations needed here [1][2]. If you have CIFS >>>>>>> shares that need to be replicated and promoted, you will hit a bug that is >>>>>>> being backported [3] at the moment. NFS shares should work fine. >>>>>>> >>>>>>> If you want, we can assist you on creating replicas for your shares >>>>>>> in #openstack-manila channel. Just reach us there. >>>>>>> >>>>>>> [1] >>>>>>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>>>>>> [2] >>>>>>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>>>>>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>>>>>> >>>>>>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> Hello, thanks for your help. >>>>>>>> I am waiting my storage administrators have a window to help me >>>>>>>> because they must setup the snapmirror. >>>>>>>> Meanwhile I am trying the host assisted migration but it does not >>>>>>>> work. >>>>>>>> The share remains in migrating for ever. >>>>>>>> I am sure the replication-dr works because I tested it one year ago. >>>>>>>> I had an openstack on site A with a netapp storage >>>>>>>> I had another openstack on Site B with another netapp storage. >>>>>>>> The two openstack installation did not share anything. >>>>>>>> So I made a replication between two volumes (shares). >>>>>>>> I demoted the source share taking note about its export location >>>>>>>> list >>>>>>>> I managed the destination on openstack and it worked. >>>>>>>> >>>>>>>> The process for replication is not fully handled by openstack api, >>>>>>>> so I should call netapp api for creating snapmirror relationship or >>>>>>>> ansible modules or ask help to my storage administrators , right ? >>>>>>>> Instead, using share migration, I could use only openstack api: I >>>>>>>> understood that driver assisted cannot work in this case, but host assisted >>>>>>>> should work. >>>>>>>> >>>>>>>> Best Regards >>>>>>>> Ignazio >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas >>>>>>>> ha scritto: >>>>>>>> >>>>>>>>> Hi Rodrigo, >>>>>>>>> >>>>>>>>> Thanks for your help on this. We were helping Ignazio in >>>>>>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>>>>>> clusters, which isn't supported in the current implementation of the >>>>>>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>>>>>> migration methods, we suggested using share-replication to create a copy in >>>>>>>>> the destination, which will use the storage technologies to copy the data >>>>>>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>>>>>> We should continue tomorrow or in the next few days. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> >>>>>>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>>>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hello Ignazio, >>>>>>>>>> >>>>>>>>>> If you are attempting to migrate between 2 NetApp backends, then >>>>>>>>>> you shouldn't need to worry about correctly setting the >>>>>>>>>> data_node_access_ip. Your ideal migration scenario is a >>>>>>>>>> driver-assisted-migration, since it is between 2 NetApp backends. If that >>>>>>>>>> fails due to misconfiguration, it will fallback to a host-assisted >>>>>>>>>> migration, which will use the data_node_access_ip and the host will attempt >>>>>>>>>> to mount both shares. This is not what you want for this scenario, as this >>>>>>>>>> is useful for different backends, not your case. >>>>>>>>>> >>>>>>>>>> if you specify "manila migration-start --preserve-metadata True" >>>>>>>>>> it will prevent the fallback to host-assisted, so it is easier for you to >>>>>>>>>> narrow down the issue with the host-assisted migration out of the way. >>>>>>>>>> >>>>>>>>>> I used to be familiar with the NetApp driver set up to review >>>>>>>>>> your case, however that was a long time ago. I believe the current NetApp >>>>>>>>>> driver maintainers will be able to more accurately review your case and >>>>>>>>>> spot the problem. >>>>>>>>>> >>>>>>>>>> If you could share some info about your scenario such as: >>>>>>>>>> >>>>>>>>>> 1) the 2 backends config groups in manila.conf (sanitized, >>>>>>>>>> without passwords) >>>>>>>>>> 2) a "manila show" of the share you are trying to migrate >>>>>>>>>> (sanitized if needed) >>>>>>>>>> 3) the "manila migration-start" command you are using and its >>>>>>>>>> parameters. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hello All, >>>>>>>>>>> I am trying to migrate a share between a netapp backend to >>>>>>>>>>> another. >>>>>>>>>>> Both backends are configured in my manila.conf. >>>>>>>>>>> I am able to create share on both, but I am not able to migrate >>>>>>>>>>> share between them. >>>>>>>>>>> I am using DSSH=False. >>>>>>>>>>> I did not understand how host and driver assisted migration work >>>>>>>>>>> and what "data_node_access_ip" means. >>>>>>>>>>> The share I want to migrate is on a network (10.102.186.0/24) >>>>>>>>>>> that I can reach by my management controllers network ( >>>>>>>>>>> 10.102.184.0/24). I Can mount share from my controllers and I >>>>>>>>>>> can mount also the netapp SVM where the share is located. >>>>>>>>>>> So in the data_node_access_ip I wrote the list of my controllers >>>>>>>>>>> management ips. >>>>>>>>>>> During the migrate phase I checked if my controller where manila >>>>>>>>>>> is running mounts the share or the netapp SVM but It does not happen. >>>>>>>>>>> Please, what is my mistake ? >>>>>>>>>>> Thanks >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Rodrigo Barbieri >>>>>>>>>> MSc Computer Scientist >>>>>>>>>> OpenStack Manila Core Contributor >>>>>>>>>> Federal University of São Carlos >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Douglas Salles Viroel >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Douglas Salles Viroel >>>>>>> >>>>>> >>>> >>>> -- >>>> Douglas Salles Viroel >>>> >>> > > -- > Rodrigo Barbieri > MSc Computer Scientist > OpenStack Manila Core Contributor > Federal University of São Carlos > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-francois.taltavull at elca.ch Mon Feb 8 18:14:33 2021 From: jean-francois.taltavull at elca.ch (Taltavull Jean-Francois) Date: Mon, 8 Feb 2021 18:14:33 +0000 Subject: Rally - Unable to install rally - install_rally.sh is not available in repo In-Reply-To: References: <3cb755495b994352aaadf0d31ad295f3@elca.ch> <7fdbc97a688744f399f7358b1250bc30@elca.ch> Message-ID: Hello Ankit, All that looks good to me. “rally deployment show” works fine for me, with the same version of rally… From: Ankit Goel Sent: vendredi, 5 février 2021 18:44 To: Taltavull Jean-Francois ; openstack-dev at lists.openstack.org Cc: John Spillane Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Hi Jean, Please find below the output. (rally) [root at rally tasks]# pip list | grep rally WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip. Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue. To avoid this problem you can invoke Python with '-m pip' instead of running pip directly. rally 3.2.0 rally-openstack 2.1.0 (rally) [root at rally tasks]# I sourced my Openstack RC file and then used the below command to create the deployment. rally deployment create --fromenv --name=existing Below are the contents for Openstack-rc file. (rally) [root at rally ~]# cat admin-openstack.sh export OS_PROJECT_DOMAIN_NAME=Default export OS_USER_DOMAIN_NAME=Default export OS_PROJECT_NAME=admin export OS_TENANT_NAME=admin export OS_USERNAME=admin export OS_PASSWORD=kMswJJGAeKhziXGWloLjYESvfOytK4DkCAAXcpA8 export OS_AUTH_URL=https://osp.example.com:5000/v3 export OS_INTERFACE=public export OS_IDENTITY_API_VERSION=3 export OS_REGION_NAME=RegionOne export OS_AUTH_PLUGIN=password export OS_CACERT=/root/certificates/haproxy-ca.crt (rally) [root at rally ~]# Regards, Ankit Goel From: Taltavull Jean-Francois > Sent: 05 February 2021 13:39 To: Ankit Goel >; openstack-dev at lists.openstack.org Cc: John Spillane > Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Hello, 1/ Is “rally-openstack” python package correctly installed ? On my side I have: (venv) vagrant at rally: $ pip list | grep rally rally 3.2.0 rally-openstack 2.1.0 2/ Could you please show the json file used to create the deployment ? From: Ankit Goel > Sent: jeudi, 4 février 2021 18:19 To: Taltavull Jean-Francois >; openstack-dev at lists.openstack.org Cc: John Spillane > Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Thanks for the response Jean. I could install rally with pip command. But when I am running rally deployment show command then it is failing. (rally) [root at rally ~]# rally deployment list +--------------------------------------+----------------------------+----------+------------------+--------+ | uuid | created_at | name | status | active | +--------------------------------------+----------------------------+----------+------------------+--------+ | 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a | 2021-02-04T14:57:02.638753 | existing | deploy->finished | * | +--------------------------------------+----------------------------+----------+------------------+--------+ (rally) [root at rally ~]# (rally) [root at rally ~]# rally deployment show 43f3d3d9-ea22-4e75-9c47-daa83cc51c2a Command failed, please check log for more info 2021-02-04 16:06:49.576 19306 CRITICAL rally [-] Unhandled error: KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally Traceback (most recent call last): 2021-02-04 16:06:49.576 19306 ERROR rally File "/bin/rally", line 11, in 2021-02-04 16:06:49.576 19306 ERROR rally sys.exit(main()) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/main.py", line 40, in main 2021-02-04 16:06:49.576 19306 ERROR rally return cliutils.run(sys.argv, categories) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/cliutils.py", line 669, in run 2021-02-04 16:06:49.576 19306 ERROR rally ret = fn(*fn_args, **fn_kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/envutils.py", line 135, in default_from_global 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "", line 2, in show 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/plugins/__init__.py", line 59, in ensure_plugins_are_loaded 2021-02-04 16:06:49.576 19306 ERROR rally return f(*args, **kwargs) 2021-02-04 16:06:49.576 19306 ERROR rally File "/usr/local/lib/python3.6/site-packages/rally/cli/commands/deployment.py", line 205, in show 2021-02-04 16:06:49.576 19306 ERROR rally creds = deployment["credentials"]["openstack"][0] 2021-02-04 16:06:49.576 19306 ERROR rally KeyError: 'openstack' 2021-02-04 16:06:49.576 19306 ERROR rally (rally) [root at rally ~]# Can you please help me to resolve this issue. Regards, Ankit Goel From: Taltavull Jean-Francois > Sent: 03 February 2021 19:38 To: Ankit Goel >; openstack-dev at lists.openstack.org Subject: RE: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Ankit, Installation part of Rally official doc is not up to date, actually. Just do “pip install rally-openstack” (in a virtualenv, of course 😊) This will also install “rally” python package. Enjoy ! Jean-Francois From: Ankit Goel > Sent: mercredi, 3 février 2021 13:40 To: openstack-dev at lists.openstack.org Subject: Rally - Unable to install rally - install_rally.sh is not available in repo Hello Experts, I was trying to install Openstack rally on centos 7 VM but the link provided in the Openstack doc to download the install_rally.sh is broken. Latest Rally Doc link - > https://docs.openstack.org/rally/latest/install_and_upgrade/install.html#automated-installation Rally Install Script -> https://raw.githubusercontent.com/openstack/rally/master/install_rally.sh - > This is broken After searching on internet I could reach to the Openstack rally Repo - > https://opendev.org/openstack/rally but here I am not seeing the install_ rally.sh script and according to all the information available on internet it says we need install_ rally.sh. Thus can you please let me know what’s the latest procedure to install Rally. Awaiting for your response. Thanks, Ankit Goel -------------- next part -------------- An HTML attachment was scrubbed... URL: From viroel at gmail.com Mon Feb 8 18:39:47 2021 From: viroel at gmail.com (Douglas) Date: Mon, 8 Feb 2021 15:39:47 -0300 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hi Ignazio, The 'replica_state_update_interval' is the time interval that Manila waits before requesting a status update to the storage system. It will request this information for every replica that isn't 'in-sync' status yet. The default value of this config option is actually 300 seconds, not 1 hour. You can also manually request this update by issuing 'share-replica-resync' operation[1]. I believe that you might be mixing with the 'snapmirror' schedule concept. Indeed, snapmirror relationships are created with 'schedule' set to be hourly. This 'schedule' is used to update the destination replica, incrementally, after the snapmirror becames in-sync (snapmirrored), since we use Asynchronous SnapMirror[2]. [1] https://docs.openstack.org/api-ref/shared-file-system/?expanded=resync-share-replica-detail#resync-share-replica [2] https://docs.netapp.com/ontap-9/topic/com.netapp.doc.pow-dap/GUID-18263F03-486B-434C-A190-C05D3AFC05DD.html On Mon, Feb 8, 2021 at 3:01 PM Ignazio Cassano wrote: > Hello All, > I have another question about replication sync time, please. > I did not specify any option in manila.conf and seems netapp set it one > time every hour. > I did not understand if replica_state_update_interval is the replication > sync time frequency or only checks the replica state. > Is there any parameter I can use to setup the replica sync time ? > Thanks > Ignazio > > > > Il giorno lun 8 feb 2021 alle ore 12:56 Rodrigo Barbieri < > rodrigo.barbieri2010 at gmail.com> ha scritto: > >> Hi Ignazio, >> >> The way you set it up is correct with "enabled_share_backends = >> svm-tst-nfs-565,netapp-nfs-566". You need both backends enabled for the >> feature to work. >> >> Regards, >> >> Rodrigo >> >> On Mon, Feb 8, 2021 at 8:52 AM Ignazio Cassano >> wrote: >> >>> Hello All, >>> I am able to replicate shares between netapp storages but I have some >>> questions. >>> Reading netapp documentation, seems the replication svm must not be >>> enabled in manila.conf. >>> The following is what netapp suggests: >>> >>> enabled_share_backends = svm-tst-nfs-565 >>> >>> [svm-tst-nfs-565] >>> share_backend_name = svm-tst-nfs-565 >>> driver_handles_share_servers = false >>> share_driver = manila.share.drivers.netapp.common.NetAppDriver >>> netapp_storage_family = ontap_cluster >>> netapp_server_hostname = fas8040.csi.it >>> netapp_server_port = 80 >>> netapp_login = admin >>> netapp_password = ****** >>> netapp_transport_type = http >>> netapp_vserver = svm-tst-nfs-565 >>> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >>> replication_domain = replication_domain_1 >>> >>> >>> [netapp-nfs-566] >>> share_backend_name = netapp-nfs-566 >>> driver_handles_share_servers = False >>> share_driver = manila.share.drivers.netapp.common.NetAppDriver >>> netapp_storage_family = ontap_cluster >>> netapp_server_hostname = fas.csi.it >>> netapp_server_port = 80 >>> netapp_login = admin >>> netapp_password = ***** >>> netapp_transport_type = http >>> netapp_vserver = manila-nfs-566 >>> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >>> replication_domain = replication_domain_1 >>> >>> As you can see above, the target of replica netapp-nfs-566 is not >>> included in enabled backends. >>> >>> When I try to create a replica in this situation, the manila schedule >>> reports "no valid host found". >>> >>> It works if I enable in manila.conf the target like this: >>> enabled_share_backends = svm-tst-nfs-565,netapp-nfs-566 >>> >>> Please, any suggestion ? >>> >>> Thanks >>> Ignazio >>> >>> >>> Il giorno ven 5 feb 2021 alle ore 13:59 Ignazio Cassano < >>> ignaziocassano at gmail.com> ha scritto: >>> >>>> Thanks, Douglas. >>>> On another question: >>>> the manila share-replica-delete delete the snapmirror ? >>>> If yes, source and destination volume become both writable ? >>>> >>>> Ignazio >>>> >>>> Il giorno ven 5 feb 2021 alle ore 13:48 Douglas ha >>>> scritto: >>>> >>>>> Yes, it is correct. This should work as an alternative for >>>>> host-assisted-migration and will be faster since it uses storage >>>>> technologies to synchronize data. >>>>> If your share isn't associated with a share-type that has >>>>> replication_type='dr' you can: 1) create a new share-type with >>>>> replication_type extra-spec, 2) unmanage your share, 3) manage it again >>>>> using the new share-type. >>>>> >>>>> >>>>> >>>>> On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Hello, I am sorry. >>>>>> >>>>>> I read the documentation. >>>>>> >>>>>> SMV must be peered once bye storage admimistrator or using ansible >>>>>> playbook. >>>>>> I must create a two backend in manila.conf with the same replication >>>>>> domain. >>>>>> I must assign to the source a type and set replication type dr. >>>>>> When I create a share if I want to enable snapmirror for it I must >>>>>> create on openstack a share replica for it. >>>>>> The share on destination is read only until I promote it. >>>>>> When I promote it, it become writable. >>>>>> Then I can manage it on target openstack. >>>>>> >>>>>> I hope the above is the correct procedure >>>>>> >>>>>> Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Douglas, you are really kind. >>>>>>> Let my to to recap and please correct if I am wrong: >>>>>>> >>>>>>> - manila share on netapp are under svm >>>>>>> - storage administrator createx a peering between svm source and svm >>>>>>> destination (or on single share volume ?) >>>>>>> - I create a manila share with specs replication type (the share >>>>>>> belongs to source svm) . In manila.conf source and destination must have >>>>>>> the same replication domain >>>>>>> - Creating the replication type it initializes the snapmirror >>>>>>> >>>>>>> Is it correct ? >>>>>>> Ignazio >>>>>>> >>>>>>> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas >>>>>>> ha scritto: >>>>>>> >>>>>>>> Hi Ignazio, >>>>>>>> >>>>>>>> In order to use share replication between NetApp backends, you'll >>>>>>>> need that Clusters and SVMs be peered in advance, which can be done by the >>>>>>>> storage administrators once. You don't need to handle any SnapMirror >>>>>>>> operation in the storage since it is fully handled by Manila and the NetApp >>>>>>>> driver. You can find all operations needed here [1][2]. If you have CIFS >>>>>>>> shares that need to be replicated and promoted, you will hit a bug that is >>>>>>>> being backported [3] at the moment. NFS shares should work fine. >>>>>>>> >>>>>>>> If you want, we can assist you on creating replicas for your shares >>>>>>>> in #openstack-manila channel. Just reach us there. >>>>>>>> >>>>>>>> [1] >>>>>>>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>>>>>>> [2] >>>>>>>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>>>>>>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>>>>>>> >>>>>>>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hello, thanks for your help. >>>>>>>>> I am waiting my storage administrators have a window to help me >>>>>>>>> because they must setup the snapmirror. >>>>>>>>> Meanwhile I am trying the host assisted migration but it does not >>>>>>>>> work. >>>>>>>>> The share remains in migrating for ever. >>>>>>>>> I am sure the replication-dr works because I tested it one year >>>>>>>>> ago. >>>>>>>>> I had an openstack on site A with a netapp storage >>>>>>>>> I had another openstack on Site B with another netapp storage. >>>>>>>>> The two openstack installation did not share anything. >>>>>>>>> So I made a replication between two volumes (shares). >>>>>>>>> I demoted the source share taking note about its export location >>>>>>>>> list >>>>>>>>> I managed the destination on openstack and it worked. >>>>>>>>> >>>>>>>>> The process for replication is not fully handled by openstack api, >>>>>>>>> so I should call netapp api for creating snapmirror relationship or >>>>>>>>> ansible modules or ask help to my storage administrators , right ? >>>>>>>>> Instead, using share migration, I could use only openstack api: I >>>>>>>>> understood that driver assisted cannot work in this case, but host assisted >>>>>>>>> should work. >>>>>>>>> >>>>>>>>> Best Regards >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas >>>>>>>>> ha scritto: >>>>>>>>> >>>>>>>>>> Hi Rodrigo, >>>>>>>>>> >>>>>>>>>> Thanks for your help on this. We were helping Ignazio in >>>>>>>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>>>>>>> clusters, which isn't supported in the current implementation of the >>>>>>>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>>>>>>> migration methods, we suggested using share-replication to create a copy in >>>>>>>>>> the destination, which will use the storage technologies to copy the data >>>>>>>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>>>>>>> We should continue tomorrow or in the next few days. >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> >>>>>>>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>>>>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hello Ignazio, >>>>>>>>>>> >>>>>>>>>>> If you are attempting to migrate between 2 NetApp backends, then >>>>>>>>>>> you shouldn't need to worry about correctly setting the >>>>>>>>>>> data_node_access_ip. Your ideal migration scenario is a >>>>>>>>>>> driver-assisted-migration, since it is between 2 NetApp backends. If that >>>>>>>>>>> fails due to misconfiguration, it will fallback to a host-assisted >>>>>>>>>>> migration, which will use the data_node_access_ip and the host will attempt >>>>>>>>>>> to mount both shares. This is not what you want for this scenario, as this >>>>>>>>>>> is useful for different backends, not your case. >>>>>>>>>>> >>>>>>>>>>> if you specify "manila migration-start --preserve-metadata True" >>>>>>>>>>> it will prevent the fallback to host-assisted, so it is easier for you to >>>>>>>>>>> narrow down the issue with the host-assisted migration out of the way. >>>>>>>>>>> >>>>>>>>>>> I used to be familiar with the NetApp driver set up to review >>>>>>>>>>> your case, however that was a long time ago. I believe the current NetApp >>>>>>>>>>> driver maintainers will be able to more accurately review your case and >>>>>>>>>>> spot the problem. >>>>>>>>>>> >>>>>>>>>>> If you could share some info about your scenario such as: >>>>>>>>>>> >>>>>>>>>>> 1) the 2 backends config groups in manila.conf (sanitized, >>>>>>>>>>> without passwords) >>>>>>>>>>> 2) a "manila show" of the share you are trying to migrate >>>>>>>>>>> (sanitized if needed) >>>>>>>>>>> 3) the "manila migration-start" command you are using and its >>>>>>>>>>> parameters. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello All, >>>>>>>>>>>> I am trying to migrate a share between a netapp backend to >>>>>>>>>>>> another. >>>>>>>>>>>> Both backends are configured in my manila.conf. >>>>>>>>>>>> I am able to create share on both, but I am not able to migrate >>>>>>>>>>>> share between them. >>>>>>>>>>>> I am using DSSH=False. >>>>>>>>>>>> I did not understand how host and driver assisted migration >>>>>>>>>>>> work and what "data_node_access_ip" means. >>>>>>>>>>>> The share I want to migrate is on a network (10.102.186.0/24) >>>>>>>>>>>> that I can reach by my management controllers network ( >>>>>>>>>>>> 10.102.184.0/24). I Can mount share from my controllers and I >>>>>>>>>>>> can mount also the netapp SVM where the share is located. >>>>>>>>>>>> So in the data_node_access_ip I wrote the list of my >>>>>>>>>>>> controllers management ips. >>>>>>>>>>>> During the migrate phase I checked if my controller where >>>>>>>>>>>> manila is running mounts the share or the netapp SVM but It does not happen. >>>>>>>>>>>> Please, what is my mistake ? >>>>>>>>>>>> Thanks >>>>>>>>>>>> Ignazio >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Rodrigo Barbieri >>>>>>>>>>> MSc Computer Scientist >>>>>>>>>>> OpenStack Manila Core Contributor >>>>>>>>>>> Federal University of São Carlos >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Douglas Salles Viroel >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Douglas Salles Viroel >>>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> Douglas Salles Viroel >>>>> >>>> >> >> -- >> Rodrigo Barbieri >> MSc Computer Scientist >> OpenStack Manila Core Contributor >> Federal University of São Carlos >> >> -- Douglas Salles Viroel -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Mon Feb 8 19:10:32 2021 From: whayutin at redhat.com (Wesley Hayutin) Date: Mon, 8 Feb 2021 12:10:32 -0700 Subject: [tripleo] migrating master from CentOS-8 to CentOS-8-Stream Message-ID: Greetings, Just a heads up over the course of the next few weeks upstream TripleO should see a transparent migration from CentOS-8 to CentOS-8-Stream. We do have a few options with regards to how the transition will take place. First and foremost we're going to migrate the master branch only at this time. *Question 1: Artifacts* Option1: New namespaces for artifacts: Containers: https://hub.docker.com/r/tripleomasterstream/ or some combination Images: http://images.rdoproject.org/centosstream8 or other combination Option2: Master content and namespaces is overwritten with centos-8-stream containers and images and will retain the paths and namespaces. Containers: https://hub.docker.com/u/tripleomaster Images: http://images.rdoproject.org/centos8/master/ *Question 2: job names* We can update the master jobs to include "stream" in the name and be explicit for the distro name OR: We can leave the job names as is and just communicate that "centos-8" is now really centos-8-stream Now's your chance to weigh in prior to the upcoming changes. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Mon Feb 8 19:20:21 2021 From: aschultz at redhat.com (Alex Schultz) Date: Mon, 8 Feb 2021 12:20:21 -0700 Subject: [tripleo] migrating master from CentOS-8 to CentOS-8-Stream In-Reply-To: References: Message-ID: On Mon, Feb 8, 2021 at 12:17 PM Wesley Hayutin wrote: > Greetings, > > Just a heads up over the course of the next few weeks upstream TripleO > should see a transparent migration from CentOS-8 to CentOS-8-Stream. > > We do have a few options with regards to how the transition will take > place. First and foremost we're going to migrate the master branch only at > this time. > > *Question 1: Artifacts* > Option1: New namespaces for artifacts: > > Containers: https://hub.docker.com/r/tripleomasterstream/ or some > combination > Images: http://images.rdoproject.org/centosstream8 or other combination > > Option2: > Master content and namespaces is overwritten with centos-8-stream > containers and images and will retain the paths and namespaces. > > Containers: https://hub.docker.com/u/tripleomaster > Images: http://images.rdoproject.org/centos8/master/ > > I vote for option2 rather than creating yet another set of names. > *Question 2: job names* > > We can update the master jobs to include "stream" in the name and be > explicit for the distro name > > OR: > > We can leave the job names as is and just communicate that "centos-8" is > now really centos-8-stream > > Also option2 for the same reason. Stream is the replacement for us so we can test early, so let's just swap it going forward. > > Now's your chance to weigh in prior to the upcoming changes. > Thanks! > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ssbarnea at redhat.com Mon Feb 8 19:45:13 2021 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Mon, 8 Feb 2021 19:45:13 +0000 Subject: [tripleo] migrating master from CentOS-8 to CentOS-8-Stream In-Reply-To: References: Message-ID: My vote is the same as Alex, for the same reasons: minimal effort (we can use the saved time to address other issues). On Mon, 8 Feb 2021 at 19:27, Alex Schultz wrote: > > > On Mon, Feb 8, 2021 at 12:17 PM Wesley Hayutin > wrote: > >> Greetings, >> >> Just a heads up over the course of the next few weeks upstream TripleO >> should see a transparent migration from CentOS-8 to CentOS-8-Stream. >> >> We do have a few options with regards to how the transition will take >> place. First and foremost we're going to migrate the master branch only at >> this time. >> >> *Question 1: Artifacts* >> Option1: New namespaces for artifacts: >> >> Containers: https://hub.docker.com/r/tripleomasterstream/ or some >> combination >> Images: http://images.rdoproject.org/centosstream8 or other combination >> >> Option2: >> Master content and namespaces is overwritten with centos-8-stream >> containers and images and will retain the paths and namespaces. >> >> Containers: https://hub.docker.com/u/tripleomaster >> Images: http://images.rdoproject.org/centos8/master/ >> >> > I vote for option2 rather than creating yet another set of names. > > >> *Question 2: job names* >> >> We can update the master jobs to include "stream" in the name and be >> explicit for the distro name >> >> OR: >> >> We can leave the job names as is and just communicate that "centos-8" is >> now really centos-8-stream >> >> > Also option2 for the same reason. Stream is the replacement for us so we > can test early, so let's just swap it going forward. > > >> >> Now's your chance to weigh in prior to the upcoming changes. >> Thanks! >> >> >> >> >> -- -- /sorin -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgolovat at redhat.com Mon Feb 8 20:50:37 2021 From: sgolovat at redhat.com (Sergii Golovatiuk) Date: Mon, 8 Feb 2021 21:50:37 +0100 Subject: [tripleo] migrating master from CentOS-8 to CentOS-8-Stream In-Reply-To: References: Message-ID: Option 2 seems reasonable. пн, 8 февр. 2021 г. в 20:47, Sorin Sbarnea : > My vote is the same as Alex, for the same reasons: minimal effort (we can > use the saved time to address other issues). > > On Mon, 8 Feb 2021 at 19:27, Alex Schultz wrote: > >> >> >> On Mon, Feb 8, 2021 at 12:17 PM Wesley Hayutin >> wrote: >> >>> Greetings, >>> >>> Just a heads up over the course of the next few weeks upstream TripleO >>> should see a transparent migration from CentOS-8 to CentOS-8-Stream. >>> >>> We do have a few options with regards to how the transition will take >>> place. First and foremost we're going to migrate the master branch only at >>> this time. >>> >>> *Question 1: Artifacts* >>> Option1: New namespaces for artifacts: >>> >>> Containers: https://hub.docker.com/r/tripleomasterstream/ or some >>> combination >>> Images: http://images.rdoproject.org/centosstream8 or other combination >>> >>> Option2: >>> Master content and namespaces is overwritten with centos-8-stream >>> containers and images and will retain the paths and namespaces. >>> >>> Containers: https://hub.docker.com/u/tripleomaster >>> Images: http://images.rdoproject.org/centos8/master/ >>> >>> >> I vote for option2 rather than creating yet another set of names. >> >> >>> *Question 2: job names* >>> >>> We can update the master jobs to include "stream" in the name and be >>> explicit for the distro name >>> >>> OR: >>> >>> We can leave the job names as is and just communicate that "centos-8" is >>> now really centos-8-stream >>> >>> >> Also option2 for the same reason. Stream is the replacement for us so we >> can test early, so let's just swap it going forward. >> >> >>> >>> Now's your chance to weigh in prior to the upcoming changes. >>> Thanks! >>> >>> >>> >>> >>> -- > -- > /sorin > -- Sergii Golovatiuk Senior Software Developer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From emiller at genesishosting.com Mon Feb 8 21:46:15 2021 From: emiller at genesishosting.com (Eric K. Miller) Date: Mon, 8 Feb 2021 15:46:15 -0600 Subject: [kolla] Stein base container build fails Message-ID: <046E9C0290DD9149B106B72FC9156BEA048DFD25@gmsxchsvr01.thecreation.com> Hi, I'm trying to update a few things in a Stein cluster and while building some kolla containers, I get the following error when building the base container: INFO:kolla.common.utils.openstack-base:Step 4/8 : RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && python get-pip.py && rm get-pip.py INFO:kolla.common.utils.openstack-base: ---> Running in f5c8bb3fe6f0 INFO:kolla.common.utils.openstack-base:curl (https://bootstrap.pypa.io/get-pip.py): response: 200, time: 0.423, size: 1929903 INFO:kolla.common.utils.openstack-base:Traceback (most recent call last): INFO:kolla.common.utils.openstack-base: File "get-pip.py", line 24244, in INFO:kolla.common.utils.openstack-base: INFO:kolla.common.utils.openstack-base: main() INFO:kolla.common.utils.openstack-base: File "get-pip.py", line 199, in main INFO:kolla.common.utils.openstack-base: bootstrap(tmpdir=tmpdir) INFO:kolla.common.utils.openstack-base: File "get-pip.py", line 82, in bootstrap INFO:kolla.common.utils.openstack-base: from pip._internal.cli.main import main as pip_entry_point INFO:kolla.common.utils.openstack-base: File "/tmp/tmpDXRruf/pip.zip/pip/_internal/cli/main.py", line 60 INFO:kolla.common.utils.openstack-base: sys.stderr.write(f"ERROR: {exc}") INFO:kolla.common.utils.openstack-base: INFO:kolla.common.utils.openstack-base: ^ INFO:kolla.common.utils.openstack-base:SyntaxError: invalid syntax INFO:kolla.common.utils.openstack-base: INFO:kolla.common.utils.openstack-base:Removing intermediate container f5c8bb3fe6f0 ERROR:kolla.common.utils.openstack-base:Error'd with the following message ERROR:kolla.common.utils.openstack-base:The command '/bin/sh -c curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && python get-pip.py && rm get-pip.py' returned a non-zero code: 1 Any ideas as to where the issue is? Is this a pip installer bug? Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From emiller at genesishosting.com Mon Feb 8 21:52:31 2021 From: emiller at genesishosting.com (Eric K. Miller) Date: Mon, 8 Feb 2021 15:52:31 -0600 Subject: [kolla] Stein base container build fails Message-ID: <046E9C0290DD9149B106B72FC9156BEA048DFD26@gmsxchsvr01.thecreation.com> Looks like a Python3 error as described here: https://stackoverflow.com/questions/65896334/python-pip-broken-wiith-sys -stderr-writeferror-exc Looks like we need some way to choose to use the older version of pip using: curl https://bootstrap.pypa.io/2.7/get-pip.py -o get-pip.py instead of: curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From nhicher at redhat.com Mon Feb 8 22:36:45 2021 From: nhicher at redhat.com (Nicolas Hicher) Date: Mon, 8 Feb 2021 17:36:45 -0500 Subject: [TripleO] Fwd: Planned outage of review.rdoproject.org: 2021-02-11 from 14:00 to 18:00 UTC In-Reply-To: <891d4c95-0f55-3452-d0b5-54a54aab0941@redhat.com> References: <891d4c95-0f55-3452-d0b5-54a54aab0941@redhat.com> Message-ID: -------- Forwarded Message -------- From: Nicolas Hicher Subject: Planned outage of review.rdoproject.org: 2021-02-11 from 14:00 to 18:00 UTC To: dev at lists.rdoproject.org Message-ID: <891d4c95-0f55-3452-d0b5-54a54aab0941 at redhat.com> Date: Mon, 8 Feb 2021 17:31:36 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Hello folks, Our cloud provider plans to do maintainance operation on 2021-02-11 from 14:00 to 18:00 UTC. Service interruption is expected, including: - Zuul CI not running jobs for gerrit, github or opendev. - RDO Trunk not building new packages. - DLRN API. - review.rdoproject.org and softwarefactory-project.io gerrit service. - www.rdoproject.org and lists.rdoproject.org. We expect that the interruption of services will be less than 4 hours. Regards, Nicolas , on behalf of the Software Factory Operation Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Mon Feb 8 23:18:44 2021 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Mon, 8 Feb 2021 17:18:44 -0600 Subject: [release] Proposing =?utf-8?B?RWzFkWQg?= =?utf-8?B?SWxsw6lz?= (elod) for Release Management Core In-Reply-To: References: Message-ID: <20210208231844.GA3861532@sm-workstation> On Mon, Feb 08, 2021 at 04:15:50PM +0100, Herve Beraud wrote: > Hi, > > Előd has been working on Release management for quite some time now and in > that time has > shown tremendous growth in his understanding of our processes and on how > deliverables work on Openstack. I think he would make a good addition to > the core team. > > Existing team members, please respond with +1/-1. > If there are no objections we'll add him to the ACL soon. :-) > +1 for sure. He has been very willing to help and has learned a lot about the release process. His focus on stable is also a great thing to have on the team. Sean From juliaashleykreger at gmail.com Tue Feb 9 00:53:24 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 8 Feb 2021 16:53:24 -0800 Subject: [ironic] On redfish-virtual-media vs idrac-redfish-virtual-media In-Reply-To: References: Message-ID: So I had an interesting discussion with an operator last week which relates quite a bit. The tl;dr is they want to hide all the vendor complexities for basic operational usage patterns, meaning deployment of machines. The thing Ironic was originally intended to try to achieve and mostly does with the community drivers. Ultimately the thing redfish was intended to try and aid as well. There was recognition that advanced actions, tasks, require specific drivers and highly specific settings, but they want to make that the exception and less of the current operational rule. They expressed a need to pivot on demand to meet their business needs, and that firmware changes are a factor in their decision making, planning processes, and a frequent headache. Hearing this, I think we have three distinct problems to solve. 1) We should enable operators of our API to have an easy path for basic usage. If I had to condense this into some sort of user story, it would be "As a infrastructure operator, I only want to provide a username, password, and base address to at least get started to perform a deployment. I don't want to have to identify vendor specifics out of the gate." 2) We should enable operators to pivot to a vendor driver easily, which is in hand with #1. In a sense, we already have part of this, although different username/password settings don't help this situation. I would boil this down to a user story of: "As an infrastructure operator, I need to be able to pivot hardware due to supply chain issues or due to immovable operational/security requirements." To put this into more precise language: "The business requirements and plan can not be put on hold." 3) Operators need less BMC interaction breakages in terms of features. They can no longer rely on the common denominator (PXE) if it doesn't meet their operational/security/networking models, especially the further outside of their traditional data center walls as they can no longer have any sort of implicit trust of the hardware's networking and physical state. "As an infrastructure operator, I need some assurance of capability stability in the BMCs and that the established API + hardware behavior contracts are not broken." I think this comes down to a mix of things like redfish profiles/testing *and* third party CI which also tests the latest firmware or even pre-release firmware. Perhaps not a public CI system, but something that can be stood up to provide feedback which also draws on upstream state so firmware changes can be detected and prevented them from breaking or starting to derail a tight operational timeline. I guess this is part of the nature of automating infrastructure. We've gone from processes that took months and months to facilitate the roll-out of an entire data center to more and more rapid solutions. I think we're at the point where expectations and planning all center around the union of stability, capability, and time-to-finished system deployment. Technical organizations make commitments to the business. Businesses make further commitments from there. It is just the nature of the industry. And with that, I guess I'd like to resume discussion on the larger topic of how we enable operators to always just start with "redfish_address", "redfish_username", "redfish_password", and ultimately reach a virtual media driven deployment regardless of the vendor. -Julia On Thu, Jan 28, 2021 at 8:07 AM Dmitry Tantsur wrote: > > Hi, > > This is how the patch would look like: https://review.opendev.org/c/openstack/ironic/+/772899 > > It even passes redfish and idrac unit tests unmodified (modulo incorrect mocking in test_boot). > > Dmitry > > On Mon, Jan 25, 2021 at 3:52 PM Dmitry Tantsur wrote: >> >> Hi, >> >> On Mon, Jan 25, 2021 at 7:04 AM Pioso, Richard wrote: >>> >>> On Wed, Jan 20, 2021 at 9:56 AM Dmitry Tantsur >>> wrote: >>> > >>> > Hi all, >>> > >>> > Now that we've gained some experience with using Redfish virtual >>> > media I'd like to reopen the discussion about $subj. For the context, the >>> > idrac-redfish-virtual-media boot interface appeared because Dell >>> > machines need an additional action [1] to boot from virtual media. The >>> > initial position on hardware interfaces was that anything requiring OEM >>> > actions must go into a vendor hardware interface. I would like to propose >>> > relaxing this (likely unwritten) rule. >>> > >>> > You see, this distinction causes a lot of confusion. Ironic supports >>> > Redfish, ironic supports iDRAC, iDRAC supports Redfish, ironic supports >>> > virtual media, Redfish supports virtual media, iDRAC supports virtual >>> > media. BUT! You cannot use redfish-virtual-media with iDRAC. Just today >>> > I had to explain the cause of it to a few people. It required diving into >>> > how exactly Redfish works and how exactly ironic uses it, which is >>> > something we want to protect our users from. >>> >>> Wow! Now I’m confused, too. AFAIU, the people you had to help decided to use the redfish driver, instead of the idrac driver. It is puzzling that they decided to do that considering the ironic driver composition reform [1] of a couple years ago. Recall that reform allows “having one vendor driver with options configurable per node instead of many drivers for every vendor” and had the following goals. >> >> >> When discussing the user's confusion we should not operate in terms of ironic (especially since the problem happened in metal3 land, which abstracts away ironic). As a user, when I see Redfish and virtual media, and I know that Dell supports them, I can expect redfish-virtual-media to work. The fact that it does not may cause serious perception problems. The one I'm particularly afraid of is end users thinking "iDRAC is not Redfish compliant". >> >>> >>> >>> “- Make vendors in charge of defining a set of supported interface implementations in priority order. >>> - Allow vendors to guarantee that unsupported interface implementations will not be used with hardware types they define. This is done by having a hardware type list all interfaces it supports.” >>> >>> The idrac driver is Dell Technologies’ vendor driver for systems with an iDRAC. It offers a one-stop shop for using ironic to manage its systems. Users can select among the hardware interfaces it supports. Each interface uses a single management protocol -- Redfish, WS-Man, and soon IPMI [2] -- to communicate with the BMC. While it supports the idrac-redfish-virtual-media boot interface, it does not support redfish-virtual-media. One cannot configure a node with the idrac driver to use redfish-virtual-media. >> >> >> I know, the problem is explaining to users why they can use the redfish hardware type with Dell machines, but only partly. >> >>> >>> >>> > >>> > We already have a precedent [2] of adding vendor-specific handling to >>> > a generic driver. >>> >>> That change was introduced about a month ago in the community’s vendor-independent ipmi driver. That was very understandable, since IPMI is a very mature management protocol and was introduced over 22 years ago. I cannot remember what I was doing back then :) As one would expect, the ipmi driver has experienced very little change over the past two-plus years. I count roughly two (2) substantive changes over that period. By contrast, the Redfish protocol is just over five (5) years old. Its vendor-independent driver, redfish, has been fertile ground for adding new, advanced features, such as BIOS settings configuration, firmware update, and RAID configuration, and fixing bugs. It fosters lots of change, too many for me to count. >>> >>> > I have proposed a patch [3] to block using redfish- >>> > virtual-media for Dell hardware, but I grew to dislike this approach. It >>> > does not have precedents in the ironic code base and it won't scale well >>> > if we have to handle vendor differences for vendors that don't have >>> > ironic drivers. >>> >>> Dell understands and is on board with ironic’s desire that vendors support the full functionality offered by the vendor-independent redfish driver. If the iDRAC is broken with regards to redfish-virtual-media, then we have a vested interest in fixing it. >>> While that is worked, an alternative approach could be for our community to strengthen its promotion of the goals of the driver composition reform. That would leverage ironic’s long-standing ability to ensure people only use hardware interfaces which the vendor and its driver support. >> >> >> Yep. I don't necessarily disagree with that, but it poses issues for layered products like metal3, where on each abstraction level a small nuance is lost, and the end result is confusion and frustration. >> >>> >>> >>> > >>> > Based on all this I suggest relaxing the rule to the following: if a feature >>> > supported by a generic hardware interface requires additional actions or >>> > has a minor deviation from the standard, allow handling it in the generic >>> > hardware interface. Meaning, redfish-virtual-media starts handling the >>> > Dell case by checking the System manufacturer (via the recently added >>> > detect_vendor call) and loading the OEM code if it matches "Dell". After >>> > this idrac-redfish-virtual-media will stay empty (for future enhancements >>> > and to make the patch backportable). >>> >>> That would cause the vendor-independent redfish driver to become dependent on sushy-oem-idrac, which is not under ironic governance. >> >> >> This itself is not a problem, most of the projects we depend on are not under ironic governance. >> >> Also it won't be a hard dependency, only if we detect 'Dell' in system.manufacturer. >> >>> >>> >>> It is worth pointing out the sushy-oem-idrac library is necessary to get virtual media to work with Dell systems. It was first created for that purpose. It is not a workaround like those in sushy, which accommodate common, minor standards interpretation and implementation differences across vendors by sprinkling a bit of code here and there within the library, unbeknownst to ironic proper. >>> >>> We at Dell Technologies are concerned that the proposed rule change would result in a greater code review load on the ironic community. Since vendor-specific code would be in the generic hardware interface, much more care, eyes, and integration testing against physical hardware would be needed to ensure it does not break others. And our community is already concerned about its limited available review bandwidth [3]. Generally speaking, the vendor third-party CIs do not cover all drivers. Rather, each vendor only tests its own driver, and, in some cases, sushy. Therefore, changes to the vendor-independent redfish driver may introduce regressions in what has been working with various hardware and not be detected by automated testing before being merged. >> >> >> The change will, in fact, be tested by your 3rd party CI because it was used by both the generic redfish hardware type and the idrac one. >> >> I guess a source of confusion may be this: I don't suggest the idrac hardware type goes away, nor do I suggest we start copying its Dell-specific features to redfish. >> >>> >>> >>> Can we afford this additional review load, prospective slowing down of innovation with Redfish, and likely undetected regressions? Would that be best for our users when we could fix the problem in other ways, such as the one suggested above? >>> >>> Also consider that feedback to the DMTF to drive vendor consistency is critical, but the DMTF needs feedback on what is broken in order to push others to address a problem. Remember the one-time boot debacle when three vendors broke at the same time? Once folks went screaming to the DMTF about the issue, it quickly explained it to member companies, clarified the standard, and created a test case for that condition. Changing the driver model to accommodate everyone's variations will reduce that communication back to the DMTF, meaning the standard stalls and interoperability does not gain traction. >> >> >> I would welcome somebody raising to DMTF the issue that causes iDRAC to need another action to boot from virtual media, I suspect other vendors may have similar issues. That being said, our users are way too far away from DMTF, and even we (Julia and myself, for example) don't have a direct way of influencing it, only through you and other folks who help (thank you!). >> >>> >>> >>> > >>> > Thoughts? >>> > >>> >>> TL;DR, we strongly recommend ironic not make this rule change. Clearly communicating users should use the vendor driver should simplify their experience and eliminate the confusion. >>> >>> The code as-is is factored well as a result of the 21st century approach the community has taken to date. Vendors can implement the driver OEM changes they need to accommodate their unique hardware and BMC requirements, with reduced concern about the risk of breaking other drivers or ironic itself. Ironic’s driver composition reform, sushy, and sushy’s OEM extension mechanism support that modern approach. Our goal is to continue to improve the iDRAC Redfish service’s compliance with the standard and eliminate the kind of OEM code Dmitry identified. >>> >>> Beware of unintended consequences, including >>> >>> - reduced quality, >>> - slowed feature and bug fix velocity, >> >> >> I don't see how this happens, given that the code is merely copied from one place to the other (with the 1st place inheriting it from its base class). >> >>> >>> - stalled DMTF Redfish standard, >>> - lost Redfish interoperability traction, and >> >> >> I'm afraid we're actually hurting Redfish adoption when we start complicating its usage. Think, with IPMI everything "Just Works" (except when it does not, but that happens much later), while for Redfish the users need to be aware of... flavors of Redfish? Something that we (and DMTF) don't even have a name for. >> >> Dmitry >> >>> >>> - increased code review load. >>> >>> > Dmitry >>> > >>> > [1] >>> > https://opendev.org/openstack/ironic/src/commit/6ea73bdfbb53486cf9 >>> > 905d21024d16cbf5829b2c/ironic/drivers/modules/drac/boot.py#L130 >>> > [2] https://review.opendev.org/c/openstack/ironic/+/757198/ >>> > [3] https://review.opendev.org/c/openstack/ironic/+/771619 >>> > >>> > -- >>> > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, >>> > Commercial register: Amtsgericht Muenchen, HRB 153243, >>> > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, >>> > Michael O'Neill >>> >>> I welcome your feedback. >>> >>> Rick >>> >>> [1] https://opendev.org/openstack/ironic-specs/src/branch/master/specs/approved/driver-composition-reform.rst >>> [2] https://storyboard.openstack.org/#!/story/2008528 >>> [3] https://etherpad.opendev.org/p/ironic-wallaby-midcycle >>> >> >> >> -- >> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, >> Commercial register: Amtsgericht Muenchen, HRB 153243, >> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill > > > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill From ignaziocassano at gmail.com Tue Feb 9 06:58:04 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 9 Feb 2021 07:58:04 +0100 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Hello Douglas , so if I want to change the snapmirror schedule I cannot do it using openstack api but I must use netapp api/ansible or manually, right ? Thanks Ignazio Il Lun 8 Feb 2021, 19:39 Douglas ha scritto: > Hi Ignazio, > > The 'replica_state_update_interval' is the time interval that Manila waits > before requesting a status update to the storage system. It will request > this information for every replica that isn't 'in-sync' status yet. The > default value of this config option is actually 300 seconds, not 1 hour. > You can also manually request this update by issuing 'share-replica-resync' > operation[1]. > I believe that you might be mixing with the 'snapmirror' schedule concept. > Indeed, snapmirror relationships are created with 'schedule' set to be > hourly. This 'schedule' is used to update the destination replica, > incrementally, after the snapmirror becames in-sync (snapmirrored), since > we use Asynchronous SnapMirror[2]. > > [1] > https://docs.openstack.org/api-ref/shared-file-system/?expanded=resync-share-replica-detail#resync-share-replica > [2] > https://docs.netapp.com/ontap-9/topic/com.netapp.doc.pow-dap/GUID-18263F03-486B-434C-A190-C05D3AFC05DD.html > > On Mon, Feb 8, 2021 at 3:01 PM Ignazio Cassano > wrote: > >> Hello All, >> I have another question about replication sync time, please. >> I did not specify any option in manila.conf and seems netapp set it one >> time every hour. >> I did not understand if replica_state_update_interval is the replication >> sync time frequency or only checks the replica state. >> Is there any parameter I can use to setup the replica sync time ? >> Thanks >> Ignazio >> >> >> >> Il giorno lun 8 feb 2021 alle ore 12:56 Rodrigo Barbieri < >> rodrigo.barbieri2010 at gmail.com> ha scritto: >> >>> Hi Ignazio, >>> >>> The way you set it up is correct with "enabled_share_backends = >>> svm-tst-nfs-565,netapp-nfs-566". You need both backends enabled for the >>> feature to work. >>> >>> Regards, >>> >>> Rodrigo >>> >>> On Mon, Feb 8, 2021 at 8:52 AM Ignazio Cassano >>> wrote: >>> >>>> Hello All, >>>> I am able to replicate shares between netapp storages but I have some >>>> questions. >>>> Reading netapp documentation, seems the replication svm must not be >>>> enabled in manila.conf. >>>> The following is what netapp suggests: >>>> >>>> enabled_share_backends = svm-tst-nfs-565 >>>> >>>> [svm-tst-nfs-565] >>>> share_backend_name = svm-tst-nfs-565 >>>> driver_handles_share_servers = false >>>> share_driver = manila.share.drivers.netapp.common.NetAppDriver >>>> netapp_storage_family = ontap_cluster >>>> netapp_server_hostname = fas8040.csi.it >>>> netapp_server_port = 80 >>>> netapp_login = admin >>>> netapp_password = ****** >>>> netapp_transport_type = http >>>> netapp_vserver = svm-tst-nfs-565 >>>> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >>>> replication_domain = replication_domain_1 >>>> >>>> >>>> [netapp-nfs-566] >>>> share_backend_name = netapp-nfs-566 >>>> driver_handles_share_servers = False >>>> share_driver = manila.share.drivers.netapp.common.NetAppDriver >>>> netapp_storage_family = ontap_cluster >>>> netapp_server_hostname = fas.csi.it >>>> netapp_server_port = 80 >>>> netapp_login = admin >>>> netapp_password = ***** >>>> netapp_transport_type = http >>>> netapp_vserver = manila-nfs-566 >>>> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >>>> replication_domain = replication_domain_1 >>>> >>>> As you can see above, the target of replica netapp-nfs-566 is not >>>> included in enabled backends. >>>> >>>> When I try to create a replica in this situation, the manila schedule >>>> reports "no valid host found". >>>> >>>> It works if I enable in manila.conf the target like this: >>>> enabled_share_backends = svm-tst-nfs-565,netapp-nfs-566 >>>> >>>> Please, any suggestion ? >>>> >>>> Thanks >>>> Ignazio >>>> >>>> >>>> Il giorno ven 5 feb 2021 alle ore 13:59 Ignazio Cassano < >>>> ignaziocassano at gmail.com> ha scritto: >>>> >>>>> Thanks, Douglas. >>>>> On another question: >>>>> the manila share-replica-delete delete the snapmirror ? >>>>> If yes, source and destination volume become both writable ? >>>>> >>>>> Ignazio >>>>> >>>>> Il giorno ven 5 feb 2021 alle ore 13:48 Douglas ha >>>>> scritto: >>>>> >>>>>> Yes, it is correct. This should work as an alternative for >>>>>> host-assisted-migration and will be faster since it uses storage >>>>>> technologies to synchronize data. >>>>>> If your share isn't associated with a share-type that has >>>>>> replication_type='dr' you can: 1) create a new share-type with >>>>>> replication_type extra-spec, 2) unmanage your share, 3) manage it again >>>>>> using the new share-type. >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Hello, I am sorry. >>>>>>> >>>>>>> I read the documentation. >>>>>>> >>>>>>> SMV must be peered once bye storage admimistrator or using ansible >>>>>>> playbook. >>>>>>> I must create a two backend in manila.conf with the same replication >>>>>>> domain. >>>>>>> I must assign to the source a type and set replication type dr. >>>>>>> When I create a share if I want to enable snapmirror for it I must >>>>>>> create on openstack a share replica for it. >>>>>>> The share on destination is read only until I promote it. >>>>>>> When I promote it, it become writable. >>>>>>> Then I can manage it on target openstack. >>>>>>> >>>>>>> I hope the above is the correct procedure >>>>>>> >>>>>>> Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Douglas, you are really kind. >>>>>>>> Let my to to recap and please correct if I am wrong: >>>>>>>> >>>>>>>> - manila share on netapp are under svm >>>>>>>> - storage administrator createx a peering between svm source and >>>>>>>> svm destination (or on single share volume ?) >>>>>>>> - I create a manila share with specs replication type (the share >>>>>>>> belongs to source svm) . In manila.conf source and destination must have >>>>>>>> the same replication domain >>>>>>>> - Creating the replication type it initializes the snapmirror >>>>>>>> >>>>>>>> Is it correct ? >>>>>>>> Ignazio >>>>>>>> >>>>>>>> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas >>>>>>>> ha scritto: >>>>>>>> >>>>>>>>> Hi Ignazio, >>>>>>>>> >>>>>>>>> In order to use share replication between NetApp backends, you'll >>>>>>>>> need that Clusters and SVMs be peered in advance, which can be done by the >>>>>>>>> storage administrators once. You don't need to handle any SnapMirror >>>>>>>>> operation in the storage since it is fully handled by Manila and the NetApp >>>>>>>>> driver. You can find all operations needed here [1][2]. If you have CIFS >>>>>>>>> shares that need to be replicated and promoted, you will hit a bug that is >>>>>>>>> being backported [3] at the moment. NFS shares should work fine. >>>>>>>>> >>>>>>>>> If you want, we can assist you on creating replicas for your >>>>>>>>> shares in #openstack-manila channel. Just reach us there. >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>>>>>>>> [2] >>>>>>>>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>>>>>>>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>>>>>>>> >>>>>>>>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano < >>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hello, thanks for your help. >>>>>>>>>> I am waiting my storage administrators have a window to help me >>>>>>>>>> because they must setup the snapmirror. >>>>>>>>>> Meanwhile I am trying the host assisted migration but it does not >>>>>>>>>> work. >>>>>>>>>> The share remains in migrating for ever. >>>>>>>>>> I am sure the replication-dr works because I tested it one year >>>>>>>>>> ago. >>>>>>>>>> I had an openstack on site A with a netapp storage >>>>>>>>>> I had another openstack on Site B with another netapp storage. >>>>>>>>>> The two openstack installation did not share anything. >>>>>>>>>> So I made a replication between two volumes (shares). >>>>>>>>>> I demoted the source share taking note about its export location >>>>>>>>>> list >>>>>>>>>> I managed the destination on openstack and it worked. >>>>>>>>>> >>>>>>>>>> The process for replication is not fully handled by openstack >>>>>>>>>> api, so I should call netapp api for creating snapmirror relationship or >>>>>>>>>> ansible modules or ask help to my storage administrators , right ? >>>>>>>>>> Instead, using share migration, I could use only openstack api: I >>>>>>>>>> understood that driver assisted cannot work in this case, but host assisted >>>>>>>>>> should work. >>>>>>>>>> >>>>>>>>>> Best Regards >>>>>>>>>> Ignazio >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas >>>>>>>>>> ha scritto: >>>>>>>>>> >>>>>>>>>>> Hi Rodrigo, >>>>>>>>>>> >>>>>>>>>>> Thanks for your help on this. We were helping Ignazio in >>>>>>>>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>>>>>>>> clusters, which isn't supported in the current implementation of the >>>>>>>>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>>>>>>>> migration methods, we suggested using share-replication to create a copy in >>>>>>>>>>> the destination, which will use the storage technologies to copy the data >>>>>>>>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>>>>>>>> We should continue tomorrow or in the next few days. >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> >>>>>>>>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>>>>>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello Ignazio, >>>>>>>>>>>> >>>>>>>>>>>> If you are attempting to migrate between 2 NetApp backends, >>>>>>>>>>>> then you shouldn't need to worry about correctly setting the >>>>>>>>>>>> data_node_access_ip. Your ideal migration scenario is a >>>>>>>>>>>> driver-assisted-migration, since it is between 2 NetApp backends. If that >>>>>>>>>>>> fails due to misconfiguration, it will fallback to a host-assisted >>>>>>>>>>>> migration, which will use the data_node_access_ip and the host will attempt >>>>>>>>>>>> to mount both shares. This is not what you want for this scenario, as this >>>>>>>>>>>> is useful for different backends, not your case. >>>>>>>>>>>> >>>>>>>>>>>> if you specify "manila migration-start --preserve-metadata >>>>>>>>>>>> True" it will prevent the fallback to host-assisted, so it is easier for >>>>>>>>>>>> you to narrow down the issue with the host-assisted migration out of the >>>>>>>>>>>> way. >>>>>>>>>>>> >>>>>>>>>>>> I used to be familiar with the NetApp driver set up to review >>>>>>>>>>>> your case, however that was a long time ago. I believe the current NetApp >>>>>>>>>>>> driver maintainers will be able to more accurately review your case and >>>>>>>>>>>> spot the problem. >>>>>>>>>>>> >>>>>>>>>>>> If you could share some info about your scenario such as: >>>>>>>>>>>> >>>>>>>>>>>> 1) the 2 backends config groups in manila.conf (sanitized, >>>>>>>>>>>> without passwords) >>>>>>>>>>>> 2) a "manila show" of the share you are trying to migrate >>>>>>>>>>>> (sanitized if needed) >>>>>>>>>>>> 3) the "manila migration-start" command you are using and its >>>>>>>>>>>> parameters. >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello All, >>>>>>>>>>>>> I am trying to migrate a share between a netapp backend to >>>>>>>>>>>>> another. >>>>>>>>>>>>> Both backends are configured in my manila.conf. >>>>>>>>>>>>> I am able to create share on both, but I am not able to >>>>>>>>>>>>> migrate share between them. >>>>>>>>>>>>> I am using DSSH=False. >>>>>>>>>>>>> I did not understand how host and driver assisted migration >>>>>>>>>>>>> work and what "data_node_access_ip" means. >>>>>>>>>>>>> The share I want to migrate is on a network (10.102.186.0/24) >>>>>>>>>>>>> that I can reach by my management controllers network ( >>>>>>>>>>>>> 10.102.184.0/24). I Can mount share from my controllers and I >>>>>>>>>>>>> can mount also the netapp SVM where the share is located. >>>>>>>>>>>>> So in the data_node_access_ip I wrote the list of my >>>>>>>>>>>>> controllers management ips. >>>>>>>>>>>>> During the migrate phase I checked if my controller where >>>>>>>>>>>>> manila is running mounts the share or the netapp SVM but It does not happen. >>>>>>>>>>>>> Please, what is my mistake ? >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> Ignazio >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Rodrigo Barbieri >>>>>>>>>>>> MSc Computer Scientist >>>>>>>>>>>> OpenStack Manila Core Contributor >>>>>>>>>>>> Federal University of São Carlos >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Douglas Salles Viroel >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Douglas Salles Viroel >>>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> Douglas Salles Viroel >>>>>> >>>>> >>> >>> -- >>> Rodrigo Barbieri >>> MSc Computer Scientist >>> OpenStack Manila Core Contributor >>> Federal University of São Carlos >>> >>> > > -- > Douglas Salles Viroel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Tue Feb 9 07:17:46 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 9 Feb 2021 08:17:46 +0100 Subject: [kolla] Stein base container build fails In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA048DFD25@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA048DFD25@gmsxchsvr01.thecreation.com> Message-ID: On Mon, Feb 8, 2021 at 10:46 PM Eric K. Miller wrote: > > Hi, > Hi Eric, > > I'm trying to update a few things in a Stein cluster and while building some kolla containers, I get the following error when building the base container: > > > > INFO:kolla.common.utils.openstack-base:Step 4/8 : RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && python get-pip.py && rm get-pip.py > > INFO:kolla.common.utils.openstack-base: ---> Running in f5c8bb3fe6f0 > > INFO:kolla.common.utils.openstack-base:curl (https://bootstrap.pypa.io/get-pip.py): response: 200, time: 0.423, size: 1929903 > > INFO:kolla.common.utils.openstack-base:Traceback (most recent call last): > > INFO:kolla.common.utils.openstack-base: File "get-pip.py", line 24244, in > > INFO:kolla.common.utils.openstack-base: > > INFO:kolla.common.utils.openstack-base: main() > > INFO:kolla.common.utils.openstack-base: File "get-pip.py", line 199, in main > > INFO:kolla.common.utils.openstack-base: bootstrap(tmpdir=tmpdir) > > INFO:kolla.common.utils.openstack-base: File "get-pip.py", line 82, in bootstrap > > INFO:kolla.common.utils.openstack-base: from pip._internal.cli.main import main as pip_entry_point > > INFO:kolla.common.utils.openstack-base: File "/tmp/tmpDXRruf/pip.zip/pip/_internal/cli/main.py", line 60 > > INFO:kolla.common.utils.openstack-base: sys.stderr.write(f"ERROR: {exc}") > > INFO:kolla.common.utils.openstack-base: > > INFO:kolla.common.utils.openstack-base: ^ > > INFO:kolla.common.utils.openstack-base:SyntaxError: invalid syntax > > INFO:kolla.common.utils.openstack-base: > > INFO:kolla.common.utils.openstack-base:Removing intermediate container f5c8bb3fe6f0 > > ERROR:kolla.common.utils.openstack-base:Error'd with the following message > > ERROR:kolla.common.utils.openstack-base:The command '/bin/sh -c curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && python get-pip.py && rm get-pip.py' returned a non-zero code: 1 > > > > Any ideas as to where the issue is? Is this a pip installer bug? This, among other things, has already been fixed in the repo. [1] Stein is in EM (Extended Maintenance) phase and does not get new releases so you are advised to run from the stable/stein branch. [2] [1] https://review.opendev.org/c/openstack/kolla/+/772490 [2] https://opendev.org/openstack/kolla/src/branch/stable/stein -yoctozepto From balazs.gibizer at est.tech Tue Feb 9 08:04:57 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Tue, 09 Feb 2021 09:04:57 +0100 Subject: [nova][placement] adding nova-core to placement-core in gerrit In-Reply-To: <0O5YNQ.4J6G712XMY0K@est.tech> References: <0O5YNQ.4J6G712XMY0K@est.tech> Message-ID: <9459OQ.7MWRESZOT3R8@est.tech> On Wed, Feb 3, 2021 at 10:43, Balazs Gibizer wrote: > > > On Tue, Feb 2, 2021 at 09:06, Balazs Gibizer > wrote: >> Hi, >> >> I've now added nova-core to the placement-core group[1] > > It turned out that there is a separate placement-stable-maint group > which does not have the nova-stable-maint included. Is there any > objection to add nova-stable-maint to placement-stable-maint? With Dan's help the nova-stable-maint is now part of placement-stable-maint. Cheers, gibi > > Cheers, > gibi > >> >> [1] >> https://review.opendev.org/admin/groups/93c2b262ebfe0b3270c0b7ad60de887b02aaba9d,members >> >> On Wed, Jan 27, 2021 at 09:49, Stephen Finucane >>  wrote: >>> On Tue, 2021-01-26 at 17:14 +0100, Balazs Gibizer wrote: >>>> Hi, >>>> >>>> Placement got back under nova governance but so far we haven't >>>> consolidated the core teams yet. Stephen pointed out to me that >>>> given >>>> the ongoing RBAC works it would be beneficial if more nova cores, >>>> with >>>> API and RBAC experience, could approve such patches. So I'm >>>> proposing >>>> to add nova-core group to the placement-core group in gerrit. This >>>> means Ghanshyam, John, Lee, and Melanie would get core rights in >>>> the >>>> placement related repositories. >>>> >>>> @placement-core, @nova-core members: Please let me know if you >>>> have any >>>> objection to such change until end of this week. >>> >>> I brought it up and obviously think it's a sensible idea, so it's >>> an easy +1 >>> from me. >>> >>> Stephen >>> >>>> cheers, >>>> gibi >>> >>> >>> >> >> >> > > > From ykarel at redhat.com Tue Feb 9 08:22:27 2021 From: ykarel at redhat.com (Yatin Karel) Date: Tue, 9 Feb 2021 13:52:27 +0530 Subject: [tripleo] migrating master from CentOS-8 to CentOS-8-Stream In-Reply-To: References: Message-ID: On Tue, Feb 9, 2021 at 12:46 AM Wesley Hayutin wrote: > > Greetings, > > Just a heads up over the course of the next few weeks upstream TripleO should see a transparent migration from CentOS-8 to CentOS-8-Stream. > > We do have a few options with regards to how the transition will take place. First and foremost we're going to migrate the master branch only at this time. > > Question 1: Artifacts > Option1: New namespaces for artifacts: > > Containers: https://hub.docker.com/r/tripleomasterstream/ or some combination > Images: http://images.rdoproject.org/centosstream8 or other combination > > Option2: > Master content and namespaces is overwritten with centos-8-stream containers and images and will retain the paths and namespaces. > > Containers: https://hub.docker.com/u/tripleomaster > Images: http://images.rdoproject.org/centos8/master/ > > Question 2: job names > > We can update the master jobs to include "stream" in the name and be explicit for the distro name > > OR: > > We can leave the job names as is and just communicate that "centos-8" is now really centos-8-stream > > > Now's your chance to weigh in prior to the upcoming changes. > Thanks! > > Option 2 LGTM too, name changes shouldn't be needed unless and until both centos8 and 8-stream jobs are planned to be running, which i don't think is under plan as per above. > > Thanks and Regards Yatin Karel From mark at stackhpc.com Tue Feb 9 09:21:39 2021 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 9 Feb 2021 09:21:39 +0000 Subject: [ironic] A new project for useful deploy steps? In-Reply-To: References: Message-ID: On Mon, 8 Feb 2021 at 15:36, Ruby Loo wrote: > > Hi Dmitry, > > Thanks for bringing this up! We discussed this in our weekly ironic meeting [1]. The consensus there seems to be to keep the ideas in IPA (with priority=0). The additional code will be 'negligible' in size so ramdisk won't be bloated due to this. Also, it keeps things simple. Having a separate package means more maintenance overhead and confusion for our users. > > Would be good to hear from others, if they don't think this is a good idea. Otherwise, I'm looking forward to Dmitry's RFEs on this :) > > --ruby > > [1] http://eavesdrop.openstack.org/irclogs/%23openstack-ironic/%23openstack-ironic.2021-02-08.log.html#t2021-02-08T15:23:02 > > On Mon, Feb 8, 2021 at 8:02 AM Dmitry Tantsur wrote: >> >> Hi all, >> >> We have finally implemented in-band deploy steps (w00t!), and people started coming up with ideas. I have two currently: >> 1) configure arbitrary kernel command line arguments via grub >> 2) write NetworkManager configuration (for those not using cloud-init) Seems like a good idea. I'd love to see deploy steps/templates getting more traction. >> >> I'm not sure how I feel about putting these in IPA proper, seems like we may go down a rabbit hole here. But what about a new project (ironic-python-agent-extras?) with a hardware manager providing a collection of potentially useful deploy steps? >> >> Or should we nonetheless just put them in IPA with priority=0? Seems reasonable to me, if they are generic enough. >> >> Opinions welcome. >> >> Dmitry >> >> -- >> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, >> Commercial register: Amtsgericht Muenchen, HRB 153243, >> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill From viroel at gmail.com Tue Feb 9 11:04:16 2021 From: viroel at gmail.com (Douglas) Date: Tue, 9 Feb 2021 08:04:16 -0300 Subject: [stein][manila] share migration misconfiguration ? In-Reply-To: References: Message-ID: Yes, you are correct. We didn't provide this configuration and you probably don't need to worry about it when promoting a share. If you call 'replica-promote' api, for your 'dr' replica, the NetApp driver will automatically trigger a snapmirror sync to get the last minute updates, before actually breaking the relationship. On Tue, Feb 9, 2021 at 3:58 AM Ignazio Cassano wrote: > Hello Douglas , so if I want to change the snapmirror schedule I cannot do > it using openstack api but I must use netapp api/ansible or manually, right > ? > Thanks > Ignazio > > Il Lun 8 Feb 2021, 19:39 Douglas ha scritto: > >> Hi Ignazio, >> >> The 'replica_state_update_interval' is the time interval that Manila >> waits before requesting a status update to the storage system. It will >> request this information for every replica that isn't 'in-sync' status yet. >> The default value of this config option is actually 300 seconds, not 1 >> hour. You can also manually request this update by issuing >> 'share-replica-resync' operation[1]. >> I believe that you might be mixing with the 'snapmirror' schedule >> concept. Indeed, snapmirror relationships are created with 'schedule' set >> to be hourly. This 'schedule' is used to update the destination replica, >> incrementally, after the snapmirror becames in-sync (snapmirrored), since >> we use Asynchronous SnapMirror[2]. >> >> [1] >> https://docs.openstack.org/api-ref/shared-file-system/?expanded=resync-share-replica-detail#resync-share-replica >> [2] >> https://docs.netapp.com/ontap-9/topic/com.netapp.doc.pow-dap/GUID-18263F03-486B-434C-A190-C05D3AFC05DD.html >> >> On Mon, Feb 8, 2021 at 3:01 PM Ignazio Cassano >> wrote: >> >>> Hello All, >>> I have another question about replication sync time, please. >>> I did not specify any option in manila.conf and seems netapp set it one >>> time every hour. >>> I did not understand if replica_state_update_interval is the replication >>> sync time frequency or only checks the replica state. >>> Is there any parameter I can use to setup the replica sync time ? >>> Thanks >>> Ignazio >>> >>> >>> >>> Il giorno lun 8 feb 2021 alle ore 12:56 Rodrigo Barbieri < >>> rodrigo.barbieri2010 at gmail.com> ha scritto: >>> >>>> Hi Ignazio, >>>> >>>> The way you set it up is correct with "enabled_share_backends = >>>> svm-tst-nfs-565,netapp-nfs-566". You need both backends enabled for the >>>> feature to work. >>>> >>>> Regards, >>>> >>>> Rodrigo >>>> >>>> On Mon, Feb 8, 2021 at 8:52 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Hello All, >>>>> I am able to replicate shares between netapp storages but I have some >>>>> questions. >>>>> Reading netapp documentation, seems the replication svm must not be >>>>> enabled in manila.conf. >>>>> The following is what netapp suggests: >>>>> >>>>> enabled_share_backends = svm-tst-nfs-565 >>>>> >>>>> [svm-tst-nfs-565] >>>>> share_backend_name = svm-tst-nfs-565 >>>>> driver_handles_share_servers = false >>>>> share_driver = manila.share.drivers.netapp.common.NetAppDriver >>>>> netapp_storage_family = ontap_cluster >>>>> netapp_server_hostname = fas8040.csi.it >>>>> netapp_server_port = 80 >>>>> netapp_login = admin >>>>> netapp_password = ****** >>>>> netapp_transport_type = http >>>>> netapp_vserver = svm-tst-nfs-565 >>>>> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >>>>> replication_domain = replication_domain_1 >>>>> >>>>> >>>>> [netapp-nfs-566] >>>>> share_backend_name = netapp-nfs-566 >>>>> driver_handles_share_servers = False >>>>> share_driver = manila.share.drivers.netapp.common.NetAppDriver >>>>> netapp_storage_family = ontap_cluster >>>>> netapp_server_hostname = fas.csi.it >>>>> netapp_server_port = 80 >>>>> netapp_login = admin >>>>> netapp_password = ***** >>>>> netapp_transport_type = http >>>>> netapp_vserver = manila-nfs-566 >>>>> netapp_aggregate_name_search_pattern = ^((?!aggr0).)*$ >>>>> replication_domain = replication_domain_1 >>>>> >>>>> As you can see above, the target of replica netapp-nfs-566 is not >>>>> included in enabled backends. >>>>> >>>>> When I try to create a replica in this situation, the manila schedule >>>>> reports "no valid host found". >>>>> >>>>> It works if I enable in manila.conf the target like this: >>>>> enabled_share_backends = svm-tst-nfs-565,netapp-nfs-566 >>>>> >>>>> Please, any suggestion ? >>>>> >>>>> Thanks >>>>> Ignazio >>>>> >>>>> >>>>> Il giorno ven 5 feb 2021 alle ore 13:59 Ignazio Cassano < >>>>> ignaziocassano at gmail.com> ha scritto: >>>>> >>>>>> Thanks, Douglas. >>>>>> On another question: >>>>>> the manila share-replica-delete delete the snapmirror ? >>>>>> If yes, source and destination volume become both writable ? >>>>>> >>>>>> Ignazio >>>>>> >>>>>> Il giorno ven 5 feb 2021 alle ore 13:48 Douglas >>>>>> ha scritto: >>>>>> >>>>>>> Yes, it is correct. This should work as an alternative for >>>>>>> host-assisted-migration and will be faster since it uses storage >>>>>>> technologies to synchronize data. >>>>>>> If your share isn't associated with a share-type that has >>>>>>> replication_type='dr' you can: 1) create a new share-type with >>>>>>> replication_type extra-spec, 2) unmanage your share, 3) manage it again >>>>>>> using the new share-type. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Feb 5, 2021 at 9:37 AM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> Hello, I am sorry. >>>>>>>> >>>>>>>> I read the documentation. >>>>>>>> >>>>>>>> SMV must be peered once bye storage admimistrator or using ansible >>>>>>>> playbook. >>>>>>>> I must create a two backend in manila.conf with the same >>>>>>>> replication domain. >>>>>>>> I must assign to the source a type and set replication type dr. >>>>>>>> When I create a share if I want to enable snapmirror for it I must >>>>>>>> create on openstack a share replica for it. >>>>>>>> The share on destination is read only until I promote it. >>>>>>>> When I promote it, it become writable. >>>>>>>> Then I can manage it on target openstack. >>>>>>>> >>>>>>>> I hope the above is the correct procedure >>>>>>>> >>>>>>>> Il giorno ven 5 feb 2021 alle ore 13:00 Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com> ha scritto: >>>>>>>> >>>>>>>>> Hi Douglas, you are really kind. >>>>>>>>> Let my to to recap and please correct if I am wrong: >>>>>>>>> >>>>>>>>> - manila share on netapp are under svm >>>>>>>>> - storage administrator createx a peering between svm source and >>>>>>>>> svm destination (or on single share volume ?) >>>>>>>>> - I create a manila share with specs replication type (the share >>>>>>>>> belongs to source svm) . In manila.conf source and destination must have >>>>>>>>> the same replication domain >>>>>>>>> - Creating the replication type it initializes the snapmirror >>>>>>>>> >>>>>>>>> Is it correct ? >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> Il giorno ven 5 feb 2021 alle ore 12:34 Douglas >>>>>>>>> ha scritto: >>>>>>>>> >>>>>>>>>> Hi Ignazio, >>>>>>>>>> >>>>>>>>>> In order to use share replication between NetApp backends, you'll >>>>>>>>>> need that Clusters and SVMs be peered in advance, which can be done by the >>>>>>>>>> storage administrators once. You don't need to handle any SnapMirror >>>>>>>>>> operation in the storage since it is fully handled by Manila and the NetApp >>>>>>>>>> driver. You can find all operations needed here [1][2]. If you have CIFS >>>>>>>>>> shares that need to be replicated and promoted, you will hit a bug that is >>>>>>>>>> being backported [3] at the moment. NFS shares should work fine. >>>>>>>>>> >>>>>>>>>> If you want, we can assist you on creating replicas for your >>>>>>>>>> shares in #openstack-manila channel. Just reach us there. >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> https://docs.openstack.org/manila/latest/admin/shared-file-systems-share-replication.html >>>>>>>>>> [2] >>>>>>>>>> https://netapp-openstack-dev.github.io/openstack-docs/victoria/manila/examples/openstack_command_line/section_manila-cli.html#creating-manila-share-replicas >>>>>>>>>> [3] https://bugs.launchpad.net/manila/+bug/1896949 >>>>>>>>>> >>>>>>>>>> On Fri, Feb 5, 2021 at 8:16 AM Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hello, thanks for your help. >>>>>>>>>>> I am waiting my storage administrators have a window to help me >>>>>>>>>>> because they must setup the snapmirror. >>>>>>>>>>> Meanwhile I am trying the host assisted migration but it does >>>>>>>>>>> not work. >>>>>>>>>>> The share remains in migrating for ever. >>>>>>>>>>> I am sure the replication-dr works because I tested it one year >>>>>>>>>>> ago. >>>>>>>>>>> I had an openstack on site A with a netapp storage >>>>>>>>>>> I had another openstack on Site B with another netapp storage. >>>>>>>>>>> The two openstack installation did not share anything. >>>>>>>>>>> So I made a replication between two volumes (shares). >>>>>>>>>>> I demoted the source share taking note about its export location >>>>>>>>>>> list >>>>>>>>>>> I managed the destination on openstack and it worked. >>>>>>>>>>> >>>>>>>>>>> The process for replication is not fully handled by openstack >>>>>>>>>>> api, so I should call netapp api for creating snapmirror relationship or >>>>>>>>>>> ansible modules or ask help to my storage administrators , right ? >>>>>>>>>>> Instead, using share migration, I could use only openstack api: >>>>>>>>>>> I understood that driver assisted cannot work in this case, but host >>>>>>>>>>> assisted should work. >>>>>>>>>>> >>>>>>>>>>> Best Regards >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Il giorno gio 4 feb 2021 alle ore 21:39 Douglas < >>>>>>>>>>> viroel at gmail.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>>> Hi Rodrigo, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for your help on this. We were helping Ignazio in >>>>>>>>>>>> #openstack-manila channel. He wants to migrate a share across ONTAP >>>>>>>>>>>> clusters, which isn't supported in the current implementation of the >>>>>>>>>>>> driver-assisted-migration with NetApp driver. So, instead of using >>>>>>>>>>>> migration methods, we suggested using share-replication to create a copy in >>>>>>>>>>>> the destination, which will use the storage technologies to copy the data >>>>>>>>>>>> faster. Ignazio didn't try that out yet, since it was late in his timezone. >>>>>>>>>>>> We should continue tomorrow or in the next few days. >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Feb 4, 2021 at 5:14 PM Rodrigo Barbieri < >>>>>>>>>>>> rodrigo.barbieri2010 at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello Ignazio, >>>>>>>>>>>>> >>>>>>>>>>>>> If you are attempting to migrate between 2 NetApp backends, >>>>>>>>>>>>> then you shouldn't need to worry about correctly setting the >>>>>>>>>>>>> data_node_access_ip. Your ideal migration scenario is a >>>>>>>>>>>>> driver-assisted-migration, since it is between 2 NetApp backends. If that >>>>>>>>>>>>> fails due to misconfiguration, it will fallback to a host-assisted >>>>>>>>>>>>> migration, which will use the data_node_access_ip and the host will attempt >>>>>>>>>>>>> to mount both shares. This is not what you want for this scenario, as this >>>>>>>>>>>>> is useful for different backends, not your case. >>>>>>>>>>>>> >>>>>>>>>>>>> if you specify "manila migration-start --preserve-metadata >>>>>>>>>>>>> True" it will prevent the fallback to host-assisted, so it is easier for >>>>>>>>>>>>> you to narrow down the issue with the host-assisted migration out of the >>>>>>>>>>>>> way. >>>>>>>>>>>>> >>>>>>>>>>>>> I used to be familiar with the NetApp driver set up to review >>>>>>>>>>>>> your case, however that was a long time ago. I believe the current NetApp >>>>>>>>>>>>> driver maintainers will be able to more accurately review your case and >>>>>>>>>>>>> spot the problem. >>>>>>>>>>>>> >>>>>>>>>>>>> If you could share some info about your scenario such as: >>>>>>>>>>>>> >>>>>>>>>>>>> 1) the 2 backends config groups in manila.conf (sanitized, >>>>>>>>>>>>> without passwords) >>>>>>>>>>>>> 2) a "manila show" of the share you are trying to migrate >>>>>>>>>>>>> (sanitized if needed) >>>>>>>>>>>>> 3) the "manila migration-start" command you are using and its >>>>>>>>>>>>> parameters. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Feb 4, 2021 at 2:06 PM Ignazio Cassano < >>>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello All, >>>>>>>>>>>>>> I am trying to migrate a share between a netapp backend to >>>>>>>>>>>>>> another. >>>>>>>>>>>>>> Both backends are configured in my manila.conf. >>>>>>>>>>>>>> I am able to create share on both, but I am not able to >>>>>>>>>>>>>> migrate share between them. >>>>>>>>>>>>>> I am using DSSH=False. >>>>>>>>>>>>>> I did not understand how host and driver assisted migration >>>>>>>>>>>>>> work and what "data_node_access_ip" means. >>>>>>>>>>>>>> The share I want to migrate is on a network (10.102.186.0/24) >>>>>>>>>>>>>> that I can reach by my management controllers network ( >>>>>>>>>>>>>> 10.102.184.0/24). I Can mount share from my controllers and >>>>>>>>>>>>>> I can mount also the netapp SVM where the share is located. >>>>>>>>>>>>>> So in the data_node_access_ip I wrote the list of my >>>>>>>>>>>>>> controllers management ips. >>>>>>>>>>>>>> During the migrate phase I checked if my controller where >>>>>>>>>>>>>> manila is running mounts the share or the netapp SVM but It does not happen. >>>>>>>>>>>>>> Please, what is my mistake ? >>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>> Ignazio >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Rodrigo Barbieri >>>>>>>>>>>>> MSc Computer Scientist >>>>>>>>>>>>> OpenStack Manila Core Contributor >>>>>>>>>>>>> Federal University of São Carlos >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Douglas Salles Viroel >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Douglas Salles Viroel >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Douglas Salles Viroel >>>>>>> >>>>>> >>>> >>>> -- >>>> Rodrigo Barbieri >>>> MSc Computer Scientist >>>> OpenStack Manila Core Contributor >>>> Federal University of São Carlos >>>> >>>> >> >> -- >> Douglas Salles Viroel >> > -- Douglas Salles Viroel -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.morin at gmail.com Tue Feb 9 11:21:10 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Tue, 9 Feb 2021 11:21:10 +0000 Subject: [nova] Rescue booting on wrong disk Message-ID: <20210209112110.GG14971@sync> Hey all, >From time to time we are facing an issue when puting instance in rescue with the same image as the one the instance was booted. E.G. I booted an instance using Debian 10, disk are: debian at testarnaud:~$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sr0 11:0 1 486K 0 rom vda 254:0 0 10G 0 disk └─vda1 254:1 0 10G 0 part / debian at testarnaud:~$ cat /etc/fstab # /etc/fstab: static file system information. UUID=5605171d-d590-46d5-85e2-60096b533a18 / ext4 errors=remount-ro 0 1 I rescued the instance: $ openstack server rescue --image bc73a901-6366-4a69-8ddc-00479b4d647f testarnaud Then, back in the instance: debian at testarnaud:~$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sr0 11:0 1 486K 0 rom vda 254:0 0 2G 0 disk └─vda1 254:1 0 2G 0 part vdb 254:16 0 10G 0 disk └─vdb1 254:17 0 10G 0 part / Instance booted on /dev/vdb1 instead of /dev/vda1 Is there anything we can configure on nova side to avoid this situation? Thanks -- Arnaud Morin From smooney at redhat.com Tue Feb 9 12:23:58 2021 From: smooney at redhat.com (Sean Mooney) Date: Tue, 09 Feb 2021 12:23:58 +0000 Subject: [nova] Rescue booting on wrong disk In-Reply-To: <20210209112110.GG14971@sync> References: <20210209112110.GG14971@sync> Message-ID: On Tue, 2021-02-09 at 11:21 +0000, Arnaud Morin wrote: > Hey all, > > From time to time we are facing an issue when puting instance in rescue > with the same image as the one the instance was booted. > > E.G. > I booted an instance using Debian 10, disk are: > > debian at testarnaud:~$ lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > sr0 11:0 1 486K 0 rom > vda 254:0 0 10G 0 disk > └─vda1 254:1 0 10G 0 part / > debian at testarnaud:~$ cat /etc/fstab > # /etc/fstab: static file system information. > UUID=5605171d-d590-46d5-85e2-60096b533a18 / ext4 > errors=remount-ro 0 1 > > > > I rescued the instance: > $ openstack server rescue --image bc73a901-6366-4a69-8ddc-00479b4d647f testarnaud > > > Then, back in the instance: > > debian at testarnaud:~$ lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > sr0 11:0 1 486K 0 rom > vda 254:0 0 2G 0 disk > └─vda1 254:1 0 2G 0 part > vdb 254:16 0 10G 0 disk > └─vdb1 254:17 0 10G 0 part / > > > > Instance booted on /dev/vdb1 instead of /dev/vda1 > > Is there anything we can configure on nova side to avoid this > situation? in ussuri lee yarwood added https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/virt-rescue-stable-disk-devices.html to nova which i belive will resolve the issue from ocata+ i think you can add hw_rescue_bus=usb to get a similar effect but from ussuri we cahgne the layout so thtat the rescue disk is always used. lee is that right? > > > Thanks > > From roshananvekar at gmail.com Tue Feb 9 12:24:43 2021 From: roshananvekar at gmail.com (roshan anvekar) Date: Tue, 9 Feb 2021 17:54:43 +0530 Subject: [stein][neutron][vlan provider network] How to configure br-ex on virtual interface Message-ID: Hello, Below is my scenario. I am trying to use a single physical bonded interface ( example: bond1) both for external TLS access traffic and for provider network too ( These are 2 different vlan networks with separate vlan ids) . Bond0 is used as api_interface for management traffic. Bond2 is used for storage traffic. For this I created 2 virtual interfaces on this bond1 and used it accordingly while deploying through kolla-ansible. Before deployment the gateways for both vlan networks were accessible. Post deployment I see that qdhcp-id router is created on one of the controllers. Post creation of this DHCP agent, the gateway is inaccessible. Also the vms created are not getting IPs through DHCP agent so failing to be accessible. I had configured br-ex on virtual interface of provider vlan network and not on physical interface directly. Please let me know if I am going wrong in my network configurations. Regards, Roshan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Tue Feb 9 13:35:02 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 9 Feb 2021 08:35:02 -0500 Subject: =?UTF-8?B?UmU6IFtyZWxlYXNlXSBQcm9wb3NpbmcgRWzFkWQgSWxsw6lzIChlbG9k?= =?UTF-8?Q?=29_for_Release_Management_Core?= In-Reply-To: References: Message-ID: <3daa6e40-d51f-a057-726a-9e7edec581ce@gmail.com> I'm not on the release management team, but fwiw I've seen a lot of Előd's reviews on cinder project release patches, and you can tell from his comments that he's a very careful reviewer, so +1 from me. cheers, brian On 2/8/21 10:15 AM, Herve Beraud wrote: > Hi, > > Előd has been working on Release management for quite some time now and > in that time  has > shown tremendous growth in his understanding of our processes and on how > deliverables work on Openstack. I think he would make a good addition to > the core team. > > Existing team members, please respond with +1/-1. > If there are no objections we'll add him to the ACL soon. :-) > > Thanks. > > -- > Hervé Beraud > Senior Software Engineer at Red Hat > irc: hberaud > https://github.com/4383/ > https://twitter.com/4383hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From rosmaita.fossdev at gmail.com Tue Feb 9 13:50:38 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 9 Feb 2021 08:50:38 -0500 Subject: [cinder] reminder: wallaby R-9 virtual mid-cycle tomorrow (wednesday) Message-ID: Reminder: no cinder meeting tomorrow, instead we're having a virtual mid-cycle meeting: DATE: Wednesday, 10 February 2021 TIME: 1400-1600 UTC (2 hours) LOCATION: https://bluejeans.com/3228528973 The meeting will be recorded. Topics are on the planning etherpad. (For people who have been procrastinating, there's probably room for one or two more topics.) https://etherpad.opendev.org/p/cinder-wallaby-mid-cycles cheers, brian From mnaser at vexxhost.com Tue Feb 9 15:10:26 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Tue, 9 Feb 2021 10:10:26 -0500 Subject: [tc] weekly update Message-ID: Hi everyone, Here's an update on what happened in the OpenStack TC this week. You can get more information by checking for changes in openstack/governance repository. # Patches ## Open Reviews - Retire tempest-horizon https://review.opendev.org/c/openstack/governance/+/774379 - Cool-down cycle goal https://review.opendev.org/c/openstack/governance/+/770616 - monasca-log-api & monasca-ceilometer does not make releases https://review.opendev.org/c/openstack/governance/+/771785 - Add assert:supports-api-interoperability tag to neutron https://review.opendev.org/c/openstack/governance/+/773090 - Add assert:supports-api-interoperability to cinder https://review.opendev.org/c/openstack/governance/+/773684 ## Project Updates - Remove Karbor project team https://review.opendev.org/c/openstack/governance/+/767056 - Create ansible-role-pki repo https://review.opendev.org/c/openstack/governance/+/773383 - Add rbd-iscsi-client to cinder project https://review.opendev.org/c/openstack/governance/+/772597 ## General Changes - Define Xena release testing runtime https://review.opendev.org/c/openstack/governance/+/770860 - [manila] add assert:supports-api-interoperability https://review.opendev.org/c/openstack/governance/+/770859 - Define 2021 upstream investment opportunities https://review.opendev.org/c/openstack/governance/+/771707 # Other Reminders - Our next [TC] Weekly meeting is scheduled for February 11th at 1500 UTC. If you would like to add topics for discussion, please go to https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting and fill out your suggestions by Wednesday, February 10th, 2100 UTC. Thanks for reading! Mohammed & Kendall -- Mohammed Naser VEXXHOST, Inc. From rosmaita.fossdev at gmail.com Tue Feb 9 15:33:21 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 9 Feb 2021 10:33:21 -0500 Subject: [cinder] wallaby new driver - kioxia kumoscale Message-ID: I'm proposing that we extend the wallaby new driver merge deadline for the kioxia kumoscale driver to Friday 12 February 2021 to give the reviewers who committed to reviewing the patches at the cinder meeting last week [0] time to complete their reviews of the driver [1] and os-brick [2] changes. The Kioxia 3rd party CI is running and responding to patches, and the dev team has been very responsive to reviews and posted revisions quickly. This extension past the standing Milestone-2 deadline for new drivers is due to limited review bandwidth this cycle which I am attributing to stress caused by the ongoing pandemic, so this should not be considered a precedent applicable to new drivers proposed for future releases. If you have comments or concerns about this proposal, please respond to this email so we can discuss at the virtual mid-cycle meeting tomorrow. And if you have committed to reviewing one of the patches and have not done it yet, please review as soon as possible! cheers, brian [0] http://eavesdrop.openstack.org/meetings/cinder/2021/cinder.2021-02-03-14.00.log.html [1] https://review.opendev.org/c/openstack/cinder/+/768574 [2] https://review.opendev.org/c/openstack/os-brick/+/768575 From roshananvekar at gmail.com Tue Feb 9 15:35:15 2021 From: roshananvekar at gmail.com (roshan anvekar) Date: Tue, 9 Feb 2021 21:05:15 +0530 Subject: Fwd: [stein][neutron][vlan provider network] How to configure br-ex on virtual interface In-Reply-To: References: Message-ID: Is it a hard requirement for br-ex to be configured on physical interface itself?? Or is it okay to configure it on a sub-interface too?? Thanks in advance. ---------- Forwarded message --------- From: roshan anvekar Date: Tue, Feb 9, 2021, 5:54 PM Subject: [stein][neutron][vlan provider network] How to configure br-ex on virtual interface To: OpenStack Discuss Hello, Below is my scenario. I am trying to use a single physical bonded interface ( example: bond1) both for external TLS access traffic and for provider network too ( These are 2 different vlan networks with separate vlan ids) . Bond0 is used as api_interface for management traffic. Bond2 is used for storage traffic. For this I created 2 virtual interfaces on this bond1 and used it accordingly while deploying through kolla-ansible. Before deployment the gateways for both vlan networks were accessible. Post deployment I see that qdhcp-id router is created on one of the controllers. Post creation of this DHCP agent, the gateway is inaccessible. Also the vms created are not getting IPs through DHCP agent so failing to be accessible. I had configured br-ex on virtual interface of provider vlan network and not on physical interface directly. Please let me know if I am going wrong in my network configurations. Regards, Roshan -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Feb 9 16:07:58 2021 From: smooney at redhat.com (Sean Mooney) Date: Tue, 09 Feb 2021 16:07:58 +0000 Subject: Fwd: [stein][neutron][vlan provider network] How to configure br-ex on virtual interface In-Reply-To: References: Message-ID: <5850226e61bdb05ca7f6743b58387726a8f51eed.camel@redhat.com> On Tue, 2021-02-09 at 21:05 +0530, roshan anvekar wrote: > Is it a hard requirement for br-ex to be configured on physical interface > itself?? Or is it okay to configure it on a sub-interface too?? you can put it on a bound putting it on a vlan or vxlan interface probaly does not make a lot of sense due to multiple encapsulation but it technialy does not need to be directly on the physical interface. that is just the recommended thing to do. generally if you are using ovs and want a bound you will use an ovs bound rahter then a linux bond but both work. > > Thanks in advance. > > ---------- Forwarded message --------- > From: roshan anvekar > Date: Tue, Feb 9, 2021, 5:54 PM > Subject: [stein][neutron][vlan provider network] How to configure br-ex on > virtual interface > To: OpenStack Discuss > > > Hello, > > Below is my scenario. > > I am trying to use a single physical bonded interface ( example: bond1) > both for external TLS access traffic and for provider network too ( These > are 2 different vlan networks with separate vlan ids) . Bond0 is used as > api_interface for management traffic. Bond2 is used for storage traffic. > > For this I created 2 virtual interfaces on this bond1 and used it > accordingly while deploying through kolla-ansible. > > Before deployment the gateways for both vlan networks were accessible. > > Post deployment I see that qdhcp-id router is created on one of the > controllers. Post creation of this DHCP agent, the gateway is inaccessible. > > Also the vms created are not getting IPs through DHCP agent so failing to > be accessible. > > I had configured br-ex on virtual interface of provider vlan network and > not on physical interface directly. > > Please let me know if I am going wrong in my network configurations. > > Regards, > Roshan From whayutin at redhat.com Tue Feb 9 16:29:12 2021 From: whayutin at redhat.com (Wesley Hayutin) Date: Tue, 9 Feb 2021 09:29:12 -0700 Subject: [tripleo][ci] socializing upstream job removal ( master upgrade, scenario010 ) Message-ID: Greetings, Expect to see several of these emails as we propose additional items to reduce the upstream resource footprint of TripleO. The intent of the email is to broadcast our general intent regarding specific ci jobs with some details regarding the how and when things will happen. Expect jobs to be removed in a staged pragmatic manner. Historical Context [1] Please feel free to respond to this thread with your opinions. Summary: First stage of TripleO's job reduction will be to remove most TripleO upgrade jobs on master and remove all scenario010 octavia jobs from upstream. *Upgrade jobs in master:* All the upgrade jobs in master are non-voting based on policy and to give the upgrade developers time to compensate for new developments and features. The feedback provided by CI can still occur in our periodic jobs in RDO's software factory zuul. Upgrade jobs will remain running in periodic but removed from upstream. Specifically: ( master ) tripleo-ci-centos-8-standalone-upgrade tripleo-ci-centos-8-undercloud-upgrade tripleo-ci-centos-8-scenario000-multinode-oooq-container-upgrades I'll note there is interest in keeping the undercloud-upgrade, however we are able to remove all the upgrade jobs for a branch in the upstream we can also remove a content-provider job. I would encourage folks to consider undercloud-upgrade for periodic only. *Scenario010 - Octavia:* Scenario010 octavia has been non-voting and not passing at a high enough rate [2] to justify the use of upstream resources. I would propose we only run these jobs in the periodic component and integration lines in RDO softwarefactory. Specifically: ( all branches ) tripleo-ci-centos-8-scenario010-ovn-provider-standalone tripleo-ci-centos-8-scenario010-standalone Please review and comment. The CI team will start taking action on these two items in one week. Thank you! [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020235.html [2] http://dashboard-ci.tripleo.org/d/iEDLIiOMz/non-voting-jobs?orgId=1&from=now-30d&to=now -------------- next part -------------- An HTML attachment was scrubbed... URL: From jesse at odyssey4.me Tue Feb 9 16:45:01 2021 From: jesse at odyssey4.me (Jesse Pretorius) Date: Tue, 9 Feb 2021 16:45:01 +0000 Subject: [tripleo][ci] socializing upstream job removal ( master upgrade, scenario010 ) In-Reply-To: References: Message-ID: <08269201-6C59-4ECE-8B65-16A724E3589B@odyssey4.me> On 9 Feb 2021, at 16:29, Wesley Hayutin > wrote: Upgrade jobs in master: All the upgrade jobs in master are non-voting based on policy and to give the upgrade developers time to compensate for new developments and features. The feedback provided by CI can still occur in our periodic jobs in RDO's software factory zuul. Upgrade jobs will remain running in periodic but removed from upstream. Specifically: ( master ) tripleo-ci-centos-8-standalone-upgrade tripleo-ci-centos-8-undercloud-upgrade tripleo-ci-centos-8-scenario000-multinode-oooq-container-upgrades I'll note there is interest in keeping the undercloud-upgrade, however we are able to remove all the upgrade jobs for a branch in the upstream we can also remove a content-provider job. I would encourage folks to consider undercloud-upgrade for periodic only. +1 I think this makes sense. These jobs are long running and consume quite a few VM over that lengthy period of time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lyarwood at redhat.com Tue Feb 9 16:50:23 2021 From: lyarwood at redhat.com (Lee Yarwood) Date: Tue, 9 Feb 2021 16:50:23 +0000 Subject: [nova] Rescue booting on wrong disk In-Reply-To: References: <20210209112110.GG14971@sync> Message-ID: <20210209165023.routbszw3hnggrwj@lyarwood-laptop.usersys.redhat.com> On 09-02-21 12:23:58, Sean Mooney wrote: > On Tue, 2021-02-09 at 11:21 +0000, Arnaud Morin wrote: > > Hey all, > > > > From time to time we are facing an issue when puting instance in rescue > > with the same image as the one the instance was booted. > > > > E.G. > > I booted an instance using Debian 10, disk are: > > > > debian at testarnaud:~$ lsblk > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > sr0 11:0 1 486K 0 rom > > vda 254:0 0 10G 0 disk > > └─vda1 254:1 0 10G 0 part / > > debian at testarnaud:~$ cat /etc/fstab > > # /etc/fstab: static file system information. > > UUID=5605171d-d590-46d5-85e2-60096b533a18 / ext4 > > errors=remount-ro 0 1 > > > > > > > > I rescued the instance: > > $ openstack server rescue --image bc73a901-6366-4a69-8ddc-00479b4d647f testarnaud > > > > > > Then, back in the instance: > > > > debian at testarnaud:~$ lsblk > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > sr0 11:0 1 486K 0 rom > > vda 254:0 0 2G 0 disk > > └─vda1 254:1 0 2G 0 part > > vdb 254:16 0 10G 0 disk > > └─vdb1 254:17 0 10G 0 part / > > > > > > > > Instance booted on /dev/vdb1 instead of /dev/vda1 > > > > Is there anything we can configure on nova side to avoid this > > situation? > > in ussuri lee yarwood added > https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/virt-rescue-stable-disk-devices.html > to nova which i belive will resolve the issue > > from ocata+ i think you can add hw_rescue_bus=usb to get a similar > effect but from ussuri we cahgne the layout so thtat the rescue disk > is always used. lee is that right? Yeah that's correct, from Ussuri (21.0.0) with the libvirt driver the instance should continue to show all disks connected as normal with the rescue disk appended last and always used as the boot device. In your case, switching the rescue bus should change the default boot device type n-cpu tells libvirt to use working around your problem. Another obvious thing you can try is just using a different image so the two MBRs don't conflict? Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From marios at redhat.com Tue Feb 9 17:03:58 2021 From: marios at redhat.com (Marios Andreou) Date: Tue, 9 Feb 2021 19:03:58 +0200 Subject: [tripleo][ci] socializing upstream job removal ( master upgrade, scenario010 ) In-Reply-To: <08269201-6C59-4ECE-8B65-16A724E3589B@odyssey4.me> References: <08269201-6C59-4ECE-8B65-16A724E3589B@odyssey4.me> Message-ID: On Tue, Feb 9, 2021 at 6:46 PM Jesse Pretorius wrote: > > > On 9 Feb 2021, at 16:29, Wesley Hayutin wrote: > > *Upgrade jobs in master:* > All the upgrade jobs in master are non-voting based on policy and to give > the upgrade developers time to compensate for new developments and > features. The feedback provided by CI can still occur in our periodic > jobs in RDO's software factory zuul. Upgrade jobs will remain running in > periodic but removed from upstream. > > Specifically: ( master ) > tripleo-ci-centos-8-standalone-upgrade > tripleo-ci-centos-8-undercloud-upgrade > tripleo-ci-centos-8-scenario000-multinode-oooq-container-upgrades > > I'll note there is interest in keeping the undercloud-upgrade, however we > are able to remove all the upgrade jobs for a branch in the upstream we can > also remove a content-provider job. I would encourage folks to consider > undercloud-upgrade for periodic only. > > > +1 I think this makes sense. These jobs are long running and consume quite > a few VM over that lengthy period of time. > > just to be clear however, note that this means that there will be no stable/wallaby version of these upgrades jobs, or at least it will be a lot harder to add them if we aren't running these jobs against master. In the 'normal' case, the non-voting master jobs become the voting stable/latest once we have created the new latest branch. If we remove master we may have a challenge to add voting stable/wallaby versions when the time comes. Based on our recent discussions, we don't *want* stable/wallaby versions since we want to eventually remove all the upstream upgrade jobs once d/stream replacements have been put in place. The point then is that ideally we need to be in a position to do that (i.e. replacement jobs in place) before stable/wallaby is branched, otherwise we may have a tough time adding stable/wallaby versions of the (removed) master upgrade jobs. regards, marios -------------- next part -------------- An HTML attachment was scrubbed... URL: From sathlang at redhat.com Tue Feb 9 17:13:17 2021 From: sathlang at redhat.com (Sofer Athlan-Guyot) Date: Tue, 09 Feb 2021 18:13:17 +0100 Subject: [tripleo][ci] socializing upstream job removal ( master upgrade, scenario010 ) In-Reply-To: <08269201-6C59-4ECE-8B65-16A724E3589B@odyssey4.me> References: <08269201-6C59-4ECE-8B65-16A724E3589B@odyssey4.me> Message-ID: <87pn19p3ki.fsf@redhat.com> Hi, Jesse Pretorius writes: > On 9 Feb 2021, at 16:29, Wesley Hayutin wrote: > > Upgrade jobs in master: > All the upgrade jobs in master are non-voting based on policy and to give the upgrade developers time to compensate for new developments and > features. The feedback provided by CI can still occur in our periodic jobs in RDO's software factory zuul. Upgrade jobs will remain running in > periodic but removed from upstream. > > Specifically: ( master ) > tripleo-ci-centos-8-standalone-upgrade > tripleo-ci-centos-8-undercloud-upgrade > tripleo-ci-centos-8-scenario000-multinode-oooq-container-upgrades > > I'll note there is interest in keeping the undercloud-upgrade, however we are able to remove all the upgrade jobs for a branch in the upstream we can > also remove a content-provider job. I would encourage folks to consider undercloud-upgrade for periodic only. > > +1 I think this makes sense. These jobs are long running and consume quite a few VM over that lengthy period of time. My personal feeling about undercloud-upgrade is that I haven't used it in a while, so this would be a +1 for removing this one as well (beside all previous discussion) as it would remove an extra content provider (cost/gain ...). Jesse is that also ok for you ? Thanks, -- Sofer Athlan-Guyot chem on #irc at rhos-upgrades DFG:Upgrades Squad:Update From fungi at yuggoth.org Tue Feb 9 17:16:53 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 9 Feb 2021 17:16:53 +0000 Subject: [tripleo][ci] socializing upstream job removal ( master upgrade, scenario010 ) In-Reply-To: References: <08269201-6C59-4ECE-8B65-16A724E3589B@odyssey4.me> Message-ID: <20210209171653.bjbhfvx6wjtmzapw@yuggoth.org> On 2021-02-09 19:03:58 +0200 (+0200), Marios Andreou wrote: [...] > just to be clear however, note that this means that there will be no > stable/wallaby version of these upgrades jobs, or at least it will be a lot > harder to add them if we aren't running these jobs against master. In the > 'normal' case, the non-voting master jobs become the voting stable/latest > once we have created the new latest branch. If we remove master we may have > a challenge to add voting stable/wallaby versions when the time comes. > > Based on our recent discussions, we don't *want* stable/wallaby versions > since we want to eventually remove all the upstream upgrade jobs once > d/stream replacements have been put in place. > > The point then is that ideally we need to be in a position to do that (i.e. > replacement jobs in place) before stable/wallaby is branched, otherwise we > may have a tough time adding stable/wallaby versions of the (removed) > master upgrade jobs. If for some reason you aren't able to get that far in Wallaby, running this once a day in the periodic pipeline in OpenDev's Zuul would still be *far* less resource-intensive than the current state of running multiple times for most proposed changes, and so could serve as a reasonable compromise. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From mkopec at redhat.com Tue Feb 9 17:52:31 2021 From: mkopec at redhat.com (Martin Kopec) Date: Tue, 9 Feb 2021 18:52:31 +0100 Subject: [QA] Migrate from testr to stestr Message-ID: Hi everyone, testr unit test runner (testrepository package [1]) hasn't been updated for years, therefore during Shanghai PTG [2] we came up with an initiative to migrate from testr to stestr (testr's successor) [3] unit test runner. Here is an etherpad which tracks the effort [4]. However as there is still quite a number of the projects which haven't migrated, we would like to kindly ask you for your help. If you are a maintainer of a project which is mentioned in the etherpad [4] and it's not crossed out yet, please migrate to stestr. [1] https://pypi.org/project/testrepository/ [2] https://etherpad.opendev.org/p/shanghai-ptg-qa [3] https://pypi.org/project/stestr/ [4] https://etherpad.opendev.org/p/enkM4eeDHObSloTjPGAu Have a nice day, -- Martin Kopec Software Quality Engineer Red Hat EMEA -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.buhman at verizonmedia.com Tue Feb 9 20:47:33 2021 From: zachary.buhman at verizonmedia.com (Zachary Buhman) Date: Tue, 9 Feb 2021 12:47:33 -0800 Subject: [E] [ironic] Review Jams In-Reply-To: References: Message-ID: I thought the 09 Feb 2021 review jam was highly valuable. Without the discussions we had, I think the "Secure RBAC" patch set would be unapproachable for me. For example, having knowledge of the (new) oslo-policy features that the patches make use of seems to be a requirement for deeply understanding the changes. As a direct result of the review jam [0], I feel that I have enough understanding and comfortability to make valuable review feedback on these patches. [0] and also having read/reviewed the secure-rbac spec previously, to be fair On Fri, Feb 5, 2021 at 7:10 AM Julia Kreger wrote: > In the Ironic team's recent mid-cycle call, we discussed the need to > return to occasionally having review jams in order to help streamline > the review process. In other words, get eyes on a change in parallel > and be able to discuss the change. The goal is to help get people on > the same page in terms of what and why. Be on hand to answer questions > or back-fill context. This is to hopefully avoid the more iterative > back and forth nature of code review, which can draw out a long chain > of patches. As always, the goal is not perfection, but forward > movement especially for complex changes. > > We've established two time windows that will hopefully not to be too > hard for some contributors to make it to. It doesn't need to be > everyone, but it would help for at least some people whom actively > review or want to actively participate in reviewing, or whom are even > interested in a feature to join us for our meeting. > > I've added an entry on to our wiki page to cover this, with the > current agenda and anticipated review jam topic schedule. The tl;dr is > we will use meetpad[1] and meet on Mondays at 2 PM UTC and Tuesdays at > 6 PM UTC. The hope is to to enable some overlap of reviewers. If > people are interested in other times, please bring this up in the > weekly meeting or on the mailing list. > > I'm not sending out calendar invites for this. Yet. :) > > See everyone next week! > > -Julia > > [0]: > https://urldefense.proofpoint.com/v2/url?u=https-3A__wiki.openstack.org_wiki_Meetings_Ironic-23Review-5FJams&d=DwIBaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=OsbscIvhVDRWHpDZtO7nXdqGCfPHirpVEemMwL8l5tw&m=S4p8gD_wQlpR_rvzdqGkdq574-DkUsgBRet9-k3RpVg&s=gVApbMsmNPVlfYreqkQe4yKFxC66U6D8nFc_TwjW-FE&e= > [1]: > https://urldefense.proofpoint.com/v2/url?u=https-3A__meetpad.opendev.org_ironic&d=DwIBaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=OsbscIvhVDRWHpDZtO7nXdqGCfPHirpVEemMwL8l5tw&m=S4p8gD_wQlpR_rvzdqGkdq574-DkUsgBRet9-k3RpVg&s=iHBy7h99FQZ6Xb_fN2Hv3HZXIANl6BzR867jblUJvsk&e= > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From finarffin at gmail.com Tue Feb 9 11:48:38 2021 From: finarffin at gmail.com (Jan Wasilewski) Date: Tue, 9 Feb 2021 12:48:38 +0100 Subject: [cinder/barbican] LUKS encryption for mounted disk - how to decrypt cinder volume Message-ID: Hi All, I have a question about the possible decryption of LUKS volume. I'm testing currently barbican+cinder, but I'm just wondering if there is a way, to somehow decrypt my LUKS volume with payload generated by a barbican. Is there any procedure for that? I was doing it by myself, but somehow it doesn't work and I got an error: [TEST]root at barbican-01:/usr/lib/python3/dist-packages# barbican secret get --payload --payload_content_type application/octet-stream http://controller.test:9311/v1/secrets/76631940-9ab6-4b8c-9481-e54c3ffdbbfe +---------+--------------------------------------------------------------------------------------------------------+ | Field | Value | +---------+--------------------------------------------------------------------------------------------------------+ | Payload | b'\xbf!i\x97\xf4\x0c\x12\xa4\xfe4\xf3\x16C\xe8@\xdc\x0f\x9d+:\x0c7\xa9\xab[\x8d\xf2\xf1\xae\r\x89\xdc' | +---------+--------------------------------------------------------------------------------------------------------+ cryptsetup luksOpen /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f my-volume Enter passphrase for /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f: ** No key available with this passphrase. I thought that above issue can be related to encoding, so I took payload value directly from vault and use it as a key-file, but problem is exactly the same(my encrypted volume is the last volume list by domblklist option): vault kv get secret/data/e5baa518207e4f9db4810988d22087ce | grep value | awk -F'value:' '{print $2}' 4d4d35676c336567714850663477336d2b415475746b74774c56376b77324b4e73773879724c46704678513d] [TEST]root at comp-02:~# cat bbb 4d4d35676c336567714850663477336d2b415475746b74774c56376b77324b4e73773879724c46704678513d [TEST]root at comp-02:~# cat bbb | base64 -d > pass2 [TEST]root at comp-02:~# cat pass2 ▒▒߻▒▒▒▒▒^<▒N▒▒▒▒~پ5▒▒▒▒▒▒▒z߾▒▒▒▒~▒▒▒▒▒n▒▒▒▒▒]▒[TEST]root at comp-02:~# [TEST]root at comp-02:~# virsh domblklist instance-00000da8 Target Source ------------------------------------------------ vda /dev/dm-17 vdb /dev/disk/by-id/wwn-0x6e00084100ee7e7e74623bd3000036bc vdc /dev/dm-16 vde /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f vdf /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 [TEST]root at comp-02:~# udisksctl unlock -b /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 --key-file pass2 Error unlocking /dev/dm-21: GDBus.Error:org.freedesktop.UDisks2.Error.Failed: Error unlocking /dev/dm-21: Failed to activate device: Operation not permitted [TEST]root at comp-02:~# cryptsetup luksOpen /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 my-volume --master-key-file=pass2 Volume key does not match the volume. I see that nova/cinder and barbican are doing this stuff somehow so I strongly believe there is a way to decrypt this manually. Maybe I’m doing something wrong in my testing-steps. Thanks in advance for any help here! Unfortunately, I haven’t found any materials on how to do this. Best regards, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: From lyarwood at redhat.com Tue Feb 9 22:12:30 2021 From: lyarwood at redhat.com (Lee Yarwood) Date: Tue, 9 Feb 2021 22:12:30 +0000 Subject: [cinder/barbican] LUKS encryption for mounted disk - how to decrypt cinder volume In-Reply-To: References: Message-ID: <20210209221230.xifstekiw6aucr7l@lyarwood-laptop.usersys.redhat.com> On 09-02-21 12:48:38, Jan Wasilewski wrote: > Hi All, > > I have a question about the possible decryption of LUKS volume. I'm testing > currently barbican+cinder, but I'm just wondering if there is a way, to > somehow decrypt my LUKS volume with payload generated by a barbican. Is > there any procedure for that? I was doing it by myself, but somehow it > doesn't work and I got an error: > > [TEST]root at barbican-01:/usr/lib/python3/dist-packages# barbican secret get > --payload --payload_content_type application/octet-stream > http://controller.test:9311/v1/secrets/76631940-9ab6-4b8c-9481-e54c3ffdbbfe > +---------+--------------------------------------------------------------------------------------------------------+ > | Field | Value > | > +---------+--------------------------------------------------------------------------------------------------------+ > | Payload | b'\xbf!i\x97\xf4\x0c\x12\xa4\xfe4\xf3\x16C\xe8@\xdc\x0f\x9d+:\x0c7\xa9\xab[\x8d\xf2\xf1\xae\r\x89\xdc' > | > +---------+--------------------------------------------------------------------------------------------------------+ > > cryptsetup luksOpen /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f > my-volume > Enter passphrase for > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f: * payload>* > No key available with this passphrase. > > I thought that above issue can be related to encoding, so I took payload > value directly from vault and use it as a key-file, but problem is exactly > the same(my encrypted volume is the last volume list by domblklist option): > > vault kv get secret/data/e5baa518207e4f9db4810988d22087ce | grep value | > awk -F'value:' '{print $2}' > 4d4d35676c336567714850663477336d2b415475746b74774c56376b77324b4e73773879724c46704678513d] > > [TEST]root at comp-02:~# cat bbb > 4d4d35676c336567714850663477336d2b415475746b74774c56376b77324b4e73773879724c46704678513d > [TEST]root at comp-02:~# cat bbb | base64 -d > pass2 > [TEST]root at comp-02:~# cat pass2 > ▒▒߻▒▒▒▒▒^<▒N▒▒▒▒~پ5▒▒▒▒▒▒▒z߾▒▒▒▒~▒▒▒▒▒n▒▒▒▒▒]▒[TEST]root at comp-02:~# > [TEST]root at comp-02:~# virsh domblklist instance-00000da8 > Target Source > ------------------------------------------------ > vda /dev/dm-17 > vdb /dev/disk/by-id/wwn-0x6e00084100ee7e7e74623bd3000036bc > vdc /dev/dm-16 > vde /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f > vdf /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 > [TEST]root at comp-02:~# udisksctl unlock -b > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 --key-file pass2 > Error unlocking /dev/dm-21: > GDBus.Error:org.freedesktop.UDisks2.Error.Failed: Error unlocking > /dev/dm-21: Failed to activate device: Operation not permitted > [TEST]root at comp-02:~# cryptsetup luksOpen > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 my-volume > --master-key-file=pass2 > Volume key does not match the volume. > > > I see that nova/cinder and barbican are doing this stuff somehow so I > strongly believe there is a way to decrypt this manually. Maybe I’m doing > something wrong in my testing-steps. > Thanks in advance for any help here! Unfortunately, I haven’t found any > materials on how to do this. Yeah this is thanks to a long standing peice of technical debt that I've wanted to remove for years but I've never had to the change to. The tl;dr is that os-brick and n-cpu both turn the associated symmetric key secret into a passphrase using the following logic, ultimately calling binascii.hexlify: https://github.com/openstack/nova/blob/944443a7b053957f0b17a5edaa1d0ef14ae48f30/nova/virt/libvirt/driver.py#L1463-L1466 https://github.com/openstack/os-brick/blob/ec70b4092f649d933322820e3003269560df7af9/os_brick/encryptors/cryptsetup.py#L101-L103 I'm sure I've written up the steps to manually decrypt a cinder volume using these steps before but I can't seem to find them at the moment. I'll try to find some time to write these up again later in the week. Obviously it goes without saying that c-vol/c-api should be creating a passphrase secret for LUKS encrypted volumes to avoid this madness. Cinder creating and associating symmetric keys with encrypted volumes when used with Barbican https://bugs.launchpad.net/cinder/+bug/1693840 -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From openstack at nemebean.com Wed Feb 10 00:08:29 2021 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 9 Feb 2021 18:08:29 -0600 Subject: [all] Gate resources and performance In-Reply-To: References: Message-ID: This seemed like a good time to finally revisit https://review.opendev.org/c/openstack/devstack/+/676016 (the OSC as a service patch). Turns out it wasn't as much work to reimplement as I had expected, but hopefully this version addresses the concerns with the old one. In my local env it takes about 3:45 off my devstack run. Not a huge amount by itself, but multiplied by thousands of jobs it could be significant. On 2/4/21 11:28 AM, Dan Smith wrote: > Hi all, > > I have become increasingly concerned with CI performance lately, and > have been raising those concerns with various people. Most specifically, > I'm worried about our turnaround time or "time to get a result", which > has been creeping up lately. Right after the beginning of the year, we > had a really bad week where the turnaround time was well over 24 > hours. That means if you submit a patch on Tuesday afternoon, you might > not get a test result until Thursday. That is, IMHO, a real problem and > massively hurts our ability to quickly merge priority fixes as well as > just general velocity and morale. If people won't review my code until > they see a +1 from Zuul, and that is two days after I submitted it, > that's bad. > > Things have gotten a little better since that week, due in part to > getting past a rush of new year submissions (we think) and also due to > some job trimming in various places (thanks Neutron!). However, things > are still not great. Being in almost the last timezone of the day, the > queue is usually so full when I wake up that it's quite often I don't > get to see a result before I stop working that day. > > I would like to ask that projects review their jobs for places where > they can cut out redundancy, as well as turn their eyes towards > optimizations that can be made. I've been looking at both Nova and > Glance jobs and have found some things I think we can do less of. I also > wanted to get an idea of who is "using too much" in the way of > resources, so I've been working on trying to characterize the weight of > the jobs we run for a project, based on the number of worker nodes > required to run all the jobs, as well as the wall clock time of how long > we tie those up. The results are interesting, I think, and may help us > to identify where we see some gains. > > The idea here is to figure out[1] how many "node hours" it takes to run > all the normal jobs on a Nova patch compared to, say, a Neutron one. If > the jobs were totally serialized, this is the number of hours a single > computer (of the size of a CI worker) would take to do all that work. If > the number is 24 hours, that means a single computer could only check > *one* patch in a day, running around the clock. I chose the top five > projects in terms of usage[2] to report here, as they represent 70% of > the total amount of resources consumed. The next five only add up to > 13%, so the "top five" seems like a good target group. Here are the > results, in order of total consumption: > > Project % of total Node Hours Nodes > ------------------------------------------ > 1. TripleO 38% 31 hours 20 > 2. Neutron 13% 38 hours 32 > 3. Nova 9% 21 hours 25 > 4. Kolla 5% 12 hours 18 > 5. OSA 5% 22 hours 17 > > What that means is that a single computer (of the size of a CI worker) > couldn't even process the jobs required to run on a single patch for > Neutron or TripleO in a 24-hour period. Now, we have lots of workers in > the gate, of course, but there is also other potential overhead involved > in that parallelism, like waiting for nodes to be available for > dependent jobs. And of course, we'd like to be able to check more than > patch per day. Most projects have smaller gate job sets than check, but > assuming they are equivalent, a Neutron patch from submission to commit > would undergo 76 hours of testing, not including revisions and not > including rechecks. That's an enormous amount of time and resource for a > single patch! > > Now, obviously nobody wants to run fewer tests on patches before they > land, and I'm not really suggesting that we take that approach > necessarily. However, I think there are probably a lot of places that we > can cut down the amount of *work* we do. Some ways to do this are: > > 1. Evaluate whether or not you need to run all of tempest on two > configurations of a devstack on each patch. Maybe having a > stripped-down tempest (like just smoke) to run on unique configs, or > even specific tests. > 2. Revisit your "irrelevant_files" lists to see where you might be able > to avoid running heavy jobs on patches that only touch something > small. > 3. Consider moving some jobs to the experimental queue and run them > on-demand for patches that touch particular subsystems or affect > particular configurations. > 4. Consider some periodic testing for things that maybe don't need to > run on every single patch. > 5. Re-examine tests that take a long time to run to see if something can > be done to make them more efficient. > 6. Consider performance improvements in the actual server projects, > which also benefits the users. > > If you're a project that is not in the top ten then your job > configuration probably doesn't matter that much, since your usage is > dwarfed by the heavy projects. If the heavy projects would consider > making changes to decrease their workload, even small gains have the > ability to multiply into noticeable improvement. The higher you are on > the above list, the more impact a small change will have on the overall > picture. > > Also, thanks to Neutron and TripleO, both of which have already > addressed this in some respect, and have other changes on the horizon. > > Thanks for listening! > > --Dan > > 1: https://gist.github.com/kk7ds/5edbfacb2a341bb18df8f8f32d01b37c > 2; http://paste.openstack.org/show/C4pwUpdgwUDrpW6V6vnC/ > From dms at danplanet.com Wed Feb 10 00:59:18 2021 From: dms at danplanet.com (Dan Smith) Date: Tue, 09 Feb 2021 16:59:18 -0800 Subject: [all] Gate resources and performance In-Reply-To: (Ben Nemec's message of "Tue, 9 Feb 2021 18:08:29 -0600") References: Message-ID: > This seemed like a good time to finally revisit > https://review.opendev.org/c/openstack/devstack/+/676016 (the OSC as a > service patch). Turns out it wasn't as much work to reimplement as I > had expected, but hopefully this version addresses the concerns with > the old one. > > In my local env it takes about 3:45 off my devstack run. Not a huge > amount by itself, but multiplied by thousands of jobs it could be > significant. I messed with doing this myself, I wish I had seen yours first. I never really got it to be stable enough to consider it usable because of how many places in devstack we use the return code of an osc command. I could get it to trivially work, but re-stacks and other behaviors weren't quite right. Looks like maybe your version does that properly? Anyway, I moved on to a full parallelization of devstack, which largely lets me run all the non-dependent osc commands in parallel, in addition to all kinds of other stuff (like db syncs and various project setup). So far, that effort is giving me about a 60% performance improvement over baseline, and I can do a minimal stack on my local machine in about five minutes: https://review.opendev.org/c/openstack/devstack/+/771505/ I think we've largely got agreement to get that merged at this point, which as you say, will definitely make some significant improvements purely because of how many times we do that in a day. If your OaaS can support parallel requests, I'd definitely be interested in pursuing that on top, although I think I've largely squeezed out the startup delay we see when we run like eight osc instances in parallel during keystone setup :) --Dan From marios at redhat.com Wed Feb 10 06:39:42 2021 From: marios at redhat.com (Marios Andreou) Date: Wed, 10 Feb 2021 08:39:42 +0200 Subject: [tripleo][ci] socializing upstream job removal ( master upgrade, scenario010 ) In-Reply-To: <20210209171653.bjbhfvx6wjtmzapw@yuggoth.org> References: <08269201-6C59-4ECE-8B65-16A724E3589B@odyssey4.me> <20210209171653.bjbhfvx6wjtmzapw@yuggoth.org> Message-ID: On Tue, Feb 9, 2021 at 7:19 PM Jeremy Stanley wrote: > On 2021-02-09 19:03:58 +0200 (+0200), Marios Andreou wrote: > [...] > > just to be clear however, note that this means that there will be no > > stable/wallaby version of these upgrades jobs, or at least it will be a > lot > > harder to add them if we aren't running these jobs against master. In the > > 'normal' case, the non-voting master jobs become the voting stable/latest > > once we have created the new latest branch. If we remove master we may > have > > a challenge to add voting stable/wallaby versions when the time comes. > > > > Based on our recent discussions, we don't *want* stable/wallaby versions > > since we want to eventually remove all the upstream upgrade jobs once > > d/stream replacements have been put in place. > > > > The point then is that ideally we need to be in a position to do that > (i.e. > > replacement jobs in place) before stable/wallaby is branched, otherwise > we > > may have a tough time adding stable/wallaby versions of the (removed) > > master upgrade jobs. > > If for some reason you aren't able to get that far in Wallaby, > running this once a day in the periodic pipeline in OpenDev's Zuul > would still be *far* less resource-intensive than the current state > of running multiple times for most proposed changes, and so could > serve as a reasonable compromise. > ack thanks and yeah we are definitely keeping our 3rd party periodics on these but good point we could also consider the upstream periodic i almost forgot we have one of those defined too ;) https://opendev.org/openstack/tripleo-ci/src/commit/8b23749692c03f2acfb6721821e09d14bd2b3928/zuul.d/periodic.yaml#L2 regards, marios > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ssbarnea at redhat.com Wed Feb 10 08:20:02 2021 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Wed, 10 Feb 2021 08:20:02 +0000 Subject: [QA] Migrate from testr to stestr In-Reply-To: References: Message-ID: While switch from testr to stestr is a no brainer short term move, I want to mention the maintenance risks. I personally see the stestr test dependency a liability because the project is not actively maintained and mainly depends on a single person. It is not unmaintained either. Due to such risks I preferred to rely on pytest for running tests, as I prefer to depend on an ecosystem that has a *big* pool of maintainers. Do not take my remark as a proposal to switch to pytest, is only about risk assessment. I am fully aware of how easy is to write impure unittests with pytest, but so far I did not regret going this route. I know that OpenStack historically loved to redo everything in house and minimise involvement with other open source python libraries. There are pros and cons on each approach but I personally prefer to bet on projects that are thriving and that are unlikely to need me to fix framework problems myself. Cheers, Sorin On Tue, 9 Feb 2021 at 17:59, Martin Kopec wrote: > Hi everyone, > > testr unit test runner (testrepository package [1]) hasn't been updated > for years, therefore during Shanghai PTG [2] we came up with an initiative > to migrate from testr to stestr (testr's successor) [3] unit test runner. > Here is an etherpad which tracks the effort [4]. However as there is still > quite a number of the projects which haven't migrated, we would like to > kindly ask you for your help. If you are a maintainer of a project which is > mentioned in the etherpad [4] and it's not crossed out yet, please migrate > to stestr. > > [1] https://pypi.org/project/testrepository/ > [2] https://etherpad.opendev.org/p/shanghai-ptg-qa > [3] https://pypi.org/project/stestr/ > [4] https://etherpad.opendev.org/p/enkM4eeDHObSloTjPGAu > > Have a nice day, > > -- > > Martin Kopec > > Software Quality Engineer > > Red Hat EMEA > > > -- -- /sorin -------------- next part -------------- An HTML attachment was scrubbed... URL: From sco1984 at gmail.com Wed Feb 10 10:09:37 2021 From: sco1984 at gmail.com (Amey Abhyankar) Date: Wed, 10 Feb 2021 15:39:37 +0530 Subject: How to configure interfaces in CentOS 8.3 for OSA installation? Message-ID: Hello, I am trying to install OpenStack by referring following guides = 1) https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html 2) https://docs.openstack.org/openstack-ansible/victoria/user/test/example.html In link number 2, there is a step to configure interfaces. In CentOS 8 we don't have a /etc/network/interfaces file. Which interface file should I configure? Or do I need to create all virtual int's manually using nmcli? Pls suggest thanks. Regards, Amey. From sco1984 at gmail.com Wed Feb 10 10:26:32 2021 From: sco1984 at gmail.com (Amey Abhyankar) Date: Wed, 10 Feb 2021 15:56:32 +0530 Subject: bootstrap errors in OSA installation in CentOS 8.3 Message-ID: Hello, I am trying to configure deployment host by referring following guide =https://docs.openstack.org/project-deploy-guide/openstack-ansible/victoria/deploymenthost.html # dnf install https://repos.fedorapeople.org/repos/openstack/openstack-victoria/rdo-release-victoria.el8.rpm # dnf install git chrony openssh-server python3-devel sudo # dnf group install "Development Tools" # git clone -b 22.0.0.0rc1 https://opendev.org/openstack/openstack-ansible /opt/openstack-ansible # scripts/bootstrap-ansible.sh [ after executing this command, I am getting errors. Attaching sample error] ------------------------------------------------- hs will be added later\\n --pathspec-from-file \\n read pathspec from file\\n --pathspec-file-nul with --pathspec-from-file, pathspec elements are separated with NUL character\\n'\"], [\"Role {'name': 'os_ironic', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-os_ironic', 'version': '67733c8f0cb13c467eb10f256a93d878c56d4743', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/os_ironic'} failed after 2 retries\\n\"], [\"Role {'name': 'os_cinder', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-os_cinder', 'version': '00a38c6584c09168faad135f10d265ad9c86efba', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/os_cinder'} failed after 2 retries\\n\"], [\"Role {'name': 'lxc_hosts', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-lxc_hosts', 'version': 'b3bff3289ac2e9510be81f562f6d35500ac47723', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/lxc_hosts'} failed after 2 retries\\n\"], [\"Role {'name': 'rsyslog_client', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-rsyslog_client', 'version': 'd616af7883bd6e7a208be3a4f56d129bc2aa92ca', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/rsyslog_client'} failed after 2 retries\\n\"], [\"Role {'name': 'openstack_openrc', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-openstack_openrc', 'version': '242772e99978fe9cd3c50b5b40d5637833f38beb', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/openstack_openrc'} failed after 2 retries\\n\"], [\"Role {'name': 'os_ceilometer', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-os_ceilometer', 'version': '6df1fb0fc5610c2f317b5085188b0a60346e7111', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/os_ceilometer'} failed after 2 retries\\n\"], [\"Failed to reset /etc/ansible/roles/os_keystone\\nCmd('git') failed due to: exit code(129)\\n cmdline: git reset --force --hard f1625b38a820dd51cffa66141053a836211695dc\\n stderr: 'error: unknown option `force'\\nusage: git reset [--mixed | --soft | --hard | --merge | --keep] [-q] []\\n or: git reset [-q] [] [--] ...\\n or: git reset [-q] [--pathspec-from-file [--pathspec-file-nul]] []\\n or: git reset --patch [] [--] [...]\\n\\n -q, --quiet be quiet, only report errors\\n --mixed reset HEAD and index\\n --soft reset only HEAD\\n --hard reset HEAD, index and working tree\\n --merge reset HEAD, index and working tree\\n --keep reset HEAD but keep local changes\\n --recurse-submodules[=]\\n control recursive updating of submodules\\n -p, --patch select hunks interactively\\n -N, --intent-to-add record only the fact that removed paths will be added later\\n --pathspec-from-file \\n read pathspec from file\\n --pathspec-file-nul with --pathspec-from-file, pathspec elements are separated with NUL character\\n'\"], [\"Role {'name': 'openstack_hosts', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-openstack_hosts', 'version': '8ab8503a15ad5cfe504777647922be301ddac911', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/openstack_hosts'} failed after 2 retries\\n\"], [\"Role {'name': 'os_sahara', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-os_sahara', 'version': '0f9e76292461a5532f6d43927b71ab9fff692dd5', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/os_sahara'} failed after 2 retries\\n\"], [\"Role {'name': 'os_tempest', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-os_tempest', 'version': '49004fb05fa491275848a07020a36d4264ab6d49', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/os_tempest'} failed after 2 retries\\n\"], [\"Role {'name': 'ceph-ansible', 'scm': 'git', 'src': 'https://github.com/ceph/ceph-ansible', 'version': '7d088320df1c4a6ed458866c61616a21fddccfe8', 'trackbranch': 'stable-5.0', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/ceph-ansible'} failed after 2 retries\\n\"], [\"Role {'name': 'galera_server', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-galera_server', 'version': '0b853b1da7802f6fac8da309c88886186f6a15a6', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/galera_server'} failed after 2 retries\\n\"], [\"Role {'name': 'os_nova', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-os_nova', 'version': '43b1b62f22b47e3148adcc4cd2396a4e29522e9b', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/os_nova'} failed after 2 retries\\n\"], [\"Role {'name': 'os_keystone', 'scm': 'git', 'src': 'https://opendev.org/openstack/openstack-ansible-os_keystone', 'version': 'f1625b38a820dd51cffa66141053a836211695dc', 'trackbranch': 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': 10, 'dest': '/etc/ansible/roles/os_keystone'} failed after 2 retries\\n\"]]\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1} PLAY RECAP ****************************************************************************************************************************************************************************************** localhost : ok=4 changed=0 unreachable=0 failed=1 skipped=8 rescued=0 ignored=0 ++ exit_fail 405 0 ++ set +x Last metadata expiration check: 2:12:54 ago on Wed 10 Feb 2021 01:34:28 PM IST. Package iproute-5.3.0-5.el8.x86_64 is already installed. Dependencies resolved. Nothing to do. Complete! Failed to get DNSSEC supported state: Unit dbus-org.freedesktop.resolve1.service not found. which: no lxc-ls in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) which: no lxc-checkconfig in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) which: no networkctl in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) [WARNING]: Skipping callback plugin '/dev/null', unable to load which: no btrfs in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin :/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) which: no btrfs in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin :/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) which: no zfs in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/ usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) ++ info_block 'Error Info - 405' 0 ++ echo ---------------------------------------------------------------------- ---------------------------------------------------------------------- ++ print_info 'Error Info - 405' 0 ++ PROC_NAME='- [ Error Info - 405 0 ] -' ++ printf '\n%s%s\n' '- [ Error Info - 405 0 ] -' ------------------------------ -------------- - [ Error Info - 405 0 ] --------------------------------------------- ++ echo ---------------------------------------------------------------------- ---------------------------------------------------------------------- ++ exit_state 1 ++ set +x ---------------------------------------------------------------------- - [ Run Time = 163 seconds || 2 minutes ] ---------------------------- ---------------------------------------------------------------------- ---------------------------------------------------------------------- - [ Status: Failure ] ------------------------------------------------ ---------------------------------------------------------------------- Pls suggest how to fix these issues thanks. Regards, Amey. From moreira.belmiro.email.lists at gmail.com Wed Feb 10 11:37:03 2021 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Wed, 10 Feb 2021 12:37:03 +0100 Subject: [largescale-sig] OpenStack DB Archiver In-Reply-To: <639c7a77f7812ad8897656404cb06cc67cf51609.camel@redhat.com> References: <20200716133127.GA31915@sync> <045a2dea-02f0-26ca-96d6-46d8cdbe2d16@openstack.org> <639c7a77f7812ad8897656404cb06cc67cf51609.camel@redhat.com> Message-ID: Hi, at CERN, Nova shadow tables are used to keep the deleted entries for 30 days. We have a cronjob, that runs every day, to move the deleted rows into the shadow tables and purge them using nova-manage (deleted rows are kept in the shadow tables for 30 days). For other projects that keep deleted rows but don't have shadow tables, for example Glance, we purge the tables everyday with glance-manage (again, keeping the deleted rows for 30 days). Belmiro On Fri, Jan 29, 2021 at 3:28 PM Sean Mooney wrote: > On Fri, 2021-01-29 at 13:47 +0100, Thierry Carrez wrote: > > Arnaud Morin wrote: > > > [...] > > > We were wondering if some other users would be interested in using the > > > tool, and maybe move it under the opendev governance? > > > > Resurrecting this thread, as OSops has now been revived under the > > auspices of the OpenStack Operation Docs and Tooling SIG. > > > > There are basically 3 potential ways forward for OSarchiver: > > > > 1- Keep it as-is on GitHub, and reference it where we can in OpenStack > docs > > > > 2- Relicense it under Apache-2 and move it in a subdirectory under > > openstack/osops > > > > 3- Move it under its own repository under opendev and propose it as a > > new official OpenStack project (relicensing under Apache-2 will be > > necessary if accepted) > > > > Options (1) and (3) have the benefit of keeping it under its own > > repository. Options (2) and (3) have the benefit of counting towards an > > official OpenStack contribution. Options (1) and (2) have the benefit of > > not requiring TC approval. > > > > All other things being equal, if the end goal is to increase > > discoverability, option 3 is probably the best. > > not to detract form the converation on where to host it, but now that i > have discoverd this > via this thread i have one quetion. OSarchiver appears to be bypassing the > shadow tabels > whcih the project maintian to allow you to archive rows in the the project > db in a different table. > > instad OSarchiver chooese to archive it in an external DB or file > > we have talked about wheter or not we can remove shaddow tables in nova > enteirly a few times in the past > but we did not want to break operators that actully use them but it > appares OVH at least has developed there > own alrenitive presumably becasue the proejcts own archive and purge > functionality was not meetingyour need. > > would the option to disable shadow tabels or define a retention policy for > delete rows be useful to operators > or is this even a capablity that project coudl declare out of scope and > delegate that to a new openstack porject > e.g. opeiton 3 above to do instead? > > im not sure how suportable OSarchiver would be in our downstream product > right now but with testing it might be > somethign we could look at includign in the futrue. we currently rely on > chron jobs to invoke nova-magne ecta > to achive similar functionality to OSarchiver but if that chron job breaks > its hard to detect and the delete rows > can build up causing really slow db queries. as a seperate service with > loggin i assuem this is simplere to monitor > and alarm on if it fails since it provices one central point to manage the > archival and deletion of rows so i kind > of like this approch even if its direct db access right now would make it > unsupportable in our product without veting > the code and productising the repo via ooo integration. > > > > > > > Regards, > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephenfin at redhat.com Wed Feb 10 12:00:22 2021 From: stephenfin at redhat.com (Stephen Finucane) Date: Wed, 10 Feb 2021 12:00:22 +0000 Subject: [QA] Migrate from testr to stestr In-Reply-To: References: Message-ID: <48a66936450ccb8e4ddc5b27021288b656a763e0.camel@redhat.com> On Wed, 2021-02-10 at 08:20 +0000, Sorin Sbarnea wrote: > While switch from testr to stestr is a no brainer short term move, I want to > mention the maintenance risks. > > I personally see the stestr test dependency a liability because the project is > not actively maintained and mainly depends on a single person. It is not > unmaintained either. > > Due to such risks I preferred to rely on pytest for running tests, as I prefer > to depend on an ecosystem that has a *big* pool of maintainers.     > > Do not take my remark as a proposal to switch to pytest, is only about risk > assessment. I am fully aware of how easy is to write impure unittests with > pytest, but so far I did not regret going this route. At the risk of starting a tool comparison thread (which isn't my intention), it's worth noting that much of the reluctance to embrace pytest in the past has been due to its coupling of both a test runner and a test framework/library in the same tool. If I recall correctly, this pattern has existed before, in the form of 'nose', and the fate of that project has cast a long, dark shadow over similar projects. stestr may be "lightly staffed", but our consistent use of the unittest framework from stdlib across virtually all projects means we can easily switch it out for any other test runner than supports this protocol, including pytest, if we feel the need to in the future. Were pytest to be embraced, there is a significant concern that its use won't be restricted to merely the test runner aspect, which leaves us tightly coupled to that tool. You've touched on this above, but I feel it worth stating again for anyone reading this. Cheers, Stephen > I know that OpenStack historically loved to redo everything in house and > minimise involvement with other open source python libraries. There are pros > and cons on each approach but I personally prefer to bet on projects that are > thriving and that are unlikely to need me to fix framework problems myself. > > Cheers, > Sorin > > On Tue, 9 Feb 2021 at 17:59, Martin Kopec wrote: > > Hi everyone, > > > > testr unit test runner (testrepository package [1]) hasn't been updated for > > years, therefore during Shanghai PTG [2] we came up with an initiative to > > migrate from testr to stestr (testr's successor) [3] unit test runner. > > Here is an etherpad which tracks the effort [4]. However as there is still > > quite a number of the projects which haven't migrated, we would like to > > kindly ask you for your help. If you are a maintainer of a project which is > > mentioned in the etherpad [4] and it's not crossed out yet, please migrate > > to stestr. > > > > [1] https://pypi.org/project/testrepository/ > > [2] https://etherpad.opendev.org/p/shanghai-ptg-qa > > [3] https://pypi.org/project/stestr/ > > [4] https://etherpad.opendev.org/p/enkM4eeDHObSloTjPGAu > > > > Have a nice day, > > > > -- > > Martin Kopec > > Software Quality Engineer > > Red Hat EMEA > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Feb 10 12:29:04 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 10 Feb 2021 12:29:04 +0000 Subject: [largescale-sig] OpenStack DB Archiver In-Reply-To: References: <20200716133127.GA31915@sync> <045a2dea-02f0-26ca-96d6-46d8cdbe2d16@openstack.org> <639c7a77f7812ad8897656404cb06cc67cf51609.camel@redhat.com> Message-ID: On Wed, 2021-02-10 at 12:37 +0100, Belmiro Moreira wrote: > Hi, > at CERN, Nova shadow tables are used to keep the deleted entries for 30 > days. > We have a cronjob, that runs every day, to move the deleted rows into the > shadow tables and purge them using nova-manage (deleted rows are kept in > the shadow tables for 30 days). > > For other projects that keep deleted rows but don't have shadow tables, for > example Glance, we purge the tables everyday with glance-manage (again, > keeping the deleted rows for 30 days). thanks that is useful. that is what i was expecting to be a good defualt setup keep for 30 days but run daily with no row limit. that way you are archiving/deleteing 1 days worth of entries every day but keeping them for 30 days before they are removed. > > Belmiro > > > On Fri, Jan 29, 2021 at 3:28 PM Sean Mooney wrote: > > > On Fri, 2021-01-29 at 13:47 +0100, Thierry Carrez wrote: > > > Arnaud Morin wrote: > > > > [...] > > > > We were wondering if some other users would be interested in using the > > > > tool, and maybe move it under the opendev governance? > > > > > > Resurrecting this thread, as OSops has now been revived under the > > > auspices of the OpenStack Operation Docs and Tooling SIG. > > > > > > There are basically 3 potential ways forward for OSarchiver: > > > > > > 1- Keep it as-is on GitHub, and reference it where we can in OpenStack > > docs > > > > > > 2- Relicense it under Apache-2 and move it in a subdirectory under > > > openstack/osops > > > > > > 3- Move it under its own repository under opendev and propose it as a > > > new official OpenStack project (relicensing under Apache-2 will be > > > necessary if accepted) > > > > > > Options (1) and (3) have the benefit of keeping it under its own > > > repository. Options (2) and (3) have the benefit of counting towards an > > > official OpenStack contribution. Options (1) and (2) have the benefit of > > > not requiring TC approval. > > > > > > All other things being equal, if the end goal is to increase > > > discoverability, option 3 is probably the best. > > > > not to detract form the converation on where to host it, but now that i > > have discoverd this > > via this thread i have one quetion. OSarchiver appears to be bypassing the > > shadow tabels > > whcih the project maintian to allow you to archive rows in the the project > > db in a different table. > > > > instad OSarchiver chooese to archive it in an external DB or file > > > > we have talked about wheter or not we can remove shaddow tables in nova > > enteirly a few times in the past > > but we did not want to break operators that actully use them but it > > appares OVH at least has developed there > > own alrenitive presumably becasue the proejcts own archive and purge > > functionality was not meetingyour need. > > > > would the option to disable shadow tabels or define a retention policy for > > delete rows be useful to operators > > or is this even a capablity that project coudl declare out of scope and > > delegate that to a new openstack porject > > e.g. opeiton 3 above to do instead? > > > > im not sure how suportable OSarchiver would be in our downstream product > > right now but with testing it might be > > somethign we could look at includign in the futrue. we currently rely on > > chron jobs to invoke nova-magne ecta > > to achive similar functionality to OSarchiver but if that chron job breaks > > its hard to detect and the delete rows > > can build up causing really slow db queries. as a seperate service with > > loggin i assuem this is simplere to monitor > > and alarm on if it fails since it provices one central point to manage the > > archival and deletion of rows so i kind > > of like this approch even if its direct db access right now would make it > > unsupportable in our product without veting > > the code and productising the repo via ooo integration. > > > > > > > > > > > > Regards, > > > > > > > > > > > From smooney at redhat.com Wed Feb 10 12:58:06 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 10 Feb 2021 12:58:06 +0000 Subject: [QA] Migrate from testr to stestr In-Reply-To: <48a66936450ccb8e4ddc5b27021288b656a763e0.camel@redhat.com> References: <48a66936450ccb8e4ddc5b27021288b656a763e0.camel@redhat.com> Message-ID: On Wed, 2021-02-10 at 12:00 +0000, Stephen Finucane wrote: > On Wed, 2021-02-10 at 08:20 +0000, Sorin Sbarnea wrote: > > While switch from testr to stestr is a no brainer short term move, I want to > > mention the maintenance risks. > > > > I personally see the stestr test dependency a liability because the project is > > not actively maintained and mainly depends on a single person. It is not > > unmaintained either. > > > > Due to such risks I preferred to rely on pytest for running tests, as I prefer > > to depend on an ecosystem that has a *big* pool of maintainers.     > > > > Do not take my remark as a proposal to switch to pytest, is only about risk > > assessment. I am fully aware of how easy is to write impure unittests with > > pytest, but so far I did not regret going this route. > > At the risk of starting a tool comparison thread (which isn't my intention), > it's worth noting that much of the reluctance to embrace pytest in the past has > been due to its coupling of both a test runner and a test framework/library in > the same tool. If I recall correctly, this pattern has existed before, in the > form of 'nose', and the fate of that project has cast a long, dark shadow over > similar projects. stestr may be "lightly staffed", but our consistent use of the > unittest framework from stdlib across virtually all projects means we can easily > switch it out for any other test runner than supports this protocol, including > pytest, if we feel the need to in the future. Were pytest to be embraced, there > is a significant concern that its use won't be restricted to merely the test > runner aspect, which leaves us tightly coupled to that tool. You've touched on > this above, but I feel it worth stating again for anyone reading this. yep by the way you can use pytest as a runner today locally. i do use it for executing test in the rare case i use an ide like pycharm to run test because i need an interactive debugger that is more capbale the pdb. im kind of surprised that trstr is still used at all since the move to stster was ment to be done 3-4+ releases ago. i actully kind of prefer pytest style of test but we have a lot of famiarliarty with the unittest framework form the standard lib and losing that base of knolage would be harmful to our productivity and ablity to work on different porjects. if its used as a test runner i think that would be ok provide we dont also use it as a test framework. one of the gaps that woudl need to be adress is ensurint we can still generate teh same html summaries which currently are based on the subunit? protofoal i belive whihc pytest does not use so we would need to make suer it works with our existing tooling. i dont think it realistic that pytest is going to die. its used way to hevally for that granted its only 55 on the top 100 downloaded package over the last year on pypi but that still put it ahead of lxml(58) and sqlalchemy(71) which we have no problem using, eventlets did not even make the list... https://hugovk.github.io/top-pypi-packages/. its used by too many large projects to realistically die in the way its being implyed it might. the bigger risk form me is mixing tests styles and ending up with test that can only run under pytest. i would avocate for completeing the process of migrating everythign to stestr before considering moving to pytest or anything else. if project do decided to adopt pytest instead i would sugess gating that and makeing sure or other tooling from jobs is updated to work with that or we have an alternitve report generator we can use e.g. https://pypi.org/project/pytest-html/ i do think adopoting pytest however if iit was done should be comunity wide rather then perproject hence why i think we should complete the current goal of moving to stestr first. > > Cheers, > Stephen > > > I know that OpenStack historically loved to redo everything in house and > > minimise involvement with other open source python libraries. There are pros > > and cons on each approach but I personally prefer to bet on projects that are > > thriving and that are unlikely to need me to fix framework problems myself. > > > > Cheers, > > Sorin > > > > On Tue, 9 Feb 2021 at 17:59, Martin Kopec wrote: > > > Hi everyone, > > > > > > testr unit test runner (testrepository package [1]) hasn't been updated for > > > years, therefore during Shanghai PTG [2] we came up with an initiative to > > > migrate from testr to stestr (testr's successor) [3] unit test runner. > > > Here is an etherpad which tracks the effort [4]. However as there is still > > > quite a number of the projects which haven't migrated, we would like to > > > kindly ask you for your help. If you are a maintainer of a project which is > > > mentioned in the etherpad [4] and it's not crossed out yet, please migrate > > > to stestr. > > > > > > [1] https://pypi.org/project/testrepository/ > > > [2] https://etherpad.opendev.org/p/shanghai-ptg-qa > > > [3] https://pypi.org/project/stestr/ > > > [4] https://etherpad.opendev.org/p/enkM4eeDHObSloTjPGAu > > > > > > Have a nice day, > > > > > > -- > > > Martin Kopec > > > Software Quality Engineer > > > Red Hat EMEA > > > > From jonathan.rosser at rd.bbc.co.uk Wed Feb 10 13:00:01 2021 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Wed, 10 Feb 2021 13:00:01 +0000 Subject: [openstack-ansible] Re: How to configure interfaces in CentOS 8.3 for OSA installation? In-Reply-To: References: Message-ID: <8fb90586-01c6-0a31-4c47-787e4f620c24@rd.bbc.co.uk> Hi Amey, The documentation that you linked to is specifically for the ansible deployment host, which can be separate from the OpenStack target hosts if you wish. The only requirement is that it has ssh access to the target hosts, by default this would be via the network attached to br-mgmt on those hosts. In terms of the target hosts networking, you should refer to https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts.html#configuring-the-network. OpenStack-Ansible does not prescribe any particular method for setting up the host networking as this tends to be specific to the individual environment or deployer preferences. You should pick most appropriate network config tool for your OS/environment and create the bridges and networks listed in the table on the target hosts documentation. I would always recommend starting with an All-In-One deployment https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html. This will generate a complete test environment with all the host networking set up correctly and will serve as a reference example for a production deployment. Do join the #openstack-ansible IRC channel if you would like to discuss any of the options further. Regards, Jonathan. On 10/02/2021 10:09, Amey Abhyankar wrote: > Hello, > > I am trying to install OpenStack by referring following guides = > > 1) https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html > 2) https://docs.openstack.org/openstack-ansible/victoria/user/test/example.html > > In link number 2, there is a step to configure interfaces. > In CentOS 8 we don't have a /etc/network/interfaces file. > Which interface file should I configure? > Or do I need to create all virtual int's manually using nmcli? > > Pls suggest thanks. > > Regards, > Amey. > > From jonathan.rosser at rd.bbc.co.uk Wed Feb 10 13:03:37 2021 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Wed, 10 Feb 2021 13:03:37 +0000 Subject: bootstrap errors in OSA installation in CentOS 8.3 In-Reply-To: References: Message-ID: <3a833f13-0816-ca3e-e67f-f4fd1ec04cd3@rd.bbc.co.uk> Hi Amey, Looks like you may have tried to check out a tag that does not exist. The most recent tag on the stable/victoria branch today is 22.0.1 Regards, Jonathan. On 10/02/2021 10:26, Amey Abhyankar wrote: > Hello, > > I am trying to configure deployment host by referring following guide > =https://docs.openstack.org/project-deploy-guide/openstack-ansible/victoria/deploymenthost.html > > # dnf install https://repos.fedorapeople.org/repos/openstack/openstack-victoria/rdo-release-victoria.el8.rpm > # dnf install git chrony openssh-server python3-devel sudo > # dnf group install "Development Tools" > # git clone -b 22.0.0.0rc1 > https://opendev.org/openstack/openstack-ansible /opt/openstack-ansible > # scripts/bootstrap-ansible.sh [ after executing this command, I am > getting errors. Attaching sample error] > > ------------------------------------------------- > hs will be added later\\n --pathspec-from-file \\n > read pathspec from file\\n --pathspec-file-nul with > --pathspec-from-file, pathspec elements are separated with NUL > character\\n'\"], [\"Role {'name': 'os_ironic', 'scm': 'git', 'src': > 'https://opendev.org/openstack/openstack-ansible-os_ironic', > 'version': '67733c8f0cb13c467eb10f256a93d878c56d4743', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/os_ironic'} failed after 2 > retries\\n\"], [\"Role {'name': 'os_cinder', 'scm': 'git', 'src': > 'https://opendev.org/openstack/openstack-ansible-os_cinder', > 'version': '00a38c6584c09168faad135f10d265ad9c86efba', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/os_cinder'} failed after 2 > retries\\n\"], [\"Role {'name': 'lxc_hosts', 'scm': 'git', 'src': > 'https://opendev.org/openstack/openstack-ansible-lxc_hosts', > 'version': 'b3bff3289ac2e9510be81f562f6d35500ac47723', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/lxc_hosts'} failed after 2 > retries\\n\"], [\"Role {'name': 'rsyslog_client', 'scm': 'git', 'src': > 'https://opendev.org/openstack/openstack-ansible-rsyslog_client', > 'version': 'd616af7883bd6e7a208be3a4f56d129bc2aa92ca', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/rsyslog_client'} failed after > 2 retries\\n\"], [\"Role {'name': 'openstack_openrc', 'scm': 'git', > 'src': 'https://opendev.org/openstack/openstack-ansible-openstack_openrc', > 'version': '242772e99978fe9cd3c50b5b40d5637833f38beb', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/openstack_openrc'} failed > after 2 retries\\n\"], [\"Role {'name': 'os_ceilometer', 'scm': 'git', > 'src': 'https://opendev.org/openstack/openstack-ansible-os_ceilometer', > 'version': '6df1fb0fc5610c2f317b5085188b0a60346e7111', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/os_ceilometer'} failed after > 2 retries\\n\"], [\"Failed to reset > /etc/ansible/roles/os_keystone\\nCmd('git') failed due to: exit > code(129)\\n cmdline: git reset --force --hard > f1625b38a820dd51cffa66141053a836211695dc\\n stderr: 'error: unknown > option `force'\\nusage: git reset [--mixed | --soft | --hard | --merge > | --keep] [-q] []\\n or: git reset [-q] [] [--] > ...\\n or: git reset [-q] [--pathspec-from-file > [--pathspec-file-nul]] []\\n or: git reset --patch > [] [--] [...]\\n\\n -q, --quiet be > quiet, only report errors\\n --mixed reset HEAD and > index\\n --soft reset only HEAD\\n --hard > reset HEAD, index and working tree\\n --merge > reset HEAD, index and working tree\\n --keep reset > HEAD but keep local changes\\n --recurse-submodules[=]\\n > control recursive updating of submodules\\n > -p, --patch select hunks interactively\\n -N, > --intent-to-add record only the fact that removed paths will be > added later\\n --pathspec-from-file \\n > read pathspec from file\\n --pathspec-file-nul with > --pathspec-from-file, pathspec elements are separated with NUL > character\\n'\"], [\"Role {'name': 'openstack_hosts', 'scm': 'git', > 'src': 'https://opendev.org/openstack/openstack-ansible-openstack_hosts', > 'version': '8ab8503a15ad5cfe504777647922be301ddac911', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/openstack_hosts'} failed > after 2 retries\\n\"], [\"Role {'name': 'os_sahara', 'scm': 'git', > 'src': 'https://opendev.org/openstack/openstack-ansible-os_sahara', > 'version': '0f9e76292461a5532f6d43927b71ab9fff692dd5', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/os_sahara'} failed after 2 > retries\\n\"], [\"Role {'name': 'os_tempest', 'scm': 'git', 'src': > 'https://opendev.org/openstack/openstack-ansible-os_tempest', > 'version': '49004fb05fa491275848a07020a36d4264ab6d49', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/os_tempest'} failed after 2 > retries\\n\"], [\"Role {'name': 'ceph-ansible', 'scm': 'git', 'src': > 'https://github.com/ceph/ceph-ansible', 'version': > '7d088320df1c4a6ed458866c61616a21fddccfe8', 'trackbranch': > 'stable-5.0', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': > 10, 'dest': '/etc/ansible/roles/ceph-ansible'} failed after 2 > retries\\n\"], [\"Role {'name': 'galera_server', 'scm': 'git', 'src': > 'https://opendev.org/openstack/openstack-ansible-galera_server', > 'version': '0b853b1da7802f6fac8da309c88886186f6a15a6', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/galera_server'} failed after > 2 retries\\n\"], [\"Role {'name': 'os_nova', 'scm': 'git', 'src': > 'https://opendev.org/openstack/openstack-ansible-os_nova', 'version': > '43b1b62f22b47e3148adcc4cd2396a4e29522e9b', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/os_nova'} failed after 2 > retries\\n\"], [\"Role {'name': 'os_keystone', 'scm': 'git', 'src': > 'https://opendev.org/openstack/openstack-ansible-os_keystone', > 'version': 'f1625b38a820dd51cffa66141053a836211695dc', 'trackbranch': > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > 'depth': 10, 'dest': '/etc/ansible/roles/os_keystone'} failed after 2 > retries\\n\"]]\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee > stdout/stderr for the exact error", "rc": 1} > > PLAY RECAP ****************************************************************************************************************************************************************************************** > localhost : ok=4 changed=0 unreachable=0 > failed=1 skipped=8 rescued=0 ignored=0 > > ++ exit_fail 405 0 > ++ set +x > Last metadata expiration check: 2:12:54 ago on Wed 10 Feb 2021 01:34:28 PM IST. > Package iproute-5.3.0-5.el8.x86_64 is already installed. > Dependencies resolved. > Nothing to do. > Complete! > Failed to get DNSSEC supported state: Unit > dbus-org.freedesktop.resolve1.service not found. > which: no lxc-ls in > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > which: no lxc-checkconfig in > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > which: no networkctl in > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > [WARNING]: Skipping callback plugin '/dev/null', unable to load > which: no btrfs in > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin > > > :/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > which: no btrfs in > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin > > > :/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > which: no zfs in > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/ > > > usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > ++ info_block 'Error Info - 405' 0 > ++ echo ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > ++ print_info 'Error Info - 405' 0 > ++ PROC_NAME='- [ Error Info - 405 0 ] -' > ++ printf '\n%s%s\n' '- [ Error Info - 405 0 ] -' > ------------------------------ > > -------------- > > - [ Error Info - 405 0 ] --------------------------------------------- > ++ echo ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > ++ exit_state 1 > ++ set +x > ---------------------------------------------------------------------- > > - [ Run Time = 163 seconds || 2 minutes ] ---------------------------- > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > > - [ Status: Failure ] ------------------------------------------------ > ---------------------------------------------------------------------- > > Pls suggest how to fix these issues thanks. > > Regards, > Amey. > > From thierry at openstack.org Wed Feb 10 13:06:57 2021 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 10 Feb 2021 14:06:57 +0100 Subject: =?UTF-8?B?UmU6IFtyZWxlYXNlXSBQcm9wb3NpbmcgRWzFkWQgSWxsw6lzIChlbG9k?= =?UTF-8?Q?=29_for_Release_Management_Core?= In-Reply-To: References: Message-ID: <530f689e-f56f-5720-29b5-677ad2f73815@openstack.org> Herve Beraud wrote: > Előd has been working on Release management for quite some time now and > in that time  has > shown tremendous growth in his understanding of our processes and on how > deliverables work on Openstack. I think he would make a good addition to > the core team. > > Existing team members, please respond with +1/-1. > If there are no objections we'll add him to the ACL soon. :-) +1 yes please ! -- Thierry Carrez (ttx) From ltoscano at redhat.com Wed Feb 10 13:14:36 2021 From: ltoscano at redhat.com (Luigi Toscano) Date: Wed, 10 Feb 2021 14:14:36 +0100 Subject: [QA] Migrate from testr to stestr In-Reply-To: References: Message-ID: <16279669.geO5KgaWL5@whitebase.usersys.redhat.com> On Tuesday, 9 February 2021 18:52:31 CET Martin Kopec wrote: > Hi everyone, > > testr unit test runner (testrepository package [1]) hasn't been updated for > years, therefore during Shanghai PTG [2] we came up with an initiative to > migrate from testr to stestr (testr's successor) [3] unit test runner. > Here is an etherpad which tracks the effort [4]. However as there is still > quite a number of the projects which haven't migrated, we would like to > kindly ask you for your help. If you are a maintainer of a project which is > mentioned in the etherpad [4] and it's not crossed out yet, please migrate > to stestr. I've sent a fix for cinder-tempest-plugin, but I'd like to note that several packages (probably all tempest plugin) don't really use stestr, because they don't have unit/functional tests. They do define a base "testenv" tox environment which the others environment inherit from (pep8, etc), so maybe the dependency could be just removed (maybe we need some general guidance/ common patterns). Ciao -- Luigi From sco1984 at gmail.com Wed Feb 10 13:45:51 2021 From: sco1984 at gmail.com (Amey Abhyankar) Date: Wed, 10 Feb 2021 19:15:51 +0530 Subject: bootstrap errors in OSA installation in CentOS 8.3 In-Reply-To: <3a833f13-0816-ca3e-e67f-f4fd1ec04cd3@rd.bbc.co.uk> References: <3a833f13-0816-ca3e-e67f-f4fd1ec04cd3@rd.bbc.co.uk> Message-ID: Hello Jonathan, I can see the particular TAG version at following location thanks = https://opendev.org/openstack/openstack-ansible/src/tag/22.0.0.0rc1 Regards, Amey. On Wed, 10 Feb 2021 at 18:35, Jonathan Rosser wrote: > > Hi Amey, > > Looks like you may have tried to check out a tag that does not exist. > > The most recent tag on the stable/victoria branch today is 22.0.1 > > Regards, > Jonathan. > > On 10/02/2021 10:26, Amey Abhyankar wrote: > > Hello, > > > > I am trying to configure deployment host by referring following guide > > =https://docs.openstack.org/project-deploy-guide/openstack-ansible/victoria/deploymenthost.html > > > > # dnf install https://repos.fedorapeople.org/repos/openstack/openstack-victoria/rdo-release-victoria.el8.rpm > > # dnf install git chrony openssh-server python3-devel sudo > > # dnf group install "Development Tools" > > # git clone -b 22.0.0.0rc1 > > https://opendev.org/openstack/openstack-ansible /opt/openstack-ansible > > # scripts/bootstrap-ansible.sh [ after executing this command, I am > > getting errors. Attaching sample error] > > > > ------------------------------------------------- > > hs will be added later\\n --pathspec-from-file \\n > > read pathspec from file\\n --pathspec-file-nul with > > --pathspec-from-file, pathspec elements are separated with NUL > > character\\n'\"], [\"Role {'name': 'os_ironic', 'scm': 'git', 'src': > > 'https://opendev.org/openstack/openstack-ansible-os_ironic', > > 'version': '67733c8f0cb13c467eb10f256a93d878c56d4743', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/os_ironic'} failed after 2 > > retries\\n\"], [\"Role {'name': 'os_cinder', 'scm': 'git', 'src': > > 'https://opendev.org/openstack/openstack-ansible-os_cinder', > > 'version': '00a38c6584c09168faad135f10d265ad9c86efba', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/os_cinder'} failed after 2 > > retries\\n\"], [\"Role {'name': 'lxc_hosts', 'scm': 'git', 'src': > > 'https://opendev.org/openstack/openstack-ansible-lxc_hosts', > > 'version': 'b3bff3289ac2e9510be81f562f6d35500ac47723', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/lxc_hosts'} failed after 2 > > retries\\n\"], [\"Role {'name': 'rsyslog_client', 'scm': 'git', 'src': > > 'https://opendev.org/openstack/openstack-ansible-rsyslog_client', > > 'version': 'd616af7883bd6e7a208be3a4f56d129bc2aa92ca', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/rsyslog_client'} failed after > > 2 retries\\n\"], [\"Role {'name': 'openstack_openrc', 'scm': 'git', > > 'src': 'https://opendev.org/openstack/openstack-ansible-openstack_openrc', > > 'version': '242772e99978fe9cd3c50b5b40d5637833f38beb', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/openstack_openrc'} failed > > after 2 retries\\n\"], [\"Role {'name': 'os_ceilometer', 'scm': 'git', > > 'src': 'https://opendev.org/openstack/openstack-ansible-os_ceilometer', > > 'version': '6df1fb0fc5610c2f317b5085188b0a60346e7111', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/os_ceilometer'} failed after > > 2 retries\\n\"], [\"Failed to reset > > /etc/ansible/roles/os_keystone\\nCmd('git') failed due to: exit > > code(129)\\n cmdline: git reset --force --hard > > f1625b38a820dd51cffa66141053a836211695dc\\n stderr: 'error: unknown > > option `force'\\nusage: git reset [--mixed | --soft | --hard | --merge > > | --keep] [-q] []\\n or: git reset [-q] [] [--] > > ...\\n or: git reset [-q] [--pathspec-from-file > > [--pathspec-file-nul]] []\\n or: git reset --patch > > [] [--] [...]\\n\\n -q, --quiet be > > quiet, only report errors\\n --mixed reset HEAD and > > index\\n --soft reset only HEAD\\n --hard > > reset HEAD, index and working tree\\n --merge > > reset HEAD, index and working tree\\n --keep reset > > HEAD but keep local changes\\n --recurse-submodules[=]\\n > > control recursive updating of submodules\\n > > -p, --patch select hunks interactively\\n -N, > > --intent-to-add record only the fact that removed paths will be > > added later\\n --pathspec-from-file \\n > > read pathspec from file\\n --pathspec-file-nul with > > --pathspec-from-file, pathspec elements are separated with NUL > > character\\n'\"], [\"Role {'name': 'openstack_hosts', 'scm': 'git', > > 'src': 'https://opendev.org/openstack/openstack-ansible-openstack_hosts', > > 'version': '8ab8503a15ad5cfe504777647922be301ddac911', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/openstack_hosts'} failed > > after 2 retries\\n\"], [\"Role {'name': 'os_sahara', 'scm': 'git', > > 'src': 'https://opendev.org/openstack/openstack-ansible-os_sahara', > > 'version': '0f9e76292461a5532f6d43927b71ab9fff692dd5', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/os_sahara'} failed after 2 > > retries\\n\"], [\"Role {'name': 'os_tempest', 'scm': 'git', 'src': > > 'https://opendev.org/openstack/openstack-ansible-os_tempest', > > 'version': '49004fb05fa491275848a07020a36d4264ab6d49', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/os_tempest'} failed after 2 > > retries\\n\"], [\"Role {'name': 'ceph-ansible', 'scm': 'git', 'src': > > 'https://github.com/ceph/ceph-ansible', 'version': > > '7d088320df1c4a6ed458866c61616a21fddccfe8', 'trackbranch': > > 'stable-5.0', 'path': '/etc/ansible/roles', 'refspec': None, 'depth': > > 10, 'dest': '/etc/ansible/roles/ceph-ansible'} failed after 2 > > retries\\n\"], [\"Role {'name': 'galera_server', 'scm': 'git', 'src': > > 'https://opendev.org/openstack/openstack-ansible-galera_server', > > 'version': '0b853b1da7802f6fac8da309c88886186f6a15a6', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/galera_server'} failed after > > 2 retries\\n\"], [\"Role {'name': 'os_nova', 'scm': 'git', 'src': > > 'https://opendev.org/openstack/openstack-ansible-os_nova', 'version': > > '43b1b62f22b47e3148adcc4cd2396a4e29522e9b', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/os_nova'} failed after 2 > > retries\\n\"], [\"Role {'name': 'os_keystone', 'scm': 'git', 'src': > > 'https://opendev.org/openstack/openstack-ansible-os_keystone', > > 'version': 'f1625b38a820dd51cffa66141053a836211695dc', 'trackbranch': > > 'stable/victoria', 'path': '/etc/ansible/roles', 'refspec': None, > > 'depth': 10, 'dest': '/etc/ansible/roles/os_keystone'} failed after 2 > > retries\\n\"]]\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee > > stdout/stderr for the exact error", "rc": 1} > > > > PLAY RECAP ****************************************************************************************************************************************************************************************** > > localhost : ok=4 changed=0 unreachable=0 > > failed=1 skipped=8 rescued=0 ignored=0 > > > > ++ exit_fail 405 0 > > ++ set +x > > Last metadata expiration check: 2:12:54 ago on Wed 10 Feb 2021 01:34:28 PM IST. > > Package iproute-5.3.0-5.el8.x86_64 is already installed. > > Dependencies resolved. > > Nothing to do. > > Complete! > > Failed to get DNSSEC supported state: Unit > > dbus-org.freedesktop.resolve1.service not found. > > which: no lxc-ls in > > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > > which: no lxc-checkconfig in > > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > > which: no networkctl in > > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > > [WARNING]: Skipping callback plugin '/dev/null', unable to load > > which: no btrfs in > > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin > > > > > > :/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > > which: no btrfs in > > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin > > > > > > :/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > > which: no zfs in > > (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/ > > > > > > usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) > > ++ info_block 'Error Info - 405' 0 > > ++ echo ---------------------------------------------------------------------- > > ---------------------------------------------------------------------- > > ++ print_info 'Error Info - 405' 0 > > ++ PROC_NAME='- [ Error Info - 405 0 ] -' > > ++ printf '\n%s%s\n' '- [ Error Info - 405 0 ] -' > > ------------------------------ > > > > -------------- > > > > - [ Error Info - 405 0 ] --------------------------------------------- > > ++ echo ---------------------------------------------------------------------- > > ---------------------------------------------------------------------- > > ++ exit_state 1 > > ++ set +x > > ---------------------------------------------------------------------- > > > > - [ Run Time = 163 seconds || 2 minutes ] ---------------------------- > > ---------------------------------------------------------------------- > > ---------------------------------------------------------------------- > > > > - [ Status: Failure ] ------------------------------------------------ > > ---------------------------------------------------------------------- > > > > Pls suggest how to fix these issues thanks. > > > > Regards, > > Amey. > > > > > From sco1984 at gmail.com Wed Feb 10 14:01:46 2021 From: sco1984 at gmail.com (Amey Abhyankar) Date: Wed, 10 Feb 2021 19:31:46 +0530 Subject: [openstack-ansible] Re: How to configure interfaces in CentOS 8.3 for OSA installation? In-Reply-To: <8fb90586-01c6-0a31-4c47-787e4f620c24@rd.bbc.co.uk> References: <8fb90586-01c6-0a31-4c47-787e4f620c24@rd.bbc.co.uk> Message-ID: Hello Jonathan, On Wed, 10 Feb 2021 at 18:35, Jonathan Rosser wrote: > > Hi Amey, > > The documentation that you linked to is specifically for the ansible > deployment host, which can be separate from the OpenStack target hosts > if you wish. The only requirement is that it has ssh access to the > target hosts, by default this would be via the network attached to > br-mgmt on those hosts. > > In terms of the target hosts networking, you should refer to > https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts.html#configuring-the-network. Yes I checked this reference. > > OpenStack-Ansible does not prescribe any particular method for > setting up the host networking as this tends to be specific > to the individual environment or deployer preferences. I am looking for an example where to put the interface configurations in CentOS 8. But, I am unable to find any references specifically for CentOS/RHEL. Hence I post the question here to get help from some1 who is already running OpenStack on CentOS 8. Regards, Amey. > > You should pick most appropriate network config tool for your > OS/environment and create the bridges and networks listed in the > table on the target hosts documentation. > > I would always recommend starting with an All-In-One deployment > https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html. > This will generate a complete test environment > with all the host networking set up correctly and will serve as > a reference example for a production deployment. > > Do join the #openstack-ansible IRC channel if you would like > to discuss any of the options further. > > Regards, > Jonathan. > > On 10/02/2021 10:09, Amey Abhyankar wrote: > > Hello, > > > > I am trying to install OpenStack by referring following guides = > > > > 1) https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html > > 2) https://docs.openstack.org/openstack-ansible/victoria/user/test/example.html > > > > In link number 2, there is a step to configure interfaces. > > In CentOS 8 we don't have a /etc/network/interfaces file. > > Which interface file should I configure? > > Or do I need to create all virtual int's manually using nmcli? > > > > Pls suggest thanks. > > > > Regards, > > Amey. > > > > > From jonathan.rosser at rd.bbc.co.uk Wed Feb 10 14:20:39 2021 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Wed, 10 Feb 2021 14:20:39 +0000 Subject: bootstrap errors in OSA installation in CentOS 8.3 In-Reply-To: References: <3a833f13-0816-ca3e-e67f-f4fd1ec04cd3@rd.bbc.co.uk> Message-ID: <318f40f1-5b47-3720-1ba5-2ca1d1af3db5@rd.bbc.co.uk> Hi Amey, 2.0.0.0rc1 is a release candidate tag (rc) from prior to the victoria release being finalised openstack-ansible. You should use the latest tag on the stable/victoria branch which currently is https://opendev.org/openstack/openstack-ansible/src/tag/22.0.1 On 10/02/2021 13:45, Amey Abhyankar wrote: > Hello Jonathan, > > I can see the particular TAG version at following location thanks = > https://opendev.org/openstack/openstack-ansible/src/tag/22.0.0.0rc1 > > Regards, > Amey. > > From satish.txt at gmail.com Wed Feb 10 14:41:51 2021 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 10 Feb 2021 09:41:51 -0500 Subject: [openstack-ansible] Re: How to configure interfaces in CentOS 8.3 for OSA installation? In-Reply-To: References: <8fb90586-01c6-0a31-4c47-787e4f620c24@rd.bbc.co.uk> Message-ID: Hi Amey, I am running my cloud using openstack-ansible on centos 7.5 (on Production) and 8.x (on Lab because centos 8 soon end of life, my future plan is to migrate everything to ubuntu ). Answer to your question is you have two way to configure networking using OSA 1. Use systemd-networkd here is the example - http://paste.openstack.org/show/802517/ 2. Use package network-scripts (This is legacy way to configure network on centOS /etc/sysconfig/network-scripts/* style) - You can see example on my blog how to deploy OSA on centos - https://satishdotpatel.github.io//build-openstack-cloud-using-openstack-ansible/ On Wed, Feb 10, 2021 at 9:14 AM Amey Abhyankar wrote: > > Hello Jonathan, > > On Wed, 10 Feb 2021 at 18:35, Jonathan Rosser > wrote: > > > > Hi Amey, > > > > The documentation that you linked to is specifically for the ansible > > deployment host, which can be separate from the OpenStack target hosts > > if you wish. The only requirement is that it has ssh access to the > > target hosts, by default this would be via the network attached to > > br-mgmt on those hosts. > > > > In terms of the target hosts networking, you should refer to > > https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts.html#configuring-the-network. > Yes I checked this reference. > > > > OpenStack-Ansible does not prescribe any particular method for > > setting up the host networking as this tends to be specific > > to the individual environment or deployer preferences. > I am looking for an example where to put the interface configurations > in CentOS 8. > But, I am unable to find any references specifically for CentOS/RHEL. > Hence I post the question here to get help from some1 who is already > running OpenStack on CentOS 8. > > Regards, > Amey. > > > > You should pick most appropriate network config tool for your > > OS/environment and create the bridges and networks listed in the > > table on the target hosts documentation. > > > > I would always recommend starting with an All-In-One deployment > > https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html. > > This will generate a complete test environment > > with all the host networking set up correctly and will serve as > > a reference example for a production deployment. > > > > Do join the #openstack-ansible IRC channel if you would like > > to discuss any of the options further. > > > > Regards, > > Jonathan. > > > > On 10/02/2021 10:09, Amey Abhyankar wrote: > > > Hello, > > > > > > I am trying to install OpenStack by referring following guides = > > > > > > 1) https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html > > > 2) https://docs.openstack.org/openstack-ansible/victoria/user/test/example.html > > > > > > In link number 2, there is a step to configure interfaces. > > > In CentOS 8 we don't have a /etc/network/interfaces file. > > > Which interface file should I configure? > > > Or do I need to create all virtual int's manually using nmcli? > > > > > > Pls suggest thanks. > > > > > > Regards, > > > Amey. > > > > > > > > > From ionut at fleio.com Wed Feb 10 14:53:42 2021 From: ionut at fleio.com (Ionut Biru) Date: Wed, 10 Feb 2021 16:53:42 +0200 Subject: [cinder][ceph] improving storage latency Message-ID: Hi guys, Currently in one locations we have the underlay for ceph running over tcp ip with solarflare networking cards attached to 40G switches. We use OSA to deploy openstack and we follow their network layout e.g br-storage. 3 nodes with mixed storage type, ssd and hdd. We want to deploy a new ceph cluster in a new region but we want to improve our ceph installation overlay to have lower latency. What do you guys use in your infrastructure? -- Ionut Biru - https://fleio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From roshananvekar at gmail.com Wed Feb 10 15:23:38 2021 From: roshananvekar at gmail.com (roshan anvekar) Date: Wed, 10 Feb 2021 20:53:38 +0530 Subject: [openstack-ansible] Re: How to configure interfaces in CentOS 8.3 for OSA installation? In-Reply-To: References: <8fb90586-01c6-0a31-4c47-787e4f620c24@rd.bbc.co.uk> Message-ID: I have query regarding the same interface configuration setup. This is for kolla-ansible which has br-ex bridge required on a physical or sub-interface. Do we need to assign IP to the bridge interface both in controller/network nodes and also on compute nodes?? Or should both the physical and bridge interface be IP free? Regards, Roshan On Wed, Feb 10, 2021, 8:15 PM Satish Patel wrote: > Hi Amey, > > I am running my cloud using openstack-ansible on centos 7.5 (on > Production) and 8.x (on Lab because centos 8 soon end of life, my > future plan is to migrate everything to ubuntu ). > > Answer to your question is you have two way to configure networking using > OSA > > 1. Use systemd-networkd here is the example - > http://paste.openstack.org/show/802517/ > 2. Use package network-scripts (This is legacy way to configure > network on centOS /etc/sysconfig/network-scripts/* style) - You can > see example on my blog how to deploy OSA on centos - > > https://satishdotpatel.github.io//build-openstack-cloud-using-openstack-ansible/ > > > On Wed, Feb 10, 2021 at 9:14 AM Amey Abhyankar wrote: > > > > Hello Jonathan, > > > > On Wed, 10 Feb 2021 at 18:35, Jonathan Rosser > > wrote: > > > > > > Hi Amey, > > > > > > The documentation that you linked to is specifically for the ansible > > > deployment host, which can be separate from the OpenStack target hosts > > > if you wish. The only requirement is that it has ssh access to the > > > target hosts, by default this would be via the network attached to > > > br-mgmt on those hosts. > > > > > > In terms of the target hosts networking, you should refer to > > > > https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts.html#configuring-the-network > . > > Yes I checked this reference. > > > > > > OpenStack-Ansible does not prescribe any particular method for > > > setting up the host networking as this tends to be specific > > > to the individual environment or deployer preferences. > > I am looking for an example where to put the interface configurations > > in CentOS 8. > > But, I am unable to find any references specifically for CentOS/RHEL. > > Hence I post the question here to get help from some1 who is already > > running OpenStack on CentOS 8. > > > > Regards, > > Amey. > > > > > > You should pick most appropriate network config tool for your > > > OS/environment and create the bridges and networks listed in the > > > table on the target hosts documentation. > > > > > > I would always recommend starting with an All-In-One deployment > > > > https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html > . > > > This will generate a complete test environment > > > with all the host networking set up correctly and will serve as > > > a reference example for a production deployment. > > > > > > Do join the #openstack-ansible IRC channel if you would like > > > to discuss any of the options further. > > > > > > Regards, > > > Jonathan. > > > > > > On 10/02/2021 10:09, Amey Abhyankar wrote: > > > > Hello, > > > > > > > > I am trying to install OpenStack by referring following guides = > > > > > > > > 1) > https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html > > > > 2) > https://docs.openstack.org/openstack-ansible/victoria/user/test/example.html > > > > > > > > In link number 2, there is a step to configure interfaces. > > > > In CentOS 8 we don't have a /etc/network/interfaces file. > > > > Which interface file should I configure? > > > > Or do I need to create all virtual int's manually using nmcli? > > > > > > > > Pls suggest thanks. > > > > > > > > Regards, > > > > Amey. > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From finarffin at gmail.com Wed Feb 10 10:29:06 2021 From: finarffin at gmail.com (Jan Wasilewski) Date: Wed, 10 Feb 2021 11:29:06 +0100 Subject: [cinder/barbican] LUKS encryption for mounted disk - how to decrypt cinder volume In-Reply-To: <20210209221230.xifstekiw6aucr7l@lyarwood-laptop.usersys.redhat.com> References: <20210209221230.xifstekiw6aucr7l@lyarwood-laptop.usersys.redhat.com> Message-ID: Thank you for a nice description of how everything is organized. It is much easier to understand the full workflow. >> I'll try to find some time to write these up again later in the week. That would be great, I will try to do this by myself, but I'm wondering if it's possible to do "all magic" directly from a payload that is visible from barbican CLI. Anyway, I checked also related conversation at openstack-lists and I'm wondering about this part: http://lists.openstack.org/pipermail/openstack-dev/2017-May/117470.html Is there a way to replace the compromised LUKS passphrase with the current implementation of barbican? I was trying to do this by myself but without luck. I'm also wondering if there is a procedure that can help here(if you know?) Thanks in advance. Best regards, Jan wt., 9 lut 2021 o 23:12 Lee Yarwood napisał(a): > On 09-02-21 12:48:38, Jan Wasilewski wrote: > > Hi All, > > > > I have a question about the possible decryption of LUKS volume. I'm > testing > > currently barbican+cinder, but I'm just wondering if there is a way, to > > somehow decrypt my LUKS volume with payload generated by a barbican. Is > > there any procedure for that? I was doing it by myself, but somehow it > > doesn't work and I got an error: > > > > [TEST]root at barbican-01:/usr/lib/python3/dist-packages# barbican secret > get > > --payload --payload_content_type application/octet-stream > > > http://controller.test:9311/v1/secrets/76631940-9ab6-4b8c-9481-e54c3ffdbbfe > > > +---------+--------------------------------------------------------------------------------------------------------+ > > | Field | Value > > | > > > +---------+--------------------------------------------------------------------------------------------------------+ > > | Payload | b'\xbf!i\x97\xf4\x0c\x12\xa4\xfe4\xf3\x16C\xe8@ > \xdc\x0f\x9d+:\x0c7\xa9\xab[\x8d\xf2\xf1\xae\r\x89\xdc' > > | > > > +---------+--------------------------------------------------------------------------------------------------------+ > > > > cryptsetup luksOpen > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f > > my-volume > > Enter passphrase for > > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f: * > payload>* > > No key available with this passphrase. > > > > I thought that above issue can be related to encoding, so I took payload > > value directly from vault and use it as a key-file, but problem is > exactly > > the same(my encrypted volume is the last volume list by domblklist > option): > > > > vault kv get secret/data/e5baa518207e4f9db4810988d22087ce | grep value | > > awk -F'value:' '{print $2}' > > > 4d4d35676c336567714850663477336d2b415475746b74774c56376b77324b4e73773879724c46704678513d] > > > > [TEST]root at comp-02:~# cat bbb > > > 4d4d35676c336567714850663477336d2b415475746b74774c56376b77324b4e73773879724c46704678513d > > [TEST]root at comp-02:~# cat bbb | base64 -d > pass2 > > [TEST]root at comp-02:~# cat pass2 > > ▒▒߻▒▒▒▒▒^<▒N▒▒▒▒~پ5▒▒▒▒▒▒▒z߾▒▒▒▒~▒▒▒▒▒n▒▒▒▒▒]▒[TEST]root at comp-02:~# > > [TEST]root at comp-02:~# virsh domblklist instance-00000da8 > > Target Source > > ------------------------------------------------ > > vda /dev/dm-17 > > vdb /dev/disk/by-id/wwn-0x6e00084100ee7e7e74623bd3000036bc > > vdc /dev/dm-16 > > vde /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f > > vdf /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 > > [TEST]root at comp-02:~# udisksctl unlock -b > > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 --key-file pass2 > > Error unlocking /dev/dm-21: > > GDBus.Error:org.freedesktop.UDisks2.Error.Failed: Error unlocking > > /dev/dm-21: Failed to activate device: Operation not permitted > > [TEST]root at comp-02:~# cryptsetup luksOpen > > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 my-volume > > --master-key-file=pass2 > > Volume key does not match the volume. > > > > > > I see that nova/cinder and barbican are doing this stuff somehow so I > > strongly believe there is a way to decrypt this manually. Maybe I’m doing > > something wrong in my testing-steps. > > Thanks in advance for any help here! Unfortunately, I haven’t found any > > materials on how to do this. > > Yeah this is thanks to a long standing peice of technical debt that I've > wanted to remove for years but I've never had to the change to. > > The tl;dr is that os-brick and n-cpu both turn the associated symmetric key > secret into a passphrase using the following logic, ultimately calling > binascii.hexlify: > > > https://github.com/openstack/nova/blob/944443a7b053957f0b17a5edaa1d0ef14ae48f30/nova/virt/libvirt/driver.py#L1463-L1466 > > > https://github.com/openstack/os-brick/blob/ec70b4092f649d933322820e3003269560df7af9/os_brick/encryptors/cryptsetup.py#L101-L103 > > I'm sure I've written up the steps to manually decrypt a cinder volume > using these steps before but I can't seem to find them at the moment. > I'll try to find some time to write these up again later in the week. > > Obviously it goes without saying that c-vol/c-api should be creating a > passphrase secret for LUKS encrypted volumes to avoid this madness. > > Cinder creating and associating symmetric keys with encrypted volumes when > used with Barbican > https://bugs.launchpad.net/cinder/+bug/1693840 > > -- > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 > 2D76 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.morin at gmail.com Wed Feb 10 16:31:44 2021 From: arnaud.morin at gmail.com (Arnaud) Date: Wed, 10 Feb 2021 17:31:44 +0100 Subject: [nova] Rescue booting on wrong disk In-Reply-To: <20210209165023.routbszw3hnggrwj@lyarwood-laptop.usersys.redhat.com> References: <20210209112110.GG14971@sync> <20210209165023.routbszw3hnggrwj@lyarwood-laptop.usersys.redhat.com> Message-ID: <80D3768E-CBBD-48B0-992E-35DC8298C17B@gmail.com> Hello, Thanks, we will check these params. Cheers Le 9 février 2021 17:50:23 GMT+01:00, Lee Yarwood a écrit : >On 09-02-21 12:23:58, Sean Mooney wrote: >> On Tue, 2021-02-09 at 11:21 +0000, Arnaud Morin wrote: >> > Hey all, >> > >> > From time to time we are facing an issue when puting instance in >rescue >> > with the same image as the one the instance was booted. >> > >> > E.G. >> > I booted an instance using Debian 10, disk are: >> > >> > debian at testarnaud:~$ lsblk >> > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT >> > sr0 11:0 1 486K 0 rom >> > vda 254:0 0 10G 0 disk >> > └─vda1 254:1 0 10G 0 part / >> > debian at testarnaud:~$ cat /etc/fstab >> > # /etc/fstab: static file system information. >> > UUID=5605171d-d590-46d5-85e2-60096b533a18 / ext4 >> > errors=remount-ro 0 1 >> > >> > >> > >> > I rescued the instance: >> > $ openstack server rescue --image >bc73a901-6366-4a69-8ddc-00479b4d647f testarnaud >> > >> > >> > Then, back in the instance: >> > >> > debian at testarnaud:~$ lsblk >> > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT >> > sr0 11:0 1 486K 0 rom >> > vda 254:0 0 2G 0 disk >> > └─vda1 254:1 0 2G 0 part >> > vdb 254:16 0 10G 0 disk >> > └─vdb1 254:17 0 10G 0 part / >> > >> > >> > >> > Instance booted on /dev/vdb1 instead of /dev/vda1 >> > >> > Is there anything we can configure on nova side to avoid this >> > situation? >> >> in ussuri lee yarwood added >> >https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/virt-rescue-stable-disk-devices.html >> to nova which i belive will resolve the issue >> >> from ocata+ i think you can add hw_rescue_bus=usb to get a similar >> effect but from ussuri we cahgne the layout so thtat the rescue disk >> is always used. lee is that right? > >Yeah that's correct, from Ussuri (21.0.0) with the libvirt driver the >instance should continue to show all disks connected as normal with the >rescue disk appended last and always used as the boot device. > >In your case, switching the rescue bus should change the default boot >device type n-cpu tells libvirt to use working around your problem. >Another obvious thing you can try is just using a different image so >the >two MBRs don't conflict? > >Cheers, > >-- >Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 >F672 2D76 -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Wed Feb 10 16:40:14 2021 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 10 Feb 2021 17:40:14 +0100 Subject: [largescale-sig] Next meeting: February 10, 15utc In-Reply-To: <24e58c7c-982c-14d1-e0b8-8fe20f7da50f@openstack.org> References: <24e58c7c-982c-14d1-e0b8-8fe20f7da50f@openstack.org> Message-ID: <0b29f087-ae23-9a60-8d27-279b39e89780@openstack.org> We held our meeting today. We discussed ideas of presentations we could run in video meeting editions of our meeting, as well as how to best collect best practices on Ceph+OpenStack deployments. Meeting logs at: http://eavesdrop.openstack.org/meetings/large_scale_sig/2021/large_scale_sig.2021-02-10-15.00.html Action items: * ttx to contact OpenStack TC re: OSarchiver and see which option sounds best * ttx to summarize "how many compute nodes in your typical cluster" thread into the wiki * ttx to create a wiki page for Ceph-related questions and seed it with SG questions Our next meeting will be Wednesday, March 10 at 15utc on Zoom! The theme will be "Regions vs. Cells" and we'll kickstart the discussion using a short presentation from Belmiro Moreira giving the CERN perspective. The link to connect will be posted a few days before the event. -- Thierry Carrez (ttx) From mtreinish at kortar.org Wed Feb 10 17:01:43 2021 From: mtreinish at kortar.org (Matthew Treinish) Date: Wed, 10 Feb 2021 12:01:43 -0500 Subject: [QA] Migrate from testr to stestr In-Reply-To: References: Message-ID: On Wed, Feb 10, 2021 at 08:20:02AM +0000, Sorin Sbarnea wrote: > While switch from testr to stestr is a no brainer short term move, I want > to mention the maintenance risks. > > I personally see the stestr test dependency a liability because the project > is not actively maintained and mainly depends on a single person. It is not > unmaintained either. > > Due to such risks I preferred to rely on pytest for running tests, as I > prefer to depend on an ecosystem that has a *big* pool of maintainers. I have to take exception with this argument. The number of maintainers of a project by itself doesn't indicate anything. It's a poor mask for other concerns like the longevity of support for a project, responsiveness to issues, etc. If you really had issues with projects that only had a single primary maintainer you'd be significantly limiting the pool of software you'd use. A lot of software we use every day has only a single maintainer. So what is your real concern here? I can take some guesses, like issues going unaddressed, but it's hard to actually address the issues you're having with the tool if you just point out the number of maintainers. My suspicion is that this post is actually all a thinly veiled reference to your discontent with the state of the testtools library which is not very actively maintained (despite having **19 people** with write access to the project). But, I should point out that stestr != testtools and there is no hard requirement for a test suite to use testtools to be run with stestr. While testtools is used internally in stestr to handle results streaming (which replacing with a native stestr implementation is on my super long-term plan) the actual unittest compatible framework portion of the library isn't used or required by stestr. The primary feature using testools as your base test class provides is the attachments support for embedding things like stdout and stderr in the result stream and built-in support for fixtures. This can (and I have) be implemented without using testools though, it just requires writing a base test class and result handler that adds the expected (but poorly documented) hook points for passing attachments to python-subunit for serialization (in other words, copying the code that does this from testtools to your local test suite). I can say as the "single person" you call out here that I'm committed to the long term support of stestr, it has a large user base outside of just OpenStack (including being used in parts of my current day job) I'm actually constantly surprised when I get contacted by unexpected users that I've never heard about before; it's not just an instance of "not invented here". OpenStack is still by far the largest user of stestr though, so I do prioritize issues that come up in OpenStack. I've also continued to maintain it through several job changes the past 5 years. I'm also not aware of any pressing issues or bugs that are going unaddressed. Especially from OpenStack I haven't seen any issues filed since Stephen dove deep and fixed that nasty short read bug with python3 in python-subunit that we had all been banging our heads on for a long time (which I'm still super thankful that he did the work on that). While I'll admit I haven't had time the past couple years get to some of the feature development I'd like (mainly finishing https://github.com/mtreinish/stestr/pull/271 and adding https://github.com/mtreinish/stestr/issues/224), none of that seems to be a priority for anyone, just nice to have features. That all being said, if your concern is just the bus factor and when I'm no longer around at some future date there's nobody to continue maintenance. I should point out that I'm not a sole maintainer, I'm just the primary maintainer. masayukig is also a maintainer and has all the same access and permissions on the repo and project that I do. We're also open to adding more maintainers, but nobody has ever stepped up and started contributing consistently (or weren't interested if they were contributing in the past). > > Do not take my remark as a proposal to switch to pytest, is only about risk > assessment. I am fully aware of how easy is to write impure unittests with > pytest, but so far I did not regret going this route. > > I know that OpenStack historically loved to redo everything in house and > minimise involvement with other open source python libraries. There are > pros and cons on each approach but I personally prefer to bet on projects > that are thriving and that are unlikely to need me to fix framework > problems myself. I think Stephen and Sean expanded on this well elsewhere in the thread, that using stdlib unittest everywhere has a lot of value, including letting you use pytest if that's your preferred runner. It's also worth pointing out that stestr was originally designed to fit the needs of OpenStack which are pretty unique in all the python projects I've interacted with, because existing actively maintained test runners (so excluding testr) couldn't get the throughput we needed for all the Python testing that goes on daily. None of the other python runners I've used are able to manage the same levels of throughput that stestr does or handle managing parallel execution as well. Especially considering there is another large thread going on right now about how to beter utilize gate resources this seems like a weird time to abandon a tool that tries to maximize our test execution throughput. stestr is also a hard dependency for Tempest and the entire execution model used in tempest is dependent on stestr to handle scheduling and execution of tests. That's unlikely to ever change because it would require basically rewriting the core of Tempest for unclear benefit. -Matt Treinish -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From dtantsur at redhat.com Wed Feb 10 17:21:11 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 10 Feb 2021 18:21:11 +0100 Subject: [ironic] [release] [kolla] RFC: stop doing Extended Maintenance for Bifrost Message-ID: Hi all, Since Bifrost is an installation project that supports several distributions, maintaining its stable branches is a never ending nightmare. We have *nearly* fixed Ussuri (thank you Riccardo!) and just started looking into Train (thank you Iury and Mark), I'm sure we cannot keep EM branches in a decent shape. Personally I feel that Bifrost is more geared towards consumers staying close to master, but my gut feeling may not necessarily match the reality. Based on the above, I'm proposing: 1) EOL all old branches from Ocata to Stein on Bifrost. 2) No longer create EM branches, EOL immediately after a branch leaves the regular maintenance. Opinions? Dmitry P.S. Adding kolla because of a potential impact. -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From lyarwood at redhat.com Wed Feb 10 17:43:03 2021 From: lyarwood at redhat.com (Lee Yarwood) Date: Wed, 10 Feb 2021 17:43:03 +0000 Subject: [cinder/barbican] LUKS encryption for mounted disk - how to decrypt cinder volume In-Reply-To: References: <20210209221230.xifstekiw6aucr7l@lyarwood-laptop.usersys.redhat.com> Message-ID: <20210210174303.gi2zuaerxsvsk66l@lyarwood-laptop.usersys.redhat.com> On 10-02-21 11:29:06, Jan Wasilewski wrote: > Thank you for a nice description of how everything is organized. It is much > easier to understand the full workflow. > > >> I'll try to find some time to write these up again later in the week. > That would be great, I will try to do this by myself, but I'm wondering if > it's possible to do "all magic" directly from a payload that is visible > from barbican CLI. > > Anyway, I checked also related conversation at openstack-lists and I'm > wondering about this part: > http://lists.openstack.org/pipermail/openstack-dev/2017-May/117470.html > > Is there a way to replace the compromised LUKS passphrase with the current > implementation of barbican? I was trying to do this by myself but without > luck. I'm also wondering if there is a procedure that can help here(if you > know?) AFAIK the only way to do this at present is by retyping [1] the volume to a different encrypted volume type. That should result in a new volume using new secrets etc that for LUKSv1 volumes would mean a new passphrase. [1] https://docs.openstack.org/api-ref/block-storage/v3/index.html?expanded=retype-a-volume-detail#volume-actions-volumes-action > wt., 9 lut 2021 o 23:12 Lee Yarwood napisał(a): > > > On 09-02-21 12:48:38, Jan Wasilewski wrote: > > > Hi All, > > > > > > I have a question about the possible decryption of LUKS volume. I'm > > testing > > > currently barbican+cinder, but I'm just wondering if there is a way, to > > > somehow decrypt my LUKS volume with payload generated by a barbican. Is > > > there any procedure for that? I was doing it by myself, but somehow it > > > doesn't work and I got an error: > > > > > > [TEST]root at barbican-01:/usr/lib/python3/dist-packages# barbican secret > > get > > > --payload --payload_content_type application/octet-stream > > > > > http://controller.test:9311/v1/secrets/76631940-9ab6-4b8c-9481-e54c3ffdbbfe > > > > > +---------+--------------------------------------------------------------------------------------------------------+ > > > | Field | Value > > > | > > > > > +---------+--------------------------------------------------------------------------------------------------------+ > > > | Payload | b'\xbf!i\x97\xf4\x0c\x12\xa4\xfe4\xf3\x16C\xe8@ > > \xdc\x0f\x9d+:\x0c7\xa9\xab[\x8d\xf2\xf1\xae\r\x89\xdc' > > > | > > > > > +---------+--------------------------------------------------------------------------------------------------------+ > > > > > > cryptsetup luksOpen > > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f > > > my-volume > > > Enter passphrase for > > > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f: * > > payload>* > > > No key available with this passphrase. > > > > > > I thought that above issue can be related to encoding, so I took payload > > > value directly from vault and use it as a key-file, but problem is > > exactly > > > the same(my encrypted volume is the last volume list by domblklist > > option): > > > > > > vault kv get secret/data/e5baa518207e4f9db4810988d22087ce | grep value | > > > awk -F'value:' '{print $2}' > > > > > 4d4d35676c336567714850663477336d2b415475746b74774c56376b77324b4e73773879724c46704678513d] > > > > > > [TEST]root at comp-02:~# cat bbb > > > > > 4d4d35676c336567714850663477336d2b415475746b74774c56376b77324b4e73773879724c46704678513d > > > [TEST]root at comp-02:~# cat bbb | base64 -d > pass2 > > > [TEST]root at comp-02:~# cat pass2 > > > ▒▒߻▒▒▒▒▒^<▒N▒▒▒▒~پ5▒▒▒▒▒▒▒z߾▒▒▒▒~▒▒▒▒▒n▒▒▒▒▒]▒[TEST]root at comp-02:~# > > > [TEST]root at comp-02:~# virsh domblklist instance-00000da8 > > > Target Source > > > ------------------------------------------------ > > > vda /dev/dm-17 > > > vdb /dev/disk/by-id/wwn-0x6e00084100ee7e7e74623bd3000036bc > > > vdc /dev/dm-16 > > > vde /dev/disk/by-id/wwn-0x6e00084100ee7e7e7ab0b13c0000386f > > > vdf /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 > > > [TEST]root at comp-02:~# udisksctl unlock -b > > > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 --key-file pass2 > > > Error unlocking /dev/dm-21: > > > GDBus.Error:org.freedesktop.UDisks2.Error.Failed: Error unlocking > > > /dev/dm-21: Failed to activate device: Operation not permitted > > > [TEST]root at comp-02:~# cryptsetup luksOpen > > > /dev/disk/by-id/wwn-0x6e00084100ee7e7e7bd45c1b000038b5 my-volume > > > --master-key-file=pass2 > > > Volume key does not match the volume. > > > > > > > > > I see that nova/cinder and barbican are doing this stuff somehow so I > > > strongly believe there is a way to decrypt this manually. Maybe I’m doing > > > something wrong in my testing-steps. > > > Thanks in advance for any help here! Unfortunately, I haven’t found any > > > materials on how to do this. > > > > Yeah this is thanks to a long standing peice of technical debt that I've > > wanted to remove for years but I've never had to the change to. > > > > The tl;dr is that os-brick and n-cpu both turn the associated symmetric key > > secret into a passphrase using the following logic, ultimately calling > > binascii.hexlify: > > > > > > https://github.com/openstack/nova/blob/944443a7b053957f0b17a5edaa1d0ef14ae48f30/nova/virt/libvirt/driver.py#L1463-L1466 > > > > > > https://github.com/openstack/os-brick/blob/ec70b4092f649d933322820e3003269560df7af9/os_brick/encryptors/cryptsetup.py#L101-L103 > > > > I'm sure I've written up the steps to manually decrypt a cinder volume > > using these steps before but I can't seem to find them at the moment. > > I'll try to find some time to write these up again later in the week. > > > > Obviously it goes without saying that c-vol/c-api should be creating a > > passphrase secret for LUKS encrypted volumes to avoid this madness. > > > > Cinder creating and associating symmetric keys with encrypted volumes when > > used with Barbican > > https://bugs.launchpad.net/cinder/+bug/1693840 > > > > -- > > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 > > 2D76 > > -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From radoslaw.piliszek at gmail.com Wed Feb 10 17:43:09 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 10 Feb 2021 18:43:09 +0100 Subject: [ironic] [release] [kolla] RFC: stop doing Extended Maintenance for Bifrost In-Reply-To: References: Message-ID: On Wed, Feb 10, 2021 at 6:22 PM Dmitry Tantsur wrote: > > Hi all, > > Since Bifrost is an installation project that supports several distributions, maintaining its stable branches is a never ending nightmare. We have *nearly* fixed Ussuri (thank you Riccardo!) and just started looking into Train (thank you Iury and Mark), I'm sure we cannot keep EM branches in a decent shape. Personally I feel that Bifrost is more geared towards consumers staying close to master, but my gut feeling may not necessarily match the reality. > > Based on the above, I'm proposing: > 1) EOL all old branches from Ocata to Stein on Bifrost. > 2) No longer create EM branches, EOL immediately after a branch leaves the regular maintenance. > > Opinions? If it's not an overkill burden, I would suggest following Kolla's policy where we in fact keep one last EM branch alive in practice (older ones rot much too quickly). That would mean still caring about Stein until Train goes EM too. -yoctozepto From piotrmisiak1984 at gmail.com Wed Feb 10 18:11:21 2021 From: piotrmisiak1984 at gmail.com (Piotr Misiak) Date: Wed, 10 Feb 2021 19:11:21 +0100 Subject: [neutron][ovn] ipv6 in virtual networks Message-ID: <68945afb-935e-f28d-a022-6e0a94af4387@gmail.com> Hi all, I have a test env with OpenStack Ussuri and OVN deployed by kolla-ansible. I'm struggling with IPv6 VMs addressing. Has anyone deployed such configuration successfully? What is working:  - SLAAC for VMs IPv6 addressing - VMs configure IPv6 addresses and can ping each other via IPv6  - VMs can ping virtual router's fe80:: address  - OVN is sending ICMPv6 RA packets periodically on virtual private networks What is not working:  - VMs can't ping virtual router's private network IPv6 address specified in virtual network configuration in Neutron (IPv6 GUA), I see ICMPv6 echo request packets on tapXXXXX interfaces with a correct DEST MAC, but there are no responses.  - Routing is not working at all Besides those, I can't imagine how upstream router will know how to reach a particular private network with GUA IPv6 addresses (to which virtual router send packets to reach a particular private network?). I have a standard external network with IPv6 GUA /64 subnet and virtual routers which connects private networks with IPv6 GUA /64 subnets with external network. I thought that OVN virtual router will send ICMPv6 RA packets on external network with reachable prefixes and upstream router will learn routing info from those but I don't see any RA packets sent by OVN on external network, I see only RA packets from an upstream router. How this should work and be configured? How to configure GUA IPv6 addresses on virtual private networks? Is it supported by Neutron/OVN? Looking forward any responses regarding this area because documentation does not exist technically. Thanks! From hberaud at redhat.com Wed Feb 10 18:37:23 2021 From: hberaud at redhat.com (Herve Beraud) Date: Wed, 10 Feb 2021 19:37:23 +0100 Subject: =?UTF-8?Q?Re=3A_=5Brelease=5D_Proposing_El=C5=91d_Ill=C3=A9s_=28elod=29_for_Rele?= =?UTF-8?Q?ase_Management_Core?= In-Reply-To: <530f689e-f56f-5720-29b5-677ad2f73815@openstack.org> References: <530f689e-f56f-5720-29b5-677ad2f73815@openstack.org> Message-ID: Welcome on board Elod! We will celebrate that during our next meeting tomorrow evening! I'll bring champagne ;) Thanks everyone for your feedback! Le mer. 10 févr. 2021 à 14:09, Thierry Carrez a écrit : > Herve Beraud wrote: > > Előd has been working on Release management for quite some time now and > > in that time has > > shown tremendous growth in his understanding of our processes and on how > > deliverables work on Openstack. I think he would make a good addition to > > the core team. > > > > Existing team members, please respond with +1/-1. > > If there are no objections we'll add him to the ACL soon. :-) > > +1 yes please ! > > -- > Thierry Carrez (ttx) > > -- Hervé Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ https://twitter.com/4383hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Wed Feb 10 18:44:42 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 10 Feb 2021 19:44:42 +0100 Subject: [ironic] [release] [kolla] RFC: stop doing Extended Maintenance for Bifrost In-Reply-To: References: Message-ID: On Wed, Feb 10, 2021 at 6:43 PM Radosław Piliszek < radoslaw.piliszek at gmail.com> wrote: > On Wed, Feb 10, 2021 at 6:22 PM Dmitry Tantsur > wrote: > > > > Hi all, > > > > Since Bifrost is an installation project that supports several > distributions, maintaining its stable branches is a never ending nightmare. > We have *nearly* fixed Ussuri (thank you Riccardo!) and just started > looking into Train (thank you Iury and Mark), I'm sure we cannot keep EM > branches in a decent shape. Personally I feel that Bifrost is more geared > towards consumers staying close to master, but my gut feeling may not > necessarily match the reality. > > > > Based on the above, I'm proposing: > > 1) EOL all old branches from Ocata to Stein on Bifrost. > > 2) No longer create EM branches, EOL immediately after a branch leaves > the regular maintenance. > > > > Opinions? > > If it's not an overkill burden, I would suggest following Kolla's > policy where we in fact keep one last EM branch alive in practice > (older ones rot much too quickly). > That would mean still caring about Stein until Train goes EM too. > I'm afraid it is an overkill burden. Please see above, we're struggling even with normally supported branches. Dmitry > > -yoctozepto > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Wed Feb 10 18:49:03 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 10 Feb 2021 19:49:03 +0100 Subject: [ironic] [release] [kolla] RFC: stop doing Extended Maintenance for Bifrost In-Reply-To: References: Message-ID: On Wed, Feb 10, 2021 at 7:44 PM Dmitry Tantsur wrote: > > > > On Wed, Feb 10, 2021 at 6:43 PM Radosław Piliszek wrote: >> >> On Wed, Feb 10, 2021 at 6:22 PM Dmitry Tantsur wrote: >> > >> > Hi all, >> > >> > Since Bifrost is an installation project that supports several distributions, maintaining its stable branches is a never ending nightmare. We have *nearly* fixed Ussuri (thank you Riccardo!) and just started looking into Train (thank you Iury and Mark), I'm sure we cannot keep EM branches in a decent shape. Personally I feel that Bifrost is more geared towards consumers staying close to master, but my gut feeling may not necessarily match the reality. >> > >> > Based on the above, I'm proposing: >> > 1) EOL all old branches from Ocata to Stein on Bifrost. >> > 2) No longer create EM branches, EOL immediately after a branch leaves the regular maintenance. >> > >> > Opinions? >> >> If it's not an overkill burden, I would suggest following Kolla's >> policy where we in fact keep one last EM branch alive in practice >> (older ones rot much too quickly). >> That would mean still caring about Stein until Train goes EM too. > > > I'm afraid it is an overkill burden. Please see above, we're struggling even with normally supported branches. Well, you mentioned EMs down to Ocata so I had to try my luck. :-) -yoctozepto From smooney at redhat.com Wed Feb 10 19:32:16 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 10 Feb 2021 19:32:16 +0000 Subject: [openstack-ansible] Re: How to configure interfaces in CentOS 8.3 for OSA installation? In-Reply-To: References: <8fb90586-01c6-0a31-4c47-787e4f620c24@rd.bbc.co.uk> Message-ID: <14c6b60d7478a8244fdd8cfdd3a70af979702426.camel@redhat.com> On Wed, 2021-02-10 at 20:53 +0530, roshan anvekar wrote: > I have query regarding the same interface configuration setup. > > This is for kolla-ansible which has br-ex bridge required on a physical or > sub-interface. > > Do we need to assign IP to the bridge interface both in controller/network > nodes and also on compute nodes?? Or should both the physical and bridge > interface be IP free? if you are usign ovs-dpdk the br-ex should have local tunnel ip used for vxlan/gre tunnels. for kernel ovs that is less important but i prefer to still follow that parttern to keep everything the same. if you dont do that your vxlan tunnels may use the management interface for connectivity instead of the interface added to the ovs bridge. it will make that determination by looking at the kernel routing table to determing which interface to sed it to so if you want to use the same interfaces for you vlan and tunnel traffic the assing the ip to the br-ex > > Regards, > Roshan > > On Wed, Feb 10, 2021, 8:15 PM Satish Patel wrote: > > > Hi Amey, > > > > I am running my cloud using openstack-ansible on centos 7.5 (on > > Production) and 8.x (on Lab because centos 8 soon end of life, my > > future plan is to migrate everything to ubuntu ). > > > > Answer to your question is you have two way to configure networking using > > OSA > > > > 1. Use systemd-networkd here is the example - > > http://paste.openstack.org/show/802517/ > > 2. Use package network-scripts (This is legacy way to configure > > network on centOS /etc/sysconfig/network-scripts/* style) - You can > > see example on my blog how to deploy OSA on centos - > > > > https://satishdotpatel.github.io//build-openstack-cloud-using-openstack-ansible/ > > > > > > On Wed, Feb 10, 2021 at 9:14 AM Amey Abhyankar wrote: > > > > > > Hello Jonathan, > > > > > > On Wed, 10 Feb 2021 at 18:35, Jonathan Rosser > > > wrote: > > > > > > > > Hi Amey, > > > > > > > > The documentation that you linked to is specifically for the ansible > > > > deployment host, which can be separate from the OpenStack target hosts > > > > if you wish. The only requirement is that it has ssh access to the > > > > target hosts, by default this would be via the network attached to > > > > br-mgmt on those hosts. > > > > > > > > In terms of the target hosts networking, you should refer to > > > > > > https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts.html#configuring-the-network > > . > > > Yes I checked this reference. > > > > > > > > OpenStack-Ansible does not prescribe any particular method for > > > > setting up the host networking as this tends to be specific > > > > to the individual environment or deployer preferences. > > > I am looking for an example where to put the interface configurations > > > in CentOS 8. > > > But, I am unable to find any references specifically for CentOS/RHEL. > > > Hence I post the question here to get help from some1 who is already > > > running OpenStack on CentOS 8. > > > > > > Regards, > > > Amey. > > > > > > > > You should pick most appropriate network config tool for your > > > > OS/environment and create the bridges and networks listed in the > > > > table on the target hosts documentation. > > > > > > > > I would always recommend starting with an All-In-One deployment > > > > > > https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html > > . > > > > This will generate a complete test environment > > > > with all the host networking set up correctly and will serve as > > > > a reference example for a production deployment. > > > > > > > > Do join the #openstack-ansible IRC channel if you would like > > > > to discuss any of the options further. > > > > > > > > Regards, > > > > Jonathan. > > > > > > > > On 10/02/2021 10:09, Amey Abhyankar wrote: > > > > > Hello, > > > > > > > > > > I am trying to install OpenStack by referring following guides = > > > > > > > > > > 1) > > https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html > > > > > 2) > > https://docs.openstack.org/openstack-ansible/victoria/user/test/example.html > > > > > > > > > > In link number 2, there is a step to configure interfaces. > > > > > In CentOS 8 we don't have a /etc/network/interfaces file. > > > > > Which interface file should I configure? > > > > > Or do I need to create all virtual int's manually using nmcli? > > > > > > > > > > Pls suggest thanks. > > > > > > > > > > Regards, > > > > > Amey. > > > > > > > > > > > > > > > > > > > > > From openstack at nemebean.com Wed Feb 10 19:46:46 2021 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 10 Feb 2021 13:46:46 -0600 Subject: [all] Gate resources and performance In-Reply-To: References: Message-ID: <53f77238-d77e-4b57-57bc-139065b23595@nemebean.com> On 2/9/21 6:59 PM, Dan Smith wrote: >> This seemed like a good time to finally revisit >> https://review.opendev.org/c/openstack/devstack/+/676016 (the OSC as a >> service patch). Turns out it wasn't as much work to reimplement as I >> had expected, but hopefully this version addresses the concerns with >> the old one. >> >> In my local env it takes about 3:45 off my devstack run. Not a huge >> amount by itself, but multiplied by thousands of jobs it could be >> significant. > > I messed with doing this myself, I wish I had seen yours first. I never > really got it to be stable enough to consider it usable because of how > many places in devstack we use the return code of an osc command. I > could get it to trivially work, but re-stacks and other behaviors > weren't quite right. Looks like maybe your version does that properly? It seems to. I had an issue at one point when I wasn't shutting down the systemd service during unstack, but I haven't seen any problems since I fixed that. I've done quite a few devstack runs on the same node with no failures. > > Anyway, I moved on to a full parallelization of devstack, which largely > lets me run all the non-dependent osc commands in parallel, in addition > to all kinds of other stuff (like db syncs and various project > setup). So far, that effort is giving me about a 60% performance > improvement over baseline, and I can do a minimal stack on my local > machine in about five minutes: > > https://review.opendev.org/c/openstack/devstack/+/771505/ Ah, that's nice. The speedup from the parallel execution series alone was pretty comparable to just the client service in my (old and slow) env. > > I think we've largely got agreement to get that merged at this point, > which as you say, will definitely make some significant improvements > purely because of how many times we do that in a day. If your OaaS can > support parallel requests, I'd definitely be interested in pursuing that > on top, although I think I've largely squeezed out the startup delay we > see when we run like eight osc instances in parallel during keystone > setup :) Surprisingly, it does seem to work. I suspect it serializes handling the multiple client calls, but it works and is still faster than just the parallel patch alone (again, in my env). The client service took about a minute off the parallel runtime. Here's the timing I see locally: Vanilla devstack: 775 Client service alone: 529 Parallel execution: 527 Parallel client service: 465 Most of the difference between the last two is shorter async_wait times because the deployment steps are taking less time. So not quite as much as before, but still a decent increase in speed. From dms at danplanet.com Wed Feb 10 19:57:22 2021 From: dms at danplanet.com (Dan Smith) Date: Wed, 10 Feb 2021 11:57:22 -0800 Subject: [all] Gate resources and performance In-Reply-To: <53f77238-d77e-4b57-57bc-139065b23595@nemebean.com> (Ben Nemec's message of "Wed, 10 Feb 2021 13:46:46 -0600") References: <53f77238-d77e-4b57-57bc-139065b23595@nemebean.com> Message-ID: > Here's the timing I see locally: > Vanilla devstack: 775 > Client service alone: 529 > Parallel execution: 527 > Parallel client service: 465 > > Most of the difference between the last two is shorter async_wait > times because the deployment steps are taking less time. So not quite > as much as before, but still a decent increase in speed. Yeah, cool, I think you're right that we'll just serialize the calls. It may not be worth the complexity, but if we make the OaaS server able to do a few things in parallel, then we'll re-gain a little more perf because we'll go back to overlapping the *server* side of things. Creating flavors, volume types, networks and uploading the image to glance are all things that should be doable in parallel in the server projects. 465s for a devstack is awesome. Think of all the developer time in $local_fiat_currency we could have saved if we did this four years ago... :) --Dan From haleyb.dev at gmail.com Wed Feb 10 20:08:14 2021 From: haleyb.dev at gmail.com (Brian Haley) Date: Wed, 10 Feb 2021 15:08:14 -0500 Subject: [neutron][ovn] ipv6 in virtual networks In-Reply-To: <68945afb-935e-f28d-a022-6e0a94af4387@gmail.com> References: <68945afb-935e-f28d-a022-6e0a94af4387@gmail.com> Message-ID: On 2/10/21 1:11 PM, Piotr Misiak wrote: > Hi all, > > I have a test env with OpenStack Ussuri and OVN deployed by kolla-ansible. > > I'm struggling with IPv6 VMs addressing. Has anyone deployed such > configuration successfully? > > What is working: > >  - SLAAC for VMs IPv6 addressing - VMs configure IPv6 addresses and can > ping each other via IPv6 > >  - VMs can ping virtual router's fe80:: address > >  - OVN is sending ICMPv6 RA packets periodically on virtual private networks > > What is not working: > >  - VMs can't ping virtual router's private network IPv6 address > specified in virtual network configuration in Neutron (IPv6 GUA), I see > ICMPv6 echo request packets on tapXXXXX interfaces with a correct DEST > MAC, but there are no responses. That should work AFAIK, just don't have a devstack to try it on at the moment, sorry. >  - Routing is not working at all > > Besides those, I can't imagine how upstream router will know how to > reach a particular private network with GUA IPv6 addresses (to which > virtual router send packets to reach a particular private network?). I > have a standard external network with IPv6 GUA /64 subnet and virtual > routers which connects private networks with IPv6 GUA /64 subnets with > external network. I thought that OVN virtual router will send ICMPv6 RA > packets on external network with reachable prefixes and upstream router > will learn routing info from those but I don't see any RA packets sent > by OVN on external network, I see only RA packets from an upstream > router. How this should work and be configured? How to configure GUA > IPv6 addresses on virtual private networks? Is it supported by Neutron/OVN? IPv6 prefix delegation is what you want, it's one of the 'gaps' with ML2/OVS, https://bugs.launchpad.net/neutron/+bug/1895972 There is a list of known items at https://docs.openstack.org/neutron/latest/ovn/gaps.html So in order to use a globally-reachable IPv6 address you should use a port from a provider network in the instance. > Looking forward any responses regarding this area because documentation > does not exist technically. All the docs were copied over to neutron so should be visible at https://docs.openstack.org/neutron/latest/ -Brian From elod.illes at est.tech Wed Feb 10 20:09:48 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Wed, 10 Feb 2021 21:09:48 +0100 Subject: =?UTF-8?B?UmU6IFtyZWxlYXNlXSBQcm9wb3NpbmcgRWzFkWQgSWxsw6lzIChlbG9k?= =?UTF-8?Q?=29_for_Release_Management_Core?= In-Reply-To: References: <530f689e-f56f-5720-29b5-677ad2f73815@openstack.org> Message-ID: <6c4ea1d0-2ad4-99c3-7b57-693e34ef1c11@est.tech> Thanks Hervé! And thanks Sean, Brian and Thierry, too! :) I'm looking forward to the meeting tomorrow. (And a bit jealous about the champagne, I'll need at least a picture of it! :D) Előd On 2021. 02. 10. 19:37, Herve Beraud wrote: > Welcome on board Elod! > > We will celebrate that during our next meeting tomorrow evening! I'll > bring champagne ;) > > Thanks everyone for your feedback! > > Le mer. 10 févr. 2021 à 14:09, Thierry Carrez > a écrit : > > Herve Beraud wrote: > > Előd has been working on Release management for quite some time > now and > > in that time  has > > shown tremendous growth in his understanding of our processes > and on how > > deliverables work on Openstack. I think he would make a good > addition to > > the core team. > > > > Existing team members, please respond with +1/-1. > > If there are no objections we'll add him to the ACL soon. :-) > > +1 yes please ! > > -- > Thierry Carrez (ttx) > > > > -- > Hervé Beraud > Senior Software Engineer at Red Hat > irc: hberaud > https://github.com/4383/ > https://twitter.com/4383hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.matulis at canonical.com Wed Feb 10 22:06:18 2021 From: peter.matulis at canonical.com (Peter Matulis) Date: Wed, 10 Feb 2021 17:06:18 -0500 Subject: [charms] OpenStack Charms 21.01 release is now available Message-ID: The 21.01 release of the OpenStack Charms is now available. This release brings several new features to the existing OpenStack Charms deployments for Queens, Rocky, Stein, Train, Ussuri, Victoria and many stable combinations of Ubuntu + OpenStack. Please see the Release Notes for full details: https://docs.openstack.org/charm-guide/latest/2101.html == Highlights == * Cinder-Ceph - Image mirroring The cinder-ceph charm now supports image replication to a second Ceph cluster. * Ironic charms Three new tech-preview charms are now available for the deployment of OpenStack Ironic. Ironic provisions bare metal machines, as opposed to virtual machines. * Glance multi-backend usage When multiple storage backends are used with Glance, the glance charm now supports the CLI Glance client option (--store) that enables an operator to specify, at image upload time, which backend will be used to store the image. * Documentation updates Ongoing improvements to the OpenStack Charms Deployment Guide, the OpenStack Charm Guide, and the charm READMEs. * Trusty series deprecation The ‘trusty’ series is no longer actively tested and is now EOL in terms of bug fixes for the OpenStack charms. == OpenStack Charms team == The OpenStack Charms team can be contacted on the #openstack-charms IRC channel on Freenode. == Thank you == Lots of thanks to the below 33 contributors who squashed 52 bugs, enabled new features, and improved the documentation! Alex Kavanagh Alvaro Uria Andrea Ieri Arif Ali Aurelien Lourot Billy Olsen Corey Bryant David Ames Dmitrii Shcherbakov Dongdong Tao Edward Hope-Morley Eric Desrochers Felipe Reyes Frode Nordahl Gabriel Samfira Garrett Thompson Giuseppe Petralia Hemanth Nakkina Ioanna Alifieraki Ionut Balutoiu Jarred Wilson Liam Young Linda Guo Marius Oprin Martin Fiala Martin Kalcok Peter Matulis Peter Sabaini Ponnuvel Palaniyappan Robert Gildein Rodrigo Barbieri Trent Lloyd Xav Paice -- OpenStack Charms Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbaker at redhat.com Wed Feb 10 22:14:39 2021 From: sbaker at redhat.com (Steve Baker) Date: Thu, 11 Feb 2021 11:14:39 +1300 Subject: [baremetal-sig][ironic] Tue Feb 9, 2021, 2pm UTC: Deploy Steps in Ironic In-Reply-To: <1bbe3d44-dc36-d168-fab4-1f81a34b32ac@cern.ch> References: <1bbe3d44-dc36-d168-fab4-1f81a34b32ac@cern.ch> Message-ID: <201f1fd9-edc7-065e-bc93-b23d1aa32d59@redhat.com> The recording for this presentation is now available in the OpenStack Bare Metal SIG Series playlist: https://www.youtube.com/watch?v=uyN481mqdOs&list=PLKqaoAnDyfgoBFAjUvZGjKXQjogWZBLL_&index=4 On 8/02/21 9:04 pm, Arne Wiebalck wrote: > Dear all, > > The Bare Metal SIG will meet tomorrow Tue Feb 9, 2021 at > 2pm UTC on zoom. > > This time there will be a 10 minute "topic-of-the-day" > presentation by Dmitry Tantsur (dtansur) on: > > 'An Introduction to Deploy Steps in Ironic' > > If you've never found the time to understand what deploy steps > are and how they are useful, this is your chance to hear it > from one of the experts! You can find all the details > for this meeting on the SIG's etherpad: > > https://etherpad.opendev.org/p/bare-metal-sig > > Everyone is welcome, don't miss out! > > Cheers, >  Arne > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sco1984 at gmail.com Thu Feb 11 04:07:06 2021 From: sco1984 at gmail.com (Amey Abhyankar) Date: Thu, 11 Feb 2021 09:37:06 +0530 Subject: bootstrap errors in OSA installation in CentOS 8.3 In-Reply-To: <318f40f1-5b47-3720-1ba5-2ca1d1af3db5@rd.bbc.co.uk> References: <3a833f13-0816-ca3e-e67f-f4fd1ec04cd3@rd.bbc.co.uk> <318f40f1-5b47-3720-1ba5-2ca1d1af3db5@rd.bbc.co.uk> Message-ID: Hello, On Wed, 10 Feb 2021 at 19:52, Jonathan Rosser wrote: > > Hi Amey, > > 2.0.0.0rc1 is a release candidate tag (rc) from prior to the victoria > release being finalised openstack-ansible. > > You should use the latest tag on the stable/victoria branch which > currently is https://opendev.org/openstack/openstack-ansible/src/tag/22.0.1 Thanks. I think we need to update the respective documentation page. For a new user, it'll be otherwise very confusing to understand. Spending too much time to understand which branch to use is not a good impression on a new user specially in the world of automation. Option 1) Add a line above the git clone command saying pls refer to the latest branch. Option 2) Update the TAG number in the respective document time to time. Regards, Amey. > > On 10/02/2021 13:45, Amey Abhyankar wrote: > > Hello Jonathan, > > > > I can see the particular TAG version at following location thanks = > > https://opendev.org/openstack/openstack-ansible/src/tag/22.0.0.0rc1 > > > > Regards, > > Amey. > > > > > From sco1984 at gmail.com Thu Feb 11 04:13:16 2021 From: sco1984 at gmail.com (Amey Abhyankar) Date: Thu, 11 Feb 2021 09:43:16 +0530 Subject: [openstack-ansible] Re: How to configure interfaces in CentOS 8.3 for OSA installation? In-Reply-To: References: <8fb90586-01c6-0a31-4c47-787e4f620c24@rd.bbc.co.uk> Message-ID: On Wed, 10 Feb 2021 at 20:12, Satish Patel wrote: > > Hi Amey, > > I am running my cloud using openstack-ansible on centos 7.5 (on > Production) and 8.x (on Lab because centos 8 soon end of life, my > future plan is to migrate everything to ubuntu ). > > Answer to your question is you have two way to configure networking using OSA > > 1. Use systemd-networkd here is the example - > http://paste.openstack.org/show/802517/ Thanks. > 2. Use package network-scripts (This is legacy way to configure > network on centOS /etc/sysconfig/network-scripts/* style) - You can > see example on my blog how to deploy OSA on centos - > https://satishdotpatel.github.io//build-openstack-cloud-using-openstack-ansible/ Nicely documented thanks. Few questions = 1) From the switch side, the port is 'access' port or a 'trunk' port? 2) To configure compute node on the controller, what changes should I make in my network config file? Regards, Amey. > > > On Wed, Feb 10, 2021 at 9:14 AM Amey Abhyankar wrote: > > > > Hello Jonathan, > > > > On Wed, 10 Feb 2021 at 18:35, Jonathan Rosser > > wrote: > > > > > > Hi Amey, > > > > > > The documentation that you linked to is specifically for the ansible > > > deployment host, which can be separate from the OpenStack target hosts > > > if you wish. The only requirement is that it has ssh access to the > > > target hosts, by default this would be via the network attached to > > > br-mgmt on those hosts. > > > > > > In terms of the target hosts networking, you should refer to > > > https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts.html#configuring-the-network. > > Yes I checked this reference. > > > > > > OpenStack-Ansible does not prescribe any particular method for > > > setting up the host networking as this tends to be specific > > > to the individual environment or deployer preferences. > > I am looking for an example where to put the interface configurations > > in CentOS 8. > > But, I am unable to find any references specifically for CentOS/RHEL. > > Hence I post the question here to get help from some1 who is already > > running OpenStack on CentOS 8. > > > > Regards, > > Amey. > > > > > > You should pick most appropriate network config tool for your > > > OS/environment and create the bridges and networks listed in the > > > table on the target hosts documentation. > > > > > > I would always recommend starting with an All-In-One deployment > > > https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html. > > > This will generate a complete test environment > > > with all the host networking set up correctly and will serve as > > > a reference example for a production deployment. > > > > > > Do join the #openstack-ansible IRC channel if you would like > > > to discuss any of the options further. > > > > > > Regards, > > > Jonathan. > > > > > > On 10/02/2021 10:09, Amey Abhyankar wrote: > > > > Hello, > > > > > > > > I am trying to install OpenStack by referring following guides = > > > > > > > > 1) https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html > > > > 2) https://docs.openstack.org/openstack-ansible/victoria/user/test/example.html > > > > > > > > In link number 2, there is a step to configure interfaces. > > > > In CentOS 8 we don't have a /etc/network/interfaces file. > > > > Which interface file should I configure? > > > > Or do I need to create all virtual int's manually using nmcli? > > > > > > > > Pls suggest thanks. > > > > > > > > Regards, > > > > Amey. > > > > > > > > > > > > > From rosmaita.fossdev at gmail.com Thu Feb 11 05:01:02 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Thu, 11 Feb 2021 00:01:02 -0500 Subject: [cinder] wallaby R-9 mid-cycle summary available Message-ID: <0a49c0d8-a753-b55c-76b8-a003349d3ed3@gmail.com> In case you missed the exciting cinder R-9 virtual mid-cycle session (held yesterday or earlier today depending on where you are) I've posted a summary: https://wiki.openstack.org/wiki/CinderWallabyMidCycleSummary It will eventually include a link to the recording (in case you want to see what you missed or if you want to re-live the excitement). cheers, brian From skaplons at redhat.com Thu Feb 11 08:32:53 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 11 Feb 2021 09:32:53 +0100 Subject: [neutron] Drivers meeting agenda - Friday 12.02.2021 Message-ID: <3622465.v8UMDG8Cpr@p1> Hi, For the tomorrow's drivers meeting we have 1 RFE to discuss: * https://bugs.launchpad.net/neutron/+bug/1915151 - [RFE] There should be a way to set ethertype for the vlan_transparent networks Please take a look at it and ask/prepare questions for it to be discussed during the meeting :) If anyone has anything else which would like to discuss, please add it to the "On demand" section at [1]. [1] https://wiki.openstack.org/wiki/Meetings/NeutronDrivers#Agenda -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From mark at stackhpc.com Thu Feb 11 08:45:13 2021 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 11 Feb 2021 08:45:13 +0000 Subject: [stein][neutron][vlan provider network] How to configure br-ex on virtual interface In-Reply-To: References: Message-ID: On Tue, 9 Feb 2021 at 15:35, roshan anvekar wrote: > > Is it a hard requirement for br-ex to be configured on physical interface itself?? Or is it okay to configure it on a sub-interface too?? Hi Roshan, * if you plug an interface into a bridge (OVS or Linux), then you will no longer be able to use the IP configuration on that interface. That is likely why you've lost access to the gateway. * as Sean said, if you are using VLAN provider networks then plugging a VLAN interface into br-ex will result in double encapsulation. You can plug the bond in directly, and still make use of a VLAN interface on the bond for the external API. Mark > > Thanks in advance. > > ---------- Forwarded message --------- > From: roshan anvekar > Date: Tue, Feb 9, 2021, 5:54 PM > Subject: [stein][neutron][vlan provider network] How to configure br-ex on virtual interface > To: OpenStack Discuss > > > Hello, > > Below is my scenario. > > I am trying to use a single physical bonded interface ( example: bond1) both for external TLS access traffic and for provider network too ( These are 2 different vlan networks with separate vlan ids) . Bond0 is used as api_interface for management traffic. Bond2 is used for storage traffic. > > For this I created 2 virtual interfaces on this bond1 and used it accordingly while deploying through kolla-ansible. > > Before deployment the gateways for both vlan networks were accessible. > > Post deployment I see that qdhcp-id router is created on one of the controllers. Post creation of this DHCP agent, the gateway is inaccessible. > > Also the vms created are not getting IPs through DHCP agent so failing to be accessible. > > I had configured br-ex on virtual interface of provider vlan network and not on physical interface directly. > > Please let me know if I am going wrong in my network configurations. > > Regards, > Roshan > > > From mark at stackhpc.com Thu Feb 11 08:59:56 2021 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 11 Feb 2021 08:59:56 +0000 Subject: [ironic] [release] [kolla] RFC: stop doing Extended Maintenance for Bifrost In-Reply-To: References: Message-ID: On Wed, 10 Feb 2021 at 17:44, Radosław Piliszek wrote: > > On Wed, Feb 10, 2021 at 6:22 PM Dmitry Tantsur wrote: > > > > Hi all, > > > > Since Bifrost is an installation project that supports several distributions, maintaining its stable branches is a never ending nightmare. We have *nearly* fixed Ussuri (thank you Riccardo!) and just started looking into Train (thank you Iury and Mark), I'm sure we cannot keep EM branches in a decent shape. Personally I feel that Bifrost is more geared towards consumers staying close to master, but my gut feeling may not necessarily match the reality. I would hope not the actual master branch, given that it pulls in master branches of all dependencies. > > > > Based on the above, I'm proposing: > > 1) EOL all old branches from Ocata to Stein on Bifrost. > > 2) No longer create EM branches, EOL immediately after a branch leaves the regular maintenance. > > > > Opinions? > > If it's not an overkill burden, I would suggest following Kolla's > policy where we in fact keep one last EM branch alive in practice > (older ones rot much too quickly). > That would mean still caring about Stein until Train goes EM too. In theory, there's no burden. There's no obligation to keep EM branches working. In Kolla we generally have a policy where EM branches lie dormant, and are not actively maintained by the core team. They are however open for patches, with the understanding that the submitter will probably need to fix CI before they can land a patch. With my StackHPC hat on, this works well for us because in general our clients' systems are no older than the latest EM (currently Stein). We do rely on Bifrost (via Kayobe), and it could make things a little difficult for us if we could not rely on Bifrost's older stable branches. While I may not be the most active ironic core, I have spent a fair amount of time patching up Bifrost, and will continue to do so when necessary. I suppose the tl;dr is, if you don't want to maintain Bifrost EM that's fine, but please keep it open for someone else to try. Mark > > -yoctozepto > From roshananvekar at gmail.com Thu Feb 11 09:47:31 2021 From: roshananvekar at gmail.com (roshan anvekar) Date: Thu, 11 Feb 2021 15:17:31 +0530 Subject: [stein][neutron][vlan provider network] How to configure br-ex on virtual interface In-Reply-To: References: Message-ID: Thanks for the reply. It was very informative. In my case I have 2 sub-interfaces on a single bond. One I am using as kolla_extrrnal_vip_interface for TLS configuration. Another one I intended to use as provider network interface. Issue is after successful kolla deploy, VM is not getting IP successfully through DHCP. This is making me wonder if my approach was right ?? Or do I need to put the br-ex directly on bond itself still. Any help and suggestion here is appreciated. PS: I have other 2 bond interfaces dedicated for kolla_internal_vip_interface/network_interface and storage network. Thanks, Roshan On Thu, Feb 11, 2021, 2:15 PM Mark Goddard wrote: > On Tue, 9 Feb 2021 at 15:35, roshan anvekar > wrote: > > > > Is it a hard requirement for br-ex to be configured on physical > interface itself?? Or is it okay to configure it on a sub-interface too?? > Hi Roshan, > * if you plug an interface into a bridge (OVS or Linux), then you will > no longer be able to use the IP configuration on that interface. That > is likely why you've lost access to the gateway. > * as Sean said, if you are using VLAN provider networks then plugging > a VLAN interface into br-ex will result in double encapsulation. You > can plug the bond in directly, and still make use of a VLAN interface > on the bond for the external API. > Mark > > > > Thanks in advance. > > > > ---------- Forwarded message --------- > > From: roshan anvekar > > Date: Tue, Feb 9, 2021, 5:54 PM > > Subject: [stein][neutron][vlan provider network] How to configure br-ex > on virtual interface > > To: OpenStack Discuss > > > > > > Hello, > > > > Below is my scenario. > > > > I am trying to use a single physical bonded interface ( example: bond1) > both for external TLS access traffic and for provider network too ( These > are 2 different vlan networks with separate vlan ids) . Bond0 is used as > api_interface for management traffic. Bond2 is used for storage traffic. > > > > For this I created 2 virtual interfaces on this bond1 and used it > accordingly while deploying through kolla-ansible. > > > > Before deployment the gateways for both vlan networks were accessible. > > > > Post deployment I see that qdhcp-id router is created on one of the > controllers. Post creation of this DHCP agent, the gateway is inaccessible. > > > > Also the vms created are not getting IPs through DHCP agent so failing > to be accessible. > > > > I had configured br-ex on virtual interface of provider vlan network and > not on physical interface directly. > > > > Please let me know if I am going wrong in my network configurations. > > > > Regards, > > Roshan > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From piotrmisiak1984 at gmail.com Thu Feb 11 11:20:00 2021 From: piotrmisiak1984 at gmail.com (Piotr Misiak) Date: Thu, 11 Feb 2021 12:20:00 +0100 Subject: [neutron][ovn] ipv6 in virtual networks In-Reply-To: References: <68945afb-935e-f28d-a022-6e0a94af4387@gmail.com> Message-ID: On 10.02.2021 21:08, Brian Haley wrote: > On 2/10/21 1:11 PM, Piotr Misiak wrote: >>   - Routing is not working at all >> >> Besides those, I can't imagine how upstream router will know how to >> reach a particular private network with GUA IPv6 addresses (to which >> virtual router send packets to reach a particular private network?). I >> have a standard external network with IPv6 GUA /64 subnet and virtual >> routers which connects private networks with IPv6 GUA /64 subnets with >> external network. I thought that OVN virtual router will send ICMPv6 RA >> packets on external network with reachable prefixes and upstream router >> will learn routing info from those but I don't see any RA packets sent >> by OVN on external network, I see only RA packets from an upstream >> router. How this should work and be configured? How to configure GUA >> IPv6 addresses on virtual private networks? Is it supported by >> Neutron/OVN? > > IPv6 prefix delegation is what you want, it's one of the 'gaps' with > ML2/OVS, https://bugs.launchpad.net/neutron/+bug/1895972 > > There is a list of known items at > https://docs.openstack.org/neutron/latest/ovn/gaps.html > > So in order to use a globally-reachable IPv6 address you should use a > port from a provider network in the instance. > Thanks Brian for the prompt response. Does this mean that the only functional IPv6 scenario in Neutron/OVN is where VMs are directly connected to an IPv6 GUA provider network? BGP peering is not supported in Neutron/OVN, so virtual routers cannot advertise their prefixes (use case where private network prefix is manually specified by the user or it is automatically assigned from a default IPv6 subnet-pool defined in Neutron) IPv6 PD is not supported in Neutron/OVN, so virtual routers cannot request an IPv6 prefix from an upstream router Thanks From lyarwood at redhat.com Thu Feb 11 12:22:55 2021 From: lyarwood at redhat.com (Lee Yarwood) Date: Thu, 11 Feb 2021 12:22:55 +0000 Subject: [cinder/barbican] LUKS encryption for mounted disk - how to decrypt cinder volume In-Reply-To: <20210210174303.gi2zuaerxsvsk66l@lyarwood-laptop.usersys.redhat.com> References: <20210209221230.xifstekiw6aucr7l@lyarwood-laptop.usersys.redhat.com> <20210210174303.gi2zuaerxsvsk66l@lyarwood-laptop.usersys.redhat.com> Message-ID: <20210211122255.2v4z36wehqsd6hpi@lyarwood-laptop.usersys.redhat.com> On 10-02-21 17:43:03, Lee Yarwood wrote: > On 10-02-21 11:29:06, Jan Wasilewski wrote: >> Thank you for a nice description of how everything is organized. It is much >> easier to understand the full workflow. >> >>> I'll try to find some time to write these up again later in the week. >> That would be great, I will try to do this by myself, but I'm wondering if >> it's possible to do "all magic" directly from a payload that is visible >> from barbican CLI. My thanks to gcharot for writing the following up downstream a while ago and highlighting some easy ways of achieving this. The following assumes that the volume is already mapped and connected to the localhost, in this case I'm just using the LVs used by the default LVM/iSCSI c-vol backend in my devstack env. It also assumes you have access to the secrets associated with the encrypted volume, by default admins do not. - Starting with an encrypted volume $ sudo qemu-img info --output=json /dev/stack-volumes-lvmdriver-1/volume-d4cc53db-6add-4c29-9f96-42a5498f8bd0 | jq .format "luks" - Fetch and store the key locally $ openstack secret get --payload_content_type 'application/octet-stream' http://192.168.122.208/key-manager/v1/secrets/6fd4f879-005d-4b7d-9e5f-2505f010be7c --file mysecret.key - Use dmcrypt to decrypt the device using the key as a passphrase $ yes $(hexdump -e '16/1 "%02x"' mysecret.key) | sudo cryptsetup luksOpen /dev/stack-volumes-lvmdriver-1/volume-d4cc53db-6add-4c29-9f96-42a5498f8bd0 volume-d4cc53db-6add-4c29-9f96-42a5498f8bd0 - This should leave you with the decrypted volume under /dev/mapper $ sudo qemu-img info /dev/mapper/volume-d4cc53db-6add-4c29-9f96-42a5498f8bd0 image: /dev/mapper/volume-d4cc53db-6add-4c29-9f96-42a5498f8bd0 file format: raw virtual size: 0.998 GiB (1071644672 bytes) disk size: 0 B Hope this helps! -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From christian.rohmann at inovex.de Thu Feb 11 13:12:21 2021 From: christian.rohmann at inovex.de (Christian Rohmann) Date: Thu, 11 Feb 2021 14:12:21 +0100 Subject: [neutron][ovn] ipv6 in virtual networks In-Reply-To: References: <68945afb-935e-f28d-a022-6e0a94af4387@gmail.com> Message-ID: Hey there, On 11/02/2021 12:20, Piotr Misiak wrote: >> So in order to use a globally-reachable IPv6 address you should use a >> port from a provider network in the instance. >> > Thanks Brian for the prompt response. > > Does this mean that the only functional IPv6 scenario in Neutron/OVN is > where VMs are directly connected to an IPv6 GUA provider network? I ran into a similar question a while back: http://lists.openstack.org/pipermail/openstack-discuss/2020-June/015682.html But there I was wondering / discussing if requesting a prefix larger than /64 was possible to allow using that routed prefix to host i.e. a VPN solution. Prefix delegation of GUA prefixes implemented in Neutron apparently: https://blueprints.launchpad.net/neutron/+spec/ipv6-prefix-delegation And to me prefix delegation of a global unicast address prefix (GUA) seems like the cleanest solution. Regards Christian -------------- next part -------------- An HTML attachment was scrubbed... URL: From gthiemonge at redhat.com Thu Feb 11 14:08:04 2021 From: gthiemonge at redhat.com (Gregory Thiemonge) Date: Thu, 11 Feb 2021 15:08:04 +0100 Subject: [tripleo][ci] socializing upstream job removal ( master upgrade, scenario010 ) In-Reply-To: References: Message-ID: On Tue, Feb 9, 2021 at 5:32 PM Wesley Hayutin wrote: > > *Scenario010 - Octavia:* > Scenario010 octavia has been non-voting and not passing at a high enough > rate [2] to justify the use of upstream resources. I would propose we > only run these jobs in the periodic component and integration lines in RDO > softwarefactory. > > Specifically: ( all branches ) > tripleo-ci-centos-8-scenario010-ovn-provider-standalone > tripleo-ci-centos-8-scenario010-standalone > We are still working on fixing scenario010, there were several huge changes in octavia-tempest-plugin that broke scenario010 but two weeks ago we fixed tripleo-ci-centos-8-scenario010-standalone by reducing the number of tests we are running (only smoke tests and perhaps more tests in the future) and by adding some new options for non-devstack environments in octavia-tempest-plugin. Then tripleo-ci-centos-8-scenario010-ovn-provider-standalone failed for another reason and the test class has been added in tempest-skiplist (even for non-ovn-provider jobs), and now all scenario010 jobs are in FAILURE because tempest doesn't run any test ( https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_d39/771832/11/check/tripleo-ci-centos-8-scenario010-standalone/d39fce5/logs/undercloud/var/log/tempest/stestr_results.html ). We still want to have those scenario010 jobs, we first need to remove the skiplist for the non-ovn-provider job, then we will figure out why the ovn-provider job fails. > Please review and comment. The CI team will start taking action on these > two items in one week. > > Thank you! > > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020235.html > [2] > http://dashboard-ci.tripleo.org/d/iEDLIiOMz/non-voting-jobs?orgId=1&from=now-30d&to=now > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Thu Feb 11 14:19:10 2021 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 11 Feb 2021 09:19:10 -0500 Subject: [openstack-ansible] Re: How to configure interfaces in CentOS 8.3 for OSA installation? In-Reply-To: References: <8fb90586-01c6-0a31-4c47-787e4f620c24@rd.bbc.co.uk> Message-ID: On Wed, Feb 10, 2021 at 11:13 PM Amey Abhyankar wrote: > > On Wed, 10 Feb 2021 at 20:12, Satish Patel wrote: > > > > Hi Amey, > > > > I am running my cloud using openstack-ansible on centos 7.5 (on > > Production) and 8.x (on Lab because centos 8 soon end of life, my > > future plan is to migrate everything to ubuntu ). > > > > Answer to your question is you have two way to configure networking using OSA > > > > 1. Use systemd-networkd here is the example - > > http://paste.openstack.org/show/802517/ > Thanks. > > 2. Use package network-scripts (This is legacy way to configure > > network on centOS /etc/sysconfig/network-scripts/* style) - You can > > see example on my blog how to deploy OSA on centos - > > https://satishdotpatel.github.io//build-openstack-cloud-using-openstack-ansible/ > Nicely documented thanks. > Few questions = > 1) From the switch side, the port is 'access' port or a 'trunk' port? All my switches run on trunk ports and I have Cisco Nexus vPC that is why I created bond0 interface and created multiple VLAN/Bridges on top of that. > 2) To configure compute node on the controller, what changes should I > make in my network config file? I have the same VLAN/Bridge running on controllers and computes. so no special changes required to run the computer on the controller. > > Regards, > Amey. > > > > > > On Wed, Feb 10, 2021 at 9:14 AM Amey Abhyankar wrote: > > > > > > Hello Jonathan, > > > > > > On Wed, 10 Feb 2021 at 18:35, Jonathan Rosser > > > wrote: > > > > > > > > Hi Amey, > > > > > > > > The documentation that you linked to is specifically for the ansible > > > > deployment host, which can be separate from the OpenStack target hosts > > > > if you wish. The only requirement is that it has ssh access to the > > > > target hosts, by default this would be via the network attached to > > > > br-mgmt on those hosts. > > > > > > > > In terms of the target hosts networking, you should refer to > > > > https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts.html#configuring-the-network. > > > Yes I checked this reference. > > > > > > > > OpenStack-Ansible does not prescribe any particular method for > > > > setting up the host networking as this tends to be specific > > > > to the individual environment or deployer preferences. > > > I am looking for an example where to put the interface configurations > > > in CentOS 8. > > > But, I am unable to find any references specifically for CentOS/RHEL. > > > Hence I post the question here to get help from some1 who is already > > > running OpenStack on CentOS 8. > > > > > > Regards, > > > Amey. > > > > > > > > You should pick most appropriate network config tool for your > > > > OS/environment and create the bridges and networks listed in the > > > > table on the target hosts documentation. > > > > > > > > I would always recommend starting with an All-In-One deployment > > > > https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html. > > > > This will generate a complete test environment > > > > with all the host networking set up correctly and will serve as > > > > a reference example for a production deployment. > > > > > > > > Do join the #openstack-ansible IRC channel if you would like > > > > to discuss any of the options further. > > > > > > > > Regards, > > > > Jonathan. > > > > > > > > On 10/02/2021 10:09, Amey Abhyankar wrote: > > > > > Hello, > > > > > > > > > > I am trying to install OpenStack by referring following guides = > > > > > > > > > > 1) https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html > > > > > 2) https://docs.openstack.org/openstack-ansible/victoria/user/test/example.html > > > > > > > > > > In link number 2, there is a step to configure interfaces. > > > > > In CentOS 8 we don't have a /etc/network/interfaces file. > > > > > Which interface file should I configure? > > > > > Or do I need to create all virtual int's manually using nmcli? > > > > > > > > > > Pls suggest thanks. > > > > > > > > > > Regards, > > > > > Amey. > > > > > > > > > > > > > > > > > From jsaezdeb at ucm.es Thu Feb 11 11:21:16 2021 From: jsaezdeb at ucm.es (Jaime) Date: Thu, 11 Feb 2021 12:21:16 +0100 Subject: Openstack cannot access to the Internet Message-ID: <85684e9e-b220-33bf-6d52-806c7e03db74@ucm.es> # Openstack instances cannot access Internet [linuxbridge] I am having serious issues in the deployment of the Openstack scenario related to the Linux Bridge. This is the scenario: - Controller machine:     - Management Interface `enp2s0`: 138.100.10.25. - Compute machine:     - Management Interface `enp2s0`: 138.100.10.26.     - Provider Interface `enp0s20f0u4`: 138.100.10.27. Openstack Train scenario has been successfully deployed in Centos 8, choosing networking option 2 (self-service network). To verify the functionality, an image has been uploaded, created an Openstack flavor and security group, and launched a couple of cirrOS instances for connection testing. We have created a provider network following [this tutorial](https://docs.openstack.org/newton/install-guide-rdo/launch-instance-networks-provider.html) and a selfservice network following [this one](https://docs.openstack.org/newton/install-guide-rdo/launch-instance-networks-selfservice.html). The network scenario is the next one: As can be seen in the network topology, an external network 138.100.10.0/21 (provider) and an internal network 192.168.1.1 (selfservice) have been created, connected through a router by the interfaces 138.100.10.198 and 192.168.1.1, both active. Our problem is that our Linux bridge is not working as expected: the Openstack cirrOS instances has no internet access. This is the controller `ip a` and `brctl show` command output: This is the compute `ip a` and `brctl show` command output: (The output of `ovs-vsctl show` command is empty in both machines). **Are the Linux Bridges correctly created?** These are the Linux bridge configuration files: * Controller `/etc/neutron/plugins/ml2/linuxbridge_agent.ini`: ``` [linux_bridge] physical_interface_mappings = provider:enp2s0     # enp2s0 is the interface associated to 138.100.10.25 [vxlan] enable_vxlan = true local_ip = 138.100.10.25    # controller has only 1 IP l2_population = true ``` * Compute `/etc/neutron/plugins/ml2/linuxbridge_agent.ini`: ``` [linux_bridge] physical_interface_mappings = provider:enp0s20f0u4        # interface associated to 138.100.10.26 [vxlan] enable_vxlan = true local_ip = 138.100.10.27 l2_population = true ``` An **observation** to keep in mind is that compute management interface (`138.100.10.26`) is inaccessible from anywhere, which I think is not correct since this prevents us, for example, from accessing the instance console through the URL. I have made some conection tests and these are the results: - There is **connection** between Cirros A and Cirros B (in both directions). - There is **connection** between Cirros A/B and self-service gateway (192.168.1.1) (in both directions). - There is **connection** between Cirros A/B and provider gateway (138.100.10.198) (in both directions). - There is **connection** between Cirros A/B and controller management interface (138.100.10.25) (in both directions). - There is **no connection** between Cirros A/B and compute management interface (138.100.10.26). This interface is not accessible. - There is **connection** between Cirros A/B and compute provider interface (138.100.10.27) (in both directions). I do not know if there is a problem on Linux bridge configuration files, or maybe I need another network interface on controller machine. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ihjeokblmmomgfjg.png Type: image/png Size: 36370 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dffejgmbhchhmaca.png Type: image/png Size: 235263 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mjolakjifbkalbee.png Type: image/png Size: 286271 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: oiecokjbfijimfcp.png Type: image/png Size: 36511 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cnihbjppfjfkghbl.png Type: image/png Size: 59491 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: kogajcbgnbmjfcgh.png Type: image/png Size: 57808 bytes Desc: not available URL: From jsaezdeb at ucm.es Thu Feb 11 11:44:42 2021 From: jsaezdeb at ucm.es (Jaime) Date: Thu, 11 Feb 2021 12:44:42 +0100 Subject: Openstack instances cannot access to Internet [linuxbridge] Message-ID: I am having serious issues in the deployment of the Openstack scenario related to the Linux Bridge. This is the scenario: - Controller machine:     - Management Interface `enp2s0`: 138.100.10.25. - Compute machine:     - Management Interface `enp2s0`: 138.100.10.26.     - Provider Interface `enp0s20f0u4`: 138.100.10.27. Openstack Train scenario has been successfully deployed in Centos 8, choosing networking option 2 (self-service network). To verify the functionality, an image has been uploaded, created an Openstack flavor and security group, and launched a couple of cirrOS instances for connection testing. We have created a provider network following [this tutorial](https://docs.openstack.org/newton/install-guide-rdo/launch-instance-networks-provider.html) and a selfservice network following [this one](https://docs.openstack.org/newton/install-guide-rdo/launch-instance-networks-selfservice.html). The network scenario is the next one: As can be seen in the network topology, an external network 138.100.10.0/21 (provider) and an internal network 192.168.1.1 (selfservice) have been created, connected through a router by the interfaces 138.100.10.198 and 192.168.1.1, both active. Our problem is that our Linux bridge is not working as expected: the Openstack cirrOS instances has no internet access. This is the controller `ip a` and `brctl show` command output: This is the compute `ip a` and `brctl show` command output: (The output of `ovs-vsctl show` command is empty in both machines). **Are the Linux Bridges correctly created?** These are the Linux bridge configuration files: * Controller `/etc/neutron/plugins/ml2/linuxbridge_agent.ini`: ``` [linux_bridge] physical_interface_mappings = provider:enp2s0     # enp2s0 is the interface associated to 138.100.10.25 [vxlan] enable_vxlan = true local_ip = 138.100.10.25    # controller has only 1 IP l2_population = true ``` * Compute `/etc/neutron/plugins/ml2/linuxbridge_agent.ini`: ``` [linux_bridge] physical_interface_mappings = provider:enp0s20f0u4        # interface associated to 138.100.10.26 [vxlan] enable_vxlan = true local_ip = 138.100.10.27 l2_population = true ``` An **observation** to keep in mind is that compute management interface (`138.100.10.26`) is inaccessible from anywhere, which I think is not correct since this prevents us, for example, from accessing the instance console through the URL. I have made some conection tests and these are the results: * Cirros_a `ip a` command output: * Cirros_b `ip a` command output: - There is **connection** between Cirros A and Cirros B (in both directions). - There is **connection** between Cirros A/B and self-service gateway (192.168.1.1) (in both directions). - There is **connection** between Cirros A/B and provider gateway (138.100.10.198) (in both directions). - There is **connection** between Cirros A/B and controller management interface (138.100.10.25) (in both directions). - There is **no connection** between Cirros A/B and compute management interface (138.100.10.26). This interface is not accessible. - There is **connection** between Cirros A/B and compute provider interface (138.100.10.27) (in both directions). I do not know if there is a problem on linux bridge configuration files, or maybe I need another network interface on controller machine. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ihlnoanbhaaanjgh.png Type: image/png Size: 36370 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: idfojfmflpcapgel.png Type: image/png Size: 235213 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jipbmmkaojbcclif.png Type: image/png Size: 284301 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dcfkombmkoalcchb.png Type: image/png Size: 34702 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nlkobopjmngpchfb.png Type: image/png Size: 58970 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: kopoaaemiodmcpab.png Type: image/png Size: 57162 bytes Desc: not available URL: From jsaezdeb at ucm.es Thu Feb 11 11:45:40 2021 From: jsaezdeb at ucm.es (Jaime) Date: Thu, 11 Feb 2021 12:45:40 +0100 Subject: Openstack instances cannot access to Internet [linuxbridge] Message-ID: <1045b3e6-ac3a-ea98-08d1-039bbfe7078c@ucm.es> I am having serious issues in the deployment of the Openstack scenario related to the Linux Bridge. This is the scenario: - Controller machine:     - Management Interface `enp2s0`: 138.100.10.25. - Compute machine:     - Management Interface `enp2s0`: 138.100.10.26.     - Provider Interface `enp0s20f0u4`: 138.100.10.27. Openstack Train scenario has been successfully deployed in Centos 8, choosing networking option 2 (self-service network). To verify the functionality, an image has been uploaded, created an Openstack flavor and security group, and launched a couple of cirrOS instances for connection testing. We have created a provider network following [this tutorial](https://docs.openstack.org/newton/install-guide-rdo/launch-instance-networks-provider.html) and a selfservice network following [this one](https://docs.openstack.org/newton/install-guide-rdo/launch-instance-networks-selfservice.html). The network scenario is the next one: As can be seen in the network topology, an external network 138.100.10.0/21 (provider) and an internal network 192.168.1.1 (selfservice) have been created, connected through a router by the interfaces 138.100.10.198 and 192.168.1.1, both active. Our problem is that our Linux bridge is not working as expected: the Openstack cirrOS instances has no internet access. This is the controller `ip a` and `brctl show` command output: ``` [upm at modena ~]$ ip a 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00     inet 127.0.0.1/8 scope host lo        valid_lft forever preferred_lft forever     inet6 ::1/128 scope host        valid_lft forever preferred_lft forever 2: enp2s0: mtu 1500 qdisc fq_codel master brq84f65ccb-c9 state UP group default qlen 1000     link/ether 24:4b:fe:7c:78:b8 brd ff:ff:ff:ff:ff:ff 3: virbr0: mtu 1500 qdisc noqueue state DOWN group default qlen 1000     link/ether 52:54:00:15:5b:02 brd ff:ff:ff:ff:ff:ff     inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0        valid_lft forever preferred_lft forever 4: virbr0-nic: mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000     link/ether 52:54:00:15:5b:02 brd ff:ff:ff:ff:ff:ff 17: tapa467f377-b1 at if2: mtu 1500 qdisc noqueue master brq84f65ccb-c9 state UP group default qlen 1000     link/ether d6:90:e8:fe:90:23 brd ff:ff:ff:ff:ff:ff link-netns qdhcp-84f65ccb-c945-437b-9013-9c71422bb10e 18: brq84f65ccb-c9: mtu 1500 qdisc noqueue state UP group default qlen 1000     link/ether 24:4b:fe:7c:78:b8 brd ff:ff:ff:ff:ff:ff     inet 138.100.10.25/21 brd 138.100.15.255 scope global brq84f65ccb-c9        valid_lft forever preferred_lft forever     inet6 fe80::6c31:4cff:fe2d:7820/64 scope link        valid_lft forever preferred_lft forever 19: tap7a390547-5b at if2: mtu 1450 qdisc noqueue master brqd811bcfa-94 state UP group default qlen 1000     link/ether 26:6d:a4:fc:73:51 brd ff:ff:ff:ff:ff:ff link-netns qdhcp-d811bcfa-945a-4633-9266-60ccafa28d86 20: vxlan-1: mtu 1450 qdisc noqu