From samuel at cassi.ba Tue Jan 1 02:54:11 2019 From: samuel at cassi.ba (Samuel Cassiba) Date: Mon, 31 Dec 2018 18:54:11 -0800 Subject: [chef] State of the Kitchen: 8th Edition Message-ID: It has been some time since the last State of the Kitchen. Since the release of Chef OpenStack 17 (Queens), we have been operating in a minimal churn mode to give people time to test/upgrade deployments and handle any regressions that emerge. There are three main areas that are still in progress upstream from Chef OpenStack, but can affect its cadence regardless. As a result, this update focuses more on those areas. Consider this more of a year-end review. ### Important Happenings * *fog-openstack*[^1] Beginning in August, we started receiving reports of breakage due to changes in fog-core. As a reactionary measure, we implemented upper-level constraints in the client resource cookbook to maintain a consistent outcome. The fog-openstack library has continued to receive changes to further align with fog-core, and we are following its progress to find a good time to move ChefDK and Chef OpenStack to a post-1.0 release of fog-openstack. We are targeting the 18th release of Chef OpenStack, due to Keystone endpoint changes that need to happen. * *Sous Chefs*[^2] One of the biggest strengths of Chef and OpenStack is the collective outcome of their unique communities. Within the Chef ecosystem, the Sous Chefs group was formed in response to a need for the continued existence of Chef components, libraries, and utilities that need a long-term home. Across the globe, Sous Chefs work to keep some of the most heavily used cookbooks in existence, such as [apache2](https://supermarket.chef.io/cookbooks/apache2), [mysql](https://supermarket.chef.io/cookbooks/mysql) (and [mariadb](https://supermarket.chef.io/cookbooks/mariadb)!), [postgresql](https://supermarket.chef.io/cookbooks/postgresql), as well as [redisio](https://supermarket.chef.io/cookbooks/redisio), and many more. Chef OpenStack depends on MariaDB, Apache, and their related cookbooks, for compatibility without operators needing to plumb those resources internally. * *poise-python*[^3] In early October, pip 18.1 was released, which made some additional waves in the ecosystem. Workarounds were devised and implemented to limit the fallout. Currently, the fix has been merged to poise-python's master, but cannot be released safely due to CI changes in the current workflow. There are limitations on what the Sous Chefs can reasonably maintain. The maintenance of poise is rather beyond that boundary, not to discount or disparage anyone involved. Anyone interested with spare cycles over the holiday season might consider joining the conversation. ### Upcoming Changes * In Chef OpenStack 18... - The MariaDB version will default to 10.3, consistent with the default in the 2.0 version of the cookbook. Please plan accordingly. - Keystone's endpoint will be changing to drop the hardcoded API version - the cloud primitives (client) cookbook is in the process of migrating from cookbook-openstackclient to cookbook-openstack-client (named openstack_client, to conform with current best practices in the Chef community) - Ubuntu will be upgraded from 16.04 to 18.04, and as such we will be gating against Bionic at that time. Plainly put, previous Chef OpenStack releases will not be moving to Bionic jobs, and will continue to work at best effort until they succumb to the detritus of time. ### Meetings Since the Summit, a few people have reached out through various means about Chef OpenStack and how to work together to improve the outcome. As a result, I would like to propose holding regular meetings for Chef OpenStack once more set aside a dedicated period where we can come together and talk about food, or other things. We have the IRC channel, but IRC has proven less effective for a small group to dedicate time consistently, so I would something more high bandwidth for technical conversations, such as video with a publicized method for joining and viewing. I will follow up with a more expanded proposal outside this update. ### On The Menu This would not be a State of the Kitchen without something to eat. My partner and I try to cook with recipes that are not overly complicated, but can be infinitely complex with just the right nudge. Sometimes we incorporate our own opinions into someone else's recipe to make it our own thing, and sometimes they're great just as they come. *Dat Dough, Doe* * 170g / 6oz grated mozzarella or Edam, or another mild cheese with similar melting consistency * 85g / 3oz almond meal/flour * 28g / 2 tbsp cream cheese or Neufchatel * 1 egg * pinch of salt to taste 1. Mix the shredded/grated cheese and the almond meal in a microwaveable bowl, then add the cream cheese. Microwave on high for 1 minute. Stir the mixture, then microwave on high for another 30 seconds. 2. Add egg, salt, additional spices or flavorings, and mix or fold gently. 3. Shape using parchment paper into the desired outcome, be it flat like a disc or rounded, like a boule. 4. Create vents to ensure that the finished product cooks evenly. 5. Fry, bake, broil or grill as desired. Lipids can be friends here. More commonly known as the "Fat Head" dough, out of these few ingredients, one can make food that can taste every bit like pizza, pasta, bread, even pão de queijo. Or, perhaps, cinnamon rolls, or danishes, as one might consider making. With these basic suggestions, one can apply their own opinions and set of requirements to create complex pieces of work, which can taste every bit like an artform and a science. See you in 2019! Your humble pastry chef, -scas [^1]: https://github.com/fog/fog-openstack/issues/434 [^2]: https://sous-chefs.org/ [^3]: https://github.com/poise/poise-python/issues/133 -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Tue Jan 1 04:15:39 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Tue, 1 Jan 2019 15:15:39 +1100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Thanks Ignazio I ll have a look asap. Cheers On Sun., 30 Dec. 2018, 6:43 pm Ignazio Cassano Hi Alfredo, > attached here there is my magnum.conf for queens release > As you can see my heat sections are empty > When you create your cluster, I suggest to check heat logs e magnum logs > for verifyng what is wrong > Ignazio > > > > Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> so. Creating a stack either manually or dashboard works fine. The problem >> seems to be when I create a cluster (kubernetes/swarm) that I got that >> error. >> Maybe the magnum conf it's not properly setup? >> In the heat section of the magnum.conf I have only >> *[heat_client]* >> *region_name = RegionOne* >> *endpoint_type = internalURL* >> >> Cheers >> >> >> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >> alfredo.deluca at gmail.com> wrote: >> >>> Yes. Next step is to check with ansible. >>> I do think it's some rights somewhere... >>> I'll check later. Thanks >>> >>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano >> wrote: >>> >>>> Alfredo, >>>> 1 . how did you run the last heat template? By dashboard ? >>>> 2. Using openstack command you can check if ansible configured heat >>>> user/domain correctly >>>> >>>> >>>> It seems a problem related to >>>> heat user rights? >>>> >>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>> killed by signal 15 >>>>> which is last log from a few days ago. >>>>> >>>>> While the journal of the heat engine says >>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>> heat-engine service. >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>> SAWarning: Unicode type received non-unicode bind param value >>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>> occurrences) >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> (util.ellipses_string(value),)) >>>>> >>>>> >>>>> I also checked the configuration and it seems to be ok. the problem is >>>>> that I installed openstack with ansible-openstack.... so I can't change >>>>> anything unless I re run everything. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Check heat user and domani are c onfigured like at the following: >>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>> >>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>> >>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>> >>>>>>>> I ll try asap. Thanks >>>>>>>> >>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>> >>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>> heat is working fine? >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>> >>>>>>>>>> HI IGNAZIO >>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>> creating the master. >>>>>>>>>> >>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> Anycase during deployment you can connect with ssh to the master >>>>>>>>>>> and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>> >>>>>>>>>>>> Cheers >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my answer >>>>>>>>>>>>> could help you.... >>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>> Ignazio >>>>>>>>>>>>> >>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>> one.... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Alfredo* >>>>>>>>>>>> >>>>>>>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >> >> -- >> *Alfredo* >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From flux.adam at gmail.com Tue Jan 1 12:34:44 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Tue, 1 Jan 2019 04:34:44 -0800 Subject: [heat] Bug : Heat cannot create Octavia Load Balancer In-Reply-To: References: Message-ID: I'm just on my phone over the holidays, but it kinda looks like the code for this was just updated 12 days ago: https://review.openstack.org/#/c/619577/ If you're using that new code, I imagine it's possible there could be a bug that wasn't yet caught... If you're NOT using that code, maybe try it and see if it helps? I'm guessing it's related one way or another. If you come to the #openstack-lbaas channel once more people are around (later this week?), we can probably take a look. --Adam Harwell (rm_work) On Sun, Dec 30, 2018, 03:37 Zufar Dhiyaulhaq wrote: > I have try creating load balancer with Heat. but always get this error : > > Resource CREATE failed: OctaviaClientException: resources.loadbalancer: > Validation failure: Missing project ID in request where one is required. > (HTTP 400) (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) > > OctaviaClientException: resources.loadbalancer: Validation failure: > Missing project ID in request where one is required. (HTTP 400) > (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) > > I create 2 openstack environment : > > - Heat with Octavia (Octavia Heat Template : > http://paste.opensuse.org/view//33592182 ) > - Heat with Neutron Lbaasv2 (Neutron LBaaSv2 Heat Template : > http://paste.opensuse.org/view//71741503) > > But always error when creating with octavia : > > - Octavia Log (https://imgur.com/a/EsuWvla) > - LBaaS v2 (https://imgur.com/a/BqNGRPH) > > Are Heat code is broken to create Octavia Load Balancer? > > Best Regards, > Zufar Dhiyaulhaq > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liliueecg at gmail.com Tue Jan 1 17:28:05 2019 From: liliueecg at gmail.com (Li Liu) Date: Tue, 1 Jan 2019 12:28:05 -0500 Subject: [Cyborg] no irc meeting this week Message-ID: Hi Team, Since it's public holiday for folks in US and Canada, we will not have the irc meeting this week. Enjoy the new year guys :) Thank you Regards Li Liu -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiaopengju at cmss.chinamobile.com Tue Jan 1 07:39:38 2019 From: jiaopengju at cmss.chinamobile.com (=?utf-8?B?54Sm6bmP5Li+?=) Date: Tue, 1 Jan 2019 15:39:38 +0800 (CST) Subject: [dev][karbor]No meeting today Message-ID: <2aff5c2b179265e-00007.Richmail.00004030743850933408@cmss.chinamobile.com> Hi Karbor Team, We will skip karbor weekly meeting today due to holidays. Next meeting will come on 15 January. Thanks. Pengju Jiao -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Wed Jan 2 02:45:13 2019 From: zbitter at redhat.com (Zane Bitter) Date: Wed, 2 Jan 2019 15:45:13 +1300 Subject: [Heat][Octavia] Is autoscaling feature missing? In-Reply-To: References: Message-ID: <26860a5b-38de-cb6d-301e-07fd7b332310@redhat.com> On 21/12/18 1:00 AM, Viktor Shulhin wrote: > Hi all, > > I am trying to create Heat template with autoscaling and loadbalancing. > I didn't find any similar Heat template examples. > Individually, loadbalancing and autoscaling work well, but loadbalancing > OS::Octavia::PoolMember can be added only manually. > Is there any way to use OS::Heat::AutoScalingGroup as server pool for > loadbalancing? Yes. The trick is that the thing you're using as the scaled unit in the Autoscaling group should not be just an OS::Nova::Server, but rather a Heat stack that contains both a server and an OS::Octavia::PoolMember. The example that Rabi linked to shows how to do it. Note that you need both of these files: https://github.com/openstack/heat-templates/blob/master/hot/autoscaling.yaml https://github.com/openstack/heat-templates/blob/master/hot/lb_server.yaml cheers, Zane. From zbitter at redhat.com Wed Jan 2 03:22:31 2019 From: zbitter at redhat.com (Zane Bitter) Date: Wed, 2 Jan 2019 16:22:31 +1300 Subject: [heat] Bug : Heat cannot create Octavia Load Balancer In-Reply-To: References: Message-ID: <11bd665e-ab0d-6ae6-49f9-6b3a7fbc4eea@redhat.com> On 2/01/19 1:34 AM, Adam Harwell wrote: > I'm just on my phone over the holidays, but it kinda looks like the code > for this was just updated 12 days ago: > https://review.openstack.org/#/c/619577/ > > If you're using that new code, I imagine it's possible there could be a > bug that wasn't yet caught... If you're NOT using that code, maybe try > it and see if it helps? I'm guessing it's related one way or another. If > you come to the #openstack-lbaas channel once more people are around > (later this week?), we can probably take a look. That's only the example template (previously it was an example for LBaaSv2; now it's an example for Octavia); there's been no recent change to the code. >      --Adam Harwell (rm_work) > > On Sun, Dec 30, 2018, 03:37 Zufar Dhiyaulhaq > wrote: > > I have try creating load balancer with Heat. but always get this error : > > Resource CREATE failed: OctaviaClientException: > resources.loadbalancer: Validation failure: Missing project ID in > request where one is required. (HTTP 400) (Request-ID: > req-b45208e1-a200-47f9-8aad-b130c4c12272) > > OctaviaClientException: resources.loadbalancer: Validation failure: > Missing project ID in request where one is required. (HTTP 400) > (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) What version of OpenStack are you using? The issue is that Heat is sending a "tenant_id" but Octavia wants a "project_id", which is the new name for the same thing. (I think you likely modified that template after trying it but before uploading it, because there is no "project_id" property in Heat's OS::Octavia::LoadBalancer resource type.) This bug has been reported and there is a patch up for review in Heat: https://storyboard.openstack.org/#!/story/2004650 There was a change to Octavia in Pike (https://review.openstack.org/455442) to add backwards compatibility, but it was either incomplete or the problem reoccurred and was fixed again in Rocky (https://review.openstack.org/569881). My guess is that it's likely broken in Pike and Queens. I'd certainly have expected Heat's gate tests to pick up the problem, and it's a bit of a mystery why they didn't. Perhaps we're not exercising the case where a project_id is required (using it at all is an admin-only feature, so that's not too surprising I guess; it's actually more surprising that there's a case where it's _required_). cheers, Zane. > I create 2 openstack environment : > > * Heat with Octavia (Octavia Heat Template : > http://paste.opensuse.org/view//33592182 ) > * Heat with Neutron Lbaasv2 (Neutron LBaaSv2 Heat Template : > http://paste.opensuse.org/view//71741503) > > But always error when creating with octavia : > > * Octavia Log (https://imgur.com/a/EsuWvla) > * LBaaS v2 (https://imgur.com/a/BqNGRPH) > > Are Heat code is broken to create Octavia Load Balancer? > > Best Regards, > Zufar Dhiyaulhaq > From yjf1970231893 at gmail.com Wed Jan 2 04:30:30 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Wed, 2 Jan 2019 12:30:30 +0800 Subject: [heat] Bug : Heat cannot create Octavia Load Balancer In-Reply-To: References: <11bd665e-ab0d-6ae6-49f9-6b3a7fbc4eea@redhat.com> Message-ID: Please confirm whether "auth_strategy" is set as ''keystone" in configure file. I remember that the value of "auth_strategy" is set as "noauth" in "/usr/share/octavia/octavia-dist.conf" default if you install octavia by rpm. If the value was set as "noauth", you must manually specify "project_id" for octavia. Jeff Yang 于2019年1月2日周三 下午12:26写道: > Please confirm whether "auth_strategy" is set as ''keystone" in > configure file. I remember that the value of "auth_strategy" is set as > "noauth" in "/usr/share/octavia/octavia-dist.conf" default if you install > octavia by rpm. If the value was set as "noauth", you must manually specify > "project_id" for octavia. > > Zane Bitter 于2019年1月2日周三 上午11:23写道: > >> On 2/01/19 1:34 AM, Adam Harwell wrote: >> > I'm just on my phone over the holidays, but it kinda looks like the >> code >> > for this was just updated 12 days ago: >> > https://review.openstack.org/#/c/619577/ >> > >> > If you're using that new code, I imagine it's possible there could be a >> > bug that wasn't yet caught... If you're NOT using that code, maybe try >> > it and see if it helps? I'm guessing it's related one way or another. >> If >> > you come to the #openstack-lbaas channel once more people are around >> > (later this week?), we can probably take a look. >> >> That's only the example template (previously it was an example for >> LBaaSv2; now it's an example for Octavia); there's been no recent change >> to the code. >> >> > --Adam Harwell (rm_work) >> > >> > On Sun, Dec 30, 2018, 03:37 Zufar Dhiyaulhaq > > > wrote: >> > >> > I have try creating load balancer with Heat. but always get this >> error : >> > >> > Resource CREATE failed: OctaviaClientException: >> > resources.loadbalancer: Validation failure: Missing project ID in >> > request where one is required. (HTTP 400) (Request-ID: >> > req-b45208e1-a200-47f9-8aad-b130c4c12272) >> > >> > OctaviaClientException: resources.loadbalancer: Validation failure: >> > Missing project ID in request where one is required. (HTTP 400) >> > (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) >> >> What version of OpenStack are you using? >> >> The issue is that Heat is sending a "tenant_id" but Octavia wants a >> "project_id", which is the new name for the same thing. (I think you >> likely modified that template after trying it but before uploading it, >> because there is no "project_id" property in Heat's >> OS::Octavia::LoadBalancer resource type.) >> >> This bug has been reported and there is a patch up for review in Heat: >> https://storyboard.openstack.org/#!/story/2004650 >> >> There was a change to Octavia in Pike >> (https://review.openstack.org/455442) to add backwards compatibility, >> but it was either incomplete or the problem reoccurred and was fixed >> again in Rocky (https://review.openstack.org/569881). My guess is that >> it's likely broken in Pike and Queens. >> >> I'd certainly have expected Heat's gate tests to pick up the problem, >> and it's a bit of a mystery why they didn't. Perhaps we're not >> exercising the case where a project_id is required (using it at all is >> an admin-only feature, so that's not too surprising I guess; it's >> actually more surprising that there's a case where it's _required_). >> >> cheers, >> Zane. >> >> > I create 2 openstack environment : >> > >> > * Heat with Octavia (Octavia Heat Template : >> > http://paste.opensuse.org/view//33592182 ) >> > * Heat with Neutron Lbaasv2 (Neutron LBaaSv2 Heat Template : >> > http://paste.opensuse.org/view//71741503) >> > >> > But always error when creating with octavia : >> > >> > * Octavia Log (https://imgur.com/a/EsuWvla) >> > * LBaaS v2 (https://imgur.com/a/BqNGRPH) >> > >> > Are Heat code is broken to create Octavia Load Balancer? >> > >> > Best Regards, >> > Zufar Dhiyaulhaq >> > >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramishra at redhat.com Wed Jan 2 05:49:25 2019 From: ramishra at redhat.com (Rabi Mishra) Date: Wed, 2 Jan 2019 11:19:25 +0530 Subject: [heat] Bug : Heat cannot create Octavia Load Balancer In-Reply-To: References: <11bd665e-ab0d-6ae6-49f9-6b3a7fbc4eea@redhat.com> Message-ID: On Wed, Jan 2, 2019 at 10:03 AM Jeff Yang wrote: > Please confirm whether "auth_strategy" is set as ''keystone" in > configure file. I remember that the value of "auth_strategy" is set as > "noauth" in "/usr/share/octavia/octavia-dist.conf" default if you install > octavia by rpm. If the value was set as "noauth", you must manually specify > "project_id" for octavia. > > Yeah, that could be the reason[1] (when deploying with puppet puppet-octavia sets it to keystone[2]), as the error is coming from octavia[3], when you don't specify a project_id in request and the context does not have it either. [1] https://github.com/rdo-packages/octavia-distgit/blob/rpm-master/octavia-dist.conf#L3 [2] https://github.com/openstack/puppet-octavia/blob/master/manifests/api.pp#L65 [3] https://github.com/openstack/octavia/blob/master/octavia/api/v2/controllers/load_balancer.py#L251 Jeff Yang 于2019年1月2日周三 下午12:26写道: > Please confirm whether "auth_strategy" is set as ''keystone" in >> configure file. I remember that the value of "auth_strategy" is set as >> "noauth" in "/usr/share/octavia/octavia-dist.conf" default if you install >> octavia by rpm. If the value was set as "noauth", you must manually specify >> "project_id" for octavia. >> >> Zane Bitter 于2019年1月2日周三 上午11:23写道: >> >>> On 2/01/19 1:34 AM, Adam Harwell wrote: >>> > I'm just on my phone over the holidays, but it kinda looks like the >>> code >>> > for this was just updated 12 days ago: >>> > https://review.openstack.org/#/c/619577/ >>> > >>> > If you're using that new code, I imagine it's possible there could be >>> a >>> > bug that wasn't yet caught... If you're NOT using that code, maybe try >>> > it and see if it helps? I'm guessing it's related one way or another. >>> If >>> > you come to the #openstack-lbaas channel once more people are around >>> > (later this week?), we can probably take a look. >>> >>> That's only the example template (previously it was an example for >>> LBaaSv2; now it's an example for Octavia); there's been no recent change >>> to the code. >>> >>> > --Adam Harwell (rm_work) >>> > >>> > On Sun, Dec 30, 2018, 03:37 Zufar Dhiyaulhaq < >>> zufardhiyaulhaq at gmail.com >>> > > wrote: >>> > >>> > I have try creating load balancer with Heat. but always get this >>> error : >>> > >>> > Resource CREATE failed: OctaviaClientException: >>> > resources.loadbalancer: Validation failure: Missing project ID in >>> > request where one is required. (HTTP 400) (Request-ID: >>> > req-b45208e1-a200-47f9-8aad-b130c4c12272) >>> > >>> > OctaviaClientException: resources.loadbalancer: Validation failure: >>> > Missing project ID in request where one is required. (HTTP 400) >>> > (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) >>> >>> What version of OpenStack are you using? >>> >>> The issue is that Heat is sending a "tenant_id" but Octavia wants a >>> "project_id", which is the new name for the same thing. (I think you >>> likely modified that template after trying it but before uploading it, >>> because there is no "project_id" property in Heat's >>> OS::Octavia::LoadBalancer resource type.) >>> >>> This bug has been reported and there is a patch up for review in Heat: >>> https://storyboard.openstack.org/#!/story/2004650 >>> >>> There was a change to Octavia in Pike >>> (https://review.openstack.org/455442) to add backwards compatibility, >>> but it was either incomplete or the problem reoccurred and was fixed >>> again in Rocky (https://review.openstack.org/569881). My guess is that >>> it's likely broken in Pike and Queens. >>> >>> I'd certainly have expected Heat's gate tests to pick up the problem, >>> and it's a bit of a mystery why they didn't. Perhaps we're not >>> exercising the case where a project_id is required (using it at all is >>> an admin-only feature, so that's not too surprising I guess; it's >>> actually more surprising that there's a case where it's _required_). >>> >>> cheers, >>> Zane. >>> >>> > I create 2 openstack environment : >>> > >>> > * Heat with Octavia (Octavia Heat Template : >>> > http://paste.opensuse.org/view//33592182 ) >>> > * Heat with Neutron Lbaasv2 (Neutron LBaaSv2 Heat Template : >>> > http://paste.opensuse.org/view//71741503) >>> > >>> > But always error when creating with octavia : >>> > >>> > * Octavia Log (https://imgur.com/a/EsuWvla) >>> > * LBaaS v2 (https://imgur.com/a/BqNGRPH) >>> > >>> > Are Heat code is broken to create Octavia Load Balancer? >>> > >>> > Best Regards, >>> > Zufar Dhiyaulhaq >>> > >>> >>> >>> -- Regards, Rabi Mishra -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Wed Jan 2 06:25:37 2019 From: zbitter at redhat.com (Zane Bitter) Date: Wed, 2 Jan 2019 19:25:37 +1300 Subject: queens heat db deadlock In-Reply-To: <028c4ec2-d6a7-d5a2-190d-91065d7231ee@gmail.com> References: <38bae882-b4ac-4b55-5345-e27edbd582f3@redhat.com> <82a140e4-55b7-453f-593d-d7423ac34e64@gmail.com> <94442ceb-1278-a573-1456-9f44204a8ccd@redhat.com> <028c4ec2-d6a7-d5a2-190d-91065d7231ee@gmail.com> Message-ID: <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> On 21/12/18 2:07 AM, Jay Pipes wrote: > On 12/20/2018 02:01 AM, Zane Bitter wrote: >> On 19/12/18 6:49 AM, Jay Pipes wrote: >>> On 12/18/2018 11:06 AM, Mike Bayer wrote: >>>> On Tue, Dec 18, 2018 at 12:36 AM Ignazio Cassano >>>> wrote: >>>>> >>>>> Yes, I  tried on yesterday and this workaround solved. >>>>> Thanks >>>>> Ignazio >>>> >>>> OK, so that means this "deadlock" is not really a deadlock but it is a >>>> write-conflict between two Galera masters.      I have a long term >>>> goal to being relaxing this common requirement that Openstack apps >>>> only refer to one Galera master at a time.    If this is a particular >>>> hotspot for Heat (no pun intended) can we pursue adding a transaction >>>> retry decorator for this operation?  This is the standard approach for >>>> other applications that are subject to galera multi-master writeset >>>> conflicts such as Neutron. >> >> The weird thing about this issue is that we actually have a retry >> decorator on the operation that I assume is the problem. It was added >> in Queens and largely fixed this issue in the gate: >> >> https://review.openstack.org/#/c/521170/1/heat/db/sqlalchemy/api.py >> >>> Correct. >>> >>> Heat doesn't use SELECT .. FOR UPDATE does it? That's also a big >>> cause of the aforementioned "deadlocks". >> >> AFAIK, no. In fact we were quite careful to design stuff that is >> expected to be subject to write contention to use UPDATE ... WHERE (by >> doing query().filter_by().update() in sqlalchemy), but it turned out >> to be those very statements that were most prone to causing deadlocks >> in the gate (i.e. we added retry decorators in those two places and >> the failures went away), according to me in the commit message for >> that patch: https://review.openstack.org/521170 >> >> Are we Doing It Wrong(TM)? > > No, it looks to me like you're doing things correctly. The OP mentioned > that this only happens when deleting a Magnum cluster -- and that it > doesn't occur in normal Heat template usage. > > I wonder (as I really don't know anything about Magnum, unfortunately), > is there something different about the Magnum cluster resource handling > in Heat that might be causing the wonkiness? There's no special-casing for Magnum within Heat. It's likely to be just that there's a lot of resources in a Magnum cluster - or more specifically, a lot of edges in the resource graph, which leads to more write contention (and, in a multi-master setup, more write conflicts). I'd assume that any similarly-complex template would have the same issues, and that Ignazio just didn't have anything else that complex to hand. That gives me an idea, though. I wonder if this would help: https://review.openstack.org/627914 Ignazio, could you possibly test with that ^ patch in multi-master mode to see if it resolves the issue? cheers, Zane. From ignaziocassano at gmail.com Wed Jan 2 07:55:10 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 2 Jan 2019 08:55:10 +0100 Subject: queens heat db deadlock In-Reply-To: <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> References: <38bae882-b4ac-4b55-5345-e27edbd582f3@redhat.com> <82a140e4-55b7-453f-593d-d7423ac34e64@gmail.com> <94442ceb-1278-a573-1456-9f44204a8ccd@redhat.com> <028c4ec2-d6a7-d5a2-190d-91065d7231ee@gmail.com> <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> Message-ID: Hello, I'll try as soon as possible ans I will send you a response. Ignazio Il giorno mer 2 gen 2019 alle ore 07:28 Zane Bitter ha scritto: > On 21/12/18 2:07 AM, Jay Pipes wrote: > > On 12/20/2018 02:01 AM, Zane Bitter wrote: > >> On 19/12/18 6:49 AM, Jay Pipes wrote: > >>> On 12/18/2018 11:06 AM, Mike Bayer wrote: > >>>> On Tue, Dec 18, 2018 at 12:36 AM Ignazio Cassano > >>>> wrote: > >>>>> > >>>>> Yes, I tried on yesterday and this workaround solved. > >>>>> Thanks > >>>>> Ignazio > >>>> > >>>> OK, so that means this "deadlock" is not really a deadlock but it is a > >>>> write-conflict between two Galera masters. I have a long term > >>>> goal to being relaxing this common requirement that Openstack apps > >>>> only refer to one Galera master at a time. If this is a particular > >>>> hotspot for Heat (no pun intended) can we pursue adding a transaction > >>>> retry decorator for this operation? This is the standard approach for > >>>> other applications that are subject to galera multi-master writeset > >>>> conflicts such as Neutron. > >> > >> The weird thing about this issue is that we actually have a retry > >> decorator on the operation that I assume is the problem. It was added > >> in Queens and largely fixed this issue in the gate: > >> > >> https://review.openstack.org/#/c/521170/1/heat/db/sqlalchemy/api.py > >> > >>> Correct. > >>> > >>> Heat doesn't use SELECT .. FOR UPDATE does it? That's also a big > >>> cause of the aforementioned "deadlocks". > >> > >> AFAIK, no. In fact we were quite careful to design stuff that is > >> expected to be subject to write contention to use UPDATE ... WHERE (by > >> doing query().filter_by().update() in sqlalchemy), but it turned out > >> to be those very statements that were most prone to causing deadlocks > >> in the gate (i.e. we added retry decorators in those two places and > >> the failures went away), according to me in the commit message for > >> that patch: https://review.openstack.org/521170 > >> > >> Are we Doing It Wrong(TM)? > > > > No, it looks to me like you're doing things correctly. The OP mentioned > > that this only happens when deleting a Magnum cluster -- and that it > > doesn't occur in normal Heat template usage. > > > > I wonder (as I really don't know anything about Magnum, unfortunately), > > is there something different about the Magnum cluster resource handling > > in Heat that might be causing the wonkiness? > > There's no special-casing for Magnum within Heat. It's likely to be just > that there's a lot of resources in a Magnum cluster - or more > specifically, a lot of edges in the resource graph, which leads to more > write contention (and, in a multi-master setup, more write conflicts). > I'd assume that any similarly-complex template would have the same > issues, and that Ignazio just didn't have anything else that complex to > hand. > > That gives me an idea, though. I wonder if this would help: > > https://review.openstack.org/627914 > > Ignazio, could you possibly test with that ^ patch in multi-master mode > to see if it resolves the issue? > > cheers, > Zane. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhengzhenyulixi at gmail.com Wed Jan 2 08:57:28 2019 From: zhengzhenyulixi at gmail.com (Zhenyu Zheng) Date: Wed, 2 Jan 2019 16:57:28 +0800 Subject: [Nova] Suggestion needed for detach-boot-volume design Message-ID: Hi Nova, Happy New Year! I've been working on detach-boot-volume[1] in Stein, we got the initial design merged and while implementing we have meet some new problems and now I'm amending the spec to cover these new problems[2]. The thing I want to discuss for wider opinion is that in the initial design, we planned to support detach root volume for only STOPPED and SHELVED/SHELVE_OFFLOADED instances. But then we found out that we allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as well. Should we allow detaching root volume for instances in these status too? Cases like RESIZE could be complicated for the revert resize action, and it also seems unnecesary. Thoughts? BR, [1] https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/volume-backed-server-rebuild.html [2] https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/volume-backed-server-rebuild.html Kevin Zheng -------------- next part -------------- An HTML attachment was scrubbed... URL: From ltoscano at redhat.com Wed Jan 2 10:13:54 2019 From: ltoscano at redhat.com (Luigi Toscano) Date: Wed, 02 Jan 2019 11:13:54 +0100 Subject: [sahara][qa][api-sig]Support for Sahara APIv2 in tempest tests, unversioned endpoints Message-ID: <1818981.9ErCeWV4fL@whitebase.usersys.redhat.com> Hi all, I'm working on adding support for APIv2 to the Sahara tempest plugin. If I get it correctly, there are two main steps 1) Make sure that that tempest client works with APIv2 (and don't regress with APIv1.1). This mainly mean implementing the tempest client for Sahara APIv2, which should not be too complicated. On the other hand, we hit an issue with the v1.1 client in an APIv2 environment. A change associated with API v2 is usage of an unversioned endpoint for the deployment (see https://review.openstack.org/#/c/622330/ , without the /v1,1/$ (tenant_id) suffix) which should magically work with both API variants, but it seems that the current tempest client fails in this case: http://logs.openstack.org/30/622330/1/check/sahara-tests-tempest/7e02114/job-output.txt.gz#_2018-12-05_21_20_23_535544 Does anyone know if this is an issue with the code of the tempest tests (which should maybe have some logic to build the expected endpoint when it's unversioned, like saharaclient does) or somewhere else? 2) fix the tests to support APIv2. Should I duplicate the tests for APIv1.1 and APIv2? Other projects which supports different APIs seems to do this. But can I freely move the existing tests under a subdirectory (sahara_tempest_plugins/tests/api/ -> sahara_tempest_plugins/tests/api/v1/), or are there any compatibility concerns? Are the test ID enough to ensure that everything works as before? And what about CLI tests currently under sahara_tempest_plugin/tests/cli/ ? They supports both API versions through a configuration flag. Should they be duplicated as well? Ciao (and happy new year if you have a new one in your calendar!) -- Luigi From dtantsur at redhat.com Wed Jan 2 11:18:40 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 2 Jan 2019 12:18:40 +0100 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat Message-ID: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> Hi all and happy new year :) As you know, tempest plugins are branchless, so the CI of ironic-tempest-plugin has to run tests on all supported branches. Currently it amounts to 16 (!) voting devstack jobs. With each of them have some small probability of a random failure, it is impossible to land anything without at least one recheck, usually more. The bad news is, we only run master API tests job, and these tests are changed more often that the other. We already had a minor stable branch breakage because of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And I've just spotted a missing master multinode job, which is defined but does not run for some reason :( Here is my proposal to deal with gate bloat on ironic-tempest-plugin: 1. Do not run CI jobs at all for unsupported branches and branches in extended maintenance. For Ocata this has already been done in [2]. 2. Make jobs running with N-3 (currently Pike) and older non-voting (and thus remove them from the gate queue). I have a gut feeling that a change that breaks N-3 is very likely to break N-2 (currently Queens) as well, so it's enough to have N-2 voting. 3. Make the discovery and the multinode jobs from all stable branches non-voting. These jobs cover the tests that get changed very infrequently (if ever). These are also the jobs with the highest random failure rate. 4. Add the API tests, voting for Queens to master, non-voting for Pike (as proposed above). This should leave us with 20 jobs, but with only 11 of them voting. Which is still a lot, but probably manageable. The corresponding change is [3], please comment here or there. Dmitry [1] https://review.openstack.org/622177 [2] https://review.openstack.org/621537 [3] https://review.openstack.org/627955 From ignaziocassano at gmail.com Wed Jan 2 11:27:13 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 2 Jan 2019 12:27:13 +0100 Subject: queens heat db deadlock In-Reply-To: <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> References: <38bae882-b4ac-4b55-5345-e27edbd582f3@redhat.com> <82a140e4-55b7-453f-593d-d7423ac34e64@gmail.com> <94442ceb-1278-a573-1456-9f44204a8ccd@redhat.com> <028c4ec2-d6a7-d5a2-190d-91065d7231ee@gmail.com> <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> Message-ID: Hello Zane, we applyed the patch and modified our haproxy : unfortunately it does not solve db deadlock issue. Ignazio & Gianpiero Il giorno mer 2 gen 2019 alle ore 07:28 Zane Bitter ha scritto: > On 21/12/18 2:07 AM, Jay Pipes wrote: > > On 12/20/2018 02:01 AM, Zane Bitter wrote: > >> On 19/12/18 6:49 AM, Jay Pipes wrote: > >>> On 12/18/2018 11:06 AM, Mike Bayer wrote: > >>>> On Tue, Dec 18, 2018 at 12:36 AM Ignazio Cassano > >>>> wrote: > >>>>> > >>>>> Yes, I tried on yesterday and this workaround solved. > >>>>> Thanks > >>>>> Ignazio > >>>> > >>>> OK, so that means this "deadlock" is not really a deadlock but it is a > >>>> write-conflict between two Galera masters. I have a long term > >>>> goal to being relaxing this common requirement that Openstack apps > >>>> only refer to one Galera master at a time. If this is a particular > >>>> hotspot for Heat (no pun intended) can we pursue adding a transaction > >>>> retry decorator for this operation? This is the standard approach for > >>>> other applications that are subject to galera multi-master writeset > >>>> conflicts such as Neutron. > >> > >> The weird thing about this issue is that we actually have a retry > >> decorator on the operation that I assume is the problem. It was added > >> in Queens and largely fixed this issue in the gate: > >> > >> https://review.openstack.org/#/c/521170/1/heat/db/sqlalchemy/api.py > >> > >>> Correct. > >>> > >>> Heat doesn't use SELECT .. FOR UPDATE does it? That's also a big > >>> cause of the aforementioned "deadlocks". > >> > >> AFAIK, no. In fact we were quite careful to design stuff that is > >> expected to be subject to write contention to use UPDATE ... WHERE (by > >> doing query().filter_by().update() in sqlalchemy), but it turned out > >> to be those very statements that were most prone to causing deadlocks > >> in the gate (i.e. we added retry decorators in those two places and > >> the failures went away), according to me in the commit message for > >> that patch: https://review.openstack.org/521170 > >> > >> Are we Doing It Wrong(TM)? > > > > No, it looks to me like you're doing things correctly. The OP mentioned > > that this only happens when deleting a Magnum cluster -- and that it > > doesn't occur in normal Heat template usage. > > > > I wonder (as I really don't know anything about Magnum, unfortunately), > > is there something different about the Magnum cluster resource handling > > in Heat that might be causing the wonkiness? > > There's no special-casing for Magnum within Heat. It's likely to be just > that there's a lot of resources in a Magnum cluster - or more > specifically, a lot of edges in the resource graph, which leads to more > write contention (and, in a multi-master setup, more write conflicts). > I'd assume that any similarly-complex template would have the same > issues, and that Ignazio just didn't have anything else that complex to > hand. > > That gives me an idea, though. I wonder if this would help: > > https://review.openstack.org/627914 > > Ignazio, could you possibly test with that ^ patch in multi-master mode > to see if it resolves the issue? > > cheers, > Zane. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Wed Jan 2 13:08:00 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 2 Jan 2019 14:08:00 +0100 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> Message-ID: <9487fc46-6957-82b2-49a7-3fc8cca53842@redhat.com> On 1/2/19 12:18 PM, Dmitry Tantsur wrote: > Hi all and happy new year :) > > As you know, tempest plugins are branchless, so the CI of ironic-tempest-plugin > has to run tests on all supported branches. Currently it amounts to 16 (!) > voting devstack jobs. With each of them have some small probability of a random > failure, it is impossible to land anything without at least one recheck, usually > more. > > The bad news is, we only run master API tests job, and these tests are changed > more often that the other. We already had a minor stable branch breakage because > of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And I've just > spotted a missing master multinode job, which is defined but does not run for > some reason :( Better news: the API tests did not have a separate job before Rocky, so we only need to add Rocky. However, we'll get to 4 jobs in the future. The multinode job is missing because it was renamed on master, and apparently Zuul does not report it Oo > > Here is my proposal to deal with gate bloat on ironic-tempest-plugin: > > 1. Do not run CI jobs at all for unsupported branches and branches in extended > maintenance. For Ocata this has already been done in [2]. > > 2. Make jobs running with N-3 (currently Pike) and older non-voting (and thus > remove them from the gate queue). I have a gut feeling that a change that breaks > N-3 is very likely to break N-2 (currently Queens) as well, so it's enough to > have N-2 voting. > > 3. Make the discovery and the multinode jobs from all stable branches > non-voting. These jobs cover the tests that get changed very infrequently (if > ever). These are also the jobs with the highest random failure rate. > > 4. Add the API tests, voting for Queens to master, non-voting for Pike (as > proposed above). Only Rocky here for now. > > This should leave us with 20 jobs, but with only 11 of them voting. Which is > still a lot, but probably manageable. > > The corresponding change is [3], please comment here or there. > > Dmitry > > [1] https://review.openstack.org/622177 > [2] https://review.openstack.org/621537 > [3] https://review.openstack.org/627955 From eblock at nde.ag Wed Jan 2 13:39:23 2019 From: eblock at nde.ag (Eugen Block) Date: Wed, 02 Jan 2019 13:39:23 +0000 Subject: [Openstack] [Nova][Glance] Nova imports flat images from base file despite ceph backend In-Reply-To: References: <20180928115051.Horde.ZC_55UzSXeK4hiOjJt6tajA@webmail.nde.ag> <20180928125224.Horde.33aqtdk0B9Ncylg-zxjA5to@webmail.nde.ag> <9F3C86CE-862D-469A-AD79-3F334CD5DB41@enter.eu> <20181004124417.Horde.py2wEG4JmO1oFXbjX5u1uw3@webmail.nde.ag> <20181009080101.Horde.---iO9LIrKkWvTsNJwWk_Mj@webmail.nde.ag> <679352a8-c082-d851-d8a5-ea7b2348b7d3@gmail.com> <20181012215027.Horde.t5xm_KfkoEE4YEnrewHQZPG@webmail.nde.ag> <9df7167b-ea3b-51d6-9fad-7c9298caa7be@gmail.com> <72242CC2-621E-4037-A8F0-8AE56C4A6F36@italy1.com> Message-ID: <20190102133923.Horde.CY4bM26RNgf_UaNTjY-WZYe@webmail.nde.ag> Hello and a happy new year! I need to reopen this thread because there are still things going on that I don't fully understand. I changed the disk_format of all the images that were affected by my mistake a couple of weeks ago. I deleted the base files in /var/lib/nova/instances/_base and launched new instances, leading to expected cow clones: ---cut here--- control:~ # openstack image show 5f486361-5468-42a0-9993-9cdda3450b0e | grep disk_format | disk_format | raw| control:~ # rbd children images/5f486361-5468-42a0-9993-9cdda3450b0e at snap images/029bfb90-cbab-4a7c-a51c-27807ab41ce7_disk images/ccd4498f-c0a0-480a-ab4b-6224d63e78fa_disk images/d5379918-40a0-4119-9e47-773c0ab8c0f3_disk ---cut here--- These are only three clones, but I have 16 instances based on the same image: ---cut here--- +--------------------------------------+ | uuid | +--------------------------------------+ | 87dfbb4f-784d-4390-a0c9-d3162a56ea7e | | bb56995d-d3ea-464d-94fe-382880bf2a92 | | d4fde2fe-140e-4904-90e8-996cc418302d | | 7bd4fdc0-d6b7-47cb-adb3-324abff6a0e5 | | bef8e4fe-b2f4-44f5-a144-36ced37007ac | | bcc005fc-fa12-4dd9-b531-cbccfb7c426a | ###| ccd4498f-c0a0-480a-ab4b-6224d63e78fa | ###| 029bfb90-cbab-4a7c-a51c-27807ab41ce7 | ###| d5379918-40a0-4119-9e47-773c0ab8c0f3 | | abd2c0bc-2e66-4e2a-9f4c-f006937c7b27 | | e07c0f5e-403e-4ad3-b04f-fb306e99869e | | edec4aa0-9459-4444-bbb8-03097bccba4c | | ca4449f5-921d-4ce5-8559-fc719cfbc845 | | 8bff1799-391e-4620-9b11-134962084b84 | | 9d927ac4-d7d3-45cb-8628-b267a6d0e668 | | 54cfeede-44f9-49bd-9f49-ef4dcacff953 | +--------------------------------------+ ---cut here--- Those three cow clones were created after adjusting the disk_format, some of the rest were created before the changes, the others have been created after the changes. Can anyone explain why a new base image is created? I downloaded the glance image and double-checked the file format, I also exported the snapshot which should be used to create clones, there's only one discrepancy (disk size), but I don't think this could be relevant: ---cut here--- control:~ # qemu-img info /var/lib/glance/images/image-snap.img image: /var/lib/glance/images/image-snap.img file format: raw virtual size: 5.0G (5368709120 bytes) disk size: 2.5G control:~ # qemu-img info /var/lib/glance/images/image.img image: /var/lib/glance/images/image.img file format: raw virtual size: 5.0G (5368709120 bytes) disk size: 5.0G control:~ # md5sum /var/lib/glance/images/image.img b9b28dd300a6fbb1de2081f1cb8a07d0 /var/lib/glance/images/image.img control:~ # md5sum /var/lib/glance/images/image-snap.img b9b28dd300a6fbb1de2081f1cb8a07d0 /var/lib/glance/images/image-snap.img ---cut here--- I also tried to find something with the glance-cache-manage cli, but I can't even connect to that service, I guess it has never been used. control:~ # glance-cache-manage list-cached Failed to show cached images. Got error: Connect error/bad request to Auth service at URL http://control1.cloud.hh.nde.ag:5000/v3/tokens. I don't know how to fix that yet, but could this be a way to resolve that issue? Regards, Eugen Zitat von melanie witt : > On Fri, 12 Oct 2018 20:06:04 -0700, Remo Mattei wrote: >> I do not have it handy now but you can verify that the image is >> indeed raw or qcow2 >> >> As soon as I get home I will dig the command and pass it on. I have >> seen where images have extensions thinking it is raw and it is not. > > You could try 'qemu-img info ' and get output like this, > notice "file format": > > $ qemu-img info test.vmdk > (VMDK) image open: flags=0x2 filename=test.vmdk > image: test.vmdk > file format: vmdk > virtual size: 20M (20971520 bytes) > disk size: 17M > > [1] https://en.wikibooks.org/wiki/QEMU/Images#Getting_information > > -melanie > > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From bence.romsics at gmail.com Wed Jan 2 14:20:10 2019 From: bence.romsics at gmail.com (Bence Romsics) Date: Wed, 2 Jan 2019 15:20:10 +0100 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <2a13e4e3-2add-76ef-9fd9-018dfc493cdb@gmail.com> References: <1545231821.28650.2@smtp.office365.com> <2a13e4e3-2add-76ef-9fd9-018dfc493cdb@gmail.com> Message-ID: Hi Matt, Sorry for the slow response over the winter holidays. First I have to correct myself: > On 12/20/2018 8:02 AM, Bence Romsics wrote: > > ... in neutron we > > don't directly control the list of extensions loaded. Instead what we > > control (through configuration) is the list of service plugins loaded. > > The 'resource_request' extension is implemented by the 'qos' service > > plugin. But the 'qos' service plugin implements other API extensions > > and features too. A cloud admin may want to use these other features > > of the 'qos' service plugin, but not the guaranteed minimum bandwidth. This is the default behavior, but it can be overcome by a small patch like this: https://review.openstack.org/627978 With a patch like that we could control loading the port-resource-request extension (and by that the presence of the resource_request attribute) on its own (independently of all other extensions implemented by the qos service plugin). On Thu, Dec 20, 2018 at 6:58 PM Matt Riedemann wrote: > Can't the resource_request part of this be controlled via policy rules > or something similar? Is this question still relevant given the above? Even if we could control the resource-request attribute via policy rules wouldn't that be just as undiscoverable as a config-only feature flag? > Barring that, are policy rules something that > could be used for deployers could decide which users can use this > feature while it's being rolled out? Using a standalone neutron extension (controlled on its own by a neutron config option) as a feature flag (and keeping it not loaded by default until the feature is past experimental) would lessen the number of cloud deployments (probably to zero) where the experimental feature is unintentionally exposed. On the other hand - now that Jay called my attention to the undiscoverability of feature flags - I realize that this approach is not enough to distinguish the complete and experimental versions of the feature, given the experimental version was exposed intentionally. Cheers, Bence From jaypipes at gmail.com Wed Jan 2 14:47:01 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Wed, 2 Jan 2019 09:47:01 -0500 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> Message-ID: <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> On 12/21/2018 03:45 AM, Rui Zang wrote: > It was advised in today's nova team meeting to bring this up by email. > > There has been some discussion on the how to track persistent memory > resource in placement on the spec review [1]. > > Background: persistent memory (PMEM) needs to be partitioned to > namespaces to be consumed by VMs. Due to fragmentation issues, the spec > proposed to use fixed sized PMEM namespaces. The spec proposed to use fixed sized namespaces that is controllable by the deployer, not fixed-size-for-everyone :) Just want to make sure we're being clear here. > The spec proposed way to represent PMEM namespaces is to use one > Resource Provider (RP) for one PMEM namespace. An new standard Resource > Class (RC) -- 'VPMEM_GB` is introduced to classify PMEM namspace RPs. > For each PMEM namespace RP, the values for 'max_unit', 'min_unit', > 'total' and 'step_size` are all set to the size of the PMEM namespace. > In this way, it is guaranteed each RP will be consumed as a whole at one > time. > > An alternative was brought out in the review. Different Custom Resource > Classes ( CUSTOM_PMEM_XXXGB) can be used to designate PMEM namespaces of > different sizes. The size of the PMEM namespace is encoded in the name > of the custom Resource Class. And multiple PMEM namespaces of the same > size  (say 128G) can be represented by one RP of the same Not represented by "one RP of the same CUSTOM_PMEM_128G". There would be only one resource provider: the compute node itself. It would have an inventory of, say, 8 CUSTOM_PMEM_128G resources. > CUSTOM_PMEM_128G. In this way, the RP could have 'max_unit'  and 'total' > as the total number of the PMEM namespaces of the certain size. And the > values of 'min_unit' and 'step_size' could set to 1. No, the max_unit, min_unit, step_size and total would refer to the number of *PMEM namespaces*, not the amount of GB of memory represented by those namespaces. Therefore, min_unit and step_size would be 1, max_unit would be the total number of *namespaces* that could simultaneously be attached to a single consumer (VM), and total would be 8 in our example where the compute node had 8 of these pre-defined 128G PMEM namespaces. > We believe both way could work. We would like to have a community > consensus on which way to use. > Email replies and review comments to the spec [1] are both welcomed. Custom resource classes were invented for precisely this kind of use case. The resource being represented is a namespace. The resource is not "a Gibibyte of persistent memory". Best, -jay > Regards, > Zang, Rui > > > [1] https://review.openstack.org/#/c/601596/ > From openstack at nemebean.com Wed Jan 2 16:23:14 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 2 Jan 2019 10:23:14 -0600 Subject: [oslo] Problem when use library "oslo.messaging" for HA Openstack In-Reply-To: References: Message-ID: On 12/27/18 8:22 PM, Thành Nguyễn Bá wrote: > Dear all, > > I have a problem when use 'notification listener' oslo-message for HA > Openstack. > > It raise 'oslo_messaging.exceptions.MessageDeliveryFailure: Unable to > connect to AMQP server on 172.16.4.125:5672 >  after inf tries: Exchange.declare: (406) > PRECONDITION_FAILED - inequivalent arg 'durable' for exchange 'nova' in > vhost '/': received 'false' but current is 'true''. > > How can i fix this?. I think settings default in my program set > 'durable' is False so it can't listen RabbitMQ Openstack? It probably depends on which rabbit client library you're using to listen for notifications. Presumably there should be some way to configure it to set durable to True. I guess the other option is to disable durable queues in the Nova config, but then you lose the contents of any queues when Rabbit gets restarted. It would be better to figure out how to make the consuming application configure durable queues instead. > > This is my nova.conf > > http://paste.openstack.org/show/738813/ > > > And section [oslo_messaging_rabbit] > > [oslo_messaging_rabbit] > rabbit_ha_queues = true > rabbit_retry_interval = 1 > rabbit_retry_backoff = 2 > amqp_durable_queues= true > > > > *Nguyễn Bá Thành* > > *Mobile*:    0128 748 0391 > > *Email*: bathanhtlu at gmail.com > From hongbin.lu at huawei.com Wed Jan 2 16:51:00 2019 From: hongbin.lu at huawei.com (Hongbin Lu) Date: Wed, 2 Jan 2019 16:51:00 +0000 Subject: [neutron] bug deputy report (Dec 24 - Dec 30) Message-ID: <0957CD8F4B55C0418161614FEC580D6B308D68A4@yyzeml705-chm.china.huawei.com> Hi all, Below is the bug deputy report for last week. Since it is on the holiday, there are not too much reporting bugs. REFs: * https://bugs.launchpad.net/neutron/+bug/1809628 [RFE] Enable driver field for the api of service_providers * https://bugs.launchpad.net/neutron/+bug/1809878 [RFE] Move sanity-checks to neutron-status CLI tool Incomplete: * https://bugs.launchpad.net/neutron/+bug/1809907 Unstable ping during zuul testing Best regards, Hongbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Wed Jan 2 17:05:05 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 02 Jan 2019 12:05:05 -0500 Subject: [tc] agenda for Technical Committee Meeting 3 Jan 2019 @ 1400 UTC Message-ID: TC Members, Our next meeting will be this Thursday, 3 Jan at 1400 UTC in #openstack-tc. This email contains the agenda for the meeting, based on the content of the wiki [0]. If you will not be able to attend, please include your name in the "Apologies for Absence" section of the wiki page [0]. [0] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee We have several items of old business to wrap up: * technical vision for openstack We have approved the first draft of the vision [1], and have one update proposed by gmann [2]. What are the next steps for this change to the vision? Are we ready to remove this topic from the tracker now, and treat further updates as individual work items? [1] https://governance.openstack.org/tc/reference/technical-vision.html [2] https://review.openstack.org/#/c/621516/ * next step in TC vision/defining the role of the TC We also approved a document explaining the role of the TC [3]. Is there more work to do here, or are we ready to remove this from the tracker now? [3] https://governance.openstack.org/tc/reference/role-of-the-tc.html * keeping up with python 3 releases We have approved all of the patches for documenting the policy and for selecting the versions to be covered in Stein. What are the next steps for ensuring that any implementation work is handled? * Reviewing TC Office Hour Times and Locations The most recent mailing list thread [4] was resolved with no changes to the number of office hours. Does anyone want to propose different times, or are we happy with the current schedule? [4] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000542.html and we also have a couple of items of new business to take up: * Train cycle goals selection update Checking with lbragstad and evrardjp for any updates to share with us this month. * health check status for stein How is it going contacting the PTLs for the health check for stein? Does anyone have anything to raise based on what they have learned in their conversations? -- Doug From doug at doughellmann.com Wed Jan 2 17:30:39 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 02 Jan 2019 12:30:39 -0500 Subject: [oslo] Problem when use library "oslo.messaging" for HA Openstack In-Reply-To: References: Message-ID: Ben Nemec writes: > On 12/27/18 8:22 PM, Thành Nguyễn Bá wrote: >> Dear all, >> >> I have a problem when use 'notification listener' oslo-message for HA >> Openstack. >> >> It raise 'oslo_messaging.exceptions.MessageDeliveryFailure: Unable to >> connect to AMQP server on 172.16.4.125:5672 >>  after inf tries: Exchange.declare: (406) >> PRECONDITION_FAILED - inequivalent arg 'durable' for exchange 'nova' in >> vhost '/': received 'false' but current is 'true''. >> >> How can i fix this?. I think settings default in my program set >> 'durable' is False so it can't listen RabbitMQ Openstack? > > It probably depends on which rabbit client library you're using to > listen for notifications. Presumably there should be some way to > configure it to set durable to True. IIRC, the "exchange" needs to be declared consistently among all listeners because the first client to connect causes the exchange to be created. > I guess the other option is to disable durable queues in the Nova > config, but then you lose the contents of any queues when Rabbit gets > restarted. It would be better to figure out how to make the consuming > application configure durable queues instead. > >> >> This is my nova.conf >> >> http://paste.openstack.org/show/738813/ >> >> >> And section [oslo_messaging_rabbit] >> >> [oslo_messaging_rabbit] >> rabbit_ha_queues = true >> rabbit_retry_interval = 1 >> rabbit_retry_backoff = 2 >> amqp_durable_queues= true You say that is your nova.conf. Is that the same configuration file your client is using when it connects? >> >> >> >> *Nguyễn Bá Thành* >> >> *Mobile*:    0128 748 0391 >> >> *Email*: bathanhtlu at gmail.com >> > -- Doug From doug at doughellmann.com Wed Jan 2 18:09:12 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 02 Jan 2019 13:09:12 -0500 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: Message-ID: Doug Hellmann writes: > Today devstack requires each project to explicitly indicate that it can > be installed under python 3, even when devstack itself is running with > python 3 enabled. > > As part of the python3-first goal, I have proposed a change to devstack > to modify that behavior [1]. With the change in place, when devstack > runs with python3 enabled all services are installed under python 3, > unless explicitly listed as not supporting python 3. > > If your project has a devstack plugin or runs integration or functional > test jobs that use devstack, please test your project with the patch > (you can submit a trivial change to your project and use Depends-On to > pull in the devstack change). > > [1] https://review.openstack.org/#/c/622415/ > -- > Doug > We have had a few +1 votes on the patch above with comments that indicate at least a couple of projects have taken the time to test and verify that things won't break for them with the change. Are we ready to proceed with merging the change? -- Doug From openstack at nemebean.com Wed Jan 2 18:17:12 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 2 Jan 2019 12:17:12 -0600 Subject: [oslo] Parallel Privsep is Proposed for Release Message-ID: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Yay alliteration! :-) I wanted to draw attention to this release[1] in particular because it includes the parallel privsep change[2]. While it shouldn't have any effect on the public API of the library, it does significantly affect how privsep will process calls on the back end. Specifically, multiple calls can now be processed at the same time, so if any privileged code is not reentrant it's possible that new race bugs could pop up. While this sounds scary, it's a necessary change to allow use of privsep in situations where a privileged call may take a non-trivial amount of time. Cinder in particular has some privileged calls that are long-running and can't afford to block all other privileged calls on them. So if you're a consumer of oslo.privsep please keep your eyes open for issues related to this new release and contact the Oslo team if you find any. Thanks. -Ben 1: https://review.openstack.org/628019 2: https://review.openstack.org/#/c/593556/ From cboylan at sapwetik.org Wed Jan 2 18:24:09 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 02 Jan 2019 10:24:09 -0800 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> Message-ID: <1546453449.3633235.1623759896.26639384@webmail.messagingengine.com> On Wed, Jan 2, 2019, at 3:18 AM, Dmitry Tantsur wrote: > Hi all and happy new year :) > > As you know, tempest plugins are branchless, so the CI of ironic- > tempest-plugin > has to run tests on all supported branches. Currently it amounts to 16 > (!) > voting devstack jobs. With each of them have some small probability of a > random > failure, it is impossible to land anything without at least one recheck, > usually > more. > > The bad news is, we only run master API tests job, and these tests are > changed > more often that the other. We already had a minor stable branch breakage > because > of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And > I've just > spotted a missing master multinode job, which is defined but does not > run for > some reason :( > > Here is my proposal to deal with gate bloat on ironic-tempest-plugin: > > 1. Do not run CI jobs at all for unsupported branches and branches in extended > maintenance. For Ocata this has already been done in [2]. > > 2. Make jobs running with N-3 (currently Pike) and older non-voting (and > thus > remove them from the gate queue). I have a gut feeling that a change > that breaks > N-3 is very likely to break N-2 (currently Queens) as well, so it's > enough to > have N-2 voting. > > 3. Make the discovery and the multinode jobs from all stable branches > non-voting. These jobs cover the tests that get changed very infrequently (if > ever). These are also the jobs with the highest random failure rate. Has any work been done to investigate why these jobs fail? And if not maybe we should stop running the jobs entirely. Non voting jobs that aren't reliable will just get ignored. > > 4. Add the API tests, voting for Queens to master, non-voting for Pike (as > proposed above). > > This should leave us with 20 jobs, but with only 11 of them voting. Which is > still a lot, but probably manageable. > > The corresponding change is [3], please comment here or there. > > Dmitry > > [1] https://review.openstack.org/622177 > [2] https://review.openstack.org/621537 > [3] https://review.openstack.org/627955 > From dtantsur at redhat.com Wed Jan 2 18:39:00 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 2 Jan 2019 19:39:00 +0100 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: <1546453449.3633235.1623759896.26639384@webmail.messagingengine.com> References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> <1546453449.3633235.1623759896.26639384@webmail.messagingengine.com> Message-ID: On 1/2/19 7:24 PM, Clark Boylan wrote: > On Wed, Jan 2, 2019, at 3:18 AM, Dmitry Tantsur wrote: >> Hi all and happy new year :) >> >> As you know, tempest plugins are branchless, so the CI of ironic- >> tempest-plugin >> has to run tests on all supported branches. Currently it amounts to 16 >> (!) >> voting devstack jobs. With each of them have some small probability of a >> random >> failure, it is impossible to land anything without at least one recheck, >> usually >> more. >> >> The bad news is, we only run master API tests job, and these tests are >> changed >> more often that the other. We already had a minor stable branch breakage >> because >> of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And >> I've just >> spotted a missing master multinode job, which is defined but does not >> run for >> some reason :( >> >> Here is my proposal to deal with gate bloat on ironic-tempest-plugin: >> >> 1. Do not run CI jobs at all for unsupported branches and branches in extended >> maintenance. For Ocata this has already been done in [2]. >> >> 2. Make jobs running with N-3 (currently Pike) and older non-voting (and >> thus >> remove them from the gate queue). I have a gut feeling that a change >> that breaks >> N-3 is very likely to break N-2 (currently Queens) as well, so it's >> enough to >> have N-2 voting. >> >> 3. Make the discovery and the multinode jobs from all stable branches >> non-voting. These jobs cover the tests that get changed very infrequently (if >> ever). These are also the jobs with the highest random failure rate. > > Has any work been done to investigate why these jobs fail? And if not maybe we should stop running the jobs entirely. Non voting jobs that aren't reliable will just get ignored. From my experience it's PXE failing or just generic timeout on slow nodes. Note that they still don't fail too often, it's their total number that makes it problematic. When you have 20 jobs each failing with, say, 5% rate it's just 35% chance of passing (unless I cannot do math). But to answer your question, yes, we do put work in that. We just never got to 0% of random failures. > >> >> 4. Add the API tests, voting for Queens to master, non-voting for Pike (as >> proposed above). >> >> This should leave us with 20 jobs, but with only 11 of them voting. Which is >> still a lot, but probably manageable. >> >> The corresponding change is [3], please comment here or there. >> >> Dmitry >> >> [1] https://review.openstack.org/622177 >> [2] https://review.openstack.org/621537 >> [3] https://review.openstack.org/627955 >> > From openstack at nemebean.com Wed Jan 2 18:49:32 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 2 Jan 2019 12:49:32 -0600 Subject: [nova] Granular locks in the API In-Reply-To: References: <7e570da6-af7a-1439-9172-e454590d52cf@gmail.com> Message-ID: On 12/20/18 4:58 PM, Lance Bragstad wrote: > > > On Thu, Dec 20, 2018 at 3:50 PM Matt Riedemann > wrote: > > On 12/20/2018 1:45 PM, Lance Bragstad wrote: > > > > One way you might be able to do this is by shoveling off the policy > > check using oslo.policy's http_check functionality [0]. But, it > still > > doesn't fix the problem that users have roles on projects, and > that's > > the standard for relaying information from keystone to services > today. > > > > Hypothetically, the external policy system *could* be an API that > allows > > operators to associate users to different policies that are more > > granular than what OpenStack offers today (I could POST to this > policy > > system that a specific user can do everything but resize up this > > *specific* instance). When nova parses a policy check, it hands > control > > to oslo.policy, which shuffles it off to this external system for > > enforcement. This external policy system evaluates the policies > based on > > what information nova passes it, which would require the policy > check > > string, context of the request like the user, and the resource > they are > > trying operate on (the instance in this case). The external policy > > system could query it's own policy database for any policies > matching > > that data, run the decisions, and return the enforcement decision > per > > the oslo.limit API. > > One thing I'm pretty sure of in nova is we do not do a great job of > getting the target of the policy check before actually doing the check. > In other words, our target is almost always the project/user from the > request context, and not the actual resource upon which the action is > being performed (the server in most cases). I know John Garbutt had a > spec for this before. It always confused me. > > > I doubt nova is alone in this position. I would bet there are a lot of > cases across OpenStack where we could be more consistent in how this > information is handed to oslo.policy. We attempted to solve this for the > other half of the equation, which is the `creds` dictionary. Turns out a > lot of what was in this arbitrary `creds` dict, was actually just > information from the request context object. The oslo.policy library now > supports context objects directly [0], as opposed to hoping services > build the dictionary properly. Target information will be a bit harder > to do because it's different across services and even APIs within the > same service. But yeah, I totally sympathize with the complexity it puts > on developers. > > [0] https://review.openstack.org/#/c/578995/ > > > > > > Conversely, you'll have a performance hit since the policy > decision and > > policy enforcement points are no longer oslo.policy *within* > nova, but > > some external system being called by oslo.policy... > > Yeah. The other thing is if I'm just looking at my server, I can see if > it's locked or not since it's an attribute of the server resource. With > policy I would only know if I can perform a certain action if I get a > 403 or not, which is fine in most cases. Being able to see via some > list > of locked actions per server is arguably more user friendly. This also > reminds me of reporting / capabilities APIs we've talked about over the > years, e.g. what I can do on this cloud, on this host, or with this > specific server? > > > Yeah - I wouldn't mind picking that conversation up, maybe in a separate > thread. An idea we had with keystone was to run a user's request through > all registered policies and return a list of the ones they could access > (e.g., take my token and tell me what I can do with it.) There are > probably other issues with this, since policy names are mostly operator > facing and end users don't really care at the moment. > > > > > > Might not be the best idea, but food for thought based on the > > architecture we have today. > > Definitely, thanks for the alternative. This is something one could > implement per-provider based on need if we don't have a standard > solution. > > > Right, I always thought it would be a good fit for people providing > super-specific policy checks or have a custom syntax they want to > implement. It keeps most of that separate from the services and > oslo.policy. So long as we pass target and context information > consistently, they essentially have an API they can write policies against. I know we fixed a number of bugs in services around the time of the first Denver PTG because a user wanted to offload policy checks to an external system and used HTTPCheck for it. They ran across a number of places where the data passed to oslo.policy was either missing or incorrect, which meant their policy system didn't have enough to make a decision. I haven't heard anything new about this in a while, so it's either still working for them or they gave up on the idea. There's also a spec proposing that we add more formal support for external policy engines to oslo.policy: https://review.openstack.org/#/c/578719/ It probably doesn't solve this problem any more than the HTTPCheck option does, but if one were to go down that path it would make external policy engines easier to use (no need to write a custom policy file to replace every rule with HTTPCheck, for example). > > > -- > > Thanks, > > Matt > From chris.friesen at windriver.com Wed Jan 2 19:31:03 2019 From: chris.friesen at windriver.com (Chris Friesen) Date: Wed, 2 Jan 2019 13:31:03 -0600 Subject: [nova] Granular locks in the API In-Reply-To: References: Message-ID: On 12/20/2018 1:07 PM, Matt Riedemann wrote: > I wanted to float something that we talked about in the public cloud SIG > meeting today [1] which is the concept of making the lock API more > granular to lock on a list of actions rather than globally locking all > actions that can be performed on a server. > > The primary use case we discussed was around a pre-paid pricing model > for servers. A user can pre-pay resources at a discount if let's say > they are going to use them for a month at a fixed rate. However, once > they do, they can't resize those servers without going through some kind > of approval (billing) process to resize up. With this, the provider > could lock the user from performing the resize action on the server but > the user could do other things like stop/start/reboot/snapshot/etc. On the operator side, it seems like you could just auto-switch the user from fixed-rate to variable-rate for that instance (assuming you have their billing info). It almost sounds like this is just a convenience thing for the user, so they don't accidentally resize the instance. Looking at it more generally, are there any other user-callable Compute API calls that would make sense to selectively disable for a specific resource? Chris From juliaashleykreger at gmail.com Wed Jan 2 21:44:42 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 2 Jan 2019 13:44:42 -0800 Subject: [ironic] Mid-cycle call times Message-ID: Greetings everyone, During our ironic team meeting in December, we discussed if we should go ahead and have a "mid-cycle" call in order to try sync up on where we are at during this cycle, and the next steps for us to take as a team. With that said, I have created a doodle poll[1] in an attempt to identify some days that might work. Largely the days available on the poll are geared around my availability this month. Ideally, I would like to find three days where we can schedule some 2-4 hour blocks of time. I've gone ahead and started an etherpad[2] to get us started on brainstorming. Once we have some ideas, we will be able to form a schedule and attempt to identify the amount of time required. -Julia [1]: https://doodle.com/poll/uqwywaxuxsiu7zde [2]: https://etherpad.openstack.org/p/ironic-stein-midcycle -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Wed Jan 2 21:49:54 2019 From: zbitter at redhat.com (Zane Bitter) Date: Thu, 3 Jan 2019 10:49:54 +1300 Subject: [all][ptl][heat][senlin][magnum]New SIG for Autoscaling? plus Session Summary: Autoscaling Integration, improvement, and feedback In-Reply-To: References: Message-ID: On 29/11/18 1:00 AM, Rico Lin wrote: > Dear all > Tl;dr; > I gonna use this ML to give a summary of the forum [1] and asking for > feedback for the idea of new SIG. > We have a plan to create a new SIG for autoscaling which to cover the > common library, docs, and tests for cross-project services (Senlin, > Heat, Monasca, etc.) and cross-community (OpenStack, Kubernetes, etc). > And the goal is not just to have a place to keep those resources that > make sure we can guarantee the availability for use cases, also to have > a force to keep push the effort to integrate across services or at least > make sure we don't go to a point where everyone just do their own > service and don't care about any duplication. > So if you have any thoughts for the new SIG (good or bad) please share > it here. So +1, obviously (for the benefit of those who weren't at the Forum, I... may have suggested it?). Even if there were only one way to do autoscaling in OpenStack, it would be an inherently cross-project thing - and in fact there are multiple options for each piece of the puzzle. I really like Rico's idea of building documentation and test suites to make user experience of autoscaling better, and the best place to host those would be a SIG rather than any individual project. We know there are a *lot* of folks out there under the radar using autoscaling in OpenStack, so maybe a SIG will also provide a way to draw some of them out of the woodwork to tell us more about their use cases. > Here I summarize our discussion in the forum session `Autoscaling > Integration, improvement, and feedback`. If you like to learn more or > input your thoughts, feel free to put it in etherpad [1] or simply > reply to this email. > In the forum, we have been discussed the scope and possibility to > integrate effort from Heat, Senlin, and also autoscaling across > OpenStack to K8s. There are some long-term goals that we can start on > like create a document for general Autoscaling on OpenStack, Common > library for cross-project usage, or create real scenario cases to test > on our CI. > And the most important part is how can we help users and satisfied use > cases without confuse them or making too much duplication effort across > communities/projects. > So here's an action we agree on, is to trigger discussion for either we > need to create a new SIG for autoscaling. We need to define the scope > and the goal for this new SIG before we go ahead and create one. > The new SIG will cover the common library, docs, and tests for > cross-project services (Senlin, Heat, Monasca, etc.) and cross-community > (OpenStack, Kubernetes, etc). And the goal is not just to have a place > to keep those resources that make sure we can guarantee the availability > for use cases, also to have a force to keep push the effort to integrate > across services or at least make sure we don't go to a point where > everyone just do their own service and don't care about any duplication. > For example, we can have a document about do autoscaling in OpenStack, > but we need a place to put it and keep maintain it. And we can even have > a place to set up CI to test all scenario for autoscaling. > I think it's possible to extend the definition of this SIG, but we have > to clear our goal and make sure we actually doing a good thing and make > everyone's life easier. On the other hand we also need to make sure we > do not duplicate the effort of other SIGs/WGs. > Also The reason I add `ptl` tag for this ML is that this SIG or the > concept of `autoscaling` might be very deferent to different projects. > So I really wish to hear from anyone and any projects who are > interested in this topic. > > [1] https://etherpad.openstack.org/p/autoscaling-integration-and-feedback > > > > -- > May The Force of OpenStack Be With You, > */Rico Lin > /*irc: ricolin > > From pawel at suder.info Wed Jan 2 22:06:41 2019 From: pawel at suder.info (=?UTF-8?Q?Pawe=C5=82?= Suder) Date: Wed, 02 Jan 2019 23:06:41 +0100 Subject: [neutron] Bug deputy report 17th~23rd Dec 2018 In-Reply-To: <1545647749.6273.1.camel@suder.info> References: <1545647749.6273.1.camel@suder.info> Message-ID: <1546466801.15393.1.camel@suder.info> Hello, I noticed that email sent to openstack-dev was not sent.. Resending it, Paweł Date 24.12.2018, time 11∶35 +0100, Paweł Suder wrote: > Hello Neutrons, > > Following bugs/RFE/issues have been raised during last week. Some of > them were already recognized, triaged, checked: > > From oldest: > > https://bugs.launchpad.net/neutron/+bug/1808731 [RFE] Needs to > restart > metadata proxy with the start/restart of l3/dhcp agent - IMO need to > discus how to do that. > > https://bugs.launchpad.net/neutron/+bug/1808916 Update mailinglist > from > dev to discuss - it is done. > > https://bugs.launchpad.net/neutron/+bug/1808917 RetryRequest > shouldn't > log stack trace by default, or it should be configurable by the > exception - confirmed by Sławek ;) > > https://bugs.launchpad.net/neutron/+bug/1809037 [RFE] Add > anti_affinity_group to binding:profile - RFE connected with another > RFE > from Nova related to NFV and (anti)affinity of resources like PFs. > > https://bugs.launchpad.net/neutron/+bug/1809080 reload_cfg doesn't > work > correctly - change in review, I tried to review, but no comments left > - > new contributor > > https://bugs.launchpad.net/neutron/+bug/1809134 - TypeError in QoS > gateway_ip code in l3-agent logs - review in progress > > https://bugs.launchpad.net/neutron/+bug/1809238 - [l3] > `port_forwarding` cannot be set before l3 `router` in service_plugins > - > review in progress > > https://bugs.launchpad.net/neutron/+bug/1809447 - performance > regression from mitaka to ocata - not triaged, I am not sure how to > handle it - it is a wide thing.. > > https://bugs.launchpad.net/neutron/+bug/1809497 - bug noticed on > gates > related to another bug opened last week: https://bugs.launchpad.net/n > eu > tron/+bug/1809134 > > Wish you a good time this week and for the next year! :) > > Cheers, > Paweł From jeremyfreudberg at gmail.com Wed Jan 2 22:29:27 2019 From: jeremyfreudberg at gmail.com (Jeremy Freudberg) Date: Wed, 2 Jan 2019 17:29:27 -0500 Subject: [sahara][qa][api-sig]Support for Sahara APIv2 in tempest tests, unversioned endpoints In-Reply-To: <1818981.9ErCeWV4fL@whitebase.usersys.redhat.com> References: <1818981.9ErCeWV4fL@whitebase.usersys.redhat.com> Message-ID: Hey Luigi. I poked around in Tempest and saw these code bits: https://github.com/openstack/tempest/blob/master/tempest/lib/common/rest_client.py#L210 https://github.com/openstack/tempest/blob/f9650269a32800fdcb873ff63f366b7bc914b3d7/tempest/lib/auth.py#L53 Here's a patch which takes advantage of those bits to append the version to the unversioned base URL: https://review.openstack.org/#/c/628056/ Hope it works without regression (I'm a bit worried since Tempest does its own URL mangling rather than nicely use keystoneauth...) On Wed, Jan 2, 2019 at 5:19 AM Luigi Toscano wrote: > > Hi all, > > I'm working on adding support for APIv2 to the Sahara tempest plugin. > > If I get it correctly, there are two main steps > > 1) Make sure that that tempest client works with APIv2 (and don't regress with > APIv1.1). > > This mainly mean implementing the tempest client for Sahara APIv2, which > should not be too complicated. > > On the other hand, we hit an issue with the v1.1 client in an APIv2 > environment. > A change associated with API v2 is usage of an unversioned endpoint for the > deployment (see https://review.openstack.org/#/c/622330/ , without the /v1,1/$ > (tenant_id) suffix) which should magically work with both API variants, but it > seems that the current tempest client fails in this case: > > http://logs.openstack.org/30/622330/1/check/sahara-tests-tempest/7e02114/job-output.txt.gz#_2018-12-05_21_20_23_535544 > > Does anyone know if this is an issue with the code of the tempest tests (which > should maybe have some logic to build the expected endpoint when it's > unversioned, like saharaclient does) or somewhere else? > > > 2) fix the tests to support APIv2. > > Should I duplicate the tests for APIv1.1 and APIv2? Other projects which > supports different APIs seems to do this. > But can I freely move the existing tests under a subdirectory > (sahara_tempest_plugins/tests/api/ -> sahara_tempest_plugins/tests/api/v1/), > or are there any compatibility concerns? Are the test ID enough to ensure that > everything works as before? > > And what about CLI tests currently under sahara_tempest_plugin/tests/cli/ ? > They supports both API versions through a configuration flag. Should they be > duplicated as well? > > > Ciao > (and happy new year if you have a new one in your calendar!) > -- > Luigi > > > From yongli.he at intel.com Thu Jan 3 03:15:26 2019 From: yongli.he at intel.com (yonglihe) Date: Thu, 3 Jan 2019 11:15:26 +0800 Subject: [nova] implementation options for nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: On 2018/12/18 下午4:20, yonglihe wrote: > Hi, guys > > This spec needs input and discuss for move on. Jay suggest we might be good to use a new sub node to hold topology stuff,  it's option 2, here. And split the PCI stuff out of this NUMA thing spec, use a /devices node to hold all 'devices' stuff instead, then this node is generic and not only for PCI itself. I'm OK for Jay's suggestion,  it contains more key words and seems crystal clear and straight forward. The problem is we need aligned about this. This spec need gain more input thanks, Jay, Matt. Regards Yongli He > > Currently the spec is under reviewing: > https://review.openstack.org/#/c/612256/8 > > Plus with POC code: > https://review.openstack.org/#/c/621476/3 > > and related stein PTG discuss: > https://etherpad.openstack.org/p/nova-ptg-stein > start from line 897 > > NUMA topology had lots of information to expose, for saving you time > to jumping into to the spec, the information need to > return include NUMA related like: > numa_node,cpu_pinning,cpu_thread_policy,cpuset,siblings, > mem,pagesize,sockets,cores, > threads, and PCI device's information. > > Base on IRC's discuss, we may have 3 options about how to deal with > those blobs: > > 1) include those directly in the server response details, like the > released POC does: > https://review.openstack.org/#/c/621476/3 > > 2) add a new sub-resource endpoint to servers, most likely use key > word 'topology' then: > "GET /servers/{server_id}/topology" returns the NUMA information for > one server. > > 3) put the NUMA info under existing 'diagnostics' API. > "GET /servers/{server_id}/diagnostics" > this is admin only API, normal user loss the possible to check their > topology. > > when the information put into diagnostics, they will be look like: > { >    .... >    "numa_topology": { >       cells  [ >                { >                     "numa_node" : 3 >                     "cpu_pinning": {0:5, 1:6}, >                     "cpu_thread_policy": "prefer", >                     "cpuset": [0,1,2,3], >                     "siblings": [[0,1],[2,3]], >                     "mem": 1024, >                     "pagesize": 4096, >                     "sockets": 0, >                     "cores": 2, >                      "threads": 2, >                  }, >              ... >            ] # cells >     } >     "emulator_threads_policy": "share" > >     "pci_devices": [ >         { >                 "address":"00:1a.0", >                 "type": "VF", >                 "vendor": "8086", >                 "product": "1526" >         }, >     ] >  } > > > Regards > Yongli He > > > From soulxu at gmail.com Thu Jan 3 04:08:26 2019 From: soulxu at gmail.com (Alex Xu) Date: Thu, 3 Jan 2019 12:08:26 +0800 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> Message-ID: Jay Pipes 于2019年1月2日周三 下午10:48写道: > On 12/21/2018 03:45 AM, Rui Zang wrote: > > It was advised in today's nova team meeting to bring this up by email. > > > > There has been some discussion on the how to track persistent memory > > resource in placement on the spec review [1]. > > > > Background: persistent memory (PMEM) needs to be partitioned to > > namespaces to be consumed by VMs. Due to fragmentation issues, the spec > > proposed to use fixed sized PMEM namespaces. > > The spec proposed to use fixed sized namespaces that is controllable by > the deployer, not fixed-size-for-everyone :) Just want to make sure > we're being clear here. > > > The spec proposed way to represent PMEM namespaces is to use one > > Resource Provider (RP) for one PMEM namespace. An new standard Resource > > Class (RC) -- 'VPMEM_GB` is introduced to classify PMEM namspace RPs. > > For each PMEM namespace RP, the values for 'max_unit', 'min_unit', > > 'total' and 'step_size` are all set to the size of the PMEM namespace. > > In this way, it is guaranteed each RP will be consumed as a whole at one > > time. > > > > An alternative was brought out in the review. Different Custom Resource > > Classes ( CUSTOM_PMEM_XXXGB) can be used to designate PMEM namespaces of > > different sizes. The size of the PMEM namespace is encoded in the name > > of the custom Resource Class. And multiple PMEM namespaces of the same > > size (say 128G) can be represented by one RP of the same > > Not represented by "one RP of the same CUSTOM_PMEM_128G". There would be > only one resource provider: the compute node itself. It would have an > inventory of, say, 8 CUSTOM_PMEM_128G resources. > > > CUSTOM_PMEM_128G. In this way, the RP could have 'max_unit' and 'total' > > as the total number of the PMEM namespaces of the certain size. And the > > values of 'min_unit' and 'step_size' could set to 1. > > No, the max_unit, min_unit, step_size and total would refer to the > number of *PMEM namespaces*, not the amount of GB of memory represented > by those namespaces. > > Therefore, min_unit and step_size would be 1, max_unit would be the > total number of *namespaces* that could simultaneously be attached to a > single consumer (VM), and total would be 8 in our example where the > compute node had 8 of these pre-defined 128G PMEM namespaces. > > > We believe both way could work. We would like to have a community > > consensus on which way to use. > > Email replies and review comments to the spec [1] are both welcomed. > > Custom resource classes were invented for precisely this kind of use > case. The resource being represented is a namespace. The resource is not > "a Gibibyte of persistent memory". > The point of the initial design is avoid to encode the `size` in the resource class name. If that is ok for you(I remember people hate to encode size and number into the trait name), then we will update the design. Probably based on the namespace configuration, nova will be responsible for create those custom RC first. Sounds works. > > Best, > -jay > > > Regards, > > Zang, Rui > > > > > > [1] https://review.openstack.org/#/c/601596/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andre at florath.net Thu Jan 3 08:07:45 2019 From: andre at florath.net (Andre Florath) Date: Thu, 3 Jan 2019 09:07:45 +0100 Subject: [glance] Question about container_format and disk_format In-Reply-To: References: Message-ID: <5ba6f641-c298-0d1a-274d-dfae909a3b8b@florath.net> Hello! After digging through the source code, I'd answer my own question: image disk_format and container_format are (A) the formats of the image file that is passed in. Reasoning: Glance's so called flows use those parameters as input like ovf_process.py [1]: if image.container_format == 'ova': When there is a conversion done in the flow, the target format is the one from the configuration (like [2]): target_format = CONF.image_conversion.output_format After a possible conversion, the new disk and container formats are set (e.g. [3]): image.disk_format = target_format image.container_format = 'bare' (At some points instead of using the disk and container format parameters, a call to 'qemu-img info' is done to extract those information from the image - like in [4]: stdout, stderr = putils.trycmd("qemu-img", "info", "--output=json", ... ... metadata = json.loads(stdout) source_format = metadata.get('format') ) So it looks that the idea is, that the disk_format and container_format should always reflect the current format of the image. Can anybody please confirm / comment? Kind regards Andre [1] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/ovf_process.py#n87 [2] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n78 [3] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n129 [4] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n87 On 12/18/18 11:07 AM, Andre Florath wrote: > Hello! > > I do not completely understand the parameters 'container_format' > and 'disk_format' as described in [1]. The documentation always > uses 'the format' but IMHO there might be two formats involved. > > Are those formats either > > (A) the formats of the image file that is passed in. > > Like (from the official documentation [2]) > > $ openstack image create --disk-format qcow2 --container-format bare \ > --public --file ./centos63.qcow2 centos63-image > > qcow2 / bare are the formats of the passed in image. > > or > > (B) the formats that are used internally to store the image > > Like > > $ openstack image create --disk-format vmdk --container-format ova \ > --public --file ./centos63.qcow2 centos63-image > > vmdk / ova are formats that are used internally in OpenStack glance > to store the image. > In this case there must be an auto-detection of the image file format > that is passed in and an automatic conversion into the new format. > > Kind regards > > Andre > > > [1] https://developer.openstack.org/api-ref/image/v2/index.html?expanded=create-image-detail#create-image > [2] https://docs.openstack.org/glance/pike/admin/manage-images.html > From jaypipes at gmail.com Thu Jan 3 12:39:40 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Thu, 3 Jan 2019 07:39:40 -0500 Subject: [nova] implementation options for nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: On 01/02/2019 10:15 PM, yonglihe wrote: > On 2018/12/18 下午4:20, yonglihe wrote: >> Hi, guys >> >> This spec needs input and discuss for move on. > > Jay suggest we might be good to use a new sub node to hold topology > stuff,  it's option 2, here. And split > > the PCI stuff out of this NUMA thing spec, use a /devices node to hold > all 'devices' stuff instead, then this node > > is generic and not only for PCI itself. > > I'm OK for Jay's suggestion,  it contains more key words and seems > crystal clear and straight forward. > > The problem is we need aligned about this. This spec need gain more > input thanks, Jay, Matt. Also, I mentioned that you need not (IMHO) combine both PCI/devices and NUMA topology in a single spec. We could proceed with the /topology API endpoint and work out the more generic /devices API endpoint in a separate spec. Best, -jay From jaypipes at gmail.com Thu Jan 3 13:31:51 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Thu, 3 Jan 2019 08:31:51 -0500 Subject: [nova] Persistent memory resource tracking model In-Reply-To: References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> Message-ID: <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> On 01/02/2019 11:08 PM, Alex Xu wrote: > Jay Pipes > 于2019年1月2 > 日周三 下午10:48写道: > > On 12/21/2018 03:45 AM, Rui Zang wrote: > > It was advised in today's nova team meeting to bring this up by > email. > > > > There has been some discussion on the how to track persistent memory > > resource in placement on the spec review [1]. > > > > Background: persistent memory (PMEM) needs to be partitioned to > > namespaces to be consumed by VMs. Due to fragmentation issues, > the spec > > proposed to use fixed sized PMEM namespaces. > > The spec proposed to use fixed sized namespaces that is controllable by > the deployer, not fixed-size-for-everyone :) Just want to make sure > we're being clear here. > > > The spec proposed way to represent PMEM namespaces is to use one > > Resource Provider (RP) for one PMEM namespace. An new standard > Resource > > Class (RC) -- 'VPMEM_GB` is introduced to classify PMEM namspace > RPs. > > For each PMEM namespace RP, the values for 'max_unit', 'min_unit', > > 'total' and 'step_size` are all set to the size of the PMEM > namespace. > > In this way, it is guaranteed each RP will be consumed as a whole > at one > > time. >  > > > An alternative was brought out in the review. Different Custom > Resource > > Classes ( CUSTOM_PMEM_XXXGB) can be used to designate PMEM > namespaces of > > different sizes. The size of the PMEM namespace is encoded in the > name > > of the custom Resource Class. And multiple PMEM namespaces of the > same > > size  (say 128G) can be represented by one RP of the same > > Not represented by "one RP of the same CUSTOM_PMEM_128G". There > would be > only one resource provider: the compute node itself. It would have an > inventory of, say, 8 CUSTOM_PMEM_128G resources. > > > CUSTOM_PMEM_128G. In this way, the RP could have 'max_unit'  and > 'total' > > as the total number of the PMEM namespaces of the certain size. > And the > > values of 'min_unit' and 'step_size' could set to 1. > > No, the max_unit, min_unit, step_size and total would refer to the > number of *PMEM namespaces*, not the amount of GB of memory represented > by those namespaces. > > Therefore, min_unit and step_size would be 1, max_unit would be the > total number of *namespaces* that could simultaneously be attached to a > single consumer (VM), and total would be 8 in our example where the > compute node had 8 of these pre-defined 128G PMEM namespaces. > > > We believe both way could work. We would like to have a community > > consensus on which way to use. > > Email replies and review comments to the spec [1] are both welcomed. > > Custom resource classes were invented for precisely this kind of use > case. The resource being represented is a namespace. The resource is > not > "a Gibibyte of persistent memory". > > > The point of the initial design is avoid to encode the `size` in the > resource class name. If that is ok for you(I remember people hate to > encode size and number into the trait name), then we will update the > design. Probably based on the namespace configuration, nova will be > responsible for create those custom RC first. Sounds works. A couple points... 1) I was/am opposed to putting the least-fine-grained size in a resource class name. For example, I would have preferred DISK_BYTE instead of DISK_GB. And MEMORY_BYTE instead of MEMORY_MB. 2) After reading the original Intel PMEM specification (http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf), it seems to me that what you are describing with a generic PMEM_GB (or PMEM_BYTE) resource class is more appropriate for the block mode translation system described in the PDF versus the PMEM namespace system described therein. From a lay person's perspective, I see the difference between the two as similar to the difference between describing the bytes that are in block storage versus a filesystem that has been formatted, wiped, cleaned, etc on that block storage. In Nova, the DISK_GB resource class describes the former: it's a bunch of blocks that are reserved in the underlying block storage for use by the virtual machine. The virtual machine manager then formats that bunch of blocks as needed and lays down a formatted image. We don't have a resource class that represents "a filesystem" or "a partition" (yet). But the proposed PMEM namespaces in your spec definitely seem to be more like a "filesystem resource" than a "GB of block storage" resource. Best, -jay From sean.mcginnis at gmx.com Thu Jan 3 13:34:26 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 07:34:26 -0600 Subject: Fwd: [cinder] Is the =?utf-8?Q?cinder_?= =?utf-8?Q?Active-Active_feature_OK=EF=BC=9F?= In-Reply-To: References: Message-ID: <20190103133426.GA27473@sm-workstation> On Tue, Dec 25, 2018 at 05:26:54PM +0800, Jaze Lee wrote: > Hello, > In my opinion, all rest api will get to manager of cinder volume. > If the manager is using DLM, we can say cinder volume can support > active-active. So, can we rewrite the comments and option's help in > page https://github.com/openstack/cinder/blob/master/cinder/cmd/volume.py#L78 > ? > > Thanks a lot. The work isn't entirely complete for active-active. The one last step we've had to complete is for backend storage vendors to validate that their storage and Cinder drivers work well when running with the higher concurrency of operations that active-active HA would allow to happen. As far as I'm aware, none of the vendors have enabled active-active with their drivers as described in [1]. The other services (not cinder-volume) should be fine, but the to complete the work we need vendors on board to support it with the drivers. Sean [1] https://docs.openstack.org/cinder/latest/contributor/high_availability.html#enabling-active-active-on-drivers From sean.mcginnis at gmx.com Thu Jan 3 13:35:33 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 07:35:33 -0600 Subject: [Infra][Cinder] Request for voting permission from LINBIT LINSTOR CI (linbit_ci@linbit.com) for Cinder projects In-Reply-To: References: Message-ID: <20190103133532.GB27473@sm-workstation> On Fri, Dec 28, 2018 at 01:55:30PM -0800, Woojay Poynter wrote: > Hello, > > I would like to request a voting permission on LINBIT LINSTOR CI for Cinder > project. We are looking to satisfy third-party testing requirement for > Cinder volume driver for LINSTOR (https://review.openstack.org/#/c/624233/). > The CI's test result of the lastest patch is at > http://us.linbit.com:8080/CI-LINSTOR/33/456/ > Thanks Woojay, but in Cinder we do not allow any third party CI to be voting. Only Zuul gets to vote. All third party CI's should comment on the pass/fail results, but it does not need to vote. Sean From rosmaita.fossdev at gmail.com Thu Jan 3 13:46:29 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Thu, 3 Jan 2019 08:46:29 -0500 Subject: [glance] Question about container_format and disk_format In-Reply-To: <5ba6f641-c298-0d1a-274d-dfae909a3b8b@florath.net> References: <5ba6f641-c298-0d1a-274d-dfae909a3b8b@florath.net> Message-ID: On 1/3/19 3:07 AM, Andre Florath wrote: > Hello! > > After digging through the source code, I'd answer my own question: Sorry you had to answer your own question, but glad you were willing to dig into the source code! > image disk_format and container_format are > > (A) the formats of the image file that is passed in. This is "sort of" correct. In general, Glance does not verify either the disk_format or container_format of the image data, so these values are whatever the image owner has specified. Glance doesn't verify these because disk/container formats are developed independently of OpenStack, and in the heady days of 2010, it seemed like a good idea that new disk/container formats be usable without having to wait for a new Glance release. (There isn't much incentive for an image owner to lie about the disk/container format, because specifying the wrong one could make the image unusable by any consuming service that relies on these image properties.) > > Reasoning: > > Glance's so called flows use those parameters as input > like ovf_process.py [1]: > > if image.container_format == 'ova': > > When there is a conversion done in the flow, the target > format is the one from the configuration (like [2]): > > target_format = CONF.image_conversion.output_format > > After a possible conversion, the new disk and container formats are > set (e.g. [3]): > > image.disk_format = target_format > image.container_format = 'bare' > > (At some points instead of using the disk and container format > parameters, a call to 'qemu-img info' is done to extract those > information from the image - like in [4]: > > > stdout, stderr = putils.trycmd("qemu-img", "info", > "--output=json", ... Note to fans of CVE 2015-5162: the above call to qemu-img is time restricted. > ... > metadata = json.loads(stdout) > source_format = metadata.get('format') > ) Remember that the "flows" are optional, so in general you cannot rely upon Glance setting these values correctly for you. > > So it looks that the idea is, that the disk_format and > container_format should always reflect the current format of the > image. > > Can anybody please confirm / comment? Yes, the image properties associated with an image are meant to describe the image data associated with that image record. > Kind regards > > Andre Happy new year! brian > > > [1] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/ovf_process.py#n87 > [2] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n78 > [3] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n129 > [4] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n87 > > > > On 12/18/18 11:07 AM, Andre Florath wrote: >> Hello! >> >> I do not completely understand the parameters 'container_format' >> and 'disk_format' as described in [1]. The documentation always >> uses 'the format' but IMHO there might be two formats involved. >> >> Are those formats either >> >> (A) the formats of the image file that is passed in. >> >> Like (from the official documentation [2]) >> >> $ openstack image create --disk-format qcow2 --container-format bare \ >> --public --file ./centos63.qcow2 centos63-image >> >> qcow2 / bare are the formats of the passed in image. >> >> or >> >> (B) the formats that are used internally to store the image >> >> Like >> >> $ openstack image create --disk-format vmdk --container-format ova \ >> --public --file ./centos63.qcow2 centos63-image >> >> vmdk / ova are formats that are used internally in OpenStack glance >> to store the image. >> In this case there must be an auto-detection of the image file format >> that is passed in and an automatic conversion into the new format. >> >> Kind regards >> >> Andre >> >> >> [1] https://developer.openstack.org/api-ref/image/v2/index.html?expanded=create-image-detail#create-image >> [2] https://docs.openstack.org/glance/pike/admin/manage-images.html >> > > From sean.mcginnis at gmx.com Thu Jan 3 13:51:55 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 07:51:55 -0600 Subject: Review-Priority for Project Repos In-Reply-To: References: Message-ID: <20190103135155.GC27473@sm-workstation> On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: > Dear All, > > There are many occasion when we want to priorities some of the patches > whether it is related to unblock the gates or blocking the non freeze > patches during RC. > > So adding the Review-Priority will allow more precise dashboard. As > Designate and Cinder projects already experiencing this[1][2] and after > discussion with Jeremy brought this to ML to interact with these team > before landing [3], as there is possibility that reapply the priority vote > following any substantive updates to change could make it more cumbersome > than it is worth. With Cinder this is fairly new, but I think it is working well so far. The oddity we've run into, that I think you're referring to here, is how those votes carry forward with updates. I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when a patchset is updates, the -1 and +2 carry forward. But for some reason we can't get the +1 to be sticky. So far, that's just a slight inconvenience. It would be great if we can figure out a way to have them all be sticky, but if we need to live with reapplying +1 votes, that's manageable to me. The one thing I have been slightly concerned with is the process around using these priority votes. It hasn't been an issue, but I could see a scenario where one core (in Cinder we have it set up so all cores can use the priority voting) has set something like a procedural -1, then been pulled away or is absent for an extended period. Like a Workflow -2, another core cannot override that vote. So until that person is back to remove the -1, that patch would not be able to be merged. Granted, we've lived with this with Workflow -2's for years and it's never been a major issue, but I think as far as centralizing control, it may make sense to have a separate smaller group (just the PTL, or PTL and a few "deputies") that are able to set priorities on patches just to make sure the folks setting it are the ones that are actively tracking what the priorities are for the project. Anyway, my 2 cents. I can imagine this would work really well for some teams, less well for others. So if you think it can help you manage your project priorities, I would recommend giving it a shot and seeing how it goes. You can always drop it if it ends up not being effective or causing issues. Sean From fungi at yuggoth.org Thu Jan 3 14:22:29 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 3 Jan 2019 14:22:29 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <20190103135155.GC27473@sm-workstation> References: <20190103135155.GC27473@sm-workstation> Message-ID: <20190103142229.y4o2syjwrq5jqfsp@yuggoth.org> On 2019-01-03 07:51:55 -0600 (-0600), Sean McGinnis wrote: [...] > The one thing I have been slightly concerned with is the process > around using these priority votes. It hasn't been an issue, but I > could see a scenario where one core (in Cinder we have it set up > so all cores can use the priority voting) has set something like a > procedural -1, then been pulled away or is absent for an extended > period. Like a Workflow -2, another core cannot override that > vote. So until that person is back to remove the -1, that patch > would not be able to be merged. [...] Please treat it as only a last resort, but the solution to this is that a Gerrit admin (find us in #openstack-infra on Freenode or the openstack-infra ML on lists.openstack.org or here on openstack-discuss with an [infra] subject tag) can selectively delete votes on a change at the request of a project leader (PTL, infra liaison, TC member...) to unblock your work. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From doug at doughellmann.com Thu Jan 3 14:52:23 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 03 Jan 2019 09:52:23 -0500 Subject: [tc] agenda for Technical Committee Meeting 3 Jan 2019 @ 1400 UTC In-Reply-To: References: Message-ID: Doug Hellmann writes: > TC Members, > > Our next meeting will be this Thursday, 3 Jan at 1400 UTC in > #openstack-tc. This email contains the agenda for the meeting, based on > the content of the wiki [0]. > The logs from the meeting can be found at: Minutes: http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-01-03-14.01.html Log: http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-01-03-14.01.log.html -- Doug From marios at redhat.com Thu Jan 3 15:19:26 2019 From: marios at redhat.com (Marios Andreou) Date: Thu, 3 Jan 2019 17:19:26 +0200 Subject: [tripleo] Scenario Standalone ci jobs update - voting and promotion pipeline Message-ID: o/ TripleO's & Happy New Year \o/ if you are tracking the ci squad you may know that one area of focus recently is moving the scenario-multinode jobs to harder/better/faster/stronger (most importantly *smaller* but not as cool) scenario-standalone equivalents. Scenarios 1-4 are now merged [1]. For the current sprint [2] ci squad is doing cleanup on those. This includes making sure the new jobs are used in all the places the multinode jobs were e.g.[3][4] (& scens 2/3 will follow) and fixing any missing services or any other nits we find. Once done we can move on to the rest - scenarios 5/6 etc. We are looking for any feedback about the jobs in general or any one in particular if you have some special interest in a particular service (see [5] for reminder about services and scenarios). Most importantly those jobs are now being set as voting (e.g. already done for 1/4 at [1]) and the next natural step once voting is to add them into the master promotion pipeline. Please let us know if you think this is a bad idea or with any other feedback or suggestion. regards &thanks for reading! marios [1] https://github.com/openstack-infra/tripleo-ci/blob/3d634dc2874f95a9d4fd97a1ac87e0b07f20bd80/zuul.d/standalone-jobs.yaml#L85-L181 [2] https://tree.taiga.io/project/tripleo-ci-board/taskboard/unified-sprint-3 [3] https://review.openstack.org/#/q/topic:replace-scen1 [4] https://review.openstack.org/#/q/topic:replace-scen4 [5] https://github.com/openstack/tripleo-heat-templates#service-testing-matrix -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfidente at redhat.com Thu Jan 3 15:26:23 2019 From: gfidente at redhat.com (Giulio Fidente) Date: Thu, 3 Jan 2019 16:26:23 +0100 Subject: [tripleo] Scenario Standalone ci jobs update - voting and promotion pipeline In-Reply-To: References: Message-ID: <719707f3-9f50-cf03-214a-ef9df206cfde@redhat.com> On 1/3/19 4:19 PM, Marios Andreou wrote: > o/ TripleO's & Happy New Year \o/  > > if you are tracking the ci squad you may know that one area of focus > recently is moving the scenario-multinode jobs to > harder/better/faster/stronger (most importantly *smaller* but not as > cool) scenario-standalone equivalents. > > Scenarios 1-4 are now merged [1]. For the current sprint [2] ci squad is > doing cleanup on those. This includes making sure the new jobs are used > in all the places the multinode jobs were e.g.[3][4] (& scens 2/3 will > follow) and fixing any missing services or any other nits we find. Once > done we can move on to the rest - scenarios 5/6 etc. > > We are looking for any feedback about the jobs in general or any one in > particular if you have some special interest in a particular service > (see [5] for reminder about services and scenarios). > > Most importantly those jobs are now being set as voting (e.g. already > done for 1/4 at [1]) and the next natural step once voting is to add > them into the master promotion pipeline. > Please let us know if you think this is a bad idea or with any other > feedback or suggestion. thanks a lot! note that these are the two scenarios testing ceph: I guess what I'm trying to say, is that if I can change, and you can change, everybody can change! (rocky balboa TM) -- Giulio Fidente GPG KEY: 08D733BA From cdent+os at anticdent.org Thu Jan 3 16:54:00 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 3 Jan 2019 16:54:00 +0000 (GMT) Subject: [placement] [packaging] [docs] WIP placement install docs Message-ID: I've started a very wippy work in progress [1] for placement installation docs within the extracted placement repository, using the existing docs from nova as a base. I'd like to draw the attention of rdo, ubuntu, and obs packagers as the docs follow the existing pattern of documenting install on those three distros. However, since placement is not fully packaged yet, I'm making some guesses in the docs about how things will be set up and how packages will be named. So if people who are interested in such things could have a look that would be helpful. I've stubbed out, but not yet completed, a from-pypi page as well, as with placement, installation can become as simple as pip install + run a single command line with the right args, if that's what you want. I think people should be able to see that too. Thanks. [1] https://review.openstack.org/628220 -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From sean.mcginnis at gmx.com Thu Jan 3 16:59:45 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 10:59:45 -0600 Subject: [release] Release countdown for week R-13, Jan 7-11 Message-ID: <20190103165944.GA5101@sm-workstation> Welcome back from the holiday lull! Development Focus ----------------- Teams should be focused on completing milestone 2 activities by January 10 and checking overall progress for the development cycle. General Information ------------------- The Stein membership freeze coincides with the Stein-2 milestone on the 10th. While doing an audit of all officially governed team repos, there are quite a few that have not had an official release done yet. If your team has added any repos for deliverables you would like to have included in the Stein coordinated release, please add at least an empty template deliverable file for now so we can help track those. We understand some may not be quite ready for a full release yet, but if you have something minimally viable to get released it would be good to do a 0.x release to exercise the release tooling for your deliverables. Another reminder about the changes this cycle with library deliverables that follow the cycle-with-milestones release model. As announced, we will be automatically proposing releases for these libraries at milestone 2 if there have been any functional changes since the last release to help ensure those changes are picked up by consumers with plenty of time to identify and correct issues. More detail can be found in the original mailing list post describing the changes: http://lists.openstack.org/pipermail/openstack-dev/2018-October/135689.html Any other cycle-with-intermediary deliverables that have not requested a release by January 10 will be switched to the cycle-with-rc model as discussed previously: http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000465.html Upcoming Deadlines & Dates -------------------------- Stein-2 Milestone: January 10 -- Sean McGinnis (smcginnis) From mriedemos at gmail.com Thu Jan 3 17:40:22 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 11:40:22 -0600 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1545992000.14055.0@smtp.office365.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> Message-ID: On 12/28/2018 4:13 AM, Balázs Gibizer wrote: > I'm wondering that introducing an API microversion could act like a > feature flag I need and at the same time still make the feautre > discoverable as you would like to see it. Something like: Create a > feature flag in the code but do not put it in the config as a settable > flag. Instead add an API microversion patch to the top of the series > and when the new version is requested it enables the feature via the > feature flag. This API patch can be small and simple enough to > cherry-pick to earlier into the series for local end-to-end testing if > needed. Also in functional test I can set the flag via a mock so I can > add and run functional tests patch by patch. That may work. It's not how I would have done this, I would have started from the bottom and worked my way up with the end to end functional testing at the end, as already noted, but I realize you've been pushing this boulder for a couple of releases now so that's not really something you want to change at this point. I guess the question is should this change have a microversion at all? That's been wrestled in the spec review and called out in this thread. I don't think a microversion would be *wrong* in any sense and could only help with discoverability on the nova side, but am open to other opinions. -- Thanks, Matt From me at not.mn Thu Jan 3 18:41:26 2019 From: me at not.mn (John Dickinson) Date: Thu, 03 Jan 2019 10:41:26 -0800 Subject: Review-Priority for Project Repos In-Reply-To: <20190103135155.GC27473@sm-workstation> References: <20190103135155.GC27473@sm-workstation> Message-ID: On 3 Jan 2019, at 5:51, Sean McGinnis wrote: > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: >> Dear All, >> >> There are many occasion when we want to priorities some of the patches >> whether it is related to unblock the gates or blocking the non freeze >> patches during RC. >> >> So adding the Review-Priority will allow more precise dashboard. As >> Designate and Cinder projects already experiencing this[1][2] and after >> discussion with Jeremy brought this to ML to interact with these team >> before landing [3], as there is possibility that reapply the priority vote >> following any substantive updates to change could make it more cumbersome >> than it is worth. > > With Cinder this is fairly new, but I think it is working well so far. The > oddity we've run into, that I think you're referring to here, is how those > votes carry forward with updates. > > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when > a patchset is updates, the -1 and +2 carry forward. But for some reason we > can't get the +1 to be sticky. > > So far, that's just a slight inconvenience. It would be great if we can figure > out a way to have them all be sticky, but if we need to live with reapplying +1 > votes, that's manageable to me. > > The one thing I have been slightly concerned with is the process around using > these priority votes. It hasn't been an issue, but I could see a scenario where > one core (in Cinder we have it set up so all cores can use the priority voting) > has set something like a procedural -1, then been pulled away or is absent for > an extended period. Like a Workflow -2, another core cannot override that vote. > So until that person is back to remove the -1, that patch would not be able to > be merged. > > Granted, we've lived with this with Workflow -2's for years and it's never been > a major issue, but I think as far as centralizing control, it may make sense to > have a separate smaller group (just the PTL, or PTL and a few "deputies") that > are able to set priorities on patches just to make sure the folks setting it > are the ones that are actively tracking what the priorities are for the > project. > > Anyway, my 2 cents. I can imagine this would work really well for some teams, > less well for others. So if you think it can help you manage your project > priorities, I would recommend giving it a shot and seeing how it goes. You can > always drop it if it ends up not being effective or causing issues. > > Sean This looks pretty interesting. I have a few question about how it's practically working out. I get the impression that the values of the votes are configurable? So you've chosen -1, +1, and +2, but you could have chosen 1, 2, 3, 4, 5 (for example)? Do you have an example of a dashboard that's using these values? IMO, gerrit's display of votes is rather bad. I'd prefer that votes like this could be aggregated. How do you manage discovering what patches are priority or not? I guess that's where the dashboards come in? I get the impression that particular priority votes have the ability to block a merge. How does that work? half-plug/half-context, I've attempted to solve priority discovery in the Swift community with some customer dashboards. We've got a review dashboard in gerrit [1] that shows "starred by the ptl". I've also created a tool that finds all the starred patches by all contributors, weights each contributor by how much they've contributed recently, and then sorts the resulting totals as a list of "stuff the community thinks is important"[2]. As a community, we also manage our own wiki page for prioritization[3]. I'd love to see if some functionality in gerrit, eg these priority review votes, could supplant some of our other tools. [1] http://not.mn/reviews.html [2] http://d.not.mn/swift_community_dashboard.html [3] https://wiki.openstack.org/wiki/Swift/PriorityReviews --John -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 850 bytes Desc: OpenPGP digital signature URL: From bcafarel at redhat.com Thu Jan 3 19:06:53 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Thu, 3 Jan 2019 20:06:53 +0100 Subject: [neutron] Switching Ocata branch to Extended maintenance Message-ID: Hello and happy new year, as discussed in our final 2018 meeting [0], we plan to switch the Ocata branch of Neutron to Extended maintenance [1]. There will be a final Ocata release for Neutron itself. Stadium projects backports only included test fixes so do not really need a final release. We will then mark the branch as ocata-em. Any comments or objections? Especially from driver projects maintainers, as these have independent releases. [0] http://eavesdrop.openstack.org/meetings/networking/2018/networking.2018-12-18-14.00.log.html [1] https://docs.openstack.org/project-team-guide/stable-branches.html#extended-maintenance -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Jan 3 19:12:11 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 13:12:11 -0600 Subject: [nova] implementation options for nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: On 1/3/2019 6:39 AM, Jay Pipes wrote: > On 01/02/2019 10:15 PM, yonglihe wrote: >> On 2018/12/18 下午4:20, yonglihe wrote: >>> Hi, guys >>> >>> This spec needs input and discuss for move on. >> >> Jay suggest we might be good to use a new sub node to hold topology >> stuff,  it's option 2, here. And split >> >> the PCI stuff out of this NUMA thing spec, use a /devices node to hold >> all 'devices' stuff instead, then this node >> >> is generic and not only for PCI itself. >> >> I'm OK for Jay's suggestion,  it contains more key words and seems >> crystal clear and straight forward. >> >> The problem is we need aligned about this. This spec need gain more >> input thanks, Jay, Matt. > > Also, I mentioned that you need not (IMHO) combine both PCI/devices and > NUMA topology in a single spec. We could proceed with the /topology API > endpoint and work out the more generic /devices API endpoint in a > separate spec. > > Best, > -jay I said earlier in the email thread that I was OK with option 2 (sub-resource) or the diagnostics API, and leaned toward the diagnostics API since it was already admin-only. As long as this information is admin-only by default, not part of the main server response body and therefore not parting of listing servers with details (GET /servers/detail) then I'm OK either way and GET /servers/{server_id}/topology is OK with me also. -- Thanks, Matt From whayutin at redhat.com Thu Jan 3 19:12:59 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Thu, 3 Jan 2019 12:12:59 -0700 Subject: [tripleo] Scenario Standalone ci jobs update - voting and promotion pipeline In-Reply-To: <719707f3-9f50-cf03-214a-ef9df206cfde@redhat.com> References: <719707f3-9f50-cf03-214a-ef9df206cfde@redhat.com> Message-ID: On Thu, Jan 3, 2019 at 8:33 AM Giulio Fidente wrote: > On 1/3/19 4:19 PM, Marios Andreou wrote: > > o/ TripleO's & Happy New Year \o/ > > > > if you are tracking the ci squad you may know that one area of focus > > recently is moving the scenario-multinode jobs to > > harder/better/faster/stronger (most importantly *smaller* but not as > > cool) scenario-standalone equivalents. > > > > Scenarios 1-4 are now merged [1]. For the current sprint [2] ci squad is > > doing cleanup on those. This includes making sure the new jobs are used > > in all the places the multinode jobs were e.g.[3][4] (& scens 2/3 will > > follow) and fixing any missing services or any other nits we find. Once > > done we can move on to the rest - scenarios 5/6 etc. > > > > We are looking for any feedback about the jobs in general or any one in > > particular if you have some special interest in a particular service > > (see [5] for reminder about services and scenarios). > > > > Most importantly those jobs are now being set as voting (e.g. already > > done for 1/4 at [1]) and the next natural step once voting is to add > > them into the master promotion pipeline. > > Please let us know if you think this is a bad idea or with any other > > feedback or suggestion. > thanks a lot! > > note that these are the two scenarios testing ceph: I guess what I'm > trying to say, is that if I can change, and you can change, everybody > can change! > > (rocky balboa TM) > -- > Giulio Fidente > GPG KEY: 08D733BA > Well said Giulio, lolz. Folks, please do take time to help review and merge these patches. Noting this is a big part in reducing TripleO's upstream resource footprint. You may recall upstream infra detailing our very large consumption of upstream resources [1][2]. 99% of the multinode scenario 1-4 jobs should be able to moved to the standalone deployment. If you think you have an exception please reach out to the team in #tripleo. Forward looking, once scenario 1-4 are updated across all the projects we'll start to tackle the other scenario jobs one at a time [3] Thanks [1] https://gist.github.com/notmyname/8bf3dbcb7195250eb76f2a1a8996fb00 [2] http://lists.openstack.org/pipermail/openstack-dev/2018-September/134867.html [3] https://github.com/openstack/tripleo-heat-templates/blob/master/README.rst#service-testing-matrix -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu Jan 3 19:41:52 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 3 Jan 2019 19:41:52 +0000 Subject: [all] One month with openstack-discuss (a progress report) Message-ID: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> First, I want to thank everyone here for the remarkably smooth transition to openstack-discuss at the end of November. It's been exactly one month today since we shuttered the old openstack, openstack-dev, openstack-operators and openstack-sigs mailing lists and forwarded all subsequent posts for them to the new list address instead. The number of posts from non-subscribers has dwindled to the point where it's now only a few each day (many of whom also subscribe immediately after receiving the moderation autoresponse). As of this moment, we're up to 708 subscribers. Unfortunately it's hard to compare raw subscriber counts because the longer a list is in existence the more dead addresses it accumulates. Mailman does its best to unsubscribe addresses which explicitly reject/bounce multiple messages in a row, but these days many E-mail addresses grow defunct without triggering any NDRs (perhaps because they've simply been abandoned, or because their MTAs just blackhole new messages for deleted accounts). Instead, it's a lot more concrete to analyze active participants on mailing lists, especially since ours are consistently configured to require a subscription if you want to avoid your messages getting stuck in the moderation queue. Over the course of 2018 (at least until the lists were closed on December 3) there were 1075 unique E-mail addresses posting to one of more of the openstack, openstack-dev, openstack-operators and openstack-sigs mailing lists. Now, a lot of those people sent one or maybe a handful of messages to ask some question they had, and then disappeared again... they didn't really follow ongoing discussions, so probably won't subscribe to openstack-discuss until they have something new to bring up. On the other hand, if we look at addresses which sent 10 or more messages in 2018 (an arbitrary threshold admittedly), there were 245. Comparing those to the list of addresses subscribed to openstack-discuss today, there are 173 matches. That means we now have *at least* 70% of the people who sent 10 or more messages to the old lists subscribed to the new one. I say "at least" because we don't have an easy way to track address changes, and even if we did that's never going to get us to 100% because there are always going to be people who leave the lists abruptly for various reasons (perhaps even disappearing from our community entirely). Seems like a good place to be after only one month, especially considering the number of folks who may not have even been paying attention at all during end-of-year holidays. As for message volume, we had a total of 912 posts to openstack-discuss in the month of December; comparing to the 1033 posts in total we saw to the four old lists in December of 2017, that's a 12% drop. Consider, though, that right at 10% of the messages on the old lists were duplicates from cross-posting, so that's really more like a 2% drop in actual (deduplicated) posting volume. It's far less of a reduction than I would have anticipated based on year-over-year comparisons (for example, December of 2016 had 1564 posts across those four lists). I think based on this, it's safe to say the transition to openstack-discuss hasn't hampered discussion, at least for its first full month in use. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jimmy at openstack.org Thu Jan 3 19:44:15 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 03 Jan 2019 13:44:15 -0600 Subject: [all] One month with openstack-discuss (a progress report) In-Reply-To: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> References: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> Message-ID: <5C2E660F.4010609@openstack.org> Thanks for the update on this, Jeremy! I was curious about the details behind those numbers :) > Jeremy Stanley > January 3, 2019 at 1:41 PM > First, I want to thank everyone here for the remarkably smooth > transition to openstack-discuss at the end of November. It's been > exactly one month today since we shuttered the old openstack, > openstack-dev, openstack-operators and openstack-sigs mailing lists > and forwarded all subsequent posts for them to the new list address > instead. The number of posts from non-subscribers has dwindled to > the point where it's now only a few each day (many of whom also > subscribe immediately after receiving the moderation autoresponse). > > As of this moment, we're up to 708 subscribers. Unfortunately it's > hard to compare raw subscriber counts because the longer a list is > in existence the more dead addresses it accumulates. Mailman does > its best to unsubscribe addresses which explicitly reject/bounce > multiple messages in a row, but these days many E-mail addresses > grow defunct without triggering any NDRs (perhaps because they've > simply been abandoned, or because their MTAs just blackhole new > messages for deleted accounts). Instead, it's a lot more concrete to > analyze active participants on mailing lists, especially since ours > are consistently configured to require a subscription if you want to > avoid your messages getting stuck in the moderation queue. > > Over the course of 2018 (at least until the lists were closed on > December 3) there were 1075 unique E-mail addresses posting to one > of more of the openstack, openstack-dev, openstack-operators and > openstack-sigs mailing lists. Now, a lot of those people sent one or > maybe a handful of messages to ask some question they had, and then > disappeared again... they didn't really follow ongoing discussions, > so probably won't subscribe to openstack-discuss until they have > something new to bring up. > > On the other hand, if we look at addresses which sent 10 or more > messages in 2018 (an arbitrary threshold admittedly), there were > 245. Comparing those to the list of addresses subscribed to > openstack-discuss today, there are 173 matches. That means we now > have *at least* 70% of the people who sent 10 or more messages to > the old lists subscribed to the new one. I say "at least" because we > don't have an easy way to track address changes, and even if we did > that's never going to get us to 100% because there are always going > to be people who leave the lists abruptly for various reasons > (perhaps even disappearing from our community entirely). Seems like > a good place to be after only one month, especially considering the > number of folks who may not have even been paying attention at all > during end-of-year holidays. > > As for message volume, we had a total of 912 posts to > openstack-discuss in the month of December; comparing to the 1033 > posts in total we saw to the four old lists in December of 2017, > that's a 12% drop. Consider, though, that right at 10% of the > messages on the old lists were duplicates from cross-posting, so > that's really more like a 2% drop in actual (deduplicated) posting > volume. It's far less of a reduction than I would have anticipated > based on year-over-year comparisons (for example, December of 2016 > had 1564 posts across those four lists). I think based on this, it's > safe to say the transition to openstack-discuss hasn't hampered > discussion, at least for its first full month in use. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Thu Jan 3 19:50:48 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 13:50:48 -0600 Subject: Review-Priority for Project Repos In-Reply-To: References: <20190103135155.GC27473@sm-workstation> Message-ID: <20190103195048.GB10975@sm-workstation> > > This looks pretty interesting. I have a few question about how it's practically working out. > > > I get the impression that the values of the votes are configurable? So you've chosen -1, +1, and +2, but you could have chosen 1, 2, 3, 4, 5 (for example)? > Yes, I believe so. Maybe someone more familiar with how this works in Gerrit can correct me if I'm misstating that. > > Do you have an example of a dashboard that's using these values? > Being able to easily query and create a dashboard for this is the big benefit. Here's what I came up with for Cinder: https://tiny.cc/CinderPriorities > > IMO, gerrit's display of votes is rather bad. I'd prefer that votes like this could be aggregated. How do you manage discovering what patches are priority or not? I guess that's where the dashboards come in? > We haven't been aggregating votes, rather just a core can decide to flag things are priority or not. > > I get the impression that particular priority votes have the ability to block a merge. How does that work? > I believe this is what controls that: http://git.openstack.org/cgit/openstack-infra/project-config/tree/gerrit/acls/openstack/cinder.config#n29 So I believe that could be NoBlock or NoOp to allow just ranking without enforcing any kind of blocking. From fungi at yuggoth.org Thu Jan 3 20:04:29 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 3 Jan 2019 20:04:29 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <20190103195048.GB10975@sm-workstation> References: <20190103135155.GC27473@sm-workstation> <20190103195048.GB10975@sm-workstation> Message-ID: <20190103200429.pwafqqohb7lxccw7@yuggoth.org> On 2019-01-03 13:50:48 -0600 (-0600), Sean McGinnis wrote: [...] > > I get the impression that the values of the votes are > > configurable? So you've chosen -1, +1, and +2, but you could > > have chosen 1, 2, 3, 4, 5 (for example)? > > Yes, I believe so. Maybe someone more familiar with how this works > in Gerrit can correct me if I'm misstating that. [...] The value ranges are entirely arbitrary as far as I know. Keep in mind though that the Gerrit configuration to carry over votes to new patch sets under specific conditions can apparently only be set to carry over the highest and lowest possible values, but none in between. I really don't understand that design choice on their part, but that's what it does. > So I believe that could be NoBlock or NoOp to allow just ranking > without enforcing any kind of blocking. Yes, that should work if it's the behavior you're looking for. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at nemebean.com Thu Jan 3 20:38:24 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 3 Jan 2019 14:38:24 -0600 Subject: [tripleo] OVB stable/1.0 branch created Message-ID: <27d1a32e-2cce-4892-fb99-7da8c3522df5@nemebean.com> Just a quick update on the status of OVB. There's some discussion on [1] about how to handle importing it to Gerrit, and it looks like we first want to wrap up the 2.0-dev feature branch so we don't have to mess with that after import. As a result, I've cut the stable/1.0 branch from current OVB master. Anyone who's not ready to try out 2.0 should start pointing at stable/1.0 instead of master as the 2.0-dev branch will soon be merged back to master. This will leave us with a branch structure that better matches most other OpenStack projects. We should probably give consumers some time to get switched over, but it sounds like TripleO CI already pins to a specific commit in OVB so it may not affect that. Once a little time has passed we can do the 2.0-dev merge and proceed with the import. I also made a wordier blog post about this[2]. Let me know if you have any feedback. Thanks. -Ben 1: https://review.openstack.org/#/c/620613/ 2: http://blog.nemebean.com/content/openstack-virtual-baremetal-import-plans From msm at redhat.com Thu Jan 3 20:59:47 2019 From: msm at redhat.com (Michael McCune) Date: Thu, 3 Jan 2019 15:59:47 -0500 Subject: [all] One month with openstack-discuss (a progress report) In-Reply-To: <5C2E660F.4010609@openstack.org> References: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> <5C2E660F.4010609@openstack.org> Message-ID: On Thu, Jan 3, 2019 at 2:47 PM Jimmy McArthur wrote: > > Thanks for the update on this, Jeremy! I was curious about the details behind those numbers :) seconded, i really appreciate the update and all the work that went into the transition. it's been completely smooth and painless on my end. kudos peace o/ From whayutin at redhat.com Thu Jan 3 21:04:16 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Thu, 3 Jan 2019 14:04:16 -0700 Subject: [tripleo] OVB stable/1.0 branch created In-Reply-To: <27d1a32e-2cce-4892-fb99-7da8c3522df5@nemebean.com> References: <27d1a32e-2cce-4892-fb99-7da8c3522df5@nemebean.com> Message-ID: Thanks for the update Ben! On Thu, Jan 3, 2019 at 1:43 PM Ben Nemec wrote: > Just a quick update on the status of OVB. There's some discussion on [1] > about how to handle importing it to Gerrit, and it looks like we first > want to wrap up the 2.0-dev feature branch so we don't have to mess with > that after import. As a result, I've cut the stable/1.0 branch from > current OVB master. Anyone who's not ready to try out 2.0 should start > pointing at stable/1.0 instead of master as the 2.0-dev branch will soon > be merged back to master. This will leave us with a branch structure > that better matches most other OpenStack projects. > > We should probably give consumers some time to get switched over, but it > sounds like TripleO CI already pins to a specific commit in OVB so it > may not affect that. Once a little time has passed we can do the 2.0-dev > merge and proceed with the import. > > I also made a wordier blog post about this[2]. > > Let me know if you have any feedback. Thanks. > > -Ben > > 1: https://review.openstack.org/#/c/620613/ > 2: > http://blog.nemebean.com/content/openstack-virtual-baremetal-import-plans > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Thu Jan 3 21:57:39 2019 From: dms at danplanet.com (Dan Smith) Date: Thu, 03 Jan 2019 13:57:39 -0800 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> (Matt Riedemann's message of "Tue, 18 Dec 2018 20:04:00 -0600") References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> Message-ID: > 1. This can already be done using existing APIs (as noted) client-side > if monitoring the live migration and it times out for whatever you > consider a reasonable timeout at the time. There's another thing to point out here, which is that this is also already doable by adjusting (rightly libvirt-specific) config tunables on a compute node that is being evacuated. Those could be hot-reloadable, meaning they could be changed without restarting the compute service when the evac process begins. It doesn't let you control it per-instance, granted, but there *is* a server-side solution to this based on existing stuff. > 2. The libvirt driver is the only one that currently supports abort > and force-complete. > > For #1, while valid as a workaround, is less than ideal since it would > mean having to orchestrate that into any tooling that needs that kind > of workaround, be that OSC, openstacksdk, python-novaclient, > gophercloud, etc. I think it would be relatively simple to pass those > parameters through with the live migration request down to > nova-compute and have the parameters override the config options and > then it's natively supported in the API. > > For #2, while also true, I think is not a great reason *not* to > support per-instance timeouts/actions in the API when we already have > existing APIs that do the same thing and have the same backend compute > driver limitations. To ease this, I think we can sort out two things: > > a) Can other virt drivers that support live migration (xenapi, hyperv, > vmware in tree, and powervm out of tree) also support abort and > force-complete actions? John Garbutt at least thought it should be > possible for xenapi at the Stein PTG. I don't know about the others - > driver maintainers please speak up here. The next challenge would be > getting driver maintainers to actually add that feature parity, but > that need not be a priority for Stein as long as we know it's possible > to add the support eventually. I think that we asked Eric and he said that powervm would/could not support such a thing because they hand the process off to the hypevisor and don't pay attention to what happens after that (and/or can't cancel it). I know John said he thought it would be doable for xenapi, but even if it is, I'm not expecting it will happen. I'd definitely like to hear from the others. > b) There are pre-live migration checks that happen on the source > compute before we initiate the actual guest transfer. If a user > (admin) specified these new parameters and the driver does not support > them, we could fail the live migration early. This wouldn't change the > instance status but the migration would fail and an instance action > event would be recorded to explain why it didn't work, and then the > admin can retry without those parameters. This would shield us from > exposing something in the API that could give a false sense of > functionality when the backend doesn't support it. This is better than nothing, granted. What I'm concerned about is not that $driver never supports these, but rather that $driver shows up later and wants *different* parameters. Or even that libvirt/kvm migration changes in such a way that these no longer make sense even for it. We already have an example this in-tree today, where the recently-added libvirt post-copy mode makes the 'abort' option invalid. > Given all of this, are these reasonable compromises to continue trying > to drive this feature forward, and more importantly, are other > operators looking to see this functionality added to nova? Huawei > public cloud operators want it because they routinely are doing live > migrations as part of maintenance activities and want to be able to > control these values per-instance. I assume there are other > deployments that would like the same. I don't need to hold this up if everyone else is on board, but I don't really want to +2 it. I'll commit to not -1ing it if it specifically confirms support before starting a migration that won't honor the requested limits. --Dan From mriedemos at gmail.com Thu Jan 3 22:17:33 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 16:17:33 -0600 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> Message-ID: <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> On 1/3/2019 3:57 PM, Dan Smith wrote: > Or even that libvirt/kvm > migration changes in such a way that these no longer make sense even for > it. We already have an example this in-tree today, where the > recently-added libvirt post-copy mode makes the 'abort' option invalid. I'm not following you here. As far as I understand, post-copy in the libvirt driver is triggered on the force complete action and only if (1) it's available and (2) nova is configured to allow it, otherwise the force complete action for the libvirt driver pauses the VM. The abort operation aborts the job in libvirt [1] which I believe triggers a rollback [2]. [1] https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7388 [2] https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7454 -- Thanks, Matt From dms at danplanet.com Thu Jan 3 22:37:03 2019 From: dms at danplanet.com (Dan Smith) Date: Thu, 03 Jan 2019 14:37:03 -0800 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> (Matt Riedemann's message of "Thu, 3 Jan 2019 16:17:33 -0600") References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> Message-ID: >> Or even that libvirt/kvm >> migration changes in such a way that these no longer make sense even for >> it. We already have an example this in-tree today, where the >> recently-added libvirt post-copy mode makes the 'abort' option invalid. > > I'm not following you here. As far as I understand, post-copy in the > libvirt driver is triggered on the force complete action and only if > (1) it's available and (2) nova is configured to allow it, otherwise > the force complete action for the libvirt driver pauses the VM. The > abort operation aborts the job in libvirt [1] which I believe triggers > a rollback [2]. > > [1] > https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7388 > [2] > https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7454 Because in nova[0] we currently only switch to post-copy after we decide we're not making progress right? If we later allow a configuration where post-copy is the default from the start (as I believe is the actual current recommendation from the virt people[1]), and someone triggers a migration with a short timeout and abort action, we'll not be able to actually do the abort. I'm guessing we'd just need to refuse a request where abort is specified with any timeout if post-copy will be used from the beginning. Since the API user can't know how the virt driver is configured, we just have to refuse to do the migration and hope they'll understand :) 0: Sorry, I shouldn't have said "in tree" because I meant "in the libvirt world" 1: look for "in summary" here: https://www.berrange.com/posts/2016/05/12/analysis-of-techniques-for-ensuring-migration-completion-with-kvm/ --Dan From mriedemos at gmail.com Thu Jan 3 23:23:21 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 17:23:21 -0600 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> Message-ID: <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> On 1/3/2019 4:37 PM, Dan Smith wrote: > Because in nova[0] we currently only switch to post-copy after we decide > we're not making progress right? If you're referring to the "live_migration_progress_timeout" option that has been deprecated and was replaced in Stein with the live_migration_timeout_action option, which was a pre-requisite for the per-instance timeout + action spec. In Stein, we only switch to post-copy if we hit live_migration_completion_timeout and live_migration_timeout_action=force_complete and live_migration_permit_post_copy=True (and libvirt/qemu are new enough for post-copy), otherwise we pause the guest. So I don't think the stalled progress stuff has applied for awhile (OSIC found problems with it in Ocata and disabled/deprecated it). > If we later allow a configuration where > post-copy is the default from the start (as I believe is the actual > current recommendation from the virt people[1]), and someone triggers a > migration with a short timeout and abort action, we'll not be able to > actually do the abort. Sorry but I don't understand this, how does "post-copy from the start" apply? If I specify a short timeout and abort action in the API, and the timeout is reached before the migration is complete, it should abort, just like if I abort it via the API. As noted above, post-copy should only be triggered once we reach the timeout, and if you overwrite that action to abort (per instance, in the API), it should abort rather than switch to post-copy. -- Thanks, Matt From dms at danplanet.com Thu Jan 3 23:45:25 2019 From: dms at danplanet.com (Dan Smith) Date: Thu, 03 Jan 2019 15:45:25 -0800 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> (Matt Riedemann's message of "Thu, 3 Jan 2019 17:23:21 -0600") References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> Message-ID: Matt Riedemann writes: > On 1/3/2019 4:37 PM, Dan Smith wrote: >> Because in nova[0] we currently only switch to post-copy after we decide >> we're not making progress right? > > If you're referring to the "live_migration_progress_timeout" option > that has been deprecated and was replaced in Stein with the > live_migration_timeout_action option, which was a pre-requisite for > the per-instance timeout + action spec. > > In Stein, we only switch to post-copy if we hit > live_migration_completion_timeout and > live_migration_timeout_action=force_complete and > live_migration_permit_post_copy=True (and libvirt/qemu are new enough > for post-copy), otherwise we pause the guest. > > So I don't think the stalled progress stuff has applied for awhile > (OSIC found problems with it in Ocata and disabled/deprecated it). Yeah, I'm trying to point out something _other_ than what is currently nova behavior. >> If we later allow a configuration where >> post-copy is the default from the start (as I believe is the actual >> current recommendation from the virt people[1]), and someone triggers a >> migration with a short timeout and abort action, we'll not be able to >> actually do the abort. > > Sorry but I don't understand this, how does "post-copy from the start" > apply? If I specify a short timeout and abort action in the API, and > the timeout is reached before the migration is complete, it should > abort, just like if I abort it via the API. As noted above, post-copy > should only be triggered once we reach the timeout, and if you > overwrite that action to abort (per instance, in the API), it should > abort rather than switch to post-copy. You can't abort a post-copy migration once it has started. If we were to add an "always do post-copy" mode to Nova, per the recommendation from the post I linked, then we would start a migration in post-copy mode, which would make it un-cancel-able. That means not only could you not cancel it, but we would have to refuse to start the migration if the user requested an abort action via this new proposed API with any timeout value. Anyway, my point here is just that libvirt already (but not nova/libvirt yet) has a live migration mode where we would not be able to honor a request of "abort after N seconds". If config specified that, we could warn or fail on startup, but via the API all we'd be able to do is refuse to start the migration. I'm just trying to highlight that baking "force/abort after N seconds" into our API is not only just libvirt-specific at the moment, but even libvirt-pre-copy specific. --Dan From mriedemos at gmail.com Fri Jan 4 00:02:16 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 18:02:16 -0600 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> Message-ID: <65c0c5c3-51eb-1b25-2818-0f149a1125fe@gmail.com> On 1/3/2019 5:45 PM, Dan Smith wrote: > You can't abort a post-copy migration once it has started. If we were to > add an "always do post-copy" mode to Nova, per the recommendation from > the post I linked, then we would start a migration in post-copy mode, > which would make it un-cancel-able. That means not only could you not > cancel it, but we would have to refuse to start the migration if the > user requested an abort action via this new proposed API with any > timeout value. > > Anyway, my point here is just that libvirt already (but not nova/libvirt > yet) has a live migration mode where we would not be able to honor a > request of "abort after N seconds". If config specified that, we could > warn or fail on startup, but via the API all we'd be able to do is > refuse to start the migration. I'm just trying to highlight that > baking "force/abort after N seconds" into our API is not only just > libvirt-specific at the moment, but even libvirt-pre-copy specific. OK, sorry, I'm following you now. I didn't make the connection that you were talking about something we could do in the future (in nova) to initiate the live migration in post-copy mode. Yeah I agree in that case if the user said abort we'd just have to reject it and say you can't do that based on how the source host is configured. -- Thanks, Matt From jungleboyj at gmail.com Fri Jan 4 00:26:04 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Thu, 3 Jan 2019 18:26:04 -0600 Subject: Review-Priority for Project Repos In-Reply-To: <20190103135155.GC27473@sm-workstation> References: <20190103135155.GC27473@sm-workstation> Message-ID: > With Cinder this is fairly new, but I think it is working well so far. The > oddity we've run into, that I think you're referring to here, is how those > votes carry forward with updates. It is unfortunate that we can't get +1's to carry forward but I don't think this negates the value of having the priorities.  I have been using our review dashboard quite a bit lately and plan to set up processes that involve it as we move forward to using/documenting Storyboard for Cinder. > So far, that's just a slight inconvenience. It would be great if we can figure > out a way to have them all be sticky, but if we need to live with reapplying +1 > votes, that's manageable to me.  Is there someway that we could allow the owner to reset this priority after pushing up a new patch.  That would lower the dependence on the cores in that case. > Anyway, my 2 cents. I can imagine this would work really well for some teams, > less well for others. So if you think it can help you manage your project > priorities, I would recommend giving it a shot and seeing how it goes. You can > always drop it if it ends up not being effective or causing issues. > > Sean > As usual, the biggest problem I am seeing is getting enough people to do the reviews and really set up all the priorities appropriately.  There are just a couple of us doing it right now. I am hoping to see more participation in the coming months to make the output more beneficial for all. From chris at openstack.org Fri Jan 4 00:27:25 2019 From: chris at openstack.org (Chris Hoge) Date: Thu, 3 Jan 2019 16:27:25 -0800 Subject: [loci] Loci Meeting for January 3, 2019 Message-ID: <1BE8273A-D24B-4F22-A225-9451117FD0F6@openstack.org> On Friday, January 3 we will resume our Loci team meetings at 7 AM PT/ 15 UTC after an extended end-of-year break. We will revisit our plan for building stable branches (in light of several failures to build stable branches on a few distributions because of requirements failures), coming up with a more robust testing strategy, and planning new development priorities for the remainder of the cycle. Thanks, Chris https://etherpad.openstack.org/p/loci-meeting From chris at openstack.org Fri Jan 4 00:29:20 2019 From: chris at openstack.org (Chris Hoge) Date: Thu, 3 Jan 2019 16:29:20 -0800 Subject: [loci] Loci Meeting for January 3, 2019 In-Reply-To: <1BE8273A-D24B-4F22-A225-9451117FD0F6@openstack.org> References: <1BE8273A-D24B-4F22-A225-9451117FD0F6@openstack.org> Message-ID: <096E5BE5-2F28-4BBE-A6DA-69E95B0D7FB1@openstack.org> Correction, the date of the meeting is January 4. > On Jan 3, 2019, at 4:27 PM, Chris Hoge wrote: > > On Friday, January 3 we will resume our Loci team meetings at 7 AM PT/ 15 > UTC after an extended end-of-year break. We will revisit our plan for > building stable branches (in light of several failures to build stable > branches on a few distributions because of requirements failures), coming > up with a more robust testing strategy, and planning new development > priorities for the remainder of the cycle. > > Thanks, > Chris > > https://etherpad.openstack.org/p/loci-meeting > > From cboylan at sapwetik.org Fri Jan 4 01:22:26 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 03 Jan 2019 17:22:26 -0800 Subject: Review-Priority for Project Repos In-Reply-To: References: <20190103135155.GC27473@sm-workstation> Message-ID: <1546564946.3332290.1625035808.384EA5BF@webmail.messagingengine.com> On Thu, Jan 3, 2019, at 4:26 PM, Jay Bryant wrote: > > snip > > So far, that's just a slight inconvenience. It would be great if we can figure > > out a way to have them all be sticky, but if we need to live with reapplying +1 > > votes, that's manageable to me. > > > >  Is there someway that we could allow the owner to reset this priority > after pushing up a new patch.  That would lower the dependence on the > cores in that case. If you use a three value label: [-1: +1] then you could set copy min and max scores so all values are carried forward on new patchsets. This would allow you to have -1 "Don't review", 0 "default no special priority", and +1 "this is a priority please review now". This may have to take advantage of the fact that if you don't set a value its roughly the same as 0 (I don't know if this is explicitly true in Gerrit but we can approximate it since -1 and +1 would be explicitly set and query on those values). If you need an explicit copy all values function in Gerrit you'll want to get that merged upstream first then we could potentially backport it to our Gerrit. This will likely require writing Java. For some reason I thought that Prolog predicates could be written for these value copying functions, but docs seem to say otherwise. Prolog is only for determining if a label's value allows a change to be submitted (merged). Clark From bathanhtlu at gmail.com Fri Jan 4 02:09:47 2019 From: bathanhtlu at gmail.com (=?UTF-8?B?VGjDoG5oIE5ndXnhu4VuIELDoQ==?=) Date: Fri, 4 Jan 2019 09:09:47 +0700 Subject: [oslo] Problem when use library "oslo.messaging" for HA Openstack In-Reply-To: References: Message-ID: No, it isn't. It raised when i use default settings on my client base on "olso_messaging". And when i create the config file and use "oslo_config" passed to tranport (get_notification_transport), it work :D Thank for your help. *Nguyễn Bá Thành* *Mobile*: 0128 748 0391 *Email*: bathanhtlu at gmail.com Vào Th 5, 3 thg 1, 2019 vào lúc 00:30 Doug Hellmann đã viết: > Ben Nemec writes: > > > On 12/27/18 8:22 PM, Thành Nguyễn Bá wrote: > >> Dear all, > >> > >> I have a problem when use 'notification listener' oslo-message for HA > >> Openstack. > >> > >> It raise 'oslo_messaging.exceptions.MessageDeliveryFailure: Unable to > >> connect to AMQP server on 172.16.4.125:5672 > >> after inf tries: Exchange.declare: (406) > >> PRECONDITION_FAILED - inequivalent arg 'durable' for exchange 'nova' in > >> vhost '/': received 'false' but current is 'true''. > >> > >> How can i fix this?. I think settings default in my program set > >> 'durable' is False so it can't listen RabbitMQ Openstack? > > > > It probably depends on which rabbit client library you're using to > > listen for notifications. Presumably there should be some way to > > configure it to set durable to True. > > IIRC, the "exchange" needs to be declared consistently among all > listeners because the first client to connect causes the exchange to be > created. > > > I guess the other option is to disable durable queues in the Nova > > config, but then you lose the contents of any queues when Rabbit gets > > restarted. It would be better to figure out how to make the consuming > > application configure durable queues instead. > > > >> > >> This is my nova.conf > >> > >> http://paste.openstack.org/show/738813/ > >> > >> > >> And section [oslo_messaging_rabbit] > >> > >> [oslo_messaging_rabbit] > >> rabbit_ha_queues = true > >> rabbit_retry_interval = 1 > >> rabbit_retry_backoff = 2 > >> amqp_durable_queues= true > > You say that is your nova.conf. Is that the same configuration file > your client is using when it connects? > > >> > >> > >> > >> *Nguyễn Bá Thành* > >> > >> *Mobile*: 0128 748 0391 > >> > >> *Email*: bathanhtlu at gmail.com > >> > > > > -- > Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Fri Jan 4 02:14:01 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Thu, 3 Jan 2019 20:14:01 -0600 Subject: [openstack-dev] [neutron] Cancelling Neutron drivers meeting on January 4th Message-ID: Hi Neutrinos, Today I spent time triaging RFEs. I posted comments and questions in several of them in order to get them in good shape to be discussed by the Drivers team. None are ready to be discussed, though. As a consequence. I am cancelling the meeting on January 4th. Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjf1970231893 at gmail.com Fri Jan 4 07:19:04 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Fri, 4 Jan 2019 15:19:04 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? Message-ID: Dear Octavia team: The email aims to ask the development progress about l3-active-active blueprint. I noticed that the work in this area has been stagnant for eight months. https://review.openstack.org/#/q/l3-active-active I want to know the community's next work plan in this regard. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Fri Jan 4 08:07:33 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 00:07:33 -0800 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: <65c0c5c3-51eb-1b25-2818-0f149a1125fe@gmail.com> References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> <65c0c5c3-51eb-1b25-2818-0f149a1125fe@gmail.com> Message-ID: On Thu, 3 Jan 2019 18:02:16 -0600, Matt Riedemann wrote: > On 1/3/2019 5:45 PM, Dan Smith wrote: >> You can't abort a post-copy migration once it has started. If we were to >> add an "always do post-copy" mode to Nova, per the recommendation from >> the post I linked, then we would start a migration in post-copy mode, >> which would make it un-cancel-able. That means not only could you not >> cancel it, but we would have to refuse to start the migration if the >> user requested an abort action via this new proposed API with any >> timeout value. >> >> Anyway, my point here is just that libvirt already (but not nova/libvirt >> yet) has a live migration mode where we would not be able to honor a >> request of "abort after N seconds". If config specified that, we could >> warn or fail on startup, but via the API all we'd be able to do is >> refuse to start the migration. I'm just trying to highlight that >> baking "force/abort after N seconds" into our API is not only just >> libvirt-specific at the moment, but even libvirt-pre-copy specific. > > OK, sorry, I'm following you now. I didn't make the connection that you > were talking about something we could do in the future (in nova) to > initiate the live migration in post-copy mode. Yeah I agree in that case > if the user said abort we'd just have to reject it and say you can't do > that based on how the source host is configured. This seems like a reasonable way to handle the future case of a live migration initiated in post-copy mode. Overall, I'm in support of the idea of adding finer-grained control over live migrations, being that we have multiple operators who've expressed the usefulness they'd get from it and it seems like a relatively simple change. It also sounds like we have answers for the concerns about bad UX by checking pre-live-migration whether the driver supports the new parameters and fail fast in that case. And in the future if we have live migrations able to be initiated in post-copy mode, fail fast with instance action info similarly. -melanie From melwittt at gmail.com Fri Jan 4 08:48:45 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 00:48:45 -0800 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> Message-ID: On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann wrote: > On 12/28/2018 4:13 AM, Balázs Gibizer wrote: >> I'm wondering that introducing an API microversion could act like a >> feature flag I need and at the same time still make the feautre >> discoverable as you would like to see it. Something like: Create a >> feature flag in the code but do not put it in the config as a settable >> flag. Instead add an API microversion patch to the top of the series >> and when the new version is requested it enables the feature via the >> feature flag. This API patch can be small and simple enough to >> cherry-pick to earlier into the series for local end-to-end testing if >> needed. Also in functional test I can set the flag via a mock so I can >> add and run functional tests patch by patch. > > That may work. It's not how I would have done this, I would have started > from the bottom and worked my way up with the end to end functional > testing at the end, as already noted, but I realize you've been pushing > this boulder for a couple of releases now so that's not really something > you want to change at this point. > > I guess the question is should this change have a microversion at all? > That's been wrestled in the spec review and called out in this thread. I > don't think a microversion would be *wrong* in any sense and could only > help with discoverability on the nova side, but am open to other opinions. Sorry to be late to this discussion, but this brought up in the nova meeting today to get more thoughts. I'm going to briefly summarize my thoughts here. IMHO, I think this change should have a microversion, to help with discoverability. I'm thinking, how will users be able to detect they're able to leverage the new functionality otherwise? A microversion would signal the availability. As for dealing with the situation where a user specifies an older microversion combined with resource requests, I think it should behave similarly to how multiattach works, where the request will be rejected straight away if microversion too low + resource requests are passed. Current behavior today would be, the resource requests are ignored. If we only ignored the resource requests when they're passed with an older microversion, it seems like it would be an unnecessarily poor UX to have their parameters ignored and likely lead them on a debugging journey if and when they realize things aren't working the way they expect given the resource requests they specified. -melanie From melwittt at gmail.com Fri Jan 4 08:53:47 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 00:53:47 -0800 Subject: [nova][dev] tracking spec reviews ahead of spec freeze In-Reply-To: <67c6b11f-a6a5-67cf-1b31-8ae8b5e580ae@gmail.com> References: <67c6b11f-a6a5-67cf-1b31-8ae8b5e580ae@gmail.com> Message-ID: On Wed, 19 Dec 2018 08:49:10 -0800, Melanie Witt wrote: > Hey all, > > Spec freeze is coming up shortly after the holidays on January 10, 2019. > Since we don't have much time after returning to work next year before > spec freeze, let's keep track of specs that are close to approval or > otherwise candidates to focus on in the last stretch before freeze. > > Here's an etherpad where I've collected a list of possible candidates > for focus the first week of January. Feel free to add notes and specs I > might have missed: > > https://etherpad.openstack.org/p/nova-stein-blueprint-spec-freeze Just wanted to bump up this message now that we're back from the holiday break. Milestone s-2 and our spec/blueprint freeze is next Thursday January 10. Please use this etherpad to help with spec reviews and blueprint approvals ahead of the freeze. Best, -melanie From rafaelweingartner at gmail.com Fri Jan 4 12:13:24 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Fri, 4 Jan 2019 10:13:24 -0200 Subject: Openstack CLI using a non-zero return code for successful command. Message-ID: Hello OpenStackers, I have been using "openstack" CLI for a while now, and for most of the commands (the ones I used so far), when it is successful, I receive a return code (RC) 0 (ZERO). However, when using the command "openstack federation protocol set --identity-provider --mapping ", I am getting an RC 1 (ONE) as the exit code for successful executions as well. Is this a known bug? -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Fri Jan 4 13:20:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 04 Jan 2019 13:20:54 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> Message-ID: On Fri, 2019-01-04 at 00:48 -0800, melanie witt wrote: > On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann > wrote: > > On 12/28/2018 4:13 AM, Balázs Gibizer wrote: > > > I'm wondering that introducing an API microversion could act like a > > > feature flag I need and at the same time still make the feautre > > > discoverable as you would like to see it. Something like: Create a > > > feature flag in the code but do not put it in the config as a settable > > > flag. Instead add an API microversion patch to the top of the series > > > and when the new version is requested it enables the feature via the > > > feature flag. This API patch can be small and simple enough to > > > cherry-pick to earlier into the series for local end-to-end testing if > > > needed. Also in functional test I can set the flag via a mock so I can > > > add and run functional tests patch by patch. > > > > That may work. It's not how I would have done this, I would have started > > from the bottom and worked my way up with the end to end functional > > testing at the end, as already noted, but I realize you've been pushing > > this boulder for a couple of releases now so that's not really something > > you want to change at this point. > > > > I guess the question is should this change have a microversion at all? > > That's been wrestled in the spec review and called out in this thread. I > > don't think a microversion would be *wrong* in any sense and could only > > help with discoverability on the nova side, but am open to other opinions. > > Sorry to be late to this discussion, but this brought up in the nova > meeting today to get more thoughts. I'm going to briefly summarize my > thoughts here. > > IMHO, I think this change should have a microversion, to help with > discoverability. I'm thinking, how will users be able to detect they're > able to leverage the new functionality otherwise? A microversion would > signal the availability. As for dealing with the situation where a user > specifies an older microversion combined with resource requests, I think > it should behave similarly to how multiattach works, where the request > will be rejected straight away if microversion too low + resource > requests are passed. this has implcations for upgrades and virsion compatiablity. if a newver version of neutron is used with older nova then behavior will change when nova is upgraded to a version of nova the has the new micoversion. my concern is as follows. a given deployment has rocky nova and rocky neutron. a teant define a minium bandwidth policy and applise it to a network. they create a port on that network. neutorn will automatically apply the minium bandwith policy to the port when it is created on the network. but we could also assuume the tenatn applied the policy to the port if we liked. the tanant then boots a vm with that port. when the vm is schduled to a node neutron will ask the network backend via the ml2 driver to configure the minium bandwith policy if the network backend supports it as part of the bind port call. the ml2 driver can refuse to bind the port at this point if it cannot fulfile the request to prevent the vm from spwaning. assuming the binding succeeds the backend will configure the minium andwith policy on the interface. nova in rocky will not schdule based on the qos policy as there is no resouce request in the port and placement will not model bandwith availablity. note: that this is how minium bandwith was orignially planned to be implmented with ml2/odl and other sdn controler backend several years ago but odl did not implement the required features so this mechanium was never used. i am not aware of any ml2 dirver that actully impmented bandwith check but before placement was created this the mechinium that at least my team at intel and some others had been planning to use. so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in neutron be egress qos minium bandwith support was added in newton with https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf24270d57417#diff-4bbb0b6d12a0d060196c0e3f10e57cec so there are will be a lot of existing cases where ports will have minium bandwith policies before stein. if we repeat the same exercise with rocky nova and stein neutron this changes slightly in that neutron will look at the qos policy associates with the port and add a resouce request. as rocky nova will not have code to parse the resource requests form the neutron port they will be ignored and the vm will boot, the neutron bandwith will configure minium bandwith enforcement on the port, placement will model the bandwith as a inventory but no allocation will be created for the vm. note: i have not checked the neutron node to confirm the qos plugin will still work without the placement allocation but if it dose not its a bug as stien neutron would nolnger work with pre stien nova. as such we would have broken the ablity to upgrade nova and neutron seperatly. if you use stein nova and stein neutron and the new micro version then the vm boots, we allocate the bandiwth in placement and configure the enforment in the networking backend if it supports it which is our end goal. the last configuration is stein nova and stien neutron with old microviron. this will happen in two cases. first the no micorverion is specified explcitly and openstack client is used since it will not negocitate the latest micro version or an explict microversion is passed. if the last rocky micro version was passed for example and we chose to ignore the presence of the resouce request then it would work the way it did with nova rocky and neutron stien above. if we choose to reject the request instead anyone who tries to preform instance actions on an existing instance will break after nova is upgraded to stien. while the fact over subsription is may happend could be problematic to debug for some i think the ux cost is less then the cost of updating all software that used egress qos since it was intoduced in newton to explcitly pass the latest microversion. i am in favor of adding a microversion by the way, i just think we should ignore the resouce request if an old microversion is used. > Current behavior today would be, the resource > requests are ignored. If we only ignored the resource requests when > they're passed with an older microversion, it seems like it would be an > unnecessarily poor UX to have their parameters ignored and likely lead > them on a debugging journey if and when they realize things aren't > working the way they expect given the resource requests they specified. > > -melanie > > > > From skaplons at redhat.com Fri Jan 4 13:28:04 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Fri, 4 Jan 2019 14:28:04 +0100 Subject: [oslo] Parallel Privsep is Proposed for Release In-Reply-To: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Message-ID: Hi, I just found that functional tests in Neutron are failing since today or maybe yesterday. See [1] I was able to reproduce it locally and it looks that it happens with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. [1] https://bugs.launchpad.net/neutron/+bug/1810518 — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Ben Nemec w dniu 02.01.2019, o godz. 19:17: > > Yay alliteration! :-) > > I wanted to draw attention to this release[1] in particular because it includes the parallel privsep change[2]. While it shouldn't have any effect on the public API of the library, it does significantly affect how privsep will process calls on the back end. Specifically, multiple calls can now be processed at the same time, so if any privileged code is not reentrant it's possible that new race bugs could pop up. > > While this sounds scary, it's a necessary change to allow use of privsep in situations where a privileged call may take a non-trivial amount of time. Cinder in particular has some privileged calls that are long-running and can't afford to block all other privileged calls on them. > > So if you're a consumer of oslo.privsep please keep your eyes open for issues related to this new release and contact the Oslo team if you find any. Thanks. > > -Ben > > 1: https://review.openstack.org/628019 > 2: https://review.openstack.org/#/c/593556/ > From doug at doughellmann.com Fri Jan 4 14:26:22 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 04 Jan 2019 09:26:22 -0500 Subject: [oslo] Problem when use library "oslo.messaging" for HA Openstack In-Reply-To: References: Message-ID: Thành Nguyễn Bá writes: > No, it isn't. It raised when i use default settings on my client base on > "olso_messaging". And when i create the config file and use "oslo_config" > passed to tranport (get_notification_transport), it work :D > > Thank for your help. I'm glad to hear that it is working! -- Doug From doug at doughellmann.com Fri Jan 4 14:41:28 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 04 Jan 2019 09:41:28 -0500 Subject: Review-Priority for Project Repos In-Reply-To: <1546564946.3332290.1625035808.384EA5BF@webmail.messagingengine.com> References: <20190103135155.GC27473@sm-workstation> <1546564946.3332290.1625035808.384EA5BF@webmail.messagingengine.com> Message-ID: Clark Boylan writes: > On Thu, Jan 3, 2019, at 4:26 PM, Jay Bryant wrote: >> >> > > snip > >> > So far, that's just a slight inconvenience. It would be great if we can figure >> > out a way to have them all be sticky, but if we need to live with reapplying +1 >> > votes, that's manageable to me. >> >> >> >>  Is there someway that we could allow the owner to reset this priority >> after pushing up a new patch.  That would lower the dependence on the >> cores in that case. > > If you use a three value label: [-1: +1] then you could set copy min > and max scores so all values are carried forward on new > patchsets. This would allow you to have -1 "Don't review", 0 "default > no special priority", and +1 "this is a priority please review > now". This may have to take advantage of the fact that if you don't > set a value its roughly the same as 0 (I don't know if this is > explicitly true in Gerrit but we can approximate it since -1 and +1 > would be explicitly set and query on those values). It is possible to tell the difference between not having a value set and having the default set, but as you point out if the dashboards are simply configured to look for +1 and -1 then the other distinction isn't important. > > If you need an explicit copy all values function in Gerrit you'll want > to get that merged upstream first then we could potentially backport > it to our Gerrit. This will likely require writing Java. We could also have a bot do it. The history of each patch is available, so it's possible to determine that a priority was set but lost when a new patch is submitted. The first step to having a bot would be to write the logic to fix the lost priorities, and if someone does that as a CLI then teams could use that by hand until someone configures the bot. > > For some reason I thought that Prolog predicates could be written for these value copying functions, but docs seem to say otherwise. Prolog is only for determining if a label's value allows a change to be submitted (merged). > > Clark > -- Doug From doug at doughellmann.com Fri Jan 4 14:46:55 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 04 Jan 2019 09:46:55 -0500 Subject: [all] One month with openstack-discuss (a progress report) In-Reply-To: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> References: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> Message-ID: Jeremy Stanley writes: > First, I want to thank everyone here for the remarkably smooth > transition to openstack-discuss at the end of November. It's been > exactly one month today since we shuttered the old openstack, > openstack-dev, openstack-operators and openstack-sigs mailing lists > and forwarded all subsequent posts for them to the new list address > instead. The number of posts from non-subscribers has dwindled to > the point where it's now only a few each day (many of whom also > subscribe immediately after receiving the moderation autoresponse). > > As of this moment, we're up to 708 subscribers. Unfortunately it's > hard to compare raw subscriber counts because the longer a list is > in existence the more dead addresses it accumulates. Mailman does > its best to unsubscribe addresses which explicitly reject/bounce > multiple messages in a row, but these days many E-mail addresses > grow defunct without triggering any NDRs (perhaps because they've > simply been abandoned, or because their MTAs just blackhole new > messages for deleted accounts). Instead, it's a lot more concrete to > analyze active participants on mailing lists, especially since ours > are consistently configured to require a subscription if you want to > avoid your messages getting stuck in the moderation queue. > > Over the course of 2018 (at least until the lists were closed on > December 3) there were 1075 unique E-mail addresses posting to one > of more of the openstack, openstack-dev, openstack-operators and > openstack-sigs mailing lists. Now, a lot of those people sent one or > maybe a handful of messages to ask some question they had, and then > disappeared again... they didn't really follow ongoing discussions, > so probably won't subscribe to openstack-discuss until they have > something new to bring up. > > On the other hand, if we look at addresses which sent 10 or more > messages in 2018 (an arbitrary threshold admittedly), there were > 245. Comparing those to the list of addresses subscribed to > openstack-discuss today, there are 173 matches. That means we now > have *at least* 70% of the people who sent 10 or more messages to > the old lists subscribed to the new one. I say "at least" because we > don't have an easy way to track address changes, and even if we did > that's never going to get us to 100% because there are always going > to be people who leave the lists abruptly for various reasons > (perhaps even disappearing from our community entirely). Seems like > a good place to be after only one month, especially considering the > number of folks who may not have even been paying attention at all > during end-of-year holidays. > > As for message volume, we had a total of 912 posts to > openstack-discuss in the month of December; comparing to the 1033 > posts in total we saw to the four old lists in December of 2017, > that's a 12% drop. Consider, though, that right at 10% of the > messages on the old lists were duplicates from cross-posting, so > that's really more like a 2% drop in actual (deduplicated) posting > volume. It's far less of a reduction than I would have anticipated > based on year-over-year comparisons (for example, December of 2016 > had 1564 posts across those four lists). I think based on this, it's > safe to say the transition to openstack-discuss hasn't hampered > discussion, at least for its first full month in use. > -- > Jeremy Stanley Thank you, Jeremy, both for producing those reassuring stats and for managing the transition. The change has been much less disruptive than I was worried it would be (even though I considered it necessary from the start) and much of the credit for that goes to you for the careful way you have planned and implemented the merge. Nice job! -- Doug From cdent+os at anticdent.org Fri Jan 4 15:29:19 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 4 Jan 2019 15:29:19 +0000 (GMT) Subject: [placement] update 19-00 Message-ID: HTML: https://anticdent.org/placement-update-19-00.html Welcome to the first placement update of 2019. May all your placements have sufficient resources this year. # Most Important A few different people have mentioned that we're approaching crunch time on pulling the trigger on deleting the placement code from nova. The week of the 14th there will be a meeting to iron out the details of what needs to be done prior to that. If this is important to you, watch out for an announcement of when it will be. This is a separate issue from putting placement under its own governance, but some of the requirements [declared](http://lists.openstack.org/pipermail/openstack-dev/2018-September/134541.html) for that, notably a deployment tool demonstrating an upgrade from placement-in-nova to placement-alone, are relevant. Therefore, reviewing and tracking the deployment tool related work remains critical. Those are listed below. Also, it is spec freeze next week. There are quite a lot of specs that are relevant to placement and scheduling that are close, but not quite. Mel has sent out [an email](http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001408.html) about which specs most need attention. # What's Changed * There's an [os-resource-classes](https://pypi.org/p/os-resource-classes) now, already merged in placement, with a change posted [for nova](https://review.openstack.org/#/c/628278/). It's effectively the same as os-traits, but for resource classes. * There's a release of a 0.1.0 of placement [pending](https://review.openstack.org/628400). This won't have complete documentation, but will mean that there's an actually usable openstack-placement on PyPI, with what we expect to be the final python module requirements. * This has been true for a while, but it seems worth mentioning, via coreycb: "you can install placement-api on bionic with the stein cloud archive enabled". * A `db stamp` command has been added to `placement-manage` tool which makes it possible for someone who has migrated their data from nova to say "I'm at version X". * placement functional tests have been removed from nova. * Matt did a mess of work to make initializing the scheduler report client in nova less expensive and redundant. * Improving the handling of [allocation ratios](https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#allocation-ratios) has merged, allowing for "initial allocation ratios". # Bugs * Placement related [bugs not yet in progress](https://goo.gl/TgiPXb): 15. -2. * [In progress placement bugs](https://goo.gl/vzGGDQ) 15. +2 # Specs Spec freeze next week! Only one of the previously listed specs has merged since early December and a new one has been added (at the end). * Account for host agg allocation ratio in placement (Still in rocky/) * Add subtree filter for GET /resource_providers * Resource provider - request group mapping in allocation candidate * VMware: place instances on resource pool (still in rocky/) * Standardize CPU resource tracking * Allow overcommit of dedicated CPU (Has an alternative which changes allocations to a float) * Modelling passthrough devices for report to placement * Nova Cyborg interaction specification. * supporting virtual NVDIMM devices * Proposes NUMA topology with RPs * Count quota based on resource class * Adds spec for instance live resize * Provider config YAML file * Propose counting quota usage from placement and API database * Resource modeling in cyborg. * Support filtering of allocation_candidates by forbidden aggregates * support virtual persistent memory # Main Themes ## Making Nested Useful Progress continues on gpu-reshaping for libvirt and xen: * Also making use of nested is bandwidth-resource-provider: * There's a [review guide](http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001129.html) for those patches. Eric's in the process of doing lots of cleanups to how often the ProviderTree in the resource tracker is checked against placement, and a variety of other "let's make this more right" changes in the same neighborhood: * Stack at: ## Extraction The [extraction etherpad](https://etherpad.openstack.org/p/placement-extract-stein-4) is starting to contain more strikethrough text than not. Progress is being made. I'll refactor that soon so it is more readable, before the week of the 14th meeting. The main tasks are the reshaper work mentioned above and the work to get deployment tools operating with an extracted placement: * [TripleO](https://review.openstack.org/#/q/topic:tripleo-placement-extraction) * [OpenStack Ansible](https://review.openstack.org/#/q/project:openstack/openstack-ansible-os_placement) * [Kolla and Kolla Ansible](https://review.openstack.org/#/q/topic:split-placement) Loci's change to have an extracted placement has merged. Kolla has a patch to [include the upgrade script](https://review.openstack.org/#/q/topic:upgrade-placement). It raises the question of how or if the `mysql-migrate-db.sh` should be distributed. Should it maybe end up in the pypi distribution? Documentation tuneups: * Release-notes: This is blocked until we refactor the release notes to reflect _now_ better. * The main remaining task here is participating in [openstack-manuals](https://docs.openstack.org/doc-contrib-guide/doc-index.html), to that end: * A stack of changes to nova to remove placement from the install docs. * Install docs in placement. I wrote to the [mailing list](http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001379.html) asking for input on making sure these things are close to correct, especially with regard to distro-specific things like package names. * Change to openstack-manuals to assert that placement is publishing install docs. Depends on the above. * There is a patch to [delete placement](https://review.openstack.org/#/c/618215/) from nova that we've put an administrative -2 on while we determine where things are (see about the meeting above). * There's a pending patch to support [online data migrations](https://review.openstack.org/#/c/624942/). This is important to make sure that fixup commands like `create_incomplete_consumers` can be safely removed from nova and implemented in placement. # Other There are currently 13 [open changes](https://review.openstack.org/#/q/project:openstack/placement+status:open) in placement itself. Most of the time critical work is happening elsewhere (notably the deployment tool changes listed above). Of those placement changes, the [database-related](https://review.openstack.org/#/q/owner:nakamura.tetsuro%2540lab.ntt.co.jp+status:open+project:openstack/placement) ones from Tetsuro are the most important. Outside of placement: * Neutron minimum bandwidth implementation * Add OWNERSHIP $SERVICE traits * zun: Use placement for unified resource management * WIP: add Placement aggregates tests (in tempest) * blazar: Consider the number of reservation inventory * Add placement client for basic GET operations (to tempest) # End Lot's of good work in progress. Our main task is making sure it all gets review and merged. The sooner we do, the sooner people get to use it and find all the bugs we're sure to have left lying around. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From mriedemos at gmail.com Fri Jan 4 15:50:46 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Jan 2019 09:50:46 -0600 Subject: [Nova] Suggestion needed for detach-boot-volume design In-Reply-To: References: Message-ID: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> On 1/2/2019 2:57 AM, Zhenyu Zheng wrote: > I've been working on detach-boot-volume[1] in Stein, we got the initial > design merged and while implementing we have meet some new problems and > now I'm amending the spec to cover these new problems[2]. [2] is https://review.openstack.org/#/c/619161/ > > The thing I want to discuss for wider opinion is that in the initial > design, we planned to support detach root volume for only STOPPED and > SHELVED/SHELVE_OFFLOADED instances. But then we found out that we > allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as > well. Should we allow detaching root volume for instances in these > status too? Cases like RESIZE could be complicated for the revert resize > action, and it also seems unnecesary. The full set of allowed states for attaching and detaching are here: https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4187 https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4297 Concerning those other states: RESIZED: There might be a case for attaching/detaching volumes based on flavor during a resize, but I'm not sure about the root volume in that case (that really sounds more like rebuild with a new image to me, which is a different blueprint). I'm also not sure how much people know about the ability to do this or what the behavior is on revert if you have changed the volumes while the server is resized. If we consider that when a user reverts a resize, they want to go back to the way things were for the root disk image, then I would think we should not allow changing out the root volume while resized. PAUSED: First, I'm not sure how much anyone uses the pause API (or suspend for that matter) although most of the virt drivers implement it. At one point you could attach volumes to suspended servers as well, but because libvirt didn't support it that was removed from the API (yay for non-discoverable backend-specific API behavior changes): https://review.openstack.org/#/c/83505/ Anyway, swapping the root volume on a paused instance seems dangerous to me, so until someone really has a good use case for it, then I think we should avoid that one as well. SOFT_DELETED: I really don't understand the use case for attaching/detaching volumes to/from a (soft) deleted server. If the server is deleted and only hanging around because it hasn't been reclaimed yet, there are really no guarantees that this would work, so again, I would just skip this one for the root volume changes. If the user really wants to play with the volumes attached to a soft deleted server, they should restore it first. So in summary, I think we should just not support any of those other states for attach/detach root volumes and only focus on stopped or shelved instances. -- Thanks, Matt From juliaashleykreger at gmail.com Fri Jan 4 15:53:54 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 4 Jan 2019 07:53:54 -0800 Subject: [dev][tc][ptl] Evaluating projects in relation to OpenStack cloud vision Message-ID: As some of you may or may not have heard, recently the Technical Committee approved a technical vision document [1]. The goal of the technical vision document is to try to provide a reference point for cloud infrastructure software in an ideal universe. It is naturally recognized that not all items will apply to all projects. With that in mind, we want to encourage projects to leverage the vision by performing a realistic self-evaluation to determine how their individual project compares to the technical vision: What gaps exist in the project that could be closed to be more in alignment with the vision? Are there aspects of the vision which are inappropriate for the project to such a degree that the vision itself should change, not the project? We envision the results of the evaluation to be added to each project's primary contributor documentation tree (/doc/source/contributor/vision-reflection.rst) as a list of bullet points detailing areas where a project feels they need adjustment to better align with the technical vision, and if the project already has visibility into a path forward, that as well. As with all things of this nature, we anticipate projects to treat the document as a living document and update it as each project's contributors feel necessary. If an individual project community feels something in the overall OpenStack community technical vision does not apply, that is okay. If the project community feels that something in the vision is wrong for the whole of OpenStack, please feel free to submit a revision to gerrit in order to start that discussion. Once projects have performed a realistic self-evaluation, we ask each project to then consider those items they identified in their future planning as areas that could use the attention of contributors. To be very explicit about this, the intent is to help enable projects to identify areas for improved alignment with the rest of OpenStack using a short, concise, easily consumable list that can be referenced in planning, or even by drive-by contributors if they are intrigued by a specific problem. Thanks, Julia Kreger & Chris Dent [1] https://governance.openstack.org/tc/reference/technical-vision.html From mriedemos at gmail.com Fri Jan 4 17:11:03 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Jan 2019 11:11:03 -0600 Subject: [goals][upgrade-checkers] Week R-14 Update Message-ID: <3bbb7683-1581-5414-1698-a08a0abed10b@gmail.com> There has not been much progress since the R-16 update [1] let's assume because of the holiday break. There is a new trove patch which replaces the placeholder check with a real upgrade check [2]. I have left comments on that review. I am not sure if that is due to some new changes in trove which require the upgrade check, or if this is just something that has always been needed when upgrading trove. [1] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001266.html [2] https://review.openstack.org/#/c/627555/ -- Thanks, Matt From miguel at mlavalle.com Fri Jan 4 17:56:49 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Fri, 4 Jan 2019 11:56:49 -0600 Subject: [openstack-dev] [neutron] Changing the meeting channel for Neutron upgrades weekly meeting Message-ID: Lujin and Nate and Neutrinos, Please be aware that the meeting room for the Neutron upgrades channel is being changed: https://review.openstack.org/#/c/626182. due to infra optimization. Cheers Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From cboylan at sapwetik.org Fri Jan 4 18:12:25 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Fri, 04 Jan 2019 10:12:25 -0800 Subject: Update on gate status for the new year Message-ID: <1546625545.3567722.1625642368.48EE2F8A@webmail.messagingengine.com> I'm still not entirely caught up on everything after the holidays, but thought I would attempt to do another update on gate reliability issues since those were well received last month. Overall things look pretty good based on elastic-recheck data. That said I think this is mostly due to low test volume over the holidays and our 10 day index window. We should revisit this next week or the week after to get a more accurate view of things. On the infra team side of things we've got quota issues in a cloud region that has decreased our test node capacity. Waiting on people to return from holidays to take a look at that. We also started tracking hypervisor IDs for our test instances (thank you pabelanger) to try and help identify when specific hypervisors might be the cause of some of our issues. https://review.openstack.org/628642 is a followup to index that data with our job log data in Elasticsearch. We've seen some ssh failures in tripleo jobs on limestone [0] and neutron and zuul report constrained IOPS there resulting in failed database migrations. I think the idea with 628642 is to see if we can narrow that down to specific hypervisors. On the project side of things our categorization rates are quite low [1][2]. If your changes are evicted from the gate due to failures it would be helpful if you could spend a few minutes to try and identify and fingerprint those failures. We'll check back in a week or two when we should have a much better data set to look at. [0] http://status.openstack.org/elastic-recheck/index.html#18100542 [1] http://status.openstack.org/elastic-recheck/data/integrated_gate.html [2] http://status.openstack.org/elastic-recheck/data/others.html Clark From mihalis68 at gmail.com Fri Jan 4 18:44:32 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Fri, 4 Jan 2019 13:44:32 -0500 Subject: [all] One month with openstack-discuss (a progress report) In-Reply-To: References: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> Message-ID: Yes Jeremy actually did an amazing job, agree with all the positive comments above. Chris On Fri, Jan 4, 2019 at 9:50 AM Doug Hellmann wrote: > Jeremy Stanley writes: > > > First, I want to thank everyone here for the remarkably smooth > > transition to openstack-discuss at the end of November. It's been > > exactly one month today since we shuttered the old openstack, > > openstack-dev, openstack-operators and openstack-sigs mailing lists > > and forwarded all subsequent posts for them to the new list address > > instead. The number of posts from non-subscribers has dwindled to > > the point where it's now only a few each day (many of whom also > > subscribe immediately after receiving the moderation autoresponse). > > > > As of this moment, we're up to 708 subscribers. Unfortunately it's > > hard to compare raw subscriber counts because the longer a list is > > in existence the more dead addresses it accumulates. Mailman does > > its best to unsubscribe addresses which explicitly reject/bounce > > multiple messages in a row, but these days many E-mail addresses > > grow defunct without triggering any NDRs (perhaps because they've > > simply been abandoned, or because their MTAs just blackhole new > > messages for deleted accounts). Instead, it's a lot more concrete to > > analyze active participants on mailing lists, especially since ours > > are consistently configured to require a subscription if you want to > > avoid your messages getting stuck in the moderation queue. > > > > Over the course of 2018 (at least until the lists were closed on > > December 3) there were 1075 unique E-mail addresses posting to one > > of more of the openstack, openstack-dev, openstack-operators and > > openstack-sigs mailing lists. Now, a lot of those people sent one or > > maybe a handful of messages to ask some question they had, and then > > disappeared again... they didn't really follow ongoing discussions, > > so probably won't subscribe to openstack-discuss until they have > > something new to bring up. > > > > On the other hand, if we look at addresses which sent 10 or more > > messages in 2018 (an arbitrary threshold admittedly), there were > > 245. Comparing those to the list of addresses subscribed to > > openstack-discuss today, there are 173 matches. That means we now > > have *at least* 70% of the people who sent 10 or more messages to > > the old lists subscribed to the new one. I say "at least" because we > > don't have an easy way to track address changes, and even if we did > > that's never going to get us to 100% because there are always going > > to be people who leave the lists abruptly for various reasons > > (perhaps even disappearing from our community entirely). Seems like > > a good place to be after only one month, especially considering the > > number of folks who may not have even been paying attention at all > > during end-of-year holidays. > > > > As for message volume, we had a total of 912 posts to > > openstack-discuss in the month of December; comparing to the 1033 > > posts in total we saw to the four old lists in December of 2017, > > that's a 12% drop. Consider, though, that right at 10% of the > > messages on the old lists were duplicates from cross-posting, so > > that's really more like a 2% drop in actual (deduplicated) posting > > volume. It's far less of a reduction than I would have anticipated > > based on year-over-year comparisons (for example, December of 2016 > > had 1564 posts across those four lists). I think based on this, it's > > safe to say the transition to openstack-discuss hasn't hampered > > discussion, at least for its first full month in use. > > -- > > Jeremy Stanley > > Thank you, Jeremy, both for producing those reassuring stats and for > managing the transition. The change has been much less disruptive than I > was worried it would be (even though I considered it necessary from the > start) and much of the credit for that goes to you for the careful way > you have planned and implemented the merge. Nice job! > > -- > Doug > > -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ken at jots.org Fri Jan 4 19:45:43 2019 From: ken at jots.org (Ken D'Ambrosio) Date: Fri, 04 Jan 2019 14:45:43 -0500 Subject: Per-VM CPU & RAM allocation? Message-ID: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> Hi! If I go into the UI, I can easily see how much each VM is allocated for RAM and CPU. However, I've googled until I'm blue in the face, and can't seem to see a way -- either through CLI or API -- to get that info. "nova limits --tenant " SEEMS like it should... except (at least on my Juno cloud), the "used" column is either full of zeros or dashes. Clearly, if it's in the UI, it's possible... somehow. But it seemed like it might be easier to ask the list than go down the rabbit hole of tcp captures. Any ideas? Thanks! -Ken P.S. Not interested in CPU-hours/usage -- I'm just looking for how much is actually allocated. From jungleboyj at gmail.com Fri Jan 4 19:53:39 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Fri, 4 Jan 2019 13:53:39 -0600 Subject: [cinder] Addition mid-cycle details ... Message-ID: Team, We are at about a month away from the Cinder Mid-Cycle in Raleigh.  I have started requesting drinks/snacks and have rooms reserved.  I soon will need a firm number of people attending so that I can finalize the various requests. If you are planning to physically attend and have not yet added your name to our planning etherpad [1] please do so ASAP. I have reserved a room at the Hyatt House Raleigh Durham Airport (10962 Chapel Hill Road, Morrisville, NC, 27560, USA ) if people want to stay at the same hotel as myself.  It is close to the Lenovo site which will make it easier to travel if we have unexpected snowy weather there.  We can also carpool to reduce the problem of finding parking. I am arriving the afternoon of 2/4/19 and leaving the morning of 2/9/19. Also a reminder to please add topics to the mid-cycle planning etherpad regardless of whether you are able to attend or not. Look forward to seeing you in Raleigh next month! Jay (jungleboyj) [1] https://etherpad.openstack.org/p/cinder-stein-mid-cycle-planning -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Fri Jan 4 19:59:11 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Jan 2019 13:59:11 -0600 Subject: Update on gate status for the new year In-Reply-To: <1546625545.3567722.1625642368.48EE2F8A@webmail.messagingengine.com> References: <1546625545.3567722.1625642368.48EE2F8A@webmail.messagingengine.com> Message-ID: On 1/4/2019 12:12 PM, Clark Boylan wrote: > Overall things look pretty good based on elastic-recheck data. That said I think this is mostly due to low test volume over the holidays and our 10 day index window. We should revisit this next week or the week after to get a more accurate view of things. > > On the infra team side of things we've got quota issues in a cloud region that has decreased our test node capacity. Waiting on people to return from holidays to take a look at that. We also started tracking hypervisor IDs for our test instances (thank you pabelanger) to try and help identify when specific hypervisors might be the cause of some of our issues.https://review.openstack.org/628642 is a followup to index that data with our job log data in Elasticsearch. > > We've seen some ssh failures in tripleo jobs on limestone [0] and neutron and zuul report constrained IOPS there resulting in failed database migrations. I think the idea with 628642 is to see if we can narrow that down to specific hypervisors. > > On the project side of things our categorization rates are quite low [1][2]. If your changes are evicted from the gate due to failures it would be helpful if you could spend a few minutes to try and identify and fingerprint those failures. On a side note, I've noticed tempest jobs failing and elastic-recheck wasn't commenting on the changes. Turns out that's because we're using a really limited regex for the jobs that e-r will process in order to comment on a change in gerrit. The following patch should help with that: https://review.openstack.org/#/c/628669/ But since "dsvm" isn't standard in job names anymore it's clear that e-r is going to be skipping a lot of project-specific jobs which otherwise have categorized failures. -- Thanks, Matt From sean.mcginnis at gmx.com Fri Jan 4 20:03:00 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Fri, 4 Jan 2019 14:03:00 -0600 Subject: Per-VM CPU & RAM allocation? In-Reply-To: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> References: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> Message-ID: <20190104200300.GB22595@sm-workstation> On Fri, Jan 04, 2019 at 02:45:43PM -0500, Ken D'Ambrosio wrote: > Hi! If I go into the UI, I can easily see how much each VM is allocated for > RAM and CPU. However, I've googled until I'm blue in the face, and can't > seem to see a way -- either through CLI or API -- to get that info. "nova > limits --tenant " SEEMS like it should... except (at least on my Juno > cloud), the "used" column is either full of zeros or dashes. Clearly, if > it's in the UI, it's possible... somehow. But it seemed like it might be > easier to ask the list than go down the rabbit hole of tcp captures. > > Any ideas? > > Thanks! > > -Ken > > P.S. Not interested in CPU-hours/usage -- I'm just looking for how much is > actually allocated. > Hey Ken, Those values are set based on the flavor that is chosen when creating the instance. You can get that information of a running instance by: openstack server show And looking at the flavor of the instance. I believe it's in the format of "flavor.name (id)". You can then do: openstack flavor show or just: openstack flavor list to get the RAM and VCPUs values defined for that flavor. There are corresponding API calls you can make. Add "--debug" to those CLI calls to get the debug output that shows curl examples of the REST APIs being called. Hope that helps. Sean From mriedemos at gmail.com Fri Jan 4 20:05:08 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Jan 2019 14:05:08 -0600 Subject: Per-VM CPU & RAM allocation? In-Reply-To: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> References: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> Message-ID: <8bd9b015-48d5-62a0-dafc-035e9407ff76@gmail.com> On 1/4/2019 1:45 PM, Ken D'Ambrosio wrote: > Hi!  If I go into the UI, I can easily see how much each VM is allocated > for RAM and CPU.  However, I've googled until I'm blue in the face, and > can't seem to see a way -- either through CLI or API -- to get that > info.  "nova limits --tenant " SEEMS like it should... except (at > least on my Juno cloud), the "used" column is either full of zeros or > dashes.  Clearly, if it's in the UI, it's possible... somehow.  But it > seemed like it might be easier to ask the list than go down the rabbit > hole of tcp captures. You might be looking for the os-simple-tenant-usages API [1]. That is per-tenant but the response has per-server results in it. If you were on something newer (Ocata+) you could use [2] to get per-instance resource allocations for VCPU and MEMORY_MB. I'm not sure what the UI is doing, but it might simply be getting the flavor used for each VM and showing the details about that flavor, which you could also get (more reliably) with the server details themselves starting in microversion 2.47 (added in Pike). [1] https://developer.openstack.org/api-ref/compute/?expanded=show-usage-statistics-for-tenant-detail#show-usage-statistics-for-tenant [2] https://developer.openstack.org/api-ref/placement/?expanded=list-allocations-detail#list-allocations -- Thanks, Matt From melwittt at gmail.com Fri Jan 4 23:33:00 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 15:33:00 -0800 Subject: [Nova] Suggestion needed for detach-boot-volume design In-Reply-To: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> References: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> Message-ID: On Fri, 4 Jan 2019 09:50:46 -0600, Matt Riedemann wrote: > On 1/2/2019 2:57 AM, Zhenyu Zheng wrote: >> I've been working on detach-boot-volume[1] in Stein, we got the initial >> design merged and while implementing we have meet some new problems and >> now I'm amending the spec to cover these new problems[2]. > > [2] is https://review.openstack.org/#/c/619161/ > >> >> The thing I want to discuss for wider opinion is that in the initial >> design, we planned to support detach root volume for only STOPPED and >> SHELVED/SHELVE_OFFLOADED instances. But then we found out that we >> allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as >> well. Should we allow detaching root volume for instances in these >> status too? Cases like RESIZE could be complicated for the revert resize >> action, and it also seems unnecesary. > > The full set of allowed states for attaching and detaching are here: > > https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4187 > > https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4297 > > Concerning those other states: > > RESIZED: There might be a case for attaching/detaching volumes based on > flavor during a resize, but I'm not sure about the root volume in that > case (that really sounds more like rebuild with a new image to me, which > is a different blueprint). I'm also not sure how much people know about > the ability to do this or what the behavior is on revert if you have > changed the volumes while the server is resized. If we consider that > when a user reverts a resize, they want to go back to the way things > were for the root disk image, then I would think we should not allow > changing out the root volume while resized. Yeah, if someone attaches/detaches a regular volume while the instance is in VERIFY_RESIZE state and then reverts the resize, I assume we probably don't attempt to change or restore anything with the volume attachments to put them back to how they were attached before the resize. But as you point out, the situation does seem different regarding a root volume. If a user changes that while in VERIFY_RESIZE and reverts the resize, and we leave the root volume alone, then they end up with a different root disk image than they had before the resize. Which seems weird. I agree it seems better not to allow this for now and come back to it later if people start asking for it. > PAUSED: First, I'm not sure how much anyone uses the pause API (or > suspend for that matter) although most of the virt drivers implement it. > At one point you could attach volumes to suspended servers as well, but > because libvirt didn't support it that was removed from the API (yay for > non-discoverable backend-specific API behavior changes): > > https://review.openstack.org/#/c/83505/ > > Anyway, swapping the root volume on a paused instance seems dangerous to > me, so until someone really has a good use case for it, then I think we > should avoid that one as well. > > SOFT_DELETED: I really don't understand the use case for > attaching/detaching volumes to/from a (soft) deleted server. If the > server is deleted and only hanging around because it hasn't been > reclaimed yet, there are really no guarantees that this would work, so > again, I would just skip this one for the root volume changes. If the > user really wants to play with the volumes attached to a soft deleted > server, they should restore it first. > > So in summary, I think we should just not support any of those other > states for attach/detach root volumes and only focus on stopped or > shelved instances. Again, agree, I think we should just not allow the other states for the initial implementation and revisit later if it turns out people need these. -melanie From aspiers at suse.com Fri Jan 4 23:45:30 2019 From: aspiers at suse.com (Adam Spiers) Date: Fri, 4 Jan 2019 23:45:30 +0000 Subject: [docs] question about deprecation badges Message-ID: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> Hi all, I'm currently hacking on the deprecation badges in openstack-manuals, and there's a couple of things I don't understand. Any chance someone could explain why www/latest/badge.html doesn't just do: {% include 'templates/deprecated_badge.tmpl' %} like all the others? https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-61d0adc734c25e15fa375c6acd344703 I'm also what exactly would be wrong with the included CSS path if CSSDIR was used in www/templates/deprecated_badge.tmpl instead of heeding this caveat: https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-67d1669c09d2cddc437c6d803a5d6c02R4 It would be good to fix it to use CSSDIR because currently it's awkward to test CSS changes. Thanks! Adam From johnsomor at gmail.com Sat Jan 5 00:02:43 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Fri, 4 Jan 2019 16:02:43 -0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Hi Jeff, Unfortunately the team that was working on that code had stopped due to internal reasons. I hope to make the reference active/active blueprint a priority again during the Train cycle. Following that I may be able to look at the L3 distributor option, but I cannot commit to that at this time. If you are interesting in picking up that work, please let me know and we can sync up on that status of the WIP patches, etc. Michael On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: > > Dear Octavia team: > The email aims to ask the development progress about l3-active-active blueprint. I > noticed that the work in this area has been stagnant for eight months. > https://review.openstack.org/#/q/l3-active-active > I want to know the community's next work plan in this regard. > Thanks. From melwittt at gmail.com Sat Jan 5 00:35:21 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 16:35:21 -0800 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> Message-ID: <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> On Fri, 04 Jan 2019 13:20:54 +0000, Sean Mooney wrote: > On Fri, 2019-01-04 at 00:48 -0800, melanie witt wrote: >> On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann >> wrote: >>> On 12/28/2018 4:13 AM, Balázs Gibizer wrote: >>>> I'm wondering that introducing an API microversion could act like a >>>> feature flag I need and at the same time still make the feautre >>>> discoverable as you would like to see it. Something like: Create a >>>> feature flag in the code but do not put it in the config as a settable >>>> flag. Instead add an API microversion patch to the top of the series >>>> and when the new version is requested it enables the feature via the >>>> feature flag. This API patch can be small and simple enough to >>>> cherry-pick to earlier into the series for local end-to-end testing if >>>> needed. Also in functional test I can set the flag via a mock so I can >>>> add and run functional tests patch by patch. >>> >>> That may work. It's not how I would have done this, I would have started >>> from the bottom and worked my way up with the end to end functional >>> testing at the end, as already noted, but I realize you've been pushing >>> this boulder for a couple of releases now so that's not really something >>> you want to change at this point. >>> >>> I guess the question is should this change have a microversion at all? >>> That's been wrestled in the spec review and called out in this thread. I >>> don't think a microversion would be *wrong* in any sense and could only >>> help with discoverability on the nova side, but am open to other opinions. >> >> Sorry to be late to this discussion, but this brought up in the nova >> meeting today to get more thoughts. I'm going to briefly summarize my >> thoughts here. >> >> IMHO, I think this change should have a microversion, to help with >> discoverability. I'm thinking, how will users be able to detect they're >> able to leverage the new functionality otherwise? A microversion would >> signal the availability. As for dealing with the situation where a user >> specifies an older microversion combined with resource requests, I think >> it should behave similarly to how multiattach works, where the request >> will be rejected straight away if microversion too low + resource >> requests are passed. > > this has implcations for upgrades and virsion compatiablity. > if a newver version of neutron is used with older nova then > behavior will change when nova is upgraded to a version of > nova the has the new micoversion. > > my concern is as follows. > a given deployment has rocky nova and rocky neutron. > a teant define a minium bandwidth policy and applise it to a network. > they create a port on that network. > neutorn will automatically apply the minium bandwith policy to the port when it is created on the network. > but we could also assuume the tenatn applied the policy to the port if we liked. > the tanant then boots a vm with that port. > > when the vm is schduled to a node neutron will ask the network backend via the ml2 driver to configure the minium > bandwith policy if the network backend supports it as part of the bind port call. the ml2 driver can refuse to bind the > port at this point if it cannot fulfile the request to prevent the vm from spwaning. assuming the binding succeeds the > backend will configure the minium andwith policy on the interface. nova in rocky will not schdule based on the qos > policy as there is no resouce request in the port and placement will not model bandwith availablity. > > note: that this is how minium bandwith was orignially planned to be implmented with ml2/odl and other sdn controler > backend several years ago but odl did not implement the required features so this mechanium was never used. > i am not aware of any ml2 dirver that actully impmented bandwith check but before placement was created this > the mechinium that at least my team at intel and some others had been planning to use. > > > so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure > the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in > neutron be egress qos minium bandwith support was added in newton with > https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf24270d57417#diff-4bbb0b6d12a0d060196c0e3f10e57cec > so there are will be a lot of existing cases where ports will have minium bandwith policies before stein. > > if we repeat the same exercise with rocky nova and stein neutron this changes slightly in that > neutron will look at the qos policy associates with the port and add a resouce request. as rocky nova > will not have code to parse the resource requests form the neutron port they will be ignored and > the vm will boot, the neutron bandwith will configure minium bandwith enforcement on the port, placement will > model the bandwith as a inventory but no allocation will be created for the vm. > > note: i have not checked the neutron node to confirm the qos plugin will still work without the placement allocation > but if it dose not its a bug as stien neutron would nolnger work with pre stien nova. as such we would have > broken the ablity to upgrade nova and neutron seperatly. > > if you use stein nova and stein neutron and the new micro version then the vm boots, we allocate the bandiwth in > placement and configure the enforment in the networking backend if it supports it which is our end goal. > > the last configuration is stein nova and stien neutron with old microviron. > this will happen in two cases. > first the no micorverion is specified explcitly and openstack client is used since it will not negocitate the latest > micro version or an explict microversion is passed. > > if the last rocky micro version was passed for example and we chose to ignore the presence of the resouce request then > it would work the way it did with nova rocky and neutron stien above. if we choose to reject the request instead > anyone who tries to preform instance actions on an existing instance will break after nova is upgraded to stien. > > while the fact over subsription is may happend could be problematic to debug for some i think the ux cost is less then > the cost of updating all software that used egress qos since it was intoduced in newton to explcitly pass the latest > microversion. > > i am in favor of adding a microversion by the way, i just think we should ignore the resouce request if an old > microversion is used. Thanks for describing this detailed scenario -- I wasn't realizing that today, you can get _some_ QoS support by pre-creating ports in neutron with resource requests attached and specifying those ports when creating a server. I understand now the concern with the idea of rejecting requests < new microversion + port.resource_request existing on pre-created ports. And there's no notion of being able to request QoS support via ports created by Nova (no change in Nova API or flavor extra-specs in the design). So, I could see this situation being reason enough not to reject requests when an old microversion is specified. But, let's chat more about it via a hangout the week after next (week of January 14 when Matt is back), as suggested in #openstack-nova today. We'll be able to have a high-bandwidth discussion then and agree on a decision on how to move forward with this. >> Current behavior today would be, the resource >> requests are ignored. If we only ignored the resource requests when >> they're passed with an older microversion, it seems like it would be an >> unnecessarily poor UX to have their parameters ignored and likely lead >> them on a debugging journey if and when they realize things aren't >> working the way they expect given the resource requests they specified. >> >> -melanie >> >> >> >> > From yjf1970231893 at gmail.com Sat Jan 5 03:29:11 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Sat, 5 Jan 2019 11:29:11 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Yes, I want to reboot this work recently. And, I want to replace exabgp with os-ken, So I may need to rewrite some patches. Michael Johnson 于2019年1月5日周六 上午8:02写道: > Hi Jeff, > > Unfortunately the team that was working on that code had stopped due > to internal reasons. > > I hope to make the reference active/active blueprint a priority again > during the Train cycle. Following that I may be able to look at the L3 > distributor option, but I cannot commit to that at this time. > > If you are interesting in picking up that work, please let me know and > we can sync up on that status of the WIP patches, etc. > > Michael > > On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: > > > > Dear Octavia team: > > The email aims to ask the development progress about > l3-active-active blueprint. I > > noticed that the work in this area has been stagnant for eight months. > > https://review.openstack.org/#/q/l3-active-active > > I want to know the community's next work plan in this regard. > > Thanks. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From saphi070 at gmail.com Sat Jan 5 03:58:35 2019 From: saphi070 at gmail.com (Sa Pham) Date: Sat, 5 Jan 2019 10:58:35 +0700 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: <3F4AE7E0-CE23-4098-8C79-225781E0BBBF@gmail.com> Hi Jeff, Do you have design and specs for that. Best, Sa Pham Dang Cloud RnD Team - VCCLOUD Phone: 0986849582 Skype: great_bn > On Jan 5, 2019, at 10:29 AM, Jeff Yang wrote: > > Yes, I want to reboot this work recently. And, I want to replace exabgp with > os-ken, So I may need to rewrite some patches. > > Michael Johnson 于2019年1月5日周六 上午8:02写道: >> Hi Jeff, >> >> Unfortunately the team that was working on that code had stopped due >> to internal reasons. >> >> I hope to make the reference active/active blueprint a priority again >> during the Train cycle. Following that I may be able to look at the L3 >> distributor option, but I cannot commit to that at this time. >> >> If you are interesting in picking up that work, please let me know and >> we can sync up on that status of the WIP patches, etc. >> >> Michael >> >> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: >> > >> > Dear Octavia team: >> > The email aims to ask the development progress about l3-active-active blueprint. I >> > noticed that the work in this area has been stagnant for eight months. >> > https://review.openstack.org/#/q/l3-active-active >> > I want to know the community's next work plan in this regard. >> > Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From smarcet at gmail.com Sat Jan 5 04:19:21 2019 From: smarcet at gmail.com (Sebastian Marcet) Date: Sat, 5 Jan 2019 01:19:21 -0300 Subject: [docs] question about deprecation badges In-Reply-To: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> References: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> Message-ID: Hi Adam, that approach is followed in order to be consumed from other projects like theme docs, using an ajax call doing a get and including its content dynamically on the page, also the css dir its not used bc otherwise it will not point to the rite css path on the fore-mentioned approach, instead its using the absolute path in order to load the css correctly through the ajax call ( check https://review.openstack.org/#/c/585516/) on docs theme the deprecation badge is loaded using this snippet on file openstackdocstheme/theme/openstackdocs/layout.html hope that shed some lite , have in mind that its a first iteration and any better approach its welcome regards El vie., 4 ene. 2019 a las 20:45, Adam Spiers () escribió: > Hi all, > > I'm currently hacking on the deprecation badges in openstack-manuals, > and there's a couple of things I don't understand. Any chance someone > could explain why www/latest/badge.html doesn't just do: > > {% include 'templates/deprecated_badge.tmpl' %} > > like all the others? > > > https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-61d0adc734c25e15fa375c6acd344703 > > I'm also what exactly would be wrong with the included CSS path if > CSSDIR was used in www/templates/deprecated_badge.tmpl instead of > heeding this caveat: > > > https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-67d1669c09d2cddc437c6d803a5d6c02R4 > > It would be good to fix it to use CSSDIR because currently it's > awkward to test CSS changes. > > Thanks! > Adam > -- Sebastian Marcet https://ar.linkedin.com/in/smarcet SKYPE: sebastian.marcet -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjf1970231893 at gmail.com Sat Jan 5 04:31:28 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Sat, 5 Jan 2019 12:31:28 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: <3F4AE7E0-CE23-4098-8C79-225781E0BBBF@gmail.com> Message-ID: I have no new specs, I plan to follow the original blueprint. https://docs.openstack.org/octavia/latest/contributor/specs/version1.1/active-active-l3-distributor.html Jeff Yang 于2019年1月5日周六 下午12:16写道: > I have no new specs, I plan to follow the original blueprint. > > https://docs.openstack.org/octavia/latest/contributor/specs/version1.1/active-active-l3-distributor.html > > Sa Pham 于2019年1月5日周六 上午11:58写道: > >> Hi Jeff, >> >> Do you have design and specs for that. >> >> >> Best, >> >> Sa Pham Dang >> Cloud RnD Team - VCCLOUD >> Phone: 0986849582 >> Skype: great_bn >> >> On Jan 5, 2019, at 10:29 AM, Jeff Yang wrote: >> >> Yes, I want to reboot this work recently. And, I want to replace >> exabgp with >> os-ken, So I may need to rewrite some patches. >> >> Michael Johnson 于2019年1月5日周六 上午8:02写道: >> >>> Hi Jeff, >>> >>> Unfortunately the team that was working on that code had stopped due >>> to internal reasons. >>> >>> I hope to make the reference active/active blueprint a priority again >>> during the Train cycle. Following that I may be able to look at the L3 >>> distributor option, but I cannot commit to that at this time. >>> >>> If you are interesting in picking up that work, please let me know and >>> we can sync up on that status of the WIP patches, etc. >>> >>> Michael >>> >>> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang >>> wrote: >>> > >>> > Dear Octavia team: >>> > The email aims to ask the development progress about >>> l3-active-active blueprint. I >>> > noticed that the work in this area has been stagnant for eight months. >>> > https://review.openstack.org/#/q/l3-active-active >>> > I want to know the community's next work plan in this regard. >>> > Thanks. >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From smarcet at gmail.com Sat Jan 5 04:34:11 2019 From: smarcet at gmail.com (Sebastian Marcet) Date: Sat, 5 Jan 2019 01:34:11 -0300 Subject: [docs] question about deprecation badges In-Reply-To: References: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> Message-ID: and the latest release its different bc its a corner case, there is not way to figure it out from the template logic if current navigated release its the latest so that is why its has its own template and logic regards El sáb., 5 ene. 2019 a las 1:19, Sebastian Marcet () escribió: > Hi Adam, that approach is followed in order to be consumed from other > projects like theme docs, using an ajax call doing a get and including its > content dynamically on the page, also the css dir its not used bc otherwise > it will not point to the rite css path on the fore-mentioned approach, > instead its using the absolute path in order to load the css correctly > through the ajax call ( check https://review.openstack.org/#/c/585516/) > > on docs theme the deprecation badge is loaded using this snippet > > > > on file openstackdocstheme/theme/openstackdocs/layout.html > > hope that shed some lite , have in mind that its a first iteration and any > better approach its welcome > > regards > > > El vie., 4 ene. 2019 a las 20:45, Adam Spiers () > escribió: > >> Hi all, >> >> I'm currently hacking on the deprecation badges in openstack-manuals, >> and there's a couple of things I don't understand. Any chance someone >> could explain why www/latest/badge.html doesn't just do: >> >> {% include 'templates/deprecated_badge.tmpl' %} >> >> like all the others? >> >> >> https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-61d0adc734c25e15fa375c6acd344703 >> >> I'm also what exactly would be wrong with the included CSS path if >> CSSDIR was used in www/templates/deprecated_badge.tmpl instead of >> heeding this caveat: >> >> >> https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-67d1669c09d2cddc437c6d803a5d6c02R4 >> >> It would be good to fix it to use CSSDIR because currently it's >> awkward to test CSS changes. >> >> Thanks! >> Adam >> > > > -- > Sebastian Marcet > https://ar.linkedin.com/in/smarcet > SKYPE: sebastian.marcet > -- Sebastian Marcet https://ar.linkedin.com/in/smarcet SKYPE: sebastian.marcet -------------- next part -------------- An HTML attachment was scrubbed... URL: From smarcet at gmail.com Sat Jan 5 04:40:21 2019 From: smarcet at gmail.com (Sebastian Marcet) Date: Sat, 5 Jan 2019 01:40:21 -0300 Subject: [docs] question about deprecation badges In-Reply-To: References: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> Message-ID: you could check the reason here https://review.openstack.org/#/c/585517/ https://review.openstack.org/#/c/585517/1/openstackdocstheme/theme/openstackdocs/layout.html regards El sáb., 5 ene. 2019 a las 1:34, Sebastian Marcet () escribió: > and the latest release its different bc its a corner case, there is not > way to figure it out from the template logic if current navigated release > its the latest so that is why its has its own template and logic regards > > El sáb., 5 ene. 2019 a las 1:19, Sebastian Marcet () > escribió: > >> Hi Adam, that approach is followed in order to be consumed from other >> projects like theme docs, using an ajax call doing a get and including its >> content dynamically on the page, also the css dir its not used bc otherwise >> it will not point to the rite css path on the fore-mentioned approach, >> instead its using the absolute path in order to load the css correctly >> through the ajax call ( check https://review.openstack.org/#/c/585516/) >> >> on docs theme the deprecation badge is loaded using this snippet >> >> >> >> on file openstackdocstheme/theme/openstackdocs/layout.html >> >> hope that shed some lite , have in mind that its a first iteration and >> any better approach its welcome >> >> regards >> >> >> El vie., 4 ene. 2019 a las 20:45, Adam Spiers () >> escribió: >> >>> Hi all, >>> >>> I'm currently hacking on the deprecation badges in openstack-manuals, >>> and there's a couple of things I don't understand. Any chance someone >>> could explain why www/latest/badge.html doesn't just do: >>> >>> {% include 'templates/deprecated_badge.tmpl' %} >>> >>> like all the others? >>> >>> >>> https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-61d0adc734c25e15fa375c6acd344703 >>> >>> I'm also what exactly would be wrong with the included CSS path if >>> CSSDIR was used in www/templates/deprecated_badge.tmpl instead of >>> heeding this caveat: >>> >>> >>> https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-67d1669c09d2cddc437c6d803a5d6c02R4 >>> >>> It would be good to fix it to use CSSDIR because currently it's >>> awkward to test CSS changes. >>> >>> Thanks! >>> Adam >>> >> >> >> -- >> Sebastian Marcet >> https://ar.linkedin.com/in/smarcet >> SKYPE: sebastian.marcet >> > > > -- > Sebastian Marcet > https://ar.linkedin.com/in/smarcet > SKYPE: sebastian.marcet > -- Sebastian Marcet https://ar.linkedin.com/in/smarcet SKYPE: sebastian.marcet -------------- next part -------------- An HTML attachment was scrubbed... URL: From qianbiao.ng at turnbig.net Sat Jan 5 09:11:26 2019 From: qianbiao.ng at turnbig.net (qianbiao.ng at turnbig.net) Date: Sat, 5 Jan 2019 17:11:26 +0800 Subject: Ironic ibmc driver for Huawei server Message-ID: <201901051711257416397@turnbig.net>+5FE654154F3343E2 Hi julia, According to the comment of story, 1. The spec for huawei ibmc drvier has been post here: https://storyboard.openstack.org/#!/story/2004635 , waiting for review. 2. About the third-party CI part, we provide mocked unittests for our driver's code. Not sure what third-party CI works for in this case. What else we should do? Thanks Qianbiao.NG -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaronzhu1121 at gmail.com Mon Jan 7 03:22:55 2019 From: aaronzhu1121 at gmail.com (Rong Zhu) Date: Mon, 7 Jan 2019 11:22:55 +0800 Subject: [murano]Retire of murano-deployment project Message-ID: Hi, teams, Murano-deployment project is the third-party CI scripts and automatic tools. Since the third-party CI have already switched to openstack CI, this project doesn't need anymore. So we decided to retire murano-deployment project. Thanks, Rong Zhu -- Thanks, Rong Zhu -------------- next part -------------- An HTML attachment was scrubbed... URL: From yongli.he at intel.com Mon Jan 7 05:06:24 2019 From: yongli.he at intel.com (yonglihe) Date: Mon, 7 Jan 2019 13:06:24 +0800 Subject: [nova] implementation options for nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: On 2019/1/4 上午3:12, Matt Riedemann wrote: > On 1/3/2019 6:39 AM, Jay Pipes wrote: >> On 01/02/2019 10:15 PM, yonglihe wrote: >>> On 2018/12/18 下午4:20, yonglihe wrote: >>>> Hi, guys >>>> >>>> This spec needs input and discuss for move on. >>> >>> Jay suggest we might be good to use a new sub node to hold topology >>> stuff,  it's option 2, here. And split >>> >>> the PCI stuff out of this NUMA thing spec, use a /devices node to >>> hold all 'devices' stuff instead, then this node >>> >>> is generic and not only for PCI itself. >>> >>> I'm OK for Jay's suggestion,  it contains more key words and seems >>> crystal clear and straight forward. >>> >>> The problem is we need aligned about this. This spec need gain more >>> input thanks, Jay, Matt. >> >> Also, I mentioned that you need not (IMHO) combine both PCI/devices >> and NUMA topology in a single spec. We could proceed with the >> /topology API endpoint and work out the more generic /devices API >> endpoint in a separate spec. >> >> Best, >> -jay > > I said earlier in the email thread that I was OK with option 2 > (sub-resource) or the diagnostics API, and leaned toward the > diagnostics API since it was already admin-only. > > As long as this information is admin-only by default, not part of the > main server response body and therefore not parting of listing servers > with details (GET /servers/detail) then I'm OK either way and GET > /servers/{server_id}/topology is OK with me also. Thanks. the spec updated to use topology. Regards Yongli he From rui.zang at yandex.com Mon Jan 7 08:23:51 2019 From: rui.zang at yandex.com (rui zang) Date: Mon, 07 Jan 2019 16:23:51 +0800 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> Message-ID: <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> An HTML attachment was scrubbed... URL: From zhengzhenyulixi at gmail.com Mon Jan 7 08:37:50 2019 From: zhengzhenyulixi at gmail.com (Zhenyu Zheng) Date: Mon, 7 Jan 2019 16:37:50 +0800 Subject: [Nova] Suggestion needed for detach-boot-volume design In-Reply-To: References: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> Message-ID: Thanks alot for the replies, lets wait for some more comments, and I will update the follow-up spec about this within two days. On Sat, Jan 5, 2019 at 7:37 AM melanie witt wrote: > On Fri, 4 Jan 2019 09:50:46 -0600, Matt Riedemann > wrote: > > On 1/2/2019 2:57 AM, Zhenyu Zheng wrote: > >> I've been working on detach-boot-volume[1] in Stein, we got the initial > >> design merged and while implementing we have meet some new problems and > >> now I'm amending the spec to cover these new problems[2]. > > > > [2] is https://review.openstack.org/#/c/619161/ > > > >> > >> The thing I want to discuss for wider opinion is that in the initial > >> design, we planned to support detach root volume for only STOPPED and > >> SHELVED/SHELVE_OFFLOADED instances. But then we found out that we > >> allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as > >> well. Should we allow detaching root volume for instances in these > >> status too? Cases like RESIZE could be complicated for the revert resize > >> action, and it also seems unnecesary. > > > > The full set of allowed states for attaching and detaching are here: > > > > > https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4187 > > > > > https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4297 > > > > Concerning those other states: > > > > RESIZED: There might be a case for attaching/detaching volumes based on > > flavor during a resize, but I'm not sure about the root volume in that > > case (that really sounds more like rebuild with a new image to me, which > > is a different blueprint). I'm also not sure how much people know about > > the ability to do this or what the behavior is on revert if you have > > changed the volumes while the server is resized. If we consider that > > when a user reverts a resize, they want to go back to the way things > > were for the root disk image, then I would think we should not allow > > changing out the root volume while resized. > > Yeah, if someone attaches/detaches a regular volume while the instance > is in VERIFY_RESIZE state and then reverts the resize, I assume we > probably don't attempt to change or restore anything with the volume > attachments to put them back to how they were attached before the > resize. But as you point out, the situation does seem different > regarding a root volume. If a user changes that while in VERIFY_RESIZE > and reverts the resize, and we leave the root volume alone, then they > end up with a different root disk image than they had before the resize. > Which seems weird. > > I agree it seems better not to allow this for now and come back to it > later if people start asking for it. > > > PAUSED: First, I'm not sure how much anyone uses the pause API (or > > suspend for that matter) although most of the virt drivers implement it. > > At one point you could attach volumes to suspended servers as well, but > > because libvirt didn't support it that was removed from the API (yay for > > non-discoverable backend-specific API behavior changes): > > > > https://review.openstack.org/#/c/83505/ > > > > Anyway, swapping the root volume on a paused instance seems dangerous to > > me, so until someone really has a good use case for it, then I think we > > should avoid that one as well. > > > > SOFT_DELETED: I really don't understand the use case for > > attaching/detaching volumes to/from a (soft) deleted server. If the > > server is deleted and only hanging around because it hasn't been > > reclaimed yet, there are really no guarantees that this would work, so > > again, I would just skip this one for the root volume changes. If the > > user really wants to play with the volumes attached to a soft deleted > > server, they should restore it first. > > > > So in summary, I think we should just not support any of those other > > states for attach/detach root volumes and only focus on stopped or > > shelved instances. > > Again, agree, I think we should just not allow the other states for the > initial implementation and revisit later if it turns out people need these. > > -melanie > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjf1970231893 at gmail.com Mon Jan 7 09:17:53 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Mon, 7 Jan 2019 17:17:53 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Hi Michael, I found that you forbid import eventlet in octavia.[1] I guess the eventlet has a conflict with gunicorn, is that? But, I need to import eventlet for os-ken that used to implement bgp speaker.[2] I am studying eventlet and gunicorn deeply. Have you some suggestions to resolve this conflict? [1] https://review.openstack.org/#/c/462334/ [2] https://review.openstack.org/#/c/628915/ Michael Johnson 于2019年1月5日周六 上午8:02写道: > Hi Jeff, > > Unfortunately the team that was working on that code had stopped due > to internal reasons. > > I hope to make the reference active/active blueprint a priority again > during the Train cycle. Following that I may be able to look at the L3 > distributor option, but I cannot commit to that at this time. > > If you are interesting in picking up that work, please let me know and > we can sync up on that status of the WIP patches, etc. > > Michael > > On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: > > > > Dear Octavia team: > > The email aims to ask the development progress about > l3-active-active blueprint. I > > noticed that the work in this area has been stagnant for eight months. > > https://review.openstack.org/#/q/l3-active-active > > I want to know the community's next work plan in this regard. > > Thanks. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Jan 7 09:32:50 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Mon, 7 Jan 2019 01:32:50 -0800 Subject: [neutron] Functional tests job broken Message-ID: Hi Neutrinos, Since few days we have an issue with neutron-functional job [1]. Please don’t recheck Your patches now. It will not help until this bug will be fixed/workarouded. [1] https://bugs.launchpad.net/neutron/+bug/1810518 — Slawek Kaplonski Senior software engineer Red Hat From smooney at redhat.com Mon Jan 7 10:05:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Jan 2019 10:05:54 +0000 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> Message-ID: <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> On Mon, 2019-01-07 at 16:23 +0800, rui zang wrote: > Hey Jay, > > I replied to your comments to the spec however missed this email. > Please see my replies in line. > > Thanks, > Zang, Rui > > > > 03.01.2019, 21:31, "Jay Pipes" : > > On 01/02/2019 11:08 PM, Alex Xu wrote: > > > Jay Pipes > 于2019年1月2 > > > 日周三 下午10:48写道: > > > > > > On 12/21/2018 03:45 AM, Rui Zang wrote: > > > > It was advised in today's nova team meeting to bring this up by > > > email. > > > > > > > > There has been some discussion on the how to track persistent memory > > > > resource in placement on the spec review [1]. > > > > > > > > Background: persistent memory (PMEM) needs to be partitioned to > > > > namespaces to be consumed by VMs. Due to fragmentation issues, > > > the spec > > > > proposed to use fixed sized PMEM namespaces. > > > > > > The spec proposed to use fixed sized namespaces that is controllable by > > > the deployer, not fixed-size-for-everyone :) Just want to make sure > > > we're being clear here. > > > > > > > The spec proposed way to represent PMEM namespaces is to use one > > > > Resource Provider (RP) for one PMEM namespace. An new standard > > > Resource > > > > Class (RC) -- 'VPMEM_GB` is introduced to classify PMEM namspace > > > RPs. > > > > For each PMEM namespace RP, the values for 'max_unit', 'min_unit', > > > > 'total' and 'step_size` are all set to the size of the PMEM > > > namespace. > > > > In this way, it is guaranteed each RP will be consumed as a whole > > > at one > > > > time. > > > > > > > > An alternative was brought out in the review. Different Custom > > > Resource > > > > Classes ( CUSTOM_PMEM_XXXGB) can be used to designate PMEM > > > namespaces of > > > > different sizes. The size of the PMEM namespace is encoded in the > > > name > > > > of the custom Resource Class. And multiple PMEM namespaces of the > > > same > > > > size (say 128G) can be represented by one RP of the same > > > > > > Not represented by "one RP of the same CUSTOM_PMEM_128G". There > > > would be > > > only one resource provider: the compute node itself. It would have an > > > inventory of, say, 8 CUSTOM_PMEM_128G resources. > > > > > > > CUSTOM_PMEM_128G. In this way, the RP could have 'max_unit' and > > > 'total' > > > > as the total number of the PMEM namespaces of the certain size. > > > And the > > > > values of 'min_unit' and 'step_size' could set to 1. > > > > > > No, the max_unit, min_unit, step_size and total would refer to the > > > number of *PMEM namespaces*, not the amount of GB of memory represented > > > by those namespaces. > > > > > > Therefore, min_unit and step_size would be 1, max_unit would be the > > > total number of *namespaces* that could simultaneously be attached to a > > > single consumer (VM), and total would be 8 in our example where the > > > compute node had 8 of these pre-defined 128G PMEM namespaces. > > > > > > > We believe both way could work. We would like to have a community > > > > consensus on which way to use. > > > > Email replies and review comments to the spec [1] are both welcomed. > > > > > > Custom resource classes were invented for precisely this kind of use > > > case. The resource being represented is a namespace. The resource is > > > not > > > "a Gibibyte of persistent memory". > > > > > > > > > The point of the initial design is avoid to encode the `size` in the > > > resource class name. If that is ok for you(I remember people hate to > > > encode size and number into the trait name), then we will update the > > > design. Probably based on the namespace configuration, nova will be > > > responsible for create those custom RC first. Sounds works. > > > > A couple points... > > > > 1) I was/am opposed to putting the least-fine-grained size in a resource > > class name. For example, I would have preferred DISK_BYTE instead of > > DISK_GB. And MEMORY_BYTE instead of MEMORY_MB. > > I agree the more precise the better as far as resource tracking is concerned. > However, as for persistent memory, it usually comes out in large capacity -- > terabytes are normal. And the targeting applications are also expected to use > persistent memory in that quantity. GB is a reasonable unit not to make > the number too nasty. so im honestly not that concernetd with large numbers. if we want to imporve the user experience we can do what we do with hugepage memory. we suppport passing a sufix. so we can say 2M or 1G. if you are concerned with capasity its a relitivly simple exerises to show that if we use a 64 int or even 48bit we have plenty of headroom over where teh technology is. NVDIMs are speced for a max capasity of 512GB per module. if i recall correctly you can also only have 12 nvdim with 4 ram dimms per socket acting as a cache so that effectivly limits you to 6TB per socket or 12 TB per 1/2U with standard density servers. moderen x86 processors i belive still use a 48 bit phyical adress spaces with the last 16 bits reserved for future use meaning a host can adress a maxium of 2^48 or 256 TiB of memory such a system. note persistent memory is stream memory so it base 2 not base 10 so when we state it 1GB we technically mean 1 GiB or 2^10 bytes not 10^9 bytes whiile it unlikely we will ever need byte level granularity in allocations to guest im not sure i buy the argument that this will only be used by applications in large allocations in the 100GB or TBs range. i think i share jays preference here in increasing the granularity and eiter tracking the allocation in MiBs or Bytes. i do somewhat agree that bytes is likely to fine grained hence my perference for mebibytes. > > > 2) After reading the original Intel PMEM specification > > (http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf), it seems to me > > that what you are describing with a generic PMEM_GB (or PMEM_BYTE) > > resource class is more appropriate for the block mode translation system > > described in the PDF versus the PMEM namespace system described therein. > > > > From a lay person's perspective, I see the difference between the two > > as similar to the difference between describing the bytes that are in > > block storage versus a filesystem that has been formatted, wiped, > > cleaned, etc on that block storage. > > First let's talk about "block mode" v.s. "persistent memory mode". > They are not tiered up, they are counterparts. Each of them describes an access > method to the unlerlying hardware. Quote some sectors from > https://www.kernel.org/doc/Documentation/nvdimm/nvdimm.txt > inside the dash line block. > > ------------------------------8<------------------------------------------------------------------- > Why BLK? > -------- > > While PMEM provides direct byte-addressable CPU-load/store access to > NVDIMM storage, it does not provide the best system RAS (recovery, > availability, and serviceability) model. An access to a corrupted > system-physical-address address causes a CPU exception while an access > to a corrupted address through an BLK-aperture causes that block window > to raise an error status in a register. The latter is more aligned with > the standard error model that host-bus-adapter attached disks present. > Also, if an administrator ever wants to replace a memory it is easier to > service a system at DIMM module boundaries. Compare this to PMEM where > data could be interleaved in an opaque hardware specific manner across > several DIMMs. > > PMEM vs BLK > BLK-apertures solve these RAS problems, but their presence is also the > major contributing factor to the complexity of the ND subsystem. They > complicate the implementation because PMEM and BLK alias in DPA space. > Any given DIMM's DPA-range may contribute to one or more > system-physical-address sets of interleaved DIMMs, *and* may also be > accessed in its entirety through its BLK-aperture. Accessing a DPA > through a system-physical-address while simultaneously accessing the > same DPA through a BLK-aperture has undefined results. For this reason, > DIMMs with this dual interface configuration include a DSM function to > store/retrieve a LABEL. The LABEL effectively partitions the DPA-space > into exclusive system-physical-address and BLK-aperture accessible > regions. For simplicity a DIMM is allowed a PMEM "region" per each > interleave set in which it is a member. The remaining DPA space can be > carved into an arbitrary number of BLK devices with discontiguous > extents. > ------------------------------8<------------------------------------------------------------------- > > You can see that "block mode" does not provide "direct access", thus not the best > performance. That is the reason "persistent memory mode" is proposed in the spec. the block mode will allow any exsing applciation that is coded to work with a block device to just use the NVDIM storage as a faster from of solid state storage. direct mode reqiures applications to be specifcialy coded to support it. form an openstack perspective we will eventually want to support exposing the deivce both as a block deivce (e.g. via virtio-blk or virtio-scsi devices if/when qemu supports that) and direct mode pmem device to the guest. i understand why persistent memory mode is more appealing from a vendor perspecitve to lead with but pratically speaking there are very few application that actully supprot pmem to date and supporting app direct mode only seams like it would hurt adoption of this feautre more generally then encourage it. > > However, people can still create a block device out of a "persistent memory mode" > namespace. And further more, create a file system on top of that block device. > Applications can map files from that file system into their memory namespaces, > and if the file system is DAX (direct-access) capable. The application's access to > the hardware is still direct-access which means direct byte-addressable > CPU-load/store access to NVDIMM storage. > This is perfect so far, as one can think of why not just track the DAX file system > and let the VM instances map the files of the file system? > However, this usage model is reported to have severe issues with hardware > pass-ed through. So the recommended model is still mapping namespaces > of "persistent memory mode" into applications' address space. > intels nvdimm technology works in 3 modes, app direct, block and system memory. the direct and block modes were discussed at some lenght in the spec and this thread. does libvirt support using a nvdims pmem namespaces in devdax mode to back a guest memory instead of system ram. based on https://docs.pmem.io/getting-started-guide/creating-development-environments/virtualization/qemu qemu does support such a configuration and honestly haveing the capablity to alter the guest meory backing to run my vms with 100s or GB of ram would as compeeling as app direct mode as it would allow all my legacy application to work without modification and would deliver effectivly the same perfromance. perhaps we should also consider a hw:mem_page_backing extra spec to complement the hw:mem_page_size we have already hugepages today. this would proably be a seperate spec but i would hope we dont make desisions today that would block other useage models in the future. > > > > > In Nova, the DISK_GB resource class describes the former: it's a bunch > > of blocks that are reserved in the underlying block storage for use by > > the virtual machine. The virtual machine manager then formats that bunch > > of blocks as needed and lays down a formatted image. > > > > We don't have a resource class that represents "a filesystem" or "a > > partition" (yet). But the proposed PMEM namespaces in your spec > > definitely seem to be more like a "filesystem resource" than a "GB of > > block storage" resource. > > > > Best, > > -jay From balazs.gibizer at ericsson.com Mon Jan 7 12:52:35 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Mon, 7 Jan 2019 12:52:35 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> Message-ID: <1546865551.29530.0@smtp.office365.com> > But, let's chat more about it via a hangout the week after next (week > of January 14 when Matt is back), as suggested in #openstack-nova > today. We'll be able to have a high-bandwidth discussion then and > agree on a decision on how to move forward with this. Thank you all for the discussion. I agree to have a real-time discussion about the way forward. Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a hangouts[2]? I see the following topics we need to discuss: * backward compatibility with already existing SRIOV ports having min bandwidth * introducing microversion(s) for this feature in Nova * allowing partial support for this feature in Nova in Stein (E.g.: only server create/delete but no migrate support). * step-by-step verification of the really long commit chain in Nova I will post a summar of each issue to the ML during this week. Cheers, gibi [1] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190114T170000 [2] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI From fungi at yuggoth.org Mon Jan 7 13:11:37 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 7 Jan 2019 13:11:37 +0000 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> Message-ID: <20190107131137.4xc4lue7t333iosu@yuggoth.org> On 2019-01-07 10:05:54 +0000 (+0000), Sean Mooney wrote: [...] > note persistent memory is stream memory so it base 2 not base 10 > so when we state it 1GB we technically mean 1 GiB or 2^10 bytes > not 10^9 bytes [...] Not to get pedantic, but a gibibyte is 2^30 bytes (2^10 is a kibibyte). I'm quite sure you (and most of the rest of us) know this, just pointing it out for the sake of clarity. -- Jeremy Stanley From smooney at redhat.com Mon Jan 7 13:47:28 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Jan 2019 13:47:28 +0000 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <20190107131137.4xc4lue7t333iosu@yuggoth.org> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> <20190107131137.4xc4lue7t333iosu@yuggoth.org> Message-ID: <37db2f34f4f39b302df1a30662013c75f4f61853.camel@redhat.com> On Mon, 2019-01-07 at 13:11 +0000, Jeremy Stanley wrote: > On 2019-01-07 10:05:54 +0000 (+0000), Sean Mooney wrote: > [...] > > note persistent memory is stream memory so it base 2 not base 10 > > so when we state it 1GB we technically mean 1 GiB or 2^10 bytes > > not 10^9 bytes > > [...] > > Not to get pedantic, but a gibibyte is 2^30 bytes (2^10 is a > kibibyte). I'm quite sure you (and most of the rest of us) know > this, just pointing it out for the sake of clarity. yep i spotted that when i was reading the mail back after i send it :) i kind of wanted to fix it but i assumed most would see its a typo and didnt wnat to spam. the main point i wanted to convay was that nvdimm-p is being standarised by JEDEC and will be using there unit definitions rather then the IEC definitions typically used by block storage. thanks for giving me the opertunity to calrify :) From jaypipes at gmail.com Mon Jan 7 14:02:40 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Mon, 7 Jan 2019 09:02:40 -0500 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> Message-ID: <5b70f20b-fbae-97f9-0253-1d54d84057e3@gmail.com> On 01/07/2019 05:05 AM, Sean Mooney wrote: > i think i share jays preference here in increasing the granularity and eiter tracking > the allocation in MiBs or Bytes. i do somewhat agree that bytes is likely to fine grained > hence my perference for mebibytes. Actually, that's not at all my preference for PMEM :) My preference is to use custom resource classes like "CUSTOM_PMEM_NAMESPACE_1TB" because the resource is the namespace, not the bunch of blocks/bytes of storage. With regards to the whole "finest-grained unit" thing, I was just responding to Alex Xu's comment: "The point of the initial design is avoid to encode the `size` in the resource class name. If that is ok for you(I remember people hate to encode size and number into the trait name), then we will update the design. Probably based on the namespace configuration, nova will be responsible for create those custom RC first. Sounds works." Best, -jay From ignaziocassano at gmail.com Mon Jan 7 15:22:03 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 7 Jan 2019 16:22:03 +0100 Subject: Queens octavia error Message-ID: Hello All, I installed octavia on queens with centos 7, but when I create a load balance with the command openstack loadbalancer create --name lb1 --vip-subnet-id admin-subnet I got some errors in octavia worker.log: 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server failures[0].reraise() 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/taskflow/types/failure.py", line 343, in reraise 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server six.reraise(*self._exc_info) 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server result = task.execute(**arguments) 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 192, in execute 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server raise exceptions.ComputeBuildException(fault=fault) 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server ComputeBuildException: Failed to build compute instance due to: {u'message': u'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 581, in build_instances\n raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': u'2019-01-07T15:15:59Z'} Anyone could help me, please ? Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Mon Jan 7 15:25:43 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Jan 2019 09:25:43 -0600 Subject: [neutron][oslo] Functional tests job broken (oslo.privsep) In-Reply-To: References: Message-ID: <20190107152543.kugiskrwk4kuawtf@mthode.org> On 19-01-07 01:32:50, Slawomir Kaplonski wrote: > Hi Neutrinos, > > Since few days we have an issue with neutron-functional job [1]. > Please don’t recheck Your patches now. It will not help until this bug > will be fixed/workarouded. > > [1] https://bugs.launchpad.net/neutron/+bug/1810518 > Adding an oslo tag. As far as can be determined the new oslo.privsep code impacts neutron. There is a requirements review out to restict the version of oslo.privsep but I'd like an ack from oslo people before we take a step back. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From derekh at redhat.com Mon Jan 7 16:24:13 2019 From: derekh at redhat.com (Derek Higgins) Date: Mon, 7 Jan 2019 16:24:13 +0000 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default Message-ID: Hi All, Shortly before the holidays CI jobs moved from xenial to bionic, for Ironic this meant a bunch failures[1], all have now been dealt with, with the exception of the UEFI job. It turns out that during this job our (virtual) baremetal nodes use tftp to download a ipxe image. In order to track these tftp connections we have been making use of the fact that nf_conntrack_helper has been enabled by default. In newer kernel versions[2] this is no longer the case and I'm now trying to figure out the best way to deal with the new behaviour. I've put together some possible solutions along with some details on why they are not ideal and would appreciate some opinions 1. Why not enable the conntrack helper with echo 1 > /proc/sys/net/netfilter/nf_conntrack_helper The router namespace is still created with nf_conntrack_helper==0 as it follows the default the nf_conntrack module was loaded with 2. Enable it in modprobe.d # cat /etc/modprobe.d/conntrack.conf options nf_conntrack nf_conntrack_helper=1 This works but requires the nf_conntrack module to be unloaded if it has already been loaded, for devstack and I guess in the majority of cases (including CI nodes) this means a reboot stage or a potentially error prone sequence of stopping the firewall and unloading nf_conntrack modules. This also globally turns on the helper on the host reintroducing the security concerns it comes with 3. Enable the contrack helper in the router network namespace when it is created[3] This works for ironic CI, but there may be better solutions that can be worked within neutron that I'm not aware of. Of the 3 options above this would be most transparent to other operators as the original behaviour would be maintained. thoughts on any of the above? or better solutions? 1 - https://storyboard.openstack.org/#!/story/2004604 2 - https://kernel.googlesource.com/pub/scm/linux/kernel/git/horms/ipvs-next/+/3bb398d925ec73e42b778cf823c8f4aecae359ea 3 - https://review.openstack.org/#/c/628493/1 From mihalis68 at gmail.com Mon Jan 7 16:29:32 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Mon, 7 Jan 2019 11:29:32 -0500 Subject: [Ops] ops meetups team meeting 2018-12-18 In-Reply-To: References: Message-ID: Hello Everyone, The next OpenStack ops meetup team meeting will be tomorrow (2019-1-8) at 10am EST on #openstack-operators on freenode. It is important that we get back to work, as there is an ops meetup to organise! The only offer received in written form was the Deutsche Telekom offer to host in Berlin (see https://etherpad.openstack.org/p/ops-meetup-venue-discuss-1st-2019-berlin) and those present in previous meetings favored opting for Thursday March 7th and Friday March 8th so as to adjoin the weekend, making a personal weekend in berlin immediately after the meetup more doable. The OpenStack "ops meetups team" is charged with making these events happen, but we could definitely do with some help. If you'd like to be involved, see our charter here https://wiki.openstack.org/wiki/Ops_Meetups_Team and/or attend the meeting on IRC tomorrow. A draft agenda is now posted at the top of our agenda etherpad, see https://etherpad.openstack.org/p/ops-meetups-team All being well, I hope we can formally agree on the Berlin proposal and get moving with all the usual prep work. We have two months to get it done. See you tomorrow! Chris On Tue, Dec 18, 2018 at 10:54 AM Chris Morgan wrote: > Meeting minutes from our meeting today on IRC are linked below. > > Key points: > - next meeting Jan 8th 2019 > - ops meetup #1 looks likely to be at Deutsche Telekom, Berlin, March 7,8 > 2019 > - the team hopes to confirm this soon and commence technical agenda > planning early January > - team meetings will continue to be 10AM EST Tuesdays on > #openstack-operators > > Meeting ended Tue Dec 18 15:44:05 2018 UTC. > 10:44 AM Minutes: > http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-12-18-15.03.html > 10:44 AM Minutes (text): > http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-12-18-15.03.txt > 10:44 AM Log: > http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-12-18-15.03.log.html > > About the OpenStack Operators Meetups team: > https://wiki.openstack.org/wiki/Ops_Meetups_Team > > Chris > > -- > Chris Morgan > -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Jan 7 16:44:05 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Mon, 7 Jan 2019 08:44:05 -0800 Subject: [neutron] Bug deputy report - week 1 2019 Message-ID: Hi Neutrios, I was bug deputy last week. Below is summary of bugs reported this week: Critical bugs: * https://bugs.launchpad.net/neutron/+bug/1810314  - neutron objects base get_values() fails with KeyError - I marked it as Critical because it cause gate failures for Tricircle project, there is patch proposed for that https://review.openstack.org/#/c/628857/ but some OVO expert should take a look at it, * https://bugs.launchpad.net/neutron/+bug/1810518 - neutron-functional tests failing with oslo.privsep 1.31 - set to Critical as it cause gate failures, Ben Nemec is looking at it from oslo.privsep side. I found that all problems are caused by tests from neutron.tests.functional.agent.linux.test_netlink_lib.NetlinkLibTestCase so maybe if someone familiar with this code in Neutron can take a look at it too. High bugs: * https://bugs.launchpad.net/neutron/+bug/1809238 - [l3] `port_forwarding` cannot be set before l3 `router` in service_plugins - I set it to High - it looks that we now require proper order of service plugins in config file, it is in progress, LIU Yulong is working on it. We also discussed that on last L3 sub team meeting. * https://bugs.launchpad.net/neutron/+bug/1810504 - neutron-tempest-iptables_hybrid job failing with internal server error while listining ports - set to High as it cause gate failures from time to time, patch proposed already: https://review.openstack.org/#/c/628492/ * https://bugs.launchpad.net/neutron/+bug/1810764 - XenServer cannot enable tunneling - this is fresh bug from today, I set it to High and it needs attention from someone familiar with Xenserver, Other bugs: * https://bugs.launchpad.net/neutron/+bug/1810025 - ovs-agent do not clear QoS rules after restart - in progress, I set it to Medium, patch proposed https://review.openstack.org/#/c/627779/ * https://bugs.launchpad.net/neutron/+bug/1810349 - agent gw ports created on non dvr destination hosts - DVR related issue, I set it to Medium - patch proposed: https://review.openstack.org/628071 * https://bugs.launchpad.net/neutron/+bug/1810563 - adding rules to security groups is slow - set to Medium, patch proposed https://review.openstack.org/628691 — Slawek Kaplonski Senior software engineer Red Hat From juliaashleykreger at gmail.com Mon Jan 7 16:48:48 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 7 Jan 2019 08:48:48 -0800 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: References: Message-ID: Thanks for bringing this up Derek! Comments below. On Mon, Jan 7, 2019 at 8:30 AM Derek Higgins wrote: > > Hi All, > > Shortly before the holidays CI jobs moved from xenial to bionic, for > Ironic this meant a bunch failures[1], all have now been dealt with, > with the exception of the UEFI job. It turns out that during this job > our (virtual) baremetal nodes use tftp to download a ipxe image. In > order to track these tftp connections we have been making use of the > fact that nf_conntrack_helper has been enabled by default. In newer > kernel versions[2] this is no longer the case and I'm now trying to > figure out the best way to deal with the new behaviour. I've put > together some possible solutions along with some details on why they > are not ideal and would appreciate some opinions The git commit message suggests that users should explicitly put in rules such that the traffic is matched. I feel like the kernel change ends up being a behavior change in this case. I think the reasonable path forward is to have a configuration parameter that the l3 agent can use to determine to set the netfilter connection tracker helper. Doing so, allows us to raise this behavior change to operators minimizing the need of them having to troubleshoot it in production, and gives them a choice in the direction that they wish to take. [trim] > 3. Enable the contrack helper in the router network namespace when it > is created[3] > This works for ironic CI, but there may be better solutions that can > be worked within neutron that I'm not aware of. Of the 3 options above > this would be most transparent to other operators as the original > behaviour would be maintained. > My thoughts exactly. > thoughts on any of the above? or better solutions? I think we should just raise it as a configuration option. Coupled with a release note, provides operators visibility to the kernel change. > > 1 - https://storyboard.openstack.org/#!/story/2004604 > 2 - https://kernel.googlesource.com/pub/scm/linux/kernel/git/horms/ipvs-next/+/3bb398d925ec73e42b778cf823c8f4aecae359ea > 3 - https://review.openstack.org/#/c/628493/1 > From cboylan at sapwetik.org Mon Jan 7 17:05:38 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 07 Jan 2019 09:05:38 -0800 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: References: Message-ID: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: > Thanks for bringing this up Derek! > Comments below. > > On Mon, Jan 7, 2019 at 8:30 AM Derek Higgins wrote: > > > > Hi All, > > > > Shortly before the holidays CI jobs moved from xenial to bionic, for > > Ironic this meant a bunch failures[1], all have now been dealt with, > > with the exception of the UEFI job. It turns out that during this job > > our (virtual) baremetal nodes use tftp to download a ipxe image. In > > order to track these tftp connections we have been making use of the > > fact that nf_conntrack_helper has been enabled by default. In newer > > kernel versions[2] this is no longer the case and I'm now trying to > > figure out the best way to deal with the new behaviour. I've put > > together some possible solutions along with some details on why they > > are not ideal and would appreciate some opinions > > The git commit message suggests that users should explicitly put in rules such > that the traffic is matched. I feel like the kernel change ends up > being a behavior > change in this case. > > I think the reasonable path forward is to have a configuration > parameter that the > l3 agent can use to determine to set the netfilter connection tracker helper. > > Doing so, allows us to raise this behavior change to operators minimizing the > need of them having to troubleshoot it in production, and gives them a choice > in the direction that they wish to take. https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to cover this. Basically you should explicitly enable specific helpers when you need them rather than relying on the auto helper rules. Maybe even avoid the configuration option entirely if ironic and neutron can set the required helper for tftp when tftp is used? > > [trim] > [more trimming] From rui.zang at yandex.com Mon Jan 7 17:17:59 2019 From: rui.zang at yandex.com (rui zang) Date: Tue, 08 Jan 2019 01:17:59 +0800 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> Message-ID: <15042191546881479@iva4-031ea4da33a1.qloud-c.yandex.net> An HTML attachment was scrubbed... URL: From rui.zang at yandex.com Mon Jan 7 17:20:17 2019 From: rui.zang at yandex.com (rui zang) Date: Tue, 08 Jan 2019 01:20:17 +0800 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <5b70f20b-fbae-97f9-0253-1d54d84057e3@gmail.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> <5b70f20b-fbae-97f9-0253-1d54d84057e3@gmail.com> Message-ID: <48252031546881617@sas1-d856b3d759c7.qloud-c.yandex.net> An HTML attachment was scrubbed... URL: From mrhillsman at gmail.com Mon Jan 7 17:28:39 2019 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Mon, 7 Jan 2019 11:28:39 -0600 Subject: [all] [uc] OpenStack UC Meeting @ 1900 UTC Message-ID: Hi everyone, Just a reminder that the UC meeting will be in #openstack-uc in a little more than an hour and a half from now. Please feel empowered to add to the agenda here - https://etherpad.openstack.org/p/uc - and we hope to see you there! -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Mon Jan 7 17:32:32 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 07 Jan 2019 17:32:32 +0000 Subject: [nova] Mempage fun Message-ID: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> We've been looking at a patch that landed some months ago and have spotted some issues: https://review.openstack.org/#/c/532168 In summary, that patch is intended to make the memory check for instances memory pagesize aware. The logic it introduces looks something like this: If the instance requests a specific pagesize (#1) Check if each host cell can provide enough memory of the pagesize requested for each instance cell Otherwise If the host has hugepages (#2) Check if each host cell can provide enough memory of the smallest pagesize available on the host for each instance cell Otherwise (#3) Check if each host cell can provide enough memory for each instance cell, ignoring pagesizes This also has the side-effect of allowing instances with hugepages and instances with a NUMA topology but no hugepages to co-exist on the same host, because the latter will now be aware of hugepages and won't consume them. However, there are a couple of issues with this: 1. It breaks overcommit for instances without pagesize request running on hosts with different pagesizes. This is because we don't allow overcommit for hugepages, but case (#2) above means we are now reusing the same functions previously used for actual hugepage checks to check for regular 4k pages 2. It doesn't fix the issue when non-NUMA instances exist on the same host as NUMA instances with hugepages. The non-NUMA instances don't run through any of the code above, meaning they're still not pagesize aware We could probably fix issue (1) by modifying those hugepage functions we're using to allow overcommit via a flag that we pass for case (#2). We can mitigate issue (2) by advising operators to split hosts into aggregates for 'hw:mem_page_size' set or unset (in addition to 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but I think this may be the case in some docs (sean-k-mooney said Intel used to do this. I don't know about Red Hat's docs or upstream). In addition, we did actually called that out in the original spec: https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact However, if we're doing that for non-NUMA instances, one would have to question why the patch is necessary/acceptable for NUMA instances. For what it's worth, a longer fix would be to start tracking hugepages in a non-NUMA aware way too but that's a lot more work and doesn't fix the issue now. As such, my question is this: should be look at fixing issue (1) and documenting issue (2), or should we revert the thing wholesale until we work on a solution that could e.g. let us track hugepages via placement and resolve issue (2) too. Thoughts? Stephen From juliaashleykreger at gmail.com Mon Jan 7 17:42:23 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 7 Jan 2019 09:42:23 -0800 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> References: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> Message-ID: On Mon, Jan 7, 2019 at 9:11 AM Clark Boylan wrote: > > On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: [trim] > > > > Doing so, allows us to raise this behavior change to operators minimizing the > > need of them having to troubleshoot it in production, and gives them a choice > > in the direction that they wish to take. > > https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to cover this. Basically you should explicitly enable specific helpers when you need them rather than relying on the auto helper rules. > > Maybe even avoid the configuration option entirely if ironic and neutron can set the required helper for tftp when tftp is used? > Great link Clark, thanks! It could be viable to ask operators to explicitly set their security groups for tftp to be passed. I guess we actually have multiple cases where there are issues and the only non-impacted case is when the ironic conductor host is directly attached to the flat network the machine is booting from. In the case of a flat network, it doesn't seem viable for us to change rules ad-hoc since we would need to be able to signal that the helper is needed, but it does seem viable to say "make sure connectivity works x way". Where as with multitenant networking, we use dedicated networks, so conceivably it is just a static security group setting that an operator can keep in place. Explicit static rules like that seem less secure to me without conntrack helpers. :( Does anyone in Neutron land have any thoughts? > > > > [trim] > > > > [more trimming] > From derekh at redhat.com Mon Jan 7 17:53:40 2019 From: derekh at redhat.com (Derek Higgins) Date: Mon, 7 Jan 2019 17:53:40 +0000 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> References: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> Message-ID: On Mon, 7 Jan 2019 at 17:08, Clark Boylan wrote: > > On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: > > Thanks for bringing this up Derek! > > Comments below. > > > > On Mon, Jan 7, 2019 at 8:30 AM Derek Higgins wrote: > > > > > > Hi All, > > > > > > Shortly before the holidays CI jobs moved from xenial to bionic, for > > > Ironic this meant a bunch failures[1], all have now been dealt with, > > > with the exception of the UEFI job. It turns out that during this job > > > our (virtual) baremetal nodes use tftp to download a ipxe image. In > > > order to track these tftp connections we have been making use of the > > > fact that nf_conntrack_helper has been enabled by default. In newer > > > kernel versions[2] this is no longer the case and I'm now trying to > > > figure out the best way to deal with the new behaviour. I've put > > > together some possible solutions along with some details on why they > > > are not ideal and would appreciate some opinions > > > > The git commit message suggests that users should explicitly put in rules such > > that the traffic is matched. I feel like the kernel change ends up > > being a behavior > > change in this case. > > > > I think the reasonable path forward is to have a configuration > > parameter that the > > l3 agent can use to determine to set the netfilter connection tracker helper. > > > > Doing so, allows us to raise this behavior change to operators minimizing the > > need of them having to troubleshoot it in production, and gives them a choice > > in the direction that they wish to take. > > https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to cover this. Basically you should explicitly enable specific helpers when you need them rather than relying on the auto helper rules. Thanks, I forgot to point out the option of adding these rules, If I understand it correctly they would need to be added inside the router namespace when neutron creates it, somebody from neutron might be able to indicate if this is a workable solution. > > Maybe even avoid the configuration option entirely if ironic and neutron can set the required helper for tftp when tftp is used? > > > > > [trim] > > > > [more trimming] > From openstack at nemebean.com Mon Jan 7 18:11:21 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Jan 2019 12:11:21 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Message-ID: Renamed the thread to be more descriptive. Just to update the list on this, it looks like the problem is a segfault when the netlink_lib module makes a C call. Digging into that code a bit, it appears there is a callback being used[1]. I've seen some comments that when you use a callback with a Python thread, the thread needs to be registered somehow, but this is all uncharted territory for me. Suggestions gratefully accepted. :-) 1: https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: > Hi, > > I just found that functional tests in Neutron are failing since today or maybe yesterday. See [1] > I was able to reproduce it locally and it looks that it happens with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. > > [1] https://bugs.launchpad.net/neutron/+bug/1810518 > > — > Slawek Kaplonski > Senior software engineer > Red Hat > >> Wiadomość napisana przez Ben Nemec w dniu 02.01.2019, o godz. 19:17: >> >> Yay alliteration! :-) >> >> I wanted to draw attention to this release[1] in particular because it includes the parallel privsep change[2]. While it shouldn't have any effect on the public API of the library, it does significantly affect how privsep will process calls on the back end. Specifically, multiple calls can now be processed at the same time, so if any privileged code is not reentrant it's possible that new race bugs could pop up. >> >> While this sounds scary, it's a necessary change to allow use of privsep in situations where a privileged call may take a non-trivial amount of time. Cinder in particular has some privileged calls that are long-running and can't afford to block all other privileged calls on them. >> >> So if you're a consumer of oslo.privsep please keep your eyes open for issues related to this new release and contact the Oslo team if you find any. Thanks. >> >> -Ben >> >> 1: https://review.openstack.org/628019 >> 2: https://review.openstack.org/#/c/593556/ >> > From smooney at redhat.com Mon Jan 7 19:19:46 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Jan 2019 19:19:46 +0000 Subject: [nova] [placement] Mempage fun In-Reply-To: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> Message-ID: On Mon, 2019-01-07 at 17:32 +0000, Stephen Finucane wrote: > We've been looking at a patch that landed some months ago and have > spotted some issues: > > https://review.openstack.org/#/c/532168 > > In summary, that patch is intended to make the memory check for > instances memory pagesize aware. The logic it introduces looks > something like this: > > If the instance requests a specific pagesize > (#1) Check if each host cell can provide enough memory of the > pagesize requested for each instance cell > Otherwise > If the host has hugepages > (#2) Check if each host cell can provide enough memory of the > smallest pagesize available on the host for each instance cell > Otherwise > (#3) Check if each host cell can provide enough memory for > each instance cell, ignoring pagesizes > > This also has the side-effect of allowing instances with hugepages and > instances with a NUMA topology but no hugepages to co-exist on the same > host, because the latter will now be aware of hugepages and won't > consume them. However, there are a couple of issues with this: > > 1. It breaks overcommit for instances without pagesize request > running on hosts with different pagesizes. This is because we don't > allow overcommit for hugepages, but case (#2) above means we are now > reusing the same functions previously used for actual hugepage > checks to check for regular 4k pages > 2. It doesn't fix the issue when non-NUMA instances exist on the same > host as NUMA instances with hugepages. The non-NUMA instances don't > run through any of the code above, meaning they're still not > pagesize aware > > We could probably fix issue (1) by modifying those hugepage functions > we're using to allow overcommit via a flag that we pass for case (#2). > We can mitigate issue (2) by advising operators to split hosts into > aggregates for 'hw:mem_page_size' set or unset (in addition to > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > I think this may be the case in some docs (sean-k-mooney said Intel > used to do this. I don't know about Red Hat's docs or upstream). In > addition, we did actually called that out in the original spec: > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > However, if we're doing that for non-NUMA instances, one would have to > question why the patch is necessary/acceptable for NUMA instances. For > what it's worth, a longer fix would be to start tracking hugepages in a > non-NUMA aware way too but that's a lot more work and doesn't fix the > issue now. > > As such, my question is this: should be look at fixing issue (1) and > documenting issue (2), or should we revert the thing wholesale until we > work on a solution that could e.g. let us track hugepages via placement > and resolve issue (2) too. for what its worth the review in question https://review.openstack.org/#/c/532168 actully attempts to implement option 1/ form https://bugs.launchpad.net/nova/+bug/1439247 the frist time i tried to fix issue 2 was with my proposal for the AggregateTypeExtraSpecsAffinityFilter https://review.openstack.org/#/c/183876/4/specs/liberty/approved/aggregate-flavor-extra-spec-affinity-filter.rst which became the out tree AggregateInstanceTypeFilter after 3 cycles of trying to get it upstream. https://github.com/openstack/nfv-filters/blob/master/nfv_filters/nova/scheduler/filters/aggregate_instance_type_filter.py the AggregateTypeExtraSpecsAffinityFilter or AggregateInstanceTypeFilter was a filter we deveopled spcifically to enforce seperation of instnace that uses explict memoery pages form those that did not and also to cater for dpdk hugepage requirement and enforce seperation of pinnind an unpinned guests. we finally got approval to publish a blog on the topin in january of 2017 https://software.intel.com/en-us/blogs/2017/01/04/filter-by-host-aggregate-metadata-or-by-image-extra-specs based on the the content in the second version of the spec https://review.openstack.org/#/c/314097/12/specs/newton/approved/aggregate-instance-type-filter.rst this filter was used in semi production 4g trial deployment in addtion to lab use with some parthers i was working with at the time but we decided to stop supporting it with the assumtion placemnet would solve it :) alot of the capablities of out of tree filter could likely be acived with some extentions to placemnt but are not support by placement today. i have raised the topic in the past of required tratis on a resouce provider that need to be present to in the request for an alloction to be made against the resouce provide. similar i have raised the idea of forbinon traits on a resocue provide that eliminates the resouce provide as a candiatie if present in the requerst. this is an inverse relation ship of the required and forbidin traits we have to day but is what the filter we implmented in 2015 did before placment using aggreate metatdata. i think there is a generalised problem statement here that would be a ligiame usecase for placement out side of simply tracking hugepages (or preferable memory of all page sizes) in placement. i would be infavor of fixing oversubsiption which is issue 1 this cycle as that is clearly a bug as a short term solution which we could backport and exploring adressing both issue 1 and 2 with placement or by repoposing the out of tree filter if placement deamed it out of scope. That said i too am interest to hear what other think especially the placement folks. you can jsut use host aggrates and existing filter to address issue 2 but its really easy to get wrong and its not very well documented that it is required. > > Thoughts? > Stephen > > From haleyb.dev at gmail.com Mon Jan 7 20:05:06 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Mon, 7 Jan 2019 15:05:06 -0500 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Message-ID: Hi Ben, On 1/7/19 1:11 PM, Ben Nemec wrote: > Renamed the thread to be more descriptive. > > Just to update the list on this, it looks like the problem is a segfault > when the netlink_lib module makes a C call. Digging into that code a > bit, it appears there is a callback being used[1]. I've seen some > comments that when you use a callback with a Python thread, the thread > needs to be registered somehow, but this is all uncharted territory for > me. Suggestions gratefully accepted. :-) > > 1: > https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 Maybe it's something as mentioned in the end of this section? https://docs.python.org/2/library/ctypes.html#callback-functions "Note Make sure you keep references to CFUNCTYPE() objects as long as they are used from C code. ctypes doesn’t, and if you don’t, they may be garbage collected, crashing your program when a callback is made. Also, note that if the callback function is called in a thread created outside of Python’s control (e.g. by the foreign code that calls the callback), ctypes creates a new dummy Python thread on every invocation. This behavior is correct for most purposes, but it means that values stored with threading.local will not survive across different callbacks, even when those calls are made from the same C thread." I can try keeping a reference to the callback function and see if it makes any difference, but I'm assuming it's not that easy. -Brian > On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >> Hi, >> >> I just found that functional tests in Neutron are failing since today >> or maybe yesterday. See [1] >> I was able to reproduce it locally and it looks that it happens with >> oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >> >> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >> >> — >> Slawek Kaplonski >> Senior software engineer >> Red Hat >> >>> Wiadomość napisana przez Ben Nemec w dniu >>> 02.01.2019, o godz. 19:17: >>> >>> Yay alliteration! :-) >>> >>> I wanted to draw attention to this release[1] in particular because >>> it includes the parallel privsep change[2]. While it shouldn't have >>> any effect on the public API of the library, it does significantly >>> affect how privsep will process calls on the back end. Specifically, >>> multiple calls can now be processed at the same time, so if any >>> privileged code is not reentrant it's possible that new race bugs >>> could pop up. >>> >>> While this sounds scary, it's a necessary change to allow use of >>> privsep in situations where a privileged call may take a non-trivial >>> amount of time.  Cinder in particular has some privileged calls that >>> are long-running and can't afford to block all other privileged calls >>> on them. >>> >>> So if you're a consumer of oslo.privsep please keep your eyes open >>> for issues related to this new release and contact the Oslo team if >>> you find any. Thanks. >>> >>> -Ben >>> >>> 1: https://review.openstack.org/628019 >>> 2: https://review.openstack.org/#/c/593556/ >>> >> > From doug at doughellmann.com Mon Jan 7 20:12:21 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Mon, 07 Jan 2019 15:12:21 -0500 Subject: [goal][python3] week R-13 update Message-ID: This is the weekly update for the "Run under Python 3 by default" goal (https://governance.openstack.org/tc/goals/stein/python3-first.html). == Ongoing and Completed Work == This week is the second milestone for the Stein cycle. By this point I hoped to have python 3 functional test jobs in place for all projects, but we still have quite a ways to go to achieve that. I have added a few missing projects to the wiki page [1] and there is a *lot* of red on that page. I count 21 projects without functional test jobs running under python 3. We also have a few projects who don't seem to have voting python 3 unit test jobs, still. [1] https://wiki.openstack.org/wiki/Python3#Other_OpenStack_Applications_and_Projects Now that folks are mostly back from the holidays, my patch to change the default version of python in devstack [2] is ready for approval. See the other thread on this list [3] for details. [2] https://review.openstack.org/#/c/622415/ [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001356.html == Next Steps == PTLs, please review the information for your projects on that page. If you have a functional test job, please update that part of the table with the name of the job. If you do not have a functional test job, please add any information you have about plans to implement one (blue prints, bugs, etc.) to the comments column in the table. == How can you help? == 1. Choose a patch that has failing tests and help fix it. https://review.openstack.org/#/q/topic:python3-first+status:open+(+label:Verified-1+OR+label:Verified-2+) 2. Review the patches for the zuul changes. Keep in mind that some of those patches will be on the stable branches for projects. 3. Work on adding functional test jobs that run under Python 3. == How can you ask for help? == If you have any questions, please post them here to the openstack-dev list with the topic tag [python3] in the subject line. Posting questions to the mailing list will give the widest audience the chance to see the answers. We are using the #openstack-dev IRC channel for discussion as well, but I'm not sure how good our timezone coverage is so it's probably better to use the mailing list. == Reference Material == Goal description: https://governance.openstack.org/tc/goals/stein/python3-first.html Open patches needing reviews: https://review.openstack.org/#/q/topic:python3-first+is:open Storyboard: https://storyboard.openstack.org/#!/board/104 Zuul migration notes: https://etherpad.openstack.org/p/python3-first Zuul migration tracking: https://storyboard.openstack.org/#!/story/2002586 Python 3 Wiki page: https://wiki.openstack.org/wiki/Python3 -- Doug From jungleboyj at gmail.com Mon Jan 7 21:43:12 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Mon, 7 Jan 2019 15:43:12 -0600 Subject: [dev][tc][ptl] Evaluating projects in relation to OpenStack cloud vision In-Reply-To: References: Message-ID: <8da07091-1fec-174b-af81-6ccc008bab2f@gmail.com> Julia and Chris, Thanks for putting this together.  Wanted to share some thoughts in-line below: On 1/4/2019 9:53 AM, Julia Kreger wrote: > As some of you may or may not have heard, recently the Technical > Committee approved a technical vision document [1]. > > The goal of the technical vision document is to try to provide a > reference point for cloud infrastructure software in an ideal > universe. It is naturally recognized that not all items will apply to > all projects. The document is a really good high level view of what each OpenStack project should hopefully conform to.  I think it would be good to get this into the Upstream Institute education in some way as I think it is something that new contributors should understand and keep in mind.  It certainly would have helped me as a newbie to think about this. > We envision the results of the evaluation to be added to each > project's primary contributor documentation tree > (/doc/source/contributor/vision-reflection.rst) as a list of bullet > points detailing areas where a project feels they need adjustment to > better align with the technical vision, and if the project already has > visibility into a path forward, that as well. Good idea to have teams go through this.  I will work on doing the above for Cinder. Jay From juliaashleykreger at gmail.com Mon Jan 7 22:38:14 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 7 Jan 2019 14:38:14 -0800 Subject: [dev][tc][ptl] Evaluating projects in relation to OpenStack cloud vision In-Reply-To: <8da07091-1fec-174b-af81-6ccc008bab2f@gmail.com> References: <8da07091-1fec-174b-af81-6ccc008bab2f@gmail.com> Message-ID: On Mon, Jan 7, 2019 at 1:48 PM Jay Bryant wrote: [trim] > > > > We envision the results of the evaluation to be added to each > > project's primary contributor documentation tree > > (/doc/source/contributor/vision-reflection.rst) as a list of bullet > > points detailing areas where a project feels they need adjustment to > > better align with the technical vision, and if the project already has > > visibility into a path forward, that as well. > > > > Good idea to have teams go through this. I will work on doing the above > for Cinder. > > Jay > > Thanks Jay! Putting on my Ironic TL hat for a while, I ended up with a fairly short list [1]. Maybe some naming/words should change, but overall I hope that it kind of gets the level ideas across to a reader. [1]: https://review.openstack.org/#/c/629060/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From codeology.lab at gmail.com Mon Jan 7 23:35:29 2019 From: codeology.lab at gmail.com (Cody) Date: Mon, 7 Jan 2019 18:35:29 -0500 Subject: [openstack-ansible]Enable DVR support with routed network (L3 spine) Message-ID: Hi OSA, Greetings! I wish to enable DVR in a routed network environment (i.e. a Leaf/Spine topology) using OSA. Has this feature been production ready under the OSA project? An example for the user_variables.yml would be great. Thank you very much. Regards, Cody From codeology.lab at gmail.com Mon Jan 7 23:45:19 2019 From: codeology.lab at gmail.com (Cody) Date: Mon, 7 Jan 2019 18:45:19 -0500 Subject: [openstack-ansible]Enable DVR with routed network (Spine/Leaf) Message-ID: Hi OSA, Greetings! I wish to enable DVR in a routed network environment (i.e. a Leaf/Spine topology) using OSA. Has this feature been production ready under the OSA project? An example for the user_variables.yml and openstack_user_config.yml would be much appreciated. Thank you very much. Regards, Cody From liliueecg at gmail.com Tue Jan 8 06:31:36 2019 From: liliueecg at gmail.com (Li Liu) Date: Tue, 8 Jan 2019 01:31:36 -0500 Subject: [Cyborg] IRC meeting Message-ID: The IRC meeting will be held Tuesday at 0300 UTC, which is 10:00 pm est(Tuesday) / 7:00 pm pst(Tuesday) /11 am Beijing time (Wednesday) This week's agenda: 1. Review Sundar's feature branch 2. Review and try to merge CoCo's patches https://review.openstack.org/#/c/625630/ https://review.openstack.org/#/c/624138/ 3. Track status updates -- Thank you Regards Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From chkumar246 at gmail.com Tue Jan 8 07:11:07 2019 From: chkumar246 at gmail.com (Chandan kumar) Date: Tue, 8 Jan 2019 12:41:07 +0530 Subject: [tripleo][openstack-ansible] collaboration on os_tempest role update V - Jan 08, 2019 Message-ID: Hello, Happy New Year all! Here is the first update (Dec, 19th, 18 to Jan 08, 19) of 2019 on collaboration on os_tempest[1] role between TripleO and OpenStack-Ansible projects. Things got merged: os_tempest: * Update all plugin urls to use https rather than git - https://review.openstack.org/625670 * Remove octavia in favor of octavia-tempest-plugin - https://review.openstack.org/625828 * Add the manila-tempest-plugin - https://review.openstack.org/626181 * Added support for installing python-tempestconf from git - https://review.openstack.org/625904 * Use tempest_tempestconf_profile for handling named args - https://review.openstack.org/623187 * Use tempest_cloud_name for setting cloudname - https://review.openstack.org/628610 python-tempestconf * Add profile argument - https://review.openstack.org/621567 * Add unit test for profile feature - https://review.openstack.org/626889 * Fixed SafeConfigParser deprecation warning for py3 - https://review.openstack.org/628130 * Fix diff in gates - https://review.openstack.org/628180 * Added python-tempestconf-tempest-devstack-admin/demo-py3 - https://review.openstack.org/622865 Summary: * On os_tempest side, we have finished the python-tempestconf support and introduced tempest_cloud_name var in order to set cloud name from clouds.yaml for tempest tests. * python-tempestconf got --profile feature and added py3 based devstack jobs. Note: when we use tempest run --subunit It always return exit status 0, It is the desired behaviour of stestr [https://github.com/mtreinish/stestr/issues/210]. The docs are now getting updated. We are working on implementing tempest last subcommand [https://review.openstack.org/#/c/511172/]related to the same. Things In-progress: os_tempest * Better tempest black and whitelist management - https://review.openstack.org/621605 * Add support for aarch64 images - https://review.openstack.org/620032 * Fix tempest workspace path - https://review.openstack.org/628182 * Configuration drives don't appear to work on aarch64+kvm - https://review.openstack.org/626592 * Use the inventory to enable/disable services by default - https://review.openstack.org/628979 * Synced tempest plugins names and services - https://review.openstack.org/628926 python-tempestconf * Create functional-tests role - https://review.openstack.org/626539 * Enable manila plugin in devstack - https://review.openstack.org/625191 Apart from this we have started working on integrating os_tempest with devstack and Tripleo standalone job. * Devstack - https://review.openstack.org/627482 * Tripleo CI - https://review.openstack.org/627500 We will try to finish os_tempest docs cleanup, whitelist/Blacklist tests management and os_tempest integration specs. I would like to thanks mkopec, arxcruz, cloudnull (reviewing patches in holidays), mnaser, jrosser, odyssey4me, marios & quiquell (from tripleo CI team on helping on os_tempest integration with Tripleo CI). Here is the 4th update [2] Have queries, Feel free to ping us on #tripleo or #openstack-ansible channel. Links: [1.] http://git.openstack.org/cgit/openstack/openstack-ansible-os_tempest [2.] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001116.html Thanks, Chandan Kumar From skaplons at redhat.com Tue Jan 8 08:06:11 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Tue, 8 Jan 2019 09:06:11 +0100 Subject: [neutron][oslo] Functional tests job broken (oslo.privsep) In-Reply-To: <20190107152543.kugiskrwk4kuawtf@mthode.org> References: <20190107152543.kugiskrwk4kuawtf@mthode.org> Message-ID: <97336A61-5F37-4A0D-A08F-6D2BE5B1F131@redhat.com> Hi, So requirements patch is now merged and oslo.privsep version is now lowered to 1.30.1. Neutron functional job should be good for now, You can recheck Your patches. — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Matthew Thode w dniu 07.01.2019, o godz. 16:25: > > On 19-01-07 01:32:50, Slawomir Kaplonski wrote: >> Hi Neutrinos, >> >> Since few days we have an issue with neutron-functional job [1]. >> Please don’t recheck Your patches now. It will not help until this bug >> will be fixed/workarouded. >> >> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >> > > Adding an oslo tag. As far as can be determined the new oslo.privsep > code impacts neutron. There is a requirements review out to restict the > version of oslo.privsep but I'd like an ack from oslo people before we > take a step back. > > -- > Matthew Thode From ignaziocassano at gmail.com Tue Jan 8 08:11:15 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 09:11:15 +0100 Subject: openstack queens octavia security group not found Message-ID: Hello everyone, I installed octavia with centos 7 queens. When I crreate a load balancer the amphora instance is not created because nova conductor cannot find the security group specified in octavia.conf. I am sure the security group id is correct but the nova condictor reports: 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils [req-75df2561-4bc3-4bde-86d0-40469058250c 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd - default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] Error from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1828, in _do_build_and_run_instance\n filter_properties, request_spec)\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2108, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security group fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] Please, what is wrong ? Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at ericsson.com Tue Jan 8 08:54:39 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 8 Jan 2019 08:54:39 +0000 Subject: [nova] Mempage fun In-Reply-To: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> Message-ID: <1546937673.17763.2@smtp.office365.com> On Mon, Jan 7, 2019 at 6:32 PM, Stephen Finucane wrote: > We've been looking at a patch that landed some months ago and have > spotted some issues: > > https://review.openstack.org/#/c/532168 > > In summary, that patch is intended to make the memory check for > instances memory pagesize aware. The logic it introduces looks > something like this: > > If the instance requests a specific pagesize > (#1) Check if each host cell can provide enough memory of the > pagesize requested for each instance cell > Otherwise > If the host has hugepages > (#2) Check if each host cell can provide enough memory of the > smallest pagesize available on the host for each instance > cell > Otherwise > (#3) Check if each host cell can provide enough memory for > each instance cell, ignoring pagesizes > > This also has the side-effect of allowing instances with hugepages and > instances with a NUMA topology but no hugepages to co-exist on the > same > host, because the latter will now be aware of hugepages and won't > consume them. However, there are a couple of issues with this: > > 1. It breaks overcommit for instances without pagesize request > running on hosts with different pagesizes. This is because we > don't > allow overcommit for hugepages, but case (#2) above means we > are now > reusing the same functions previously used for actual hugepage > checks to check for regular 4k pages > 2. It doesn't fix the issue when non-NUMA instances exist on the > same > host as NUMA instances with hugepages. The non-NUMA instances > don't > run through any of the code above, meaning they're still not > pagesize aware > > We could probably fix issue (1) by modifying those hugepage functions > we're using to allow overcommit via a flag that we pass for case (#2). > We can mitigate issue (2) by advising operators to split hosts into > aggregates for 'hw:mem_page_size' set or unset (in addition to > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > I think this may be the case in some docs (sean-k-mooney said Intel > used to do this. I don't know about Red Hat's docs or upstream). In > addition, we did actually called that out in the original spec: > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > However, if we're doing that for non-NUMA instances, one would have to > question why the patch is necessary/acceptable for NUMA instances. For > what it's worth, a longer fix would be to start tracking hugepages in > a > non-NUMA aware way too but that's a lot more work and doesn't fix the > issue now. > > As such, my question is this: should be look at fixing issue (1) and > documenting issue (2), or should we revert the thing wholesale until > we > work on a solution that could e.g. let us track hugepages via > placement > and resolve issue (2) too. If you feel that fixing (1) is pretty simple then I suggest to do that and document the limitation of (2) while we think about a proper solution. gibi > > Thoughts? > Stephen > > From jean-philippe at evrard.me Tue Jan 8 09:27:09 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 08 Jan 2019 10:27:09 +0100 Subject: [openstack-ansible]Enable DVR with routed network (Spine/Leaf) In-Reply-To: References: Message-ID: On Mon, 2019-01-07 at 18:45 -0500, Cody wrote: > Hi OSA, > > Greetings! > > I wish to enable DVR in a routed network environment (i.e. a > Leaf/Spine topology) using OSA. Has this feature been production > ready > under the OSA project? An example for the user_variables.yml and > openstack_user_config.yml would be much appreciated. > > Thank you very much. > > Regards, > Cody > That might have changed, but there are not many people that are using OVS + DVR in OSA. We don't have a full scenario testing of this (only testing in neutron), AFAIK. I would say if you are looking for example files, you might want to discuss with people on our channel (#openstack-ansible), maybe you'll find more people that might help you there. Also it might be worth checking in neutron role tests. Patches are always welcome to test full end-to-end coverage of this feature :) NB: Is there a particular reason you want DVR? Would calico fit the bill? Wouldn't other SDN solutions fit the bill better than DVR? Before going the DVR route, I am generally asking why :) Regards, Jean-Philippe Evrard (evrardjp) From ignaziocassano at gmail.com Tue Jan 8 09:39:16 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 10:39:16 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: Hello, I do not have an octavia project but only a service project. Octavia user belongs to admin and service project :-( Documentation does not seem clear about it Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann ha scritto: > Hi, > > did you create the security group in the octavia project? > > Can you see the sg if you login with the octavia credentials? > > > Fabian > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > Hello everyone, > > I installed octavia with centos 7 queens. > > When I crreate a load balancer the amphora instance is not created > > because nova conductor cannot find the security group specified in > > octavia.conf. > > I am sure the security group id is correct but the nova condictor > reports: > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd - > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] Error > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most recent > > call last):\n', u' File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1828, > > in _do_build_and_run_instance\n filter_properties, request_spec)\n', > > u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > line 2108, in _build_and_run_instance\n instance_uuid=instance.uuid, > > reason=six.text_type(e))\n', u'RescheduledException: Build of instance > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security group > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > Please, what is wrong ? > > > > Regards > > Ignazio > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dev.faz at gmail.com Tue Jan 8 09:44:47 2019 From: dev.faz at gmail.com (Fabian Zimmermann) Date: Tue, 8 Jan 2019 10:44:47 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: Hi, in which project should octavia start its amphora instances? In this project you should create a suitable sg. Fabian Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > Hello, I do not have an octavia project but only a service project. > Octavia user belongs to admin and service project :-( > Documentation  does not seem clear about it > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > ha scritto: > > Hi, > > did you create the security group in the octavia project? > > Can you see the sg if you login with the octavia credentials? > > >   Fabian > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > Hello everyone, > > I installed octavia with centos 7 queens. > > When I crreate a load balancer the amphora instance is not created > > because nova conductor cannot find the security group specified in > > octavia.conf. > > I am sure the security group id is correct but the nova condictor > reports: > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd - > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] > Error > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most > recent > > call last):\n', u'  File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > 1828, > > in _do_build_and_run_instance\n    filter_properties, > request_spec)\n', > > u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > line 2108, in _build_and_run_instance\n > instance_uuid=instance.uuid, > > reason=six.text_type(e))\n', u'RescheduledException: Build of > instance > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security > group > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > Please, what is wrong ? > > > > Regards > > Ignazio > From jean-philippe at evrard.me Tue Jan 8 09:57:03 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 08 Jan 2019 10:57:03 +0100 Subject: [loci] Stable Branches in Loci In-Reply-To: References: <3855B170-6E38-4DB9-A91C-9389D16D387F@openstack.org> <64a34fd9-d31b-5d7d-ae94-053d9bdebbad@openstack.org> <20181218054413.GG6373@thor.bakeyournoodle.com> Message-ID: <6572b304b7a340a798bad518061dd95d71efc04a.camel@evrard.me> On Thu, 2018-12-20 at 09:58 -0800, Chris Hoge wrote: > > On Dec 17, 2018, at 9:44 PM, Tony Breeds > > wrote: > > > > On Thu, Dec 13, 2018 at 09:43:37AM -0800, Chris Hoge wrote: > > > There is a need for us to work out whether Loci is even > > > appropriate for > > > stable branch development. Over the last week or so the CentOS > > > libvirt > > > update has broken all stable branch builds as it introduced an > > > incompatibility between the stable upper contraints of python- > > > libvirt and > > > libvirt. > > > > Yup, as we've seen on https://review.openstack.org/#/c/622262 this > > is a > > common thing and happens with every CentOS minor release. We're > > working > > the update to make sure we don't cause more breakage as we try to > > fix > > this thing. > > > > > libvirt. If we start running stable builds, it might provide a > > > useful > > > gate signal for when stable source builds break against upstream > > > distributions. It's something for the Loci team to think about as > > > we > > > work through refactoring our gate jobs. > > > > That's interesting idea. Happy to discuss how we can do that in a > > way > > that makes sense for each project. How long does LOCI build take? > > Loci makes one build for each OpenStack project you want to deploy. > The > requirements container takes the most time, as it does a pip wheel of > every requirement listed in the openstack/requirements repository, > then > bind-mounts the complete set of wheels into the service containers > during > those builds to ensure a complete and consistent set of dependencies. > Requirements must be done serially, but the rest of the builds can be > done in parallel. > > What I'm thinking is if we do stable builds of Loci that stand up a > simplified all-in-one environment we can run Tempest against, we > would > both get a signal for the Loci stable build (as well as master) and a > signal for requirements. Co-gating means we can check that an update > to > requirements to fix one distrubution does not negatively impact the > stability of other distributions. > > I have some very initial work on this in a personal project (this is > how > I like to spend some of my holiday down time), and we can bring it up > as > an agenda item for the Loci meeting tomorrow morning. > > -Chris > > I like the idea of having REAL testing of the loci images. Currently we just install software, and it's up to deployment tools to configure the images to match their needs. Doing a real test for all distros would be very nice, and a positive addition. I am curious about how we'd do this though. I suppose though it might require a new job, which will take far more time: After doing a building of the necessary images (more than one project!), we need to deploy them together and run tempest on them (therefore ensuring proper image building and co-installability). Or did you mean that you wanted to test each image building separately by running the minimum smoke tests for each image? What about reusing a deployment project job that's using loci in an experimental pipeline? Not sure to understand what you have written :) Regards, JP From ignaziocassano at gmail.com Tue Jan 8 09:57:40 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 10:57:40 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: It started in service project but service project not read securty groups that I created in admin project. So I modified /etc/octavia.conf secitons service_auth and keystone_authoken and I put project_name = admin instead of project_name = service With the above modifications the amphora instance starts in admin projects abd can read from it the security group id. But the load balancer remains in pending and then the ambora instance is automatically deleted. Another problem is that in both cases it does not start to create the amphra instance when I specify amp_ssh_key_name in octavia.conf In admin project case it shoud read it, because this key is in the admin project :-( So I started without ssh_key. Could you help me with my wrong configuration,please ? Regards Ignazio Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann ha scritto: > Hi, > > in which project should octavia start its amphora instances? > > In this project you should create a suitable sg. > > Fabian > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > Hello, I do not have an octavia project but only a service project. > > Octavia user belongs to admin and service project :-( > > Documentation does not seem clear about it > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > did you create the security group in the octavia project? > > > > Can you see the sg if you login with the octavia credentials? > > > > > > Fabian > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > Hello everyone, > > > I installed octavia with centos 7 queens. > > > When I crreate a load balancer the amphora instance is not created > > > because nova conductor cannot find the security group specified in > > > octavia.conf. > > > I am sure the security group id is correct but the nova condictor > > reports: > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd > - > > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > Error > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most > > recent > > > call last):\n', u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > 1828, > > > in _do_build_and_run_instance\n filter_properties, > > request_spec)\n', > > > u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > line 2108, in _build_and_run_instance\n > > instance_uuid=instance.uuid, > > > reason=six.text_type(e))\n', u'RescheduledException: Build of > > instance > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security > > group > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > Please, what is wrong ? > > > > > > Regards > > > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sahid.ferdjaoui at canonical.com Tue Jan 8 10:06:31 2019 From: sahid.ferdjaoui at canonical.com (Sahid Orentino Ferdjaoui) Date: Tue, 8 Jan 2019 11:06:31 +0100 Subject: [nova] Mempage fun In-Reply-To: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> Message-ID: <20190108100631.GA4852@canonical> On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > We've been looking at a patch that landed some months ago and have > spotted some issues: > > https://review.openstack.org/#/c/532168 > > In summary, that patch is intended to make the memory check for > instances memory pagesize aware. The logic it introduces looks > something like this: > > If the instance requests a specific pagesize > (#1) Check if each host cell can provide enough memory of the > pagesize requested for each instance cell > Otherwise > If the host has hugepages > (#2) Check if each host cell can provide enough memory of the > smallest pagesize available on the host for each instance cell > Otherwise > (#3) Check if each host cell can provide enough memory for > each instance cell, ignoring pagesizes > > This also has the side-effect of allowing instances with hugepages and > instances with a NUMA topology but no hugepages to co-exist on the same > host, because the latter will now be aware of hugepages and won't > consume them. However, there are a couple of issues with this: > > 1. It breaks overcommit for instances without pagesize request > running on hosts with different pagesizes. This is because we don't > allow overcommit for hugepages, but case (#2) above means we are now > reusing the same functions previously used for actual hugepage > checks to check for regular 4k pages I think that we should not accept any overcommit. Only instances with an InstanceNUMATopology associated pass to this part of check. Such instances want to use features like guest NUMA topology so their memory mapped on host NUMA nodes or CPU pinning. Both cases are used for performance reason and to avoid any cross memory latency. > 2. It doesn't fix the issue when non-NUMA instances exist on the same > host as NUMA instances with hugepages. The non-NUMA instances don't > run through any of the code above, meaning they're still not > pagesize aware That is an other issue. We report to the resource tracker all the physical memory (small pages + hugepages allocated). The difficulty is that we can't just change the virt driver to report only small pages. Some instances wont be able to get scheduled. We should basically change the resource tracker so it can take into account the different kind of page memory. But it's not really an issue since instances that use "NUMA features" (in Nova world) should be isolated to an aggregate and not be mixed with no-NUMA instances. The reason is simple no-NUMA instances do not have boundaries and break rules of NUMA instances. > We could probably fix issue (1) by modifying those hugepage functions > we're using to allow overcommit via a flag that we pass for case (#2). > We can mitigate issue (2) by advising operators to split hosts into > aggregates for 'hw:mem_page_size' set or unset (in addition to > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > I think this may be the case in some docs (sean-k-mooney said Intel > used to do this. I don't know about Red Hat's docs or upstream). In > addition, we did actually called that out in the original spec: > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > However, if we're doing that for non-NUMA instances, one would have to > question why the patch is necessary/acceptable for NUMA instances. For > what it's worth, a longer fix would be to start tracking hugepages in a > non-NUMA aware way too but that's a lot more work and doesn't fix the > issue now. > > As such, my question is this: should be look at fixing issue (1) and > documenting issue (2), or should we revert the thing wholesale until we > work on a solution that could e.g. let us track hugepages via placement > and resolve issue (2) too. > > Thoughts? > Stephen > From jan.vondra at ultimum.io Tue Jan 8 10:08:43 2019 From: jan.vondra at ultimum.io (Jan Vondra) Date: Tue, 8 Jan 2019 11:08:43 +0100 Subject: [Kolla] Queens for debian images Message-ID: Dear Kolla team, during project for one of our customers we have upgraded debian part of kolla project using a queens debian repositories (http://stretch-queens.debian.net/debian stretch-queens-backports) and we would like to share this work with community. I would like to ask what's the proper process of contributing since the patches affects both kolla and kolla-ansible repositories. Also any other comments regarding debian in kolla would be appriciated. Thanks, Jan Vondra Ultimum Technologies s.r.o. Na Poříčí 1047/26, 11000 Praha 1 Czech Republic http://ultimum.io From jean-philippe at evrard.me Tue Jan 8 10:09:53 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 08 Jan 2019 11:09:53 +0100 Subject: [tc][all] Train Community Goals In-Reply-To: <66d73db6-9f84-1290-1ab8-cf901a7fb355@catalyst.net.nz> References: <66d73db6-9f84-1290-1ab8-cf901a7fb355@catalyst.net.nz> Message-ID: <6b498008e71b7dae651e54e29717f3ccedea50d1.camel@evrard.me> On Wed, 2018-12-19 at 06:58 +1300, Adrian Turjak wrote: > I put my hand up during the summit for being at least one of the > champions for the deletion of project resources effort. > > I have been meaning to do a follow up email and options as well as > steps > for how the goal might go, but my working holiday in Europe after the > summit turned into more of a holiday than originally planned. > > I'll get a thread going around what I (and the public cloud working > group) think project resource deletion should look like, and what the > options are, and where we should aim to be with it. We can then turn > that discussion into a final 'spec' of sorts. > > Great news! Do you need any help to get started there? Regards, JP From ignaziocassano at gmail.com Tue Jan 8 10:14:16 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 11:14:16 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: PS Now I added the ssh key to octavia user and it assignes it to amphora instance. Still load balancer remains in pending create and after 3 minutes the amphora instance is automatically deleted. Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann ha scritto: > Hi, > > in which project should octavia start its amphora instances? > > In this project you should create a suitable sg. > > Fabian > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > Hello, I do not have an octavia project but only a service project. > > Octavia user belongs to admin and service project :-( > > Documentation does not seem clear about it > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > did you create the security group in the octavia project? > > > > Can you see the sg if you login with the octavia credentials? > > > > > > Fabian > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > Hello everyone, > > > I installed octavia with centos 7 queens. > > > When I crreate a load balancer the amphora instance is not created > > > because nova conductor cannot find the security group specified in > > > octavia.conf. > > > I am sure the security group id is correct but the nova condictor > > reports: > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd > - > > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > Error > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most > > recent > > > call last):\n', u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > 1828, > > > in _do_build_and_run_instance\n filter_properties, > > request_spec)\n', > > > u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > line 2108, in _build_and_run_instance\n > > instance_uuid=instance.uuid, > > > reason=six.text_type(e))\n', u'RescheduledException: Build of > > instance > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security > > group > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > Please, what is wrong ? > > > > > > Regards > > > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 10:24:16 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 11:24:16 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: PS on the amphore instance there is nothng on port 9443 Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann ha scritto: > Hi, > > in which project should octavia start its amphora instances? > > In this project you should create a suitable sg. > > Fabian > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > Hello, I do not have an octavia project but only a service project. > > Octavia user belongs to admin and service project :-( > > Documentation does not seem clear about it > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > did you create the security group in the octavia project? > > > > Can you see the sg if you login with the octavia credentials? > > > > > > Fabian > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > Hello everyone, > > > I installed octavia with centos 7 queens. > > > When I crreate a load balancer the amphora instance is not created > > > because nova conductor cannot find the security group specified in > > > octavia.conf. > > > I am sure the security group id is correct but the nova condictor > > reports: > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd > - > > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > Error > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most > > recent > > > call last):\n', u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > 1828, > > > in _do_build_and_run_instance\n filter_properties, > > request_spec)\n', > > > u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > line 2108, in _build_and_run_instance\n > > instance_uuid=instance.uuid, > > > reason=six.text_type(e))\n', u'RescheduledException: Build of > > instance > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security > > group > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > Please, what is wrong ? > > > > > > Regards > > > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Tue Jan 8 10:47:47 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 08 Jan 2019 10:47:47 +0000 Subject: [nova] Mempage fun In-Reply-To: <20190108100631.GA4852@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> Message-ID: <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > We've been looking at a patch that landed some months ago and have > > spotted some issues: > > > > https://review.openstack.org/#/c/532168 > > > > In summary, that patch is intended to make the memory check for > > instances memory pagesize aware. The logic it introduces looks > > something like this: > > > > If the instance requests a specific pagesize > > (#1) Check if each host cell can provide enough memory of the > > pagesize requested for each instance cell > > Otherwise > > If the host has hugepages > > (#2) Check if each host cell can provide enough memory of the > > smallest pagesize available on the host for each instance cell > > Otherwise > > (#3) Check if each host cell can provide enough memory for > > each instance cell, ignoring pagesizes > > > > This also has the side-effect of allowing instances with hugepages and > > instances with a NUMA topology but no hugepages to co-exist on the same > > host, because the latter will now be aware of hugepages and won't > > consume them. However, there are a couple of issues with this: > > > > 1. It breaks overcommit for instances without pagesize request > > running on hosts with different pagesizes. This is because we don't > > allow overcommit for hugepages, but case (#2) above means we are now > > reusing the same functions previously used for actual hugepage > > checks to check for regular 4k pages > > I think that we should not accept any overcommit. Only instances with > an InstanceNUMATopology associated pass to this part of check. Such > instances want to use features like guest NUMA topology so their > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > for performance reason and to avoid any cross memory latency. This issue with this is that we had previously designed everything *to* allow overcommit: https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 The only time this doesn't apply is if CPU pinning is also in action (remembering that CPU pinning and NUMA topologies are tightly bound and CPU pinning implies a NUMA topology, much to Jay's consternation). As noted below, our previous advice was not to mix hugepage instances and non-hugepage instances, meaning hosts handling non-hugepage instances should not have hugepages (or should mark the memory consumed by them as reserved for host). We have in effect broken previous behaviour in the name of solving a bug that didn't necessarily have to be fixed yet. > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > host as NUMA instances with hugepages. The non-NUMA instances don't > > run through any of the code above, meaning they're still not > > pagesize aware > > That is an other issue. We report to the resource tracker all the > physical memory (small pages + hugepages allocated). The difficulty is > that we can't just change the virt driver to report only small > pages. Some instances wont be able to get scheduled. We should > basically change the resource tracker so it can take into account the > different kind of page memory. Agreed (likely via move tracking of this resource to placement, I assume). It's a longer term fix though. > But it's not really an issue since instances that use "NUMA features" > (in Nova world) should be isolated to an aggregate and not be mixed > with no-NUMA instances. The reason is simple no-NUMA instances do not > have boundaries and break rules of NUMA instances. Again, we have to be careful not to mix up NUMA and CPU pinning. It's perfectly fine to have NUMA without CPU pinning, though not the other way around. For example: $ openstack flavor set --property hw:numa_nodes=2 FLAVOR >From what I can tell, there are three reasons that an instance will have a NUMA topology: the user explicitly requested one, the user requested CPU pinning and got one implicitly, or the user requested a specific pagesize and, again, got one implicitly. We handle the latter two with the advice given below, but I don't think anyone has ever said we must separate instances that had a user-specified NUMA topology from those that had no NUMA topology. If we're going down this path, we need clear docs. Stephen > > We could probably fix issue (1) by modifying those hugepage functions > > we're using to allow overcommit via a flag that we pass for case (#2). > > We can mitigate issue (2) by advising operators to split hosts into > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > I think this may be the case in some docs (sean-k-mooney said Intel > > used to do this. I don't know about Red Hat's docs or upstream). In > > addition, we did actually called that out in the original spec: > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > However, if we're doing that for non-NUMA instances, one would have to > > question why the patch is necessary/acceptable for NUMA instances. For > > what it's worth, a longer fix would be to start tracking hugepages in a > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > issue now. > > > > As such, my question is this: should be look at fixing issue (1) and > > documenting issue (2), or should we revert the thing wholesale until we > > work on a solution that could e.g. let us track hugepages via placement > > and resolve issue (2) too. > > > > Thoughts? > > Stephen > > From marcin.juszkiewicz at linaro.org Tue Jan 8 11:00:48 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Tue, 8 Jan 2019 12:00:48 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: W dniu 08.01.2019 o 11:08, Jan Vondra pisze: > Dear Kolla team, > > during project for one of our customers we have upgraded debian part > of kolla project using a queens debian repositories > (http://stretch-queens.debian.net/debian stretch-queens-backports) and > we would like to share this work with community. Thanks for doing that. Is there an option to provide arm64 packages next time? > I would like to ask what's the proper process of contributing since > the patches affects both kolla and kolla-ansible repositories. Send patches for review [1] and then we can discuss about changing them. Remember that we target Stein now. 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > Also any other comments regarding debian in kolla would be appriciated. Love to see someone else caring about Debian in Kolla. I took it over two years ago, revived and moved to 'stretch'. But skipped support for binary packages as there were no up-to-date packages available. In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' as it will enter final freeze. Had some discussion with Debian OpenStack team about providing preliminary Stein packages so support for 'binary' type of images could be possible. From dev.faz at gmail.com Tue Jan 8 11:32:33 2019 From: dev.faz at gmail.com (Fabian Zimmermann) Date: Tue, 8 Jan 2019 12:32:33 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> Hi, are you able to connect to the amphora via ssh? Could you paste your octavia.log and the log of the amphora somewhere? Fabian Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > PS > on the amphore instance there is nothng on port 9443 > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > ha scritto: > > Hi, > > in which project should octavia start its amphora instances? > > In this project you should create a suitable sg. > >   Fabian > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > Hello, I do not have an octavia project but only a service project. > > Octavia user belongs to admin and service project :-( > > Documentation  does not seem clear about it > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > >> ha scritto: > > > >     Hi, > > > >     did you create the security group in the octavia project? > > > >     Can you see the sg if you login with the octavia credentials? > > > > > >        Fabian > > > >     Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > >      > Hello everyone, > >      > I installed octavia with centos 7 queens. > >      > When I crreate a load balancer the amphora instance is not > created > >      > because nova conductor cannot find the security group > specified in > >      > octavia.conf. > >      > I am sure the security group id is correct but the nova > condictor > >     reports: > >      > > >      > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > >      > [req-75df2561-4bc3-4bde-86d0-40469058250c > >      > 62ed0b7f336b479ebda6f8587c4dd608 > 2a33760772ab4b0381a27735443ec4bd - > >      > default default] [instance: > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > >     Error > >      > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback > (most > >     recent > >      > call last):\n', u'  File > >      > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > >     1828, > >      > in _do_build_and_run_instance\n    filter_properties, > >     request_spec)\n', > >      > u'  File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > >      > line 2108, in _build_and_run_instance\n > >     instance_uuid=instance.uuid, > >      > reason=six.text_type(e))\n', u'RescheduledException: Build of > >     instance > >      > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: > Security > >     group > >      > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > >      > > >      > Please, what is wrong ? > >      > > >      > Regards > >      > Ignazio > > > From ellorent at redhat.com Tue Jan 8 11:48:56 2019 From: ellorent at redhat.com (Felix Enrique Llorente Pastora) Date: Tue, 8 Jan 2019 12:48:56 +0100 Subject: Make tripleo-ci-fedora-28-standalone voting Message-ID: Hi All, The hibrid job to test fedora28 host and centos7 containers is working now and running tempest correctly (well it miss junit xml generation, but it's a matter of updating one RPM), so maybe is time for the job to be voting. What do you think? BR -- Quique Llorente Openstack TripleO CI -------------- next part -------------- An HTML attachment was scrubbed... URL: From sahid.ferdjaoui at canonical.com Tue Jan 8 11:50:27 2019 From: sahid.ferdjaoui at canonical.com (Sahid Orentino Ferdjaoui) Date: Tue, 8 Jan 2019 12:50:27 +0100 Subject: [nova] Mempage fun In-Reply-To: <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> Message-ID: <20190108115027.GA7825@canonical> On Tue, Jan 08, 2019 at 10:47:47AM +0000, Stephen Finucane wrote: > On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > > We've been looking at a patch that landed some months ago and have > > > spotted some issues: > > > > > > https://review.openstack.org/#/c/532168 > > > > > > In summary, that patch is intended to make the memory check for > > > instances memory pagesize aware. The logic it introduces looks > > > something like this: > > > > > > If the instance requests a specific pagesize > > > (#1) Check if each host cell can provide enough memory of the > > > pagesize requested for each instance cell > > > Otherwise > > > If the host has hugepages > > > (#2) Check if each host cell can provide enough memory of the > > > smallest pagesize available on the host for each instance cell > > > Otherwise > > > (#3) Check if each host cell can provide enough memory for > > > each instance cell, ignoring pagesizes > > > > > > This also has the side-effect of allowing instances with hugepages and > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > host, because the latter will now be aware of hugepages and won't > > > consume them. However, there are a couple of issues with this: > > > > > > 1. It breaks overcommit for instances without pagesize request > > > running on hosts with different pagesizes. This is because we don't > > > allow overcommit for hugepages, but case (#2) above means we are now > > > reusing the same functions previously used for actual hugepage > > > checks to check for regular 4k pages > > > > I think that we should not accept any overcommit. Only instances with > > an InstanceNUMATopology associated pass to this part of check. Such > > instances want to use features like guest NUMA topology so their > > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > > for performance reason and to avoid any cross memory latency. > > This issue with this is that we had previously designed everything *to* > allow overcommit: > > https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 This code never worked Stephen, that instead of to please unit tests related. I would not recommend to use it as a reference. > The only time this doesn't apply is if CPU pinning is also in action > (remembering that CPU pinning and NUMA topologies are tightly bound and > CPU pinning implies a NUMA topology, much to Jay's consternation). As > noted below, our previous advice was not to mix hugepage instances and > non-hugepage instances, meaning hosts handling non-hugepage instances > should not have hugepages (or should mark the memory consumed by them > as reserved for host). We have in effect broken previous behaviour in > the name of solving a bug that didn't necessarily have to be fixed yet. > > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > run through any of the code above, meaning they're still not > > > pagesize aware > > > > That is an other issue. We report to the resource tracker all the > > physical memory (small pages + hugepages allocated). The difficulty is > > that we can't just change the virt driver to report only small > > pages. Some instances wont be able to get scheduled. We should > > basically change the resource tracker so it can take into account the > > different kind of page memory. > > Agreed (likely via move tracking of this resource to placement, I > assume). It's a longer term fix though. > > > But it's not really an issue since instances that use "NUMA features" > > (in Nova world) should be isolated to an aggregate and not be mixed > > with no-NUMA instances. The reason is simple no-NUMA instances do not > > have boundaries and break rules of NUMA instances. > > Again, we have to be careful not to mix up NUMA and CPU pinning. It's > perfectly fine to have NUMA without CPU pinning, though not the other > way around. For example: > > $ openstack flavor set --property hw:numa_nodes=2 FLAVOR > > >From what I can tell, there are three reasons that an instance will > have a NUMA topology: the user explicitly requested one, the user > requested CPU pinning and got one implicitly, or the user requested a > specific pagesize and, again, got one implicitly. We handle the latter > two with the advice given below, but I don't think anyone has ever said > we must separate instances that had a user-specified NUMA topology from > those that had no NUMA topology. If we're going down this path, we need > clear docs. The implementation is pretty old and it was a first design from scratch, all the situations have not been take into account or been documented. If we want create specific behaviors we are going to add more complexity on something which is already, and which is not completely stable, as an example the patch you have mentioned which has been merged last release. I agree documenting is probably where we should go; don't try to mix instances with InstanceNUMATopology and without, Nova uses a different way to compute their resources, like don't try to overcommit such instances. We basically recommend to use aggregate for pinning, realtime, hugepages, so it looks reasonable to add guest NUMA topology to that list. > Stephen > > > > We could probably fix issue (1) by modifying those hugepage functions > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > We can mitigate issue (2) by advising operators to split hosts into > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > addition, we did actually called that out in the original spec: > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > question why the patch is necessary/acceptable for NUMA instances. For > > > what it's worth, a longer fix would be to start tracking hugepages in a > > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > > issue now. > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > documenting issue (2), or should we revert the thing wholesale until we > > > work on a solution that could e.g. let us track hugepages via placement > > > and resolve issue (2) too. > > > > > > Thoughts? > > > Stephen > > > > From ignaziocassano at gmail.com Tue Jan 8 12:05:55 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 13:05:55 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> Message-ID: Yes, I can connect to amphora instance for a short time because it is removed automatically. For the amphora instance which log do you need? For octavia worker log is enough? Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann ha scritto: > Hi, > > are you able to connect to the amphora via ssh? > > Could you paste your octavia.log and the log of the amphora somewhere? > > Fabian > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > PS > > on the amphore instance there is nothng on port 9443 > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > in which project should octavia start its amphora instances? > > > > In this project you should create a suitable sg. > > > > Fabian > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > Hello, I do not have an octavia project but only a service > project. > > > Octavia user belongs to admin and service project :-( > > > Documentation does not seem clear about it > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > did you create the security group in the octavia project? > > > > > > Can you see the sg if you login with the octavia credentials? > > > > > > > > > Fabian > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > Hello everyone, > > > > I installed octavia with centos 7 queens. > > > > When I crreate a load balancer the amphora instance is not > > created > > > > because nova conductor cannot find the security group > > specified in > > > > octavia.conf. > > > > I am sure the security group id is correct but the nova > > condictor > > > reports: > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > 2a33760772ab4b0381a27735443ec4bd - > > > > default default] [instance: > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > Error > > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback > > (most > > > recent > > > > call last):\n', u' File > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > > 1828, > > > > in _do_build_and_run_instance\n filter_properties, > > > request_spec)\n', > > > > u' File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > line 2108, in _build_and_run_instance\n > > > instance_uuid=instance.uuid, > > > > reason=six.text_type(e))\n', u'RescheduledException: Build > of > > > instance > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: > > Security > > > group > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > Please, what is wrong ? > > > > > > > > Regards > > > > Ignazio > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dev.faz at gmail.com Tue Jan 8 12:06:54 2019 From: dev.faz at gmail.com (Fabian Zimmermann) Date: Tue, 8 Jan 2019 13:06:54 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> Message-ID: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Well, more logs are always better ;) Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > Yes, I can connect to amphora instance for a short time because it is > removed automatically. > For the amphora instance which log do you need? > For octavia worker log is enough? > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > ha scritto: > > Hi, > > are you able to connect to the amphora via ssh? > > Could you paste your octavia.log and the log of the amphora somewhere? > >   Fabian > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > PS > > on the amphore instance there is nothng on port 9443 > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > >> ha scritto: > > > >     Hi, > > > >     in which project should octavia start its amphora instances? > > > >     In this project you should create a suitable sg. > > > >        Fabian > > > >     Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > >      > Hello, I do not have an octavia project but only a service > project. > >      > Octavia user belongs to admin and service project :-( > >      > Documentation  does not seem clear about it > >      > > >      > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > >      > > > > >      > >>> ha scritto: > >      > > >      >     Hi, > >      > > >      >     did you create the security group in the octavia project? > >      > > >      >     Can you see the sg if you login with the octavia > credentials? > >      > > >      > > >      >        Fabian > >      > > >      >     Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > >      >      > Hello everyone, > >      >      > I installed octavia with centos 7 queens. > >      >      > When I crreate a load balancer the amphora instance > is not > >     created > >      >      > because nova conductor cannot find the security group > >     specified in > >      >      > octavia.conf. > >      >      > I am sure the security group id is correct but the nova > >     condictor > >      >     reports: > >      >      > > >      >      > 2019-01-08 09:06:06.803 11872 ERROR > nova.scheduler.utils > >      >      > [req-75df2561-4bc3-4bde-86d0-40469058250c > >      >      > 62ed0b7f336b479ebda6f8587c4dd608 > >     2a33760772ab4b0381a27735443ec4bd - > >      >      > default default] [instance: > >     83f2fd75-8069-47a5-9572-8949ec9b5cee] > >      >     Error > >      >      > from last host: tst2-kvm02 (node tst2-kvm02): > [u'Traceback > >     (most > >      >     recent > >      >      > call last):\n', u'  File > >      >      > > >     "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > >      >     1828, > >      >      > in _do_build_and_run_instance\n    filter_properties, > >      >     request_spec)\n', > >      >      > u'  File > >     "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > >      >      > line 2108, in _build_and_run_instance\n > >      >     instance_uuid=instance.uuid, > >      >      > reason=six.text_type(e))\n', > u'RescheduledException: Build of > >      >     instance > >      >      > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: > >     Security > >      >     group > >      >      > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > >      >      > > >      >      > Please, what is wrong ? > >      >      > > >      >      > Regards > >      >      > Ignazio > >      > > > > From ignaziocassano at gmail.com Tue Jan 8 12:34:59 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 13:34:59 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Hi, attached here there are logs. As you can see the amphora messages reports that amphora-agent service fails. Thanks a lot for your help Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: worker.log Type: text/x-log Size: 2718 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: messages-amphora-instace.log Type: text/x-log Size: 1077 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: housekeeping.log Type: text/x-log Size: 1146 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: api.log Type: text/x-log Size: 3921 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: healt-manager.log Type: text/x-log Size: 1205 bytes Desc: not available URL: From ignaziocassano at gmail.com Tue Jan 8 12:42:59 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 13:42:59 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Doing a journactl -u on amphora instance it gives: gen 08 12:40:12 amphora-1e35a2d5-c3ab-4016-baf6-e6bf1dd061ac.novalocal amphora-agent[2884]: 2019-01-08 12:40:12.013 2884 ERROR octavia raise ValueError('certfile "%s" does not exist' % conf.certfile) gen 08 12:40:12 amphora-1e35a2d5-c3ab-4016-baf6-e6bf1dd061ac.novalocal amphora-agent[2884]: 2019-01-08 12:40:12.013 2884 ERROR octavia ValueError: certfile "/etc/octavia/certs/client.pem" does not exist Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 12:50:48 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 13:50:48 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Probably I must set client_ca.pem in octavia.conf end not client.pem in section haproxy_amphora Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 13:06:36 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 14:06:36 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: I am becoming crazy. Amphora agent in amphora instance search /etc/octavia/certs/client.pem but in /etc/octavia/certs there is client_ca.pem :-( Probably must I modify the amphora_agent section ? Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 13:14:47 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 14:14:47 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> Message-ID: OK I solved the certificate issue modifying the right section . Now amphora agent starts in the instance but controller canno reach it on 9443 port. I think this is a firewall problem on our network. I am going to check Il giorno mar 8 gen 2019 alle ore 12:32 Fabian Zimmermann ha scritto: > Hi, > > are you able to connect to the amphora via ssh? > > Could you paste your octavia.log and the log of the amphora somewhere? > > Fabian > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > PS > > on the amphore instance there is nothng on port 9443 > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > in which project should octavia start its amphora instances? > > > > In this project you should create a suitable sg. > > > > Fabian > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > Hello, I do not have an octavia project but only a service > project. > > > Octavia user belongs to admin and service project :-( > > > Documentation does not seem clear about it > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > did you create the security group in the octavia project? > > > > > > Can you see the sg if you login with the octavia credentials? > > > > > > > > > Fabian > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > Hello everyone, > > > > I installed octavia with centos 7 queens. > > > > When I crreate a load balancer the amphora instance is not > > created > > > > because nova conductor cannot find the security group > > specified in > > > > octavia.conf. > > > > I am sure the security group id is correct but the nova > > condictor > > > reports: > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > 2a33760772ab4b0381a27735443ec4bd - > > > > default default] [instance: > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > Error > > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback > > (most > > > recent > > > > call last):\n', u' File > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > > 1828, > > > > in _do_build_and_run_instance\n filter_properties, > > > request_spec)\n', > > > > u' File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > line 2108, in _build_and_run_instance\n > > > instance_uuid=instance.uuid, > > > > reason=six.text_type(e))\n', u'RescheduledException: Build > of > > > instance > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: > > Security > > > group > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > Please, what is wrong ? > > > > > > > > Regards > > > > Ignazio > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Jan 8 13:39:16 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 08 Jan 2019 13:39:16 +0000 Subject: [nova] Mempage fun In-Reply-To: <20190108100631.GA4852@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> Message-ID: On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > We've been looking at a patch that landed some months ago and have > > spotted some issues: > > > > https://review.openstack.org/#/c/532168 > > > > In summary, that patch is intended to make the memory check for > > instances memory pagesize aware. The logic it introduces looks > > something like this: > > > > If the instance requests a specific pagesize > > (#1) Check if each host cell can provide enough memory of the > > pagesize requested for each instance cell > > Otherwise > > If the host has hugepages > > (#2) Check if each host cell can provide enough memory of the > > smallest pagesize available on the host for each instance cell > > Otherwise > > (#3) Check if each host cell can provide enough memory for > > each instance cell, ignoring pagesizes > > > > This also has the side-effect of allowing instances with hugepages and > > instances with a NUMA topology but no hugepages to co-exist on the same > > host, because the latter will now be aware of hugepages and won't > > consume them. However, there are a couple of issues with this: > > > > 1. It breaks overcommit for instances without pagesize request > > running on hosts with different pagesizes. This is because we don't > > allow overcommit for hugepages, but case (#2) above means we are now > > reusing the same functions previously used for actual hugepage > > checks to check for regular 4k pages > > I think that we should not accept any overcommit. Only instances with > an InstanceNUMATopology associated pass to this part of check. Such > instances want to use features like guest NUMA topology so their > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > for performance reason and to avoid any cross memory latency. that is not nessisarialy correct. if i request cpu pinning that does not imply that i dont want the ablitiy memory over subsricption. that is an artifact of how we chose to implement pinning in the libvirt driver. for the case of cpu pinning specifically i have alway felt it is wrong that we create a numa toplogy for the geust implicitly. in the case of hw:numa_nodes=1 in the absence of any other extra spec or image metadata i also do not think it is correct ot disabel over subsription retroativly after supportin it for several years. requesting a numa toplogy out side of requesting huge pages explcitly shoudl never have disabled over subsription and changing that behavior should have both required a microvirtion and a nova spec. https://review.openstack.org/#/c/532168 was simply a bug fix and therefor shoudl not have changed the meaning of requesting a numa topology. > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > host as NUMA instances with hugepages. The non-NUMA instances don't > > run through any of the code above, meaning they're still not > > pagesize aware > > That is an other issue. We report to the resource tracker all the > physical memory (small pages + hugepages allocated). The difficulty is > that we can't just change the virt driver to report only small > pages. Some instances wont be able to get scheduled. We should > basically change the resource tracker so it can take into account the > different kind of page memory. > > But it's not really an issue since instances that use "NUMA features" > (in Nova world) should be isolated to an aggregate and not be mixed > with no-NUMA instances. The reason is simple no-NUMA instances do not > have boundaries and break rules of NUMA instances. it is true that we should partion deployment useing host aggreates for host with numa instance today and host that have non numa instances. the reason issue 2 was raised is that the commit message implied that the patch addressed mixing numa and non numa guest on the same host. "Also when no pagesize is requested we should consider to compute memory usage based on small pages since the amount of physical memory available may also include some large pages." but the logic in the patch does not actully get triggered when the guest does not have numa topology so it does not actully consider the total number of small page in that case. this was linked to a down bugzilla you filed https://bugzilla.redhat.com/show_bug.cgi?id=1625119 and another for osp 10 https://bugzilla.redhat.com/show_bug.cgi?id=1519540 which has 3 costomer cases associated with it that downstream bugs. on closer inspection of the patch dose not address the downstream bug at all as it is expressly stating that nova does not consider small pages when mem_page_size is not set but since we dont execute this code of non numa guest we dont actully resovle the issue. when we consider the costomer cases associated with this specifcally the 3 the donwstream bug claims to resove are 1.) a sheudler race where two pinned instances get shduled to the same set of resouces (this can only be fixed with placement 2.) mixing hugepage and non huge page guest reulted in OOM events 3.) instnace with a numa toplogy nolonger respect ram allocation ration. the third customer issue was directly caused but backporting this patach. the second issue would be resoved by using host aggreates to segerage hugepage host from non numa hosts and the first cant be addressed without premtivly claiming cpu/hugepages in the schduelr/placement. > > We could probably fix issue (1) by modifying those hugepage functions > > we're using to allow overcommit via a flag that we pass for case (#2). > > We can mitigate issue (2) by advising operators to split hosts into > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > I think this may be the case in some docs (sean-k-mooney said Intel > > used to do this. I don't know about Red Hat's docs or upstream). In > > addition, we did actually called that out in the original spec: > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > However, if we're doing that for non-NUMA instances, one would have to > > question why the patch is necessary/acceptable for NUMA instances. For > > what it's worth, a longer fix would be to start tracking hugepages in a > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > issue now. > > > > As such, my question is this: should be look at fixing issue (1) and > > documenting issue (2), or should we revert the thing wholesale until we > > work on a solution that could e.g. let us track hugepages via placement > > and resolve issue (2) too. > > > > Thoughts? > > Stephen > > > > From fungi at yuggoth.org Tue Jan 8 14:00:26 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 8 Jan 2019 14:00:26 +0000 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: <20190108140026.p4462df5otnyizm2@yuggoth.org> On 2019-01-08 12:00:48 +0100 (+0100), Marcin Juszkiewicz wrote: [...] > Send patches for review [1] and then we can discuss about changing them. > Remember that we target Stein now. > > 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing [...] These days it's probably better to recommend https://docs.openstack.org/contributors/ since I expect we're about ready to retire that old wiki page. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From ignaziocassano at gmail.com Tue Jan 8 14:03:08 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 15:03:08 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Hello, I solved firewall issues. Now controllers can access amphora instance on 9443 port but worker.log reports: Could not connect to instance. Retrying.: SSLError: ("bad handshake: SysCallError(-1, 'Unexpected EOF')",) :-( Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sahid.ferdjaoui at canonical.com Tue Jan 8 14:17:55 2019 From: sahid.ferdjaoui at canonical.com (Sahid Orentino Ferdjaoui) Date: Tue, 8 Jan 2019 15:17:55 +0100 Subject: [nova] Mempage fun In-Reply-To: <20190108115027.GA7825@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> <20190108115027.GA7825@canonical> Message-ID: <20190108141755.GA9289@canonical> On Tue, Jan 08, 2019 at 12:50:27PM +0100, Sahid Orentino Ferdjaoui wrote: > On Tue, Jan 08, 2019 at 10:47:47AM +0000, Stephen Finucane wrote: > > On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > > > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > > > We've been looking at a patch that landed some months ago and have > > > > spotted some issues: > > > > > > > > https://review.openstack.org/#/c/532168 > > > > > > > > In summary, that patch is intended to make the memory check for > > > > instances memory pagesize aware. The logic it introduces looks > > > > something like this: > > > > > > > > If the instance requests a specific pagesize > > > > (#1) Check if each host cell can provide enough memory of the > > > > pagesize requested for each instance cell > > > > Otherwise > > > > If the host has hugepages > > > > (#2) Check if each host cell can provide enough memory of the > > > > smallest pagesize available on the host for each instance cell > > > > Otherwise > > > > (#3) Check if each host cell can provide enough memory for > > > > each instance cell, ignoring pagesizes > > > > > > > > This also has the side-effect of allowing instances with hugepages and > > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > > host, because the latter will now be aware of hugepages and won't > > > > consume them. However, there are a couple of issues with this: > > > > > > > > 1. It breaks overcommit for instances without pagesize request > > > > running on hosts with different pagesizes. This is because we don't > > > > allow overcommit for hugepages, but case (#2) above means we are now > > > > reusing the same functions previously used for actual hugepage > > > > checks to check for regular 4k pages > > > > > > I think that we should not accept any overcommit. Only instances with > > > an InstanceNUMATopology associated pass to this part of check. Such > > > instances want to use features like guest NUMA topology so their > > > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > > > for performance reason and to avoid any cross memory latency. > > > > This issue with this is that we had previously designed everything *to* > > allow overcommit: > > > > https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 > > This code never worked Stephen, that instead of to please unit tests > related. I would not recommend to use it as a reference. > > > The only time this doesn't apply is if CPU pinning is also in action > > (remembering that CPU pinning and NUMA topologies are tightly bound and > > CPU pinning implies a NUMA topology, much to Jay's consternation). As > > noted below, our previous advice was not to mix hugepage instances and > > non-hugepage instances, meaning hosts handling non-hugepage instances > > should not have hugepages (or should mark the memory consumed by them > > as reserved for host). We have in effect broken previous behaviour in > > the name of solving a bug that didn't necessarily have to be fixed yet. > > > > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > > run through any of the code above, meaning they're still not > > > > pagesize aware > > > > > > That is an other issue. We report to the resource tracker all the > > > physical memory (small pages + hugepages allocated). The difficulty is > > > that we can't just change the virt driver to report only small > > > pages. Some instances wont be able to get scheduled. We should > > > basically change the resource tracker so it can take into account the > > > different kind of page memory. > > > > Agreed (likely via move tracking of this resource to placement, I > > assume). It's a longer term fix though. > > > > > But it's not really an issue since instances that use "NUMA features" > > > (in Nova world) should be isolated to an aggregate and not be mixed > > > with no-NUMA instances. The reason is simple no-NUMA instances do not > > > have boundaries and break rules of NUMA instances. > > > > Again, we have to be careful not to mix up NUMA and CPU pinning. It's > > perfectly fine to have NUMA without CPU pinning, though not the other > > way around. For example: > > > > $ openstack flavor set --property hw:numa_nodes=2 FLAVOR > > > > >From what I can tell, there are three reasons that an instance will > > have a NUMA topology: the user explicitly requested one, the user > > requested CPU pinning and got one implicitly, or the user requested a > > specific pagesize and, again, got one implicitly. We handle the latter > > two with the advice given below, but I don't think anyone has ever said > > we must separate instances that had a user-specified NUMA topology from > > those that had no NUMA topology. If we're going down this path, we need > > clear docs. Now I remember why we can't support it. When defining guest NUMA topology (hw:numa_node) the memory is mapped to the assigned host NUMA nodes meaning that the guest memory can't swap out. If a non-NUMA instance starts using memory from host NUMA nodes used by a guest with NUMA it can result that the guest with NUMA run out of memory and be killed. > > The implementation is pretty old and it was a first design from > scratch, all the situations have not been take into account or been > documented. If we want create specific behaviors we are going to add > more complexity on something which is already, and which is not > completely stable, as an example the patch you have mentioned which > has been merged last release. > > I agree documenting is probably where we should go; don't try to mix > instances with InstanceNUMATopology and without, Nova uses a different > way to compute their resources, like don't try to overcommit such > instances. > > We basically recommend to use aggregate for pinning, realtime, > hugepages, so it looks reasonable to add guest NUMA topology to that > list. > > > Stephen > > > > > > We could probably fix issue (1) by modifying those hugepage functions > > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > > We can mitigate issue (2) by advising operators to split hosts into > > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > > addition, we did actually called that out in the original spec: > > > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > > question why the patch is necessary/acceptable for NUMA instances. For > > > > what it's worth, a longer fix would be to start tracking hugepages in a > > > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > > > issue now. > > > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > > documenting issue (2), or should we revert the thing wholesale until we > > > > work on a solution that could e.g. let us track hugepages via placement > > > > and resolve issue (2) too. > > > > > > > > Thoughts? > > > > Stephen > > > > > > From ignaziocassano at gmail.com Tue Jan 8 14:17:57 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 15:17:57 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Solved hanshake error commenting in amphora_agent section the following lines: #agent_server_ca = /etc/octavia/certs/ca_01.pem #agent_server_cert = /etc/octavia/certs/client_ca.pem Il giorno mar 8 gen 2019 alle ore 15:03 Ignazio Cassano < ignaziocassano at gmail.com> ha scritto: > Hello, > I solved firewall issues. > Now controllers can access amphora instance on 9443 port but worker.log > reports: > Could not connect to instance. Retrying.: SSLError: ("bad handshake: > SysCallError(-1, 'Unexpected EOF')",) > :-( > > > Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann < > dev.faz at gmail.com> ha scritto: > >> Well, more logs are always better ;) >> >> Am 08.01.19 um 13:05 schrieb Ignazio Cassano: >> > Yes, I can connect to amphora instance for a short time because it is >> > removed automatically. >> > For the amphora instance which log do you need? >> > For octavia worker log is enough? >> > >> > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > > ha scritto: >> > >> > Hi, >> > >> > are you able to connect to the amphora via ssh? >> > >> > Could you paste your octavia.log and the log of the amphora >> somewhere? >> > >> > Fabian >> > >> > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: >> > > PS >> > > on the amphore instance there is nothng on port 9443 >> > > >> > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann >> > > >> > >> ha scritto: >> > > >> > > Hi, >> > > >> > > in which project should octavia start its amphora instances? >> > > >> > > In this project you should create a suitable sg. >> > > >> > > Fabian >> > > >> > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: >> > > > Hello, I do not have an octavia project but only a service >> > project. >> > > > Octavia user belongs to admin and service project :-( >> > > > Documentation does not seem clear about it >> > > > >> > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann >> > > > >> > > >> > > >> > >>> ha scritto: >> > > > >> > > > Hi, >> > > > >> > > > did you create the security group in the octavia >> project? >> > > > >> > > > Can you see the sg if you login with the octavia >> > credentials? >> > > > >> > > > >> > > > Fabian >> > > > >> > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: >> > > > > Hello everyone, >> > > > > I installed octavia with centos 7 queens. >> > > > > When I crreate a load balancer the amphora instance >> > is not >> > > created >> > > > > because nova conductor cannot find the security >> group >> > > specified in >> > > > > octavia.conf. >> > > > > I am sure the security group id is correct but the >> nova >> > > condictor >> > > > reports: >> > > > > >> > > > > 2019-01-08 09:06:06.803 11872 ERROR >> > nova.scheduler.utils >> > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c >> > > > > 62ed0b7f336b479ebda6f8587c4dd608 >> > > 2a33760772ab4b0381a27735443ec4bd - >> > > > > default default] [instance: >> > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] >> > > > Error >> > > > > from last host: tst2-kvm02 (node tst2-kvm02): >> > [u'Traceback >> > > (most >> > > > recent >> > > > > call last):\n', u' File >> > > > > >> > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", >> line >> > > > 1828, >> > > > > in _do_build_and_run_instance\n >> filter_properties, >> > > > request_spec)\n', >> > > > > u' File >> > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", >> > > > > line 2108, in _build_and_run_instance\n >> > > > instance_uuid=instance.uuid, >> > > > > reason=six.text_type(e))\n', >> > u'RescheduledException: Build of >> > > > instance >> > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was >> re-scheduled: >> > > Security >> > > > group >> > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] >> > > > > >> > > > > Please, what is wrong ? >> > > > > >> > > > > Regards >> > > > > Ignazio >> > > > >> > > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Tue Jan 8 15:55:27 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Tue, 8 Jan 2019 16:55:27 +0100 Subject: [ansible-openstack] Magnum/Heat k8s failed Message-ID: Hi all. I have installed ansible-openstal AIO. Now I d like to create a cluster k8s but heat gives me always an error below: 2019-01-08 15:35:03Z [alf-k8s-nerea4mr3b2c.kube_masters]: *CREATE_IN_PROGRESS state changed* 2019-01-08 15:35:24Z [alf-k8s-nerea4mr3b2c.kube_masters]: *CREATE_FAILED AuthorizationFailure: resources.kube_masters.resources[0].resources.master_wait_handle: Authorization failed.* 2019-01-08 15:35:24Z [alf-k8s-nerea4mr3b2c]: CREATE_FAILED Resource CREATE failed: AuthorizationFailure: resources.kube_masters.resources[0].resources.master_wait_handle: Authorization failed. Any idea what to check? Cheers -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Tue Jan 8 16:31:16 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Tue, 8 Jan 2019 08:31:16 -0800 Subject: [Kolla] Queens for debian images In-Reply-To: <20190108140026.p4462df5otnyizm2@yuggoth.org> References: <20190108140026.p4462df5otnyizm2@yuggoth.org> Message-ID: Another useful link that Zane put together a while back now, but is more up to date/complete than the wiki, was this Reviewing the OpenStack Way Guide[1]. -Kendall (diablo_rojo) [1] https://docs.openstack.org/project-team-guide/review-the-openstack-way.html On Tue, Jan 8, 2019 at 6:01 AM Jeremy Stanley wrote: > On 2019-01-08 12:00:48 +0100 (+0100), Marcin Juszkiewicz wrote: > [...] > > Send patches for review [1] and then we can discuss about changing them. > > Remember that we target Stein now. > > > > 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > [...] > > These days it's probably better to recommend > https://docs.openstack.org/contributors/ since I expect we're about > ready to retire that old wiki page. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sokoban at foxmail.com Tue Jan 8 06:47:38 2019 From: sokoban at foxmail.com (=?gb18030?B?eG11Zml2ZUBxcS5jb20=?=) Date: Tue, 8 Jan 2019 14:47:38 +0800 Subject: Ironic ibmc driver for Huawei server Message-ID: Hi julia, According to the comment of story, 1. The spec for huawei ibmc drvier has been post here: https://storyboard.openstack.org/#!/story/2004635 , waiting for review. 2. About the third-party CI part, we provide mocked unittests for our driver's code. Not sure what third-party CI works for in this case. What else we should do? Thanks Qianbiao.NG -------------- next part -------------- An HTML attachment was scrubbed... URL: From dev.faz at gmail.com Tue Jan 8 09:30:44 2019 From: dev.faz at gmail.com (Fabian Zimmermann) Date: Tue, 8 Jan 2019 10:30:44 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: Message-ID: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Hi, did you create the security group in the octavia project? Can you see the sg if you login with the octavia credentials? Fabian Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > Hello everyone, > I installed octavia with centos 7 queens. > When I crreate a load balancer the amphora instance is not created > because nova conductor cannot find the security group specified in > octavia.conf. > I am sure the security group id is correct but the nova condictor reports: > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > [req-75df2561-4bc3-4bde-86d0-40469058250c > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd - > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] Error > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most recent > call last):\n', u'  File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1828, > in _do_build_and_run_instance\n    filter_properties, request_spec)\n', > u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line 2108, in _build_and_run_instance\n    instance_uuid=instance.uuid, > reason=six.text_type(e))\n', u'RescheduledException: Build of instance > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security group > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > Please, what is wrong ? > > Regards > Ignazio From smooney at redhat.com Tue Jan 8 16:48:36 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 08 Jan 2019 16:48:36 +0000 Subject: [nova] Mempage fun In-Reply-To: <20190108141755.GA9289@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> <20190108115027.GA7825@canonical> <20190108141755.GA9289@canonical> Message-ID: <913e061bf714036ff26bfc268822054ec9878ede.camel@redhat.com> On Tue, 2019-01-08 at 15:17 +0100, Sahid Orentino Ferdjaoui wrote: > On Tue, Jan 08, 2019 at 12:50:27PM +0100, Sahid Orentino Ferdjaoui wrote: > > On Tue, Jan 08, 2019 at 10:47:47AM +0000, Stephen Finucane wrote: > > > On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > > > > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > > > > We've been looking at a patch that landed some months ago and have > > > > > spotted some issues: > > > > > > > > > > https://review.openstack.org/#/c/532168 > > > > > > > > > > In summary, that patch is intended to make the memory check for > > > > > instances memory pagesize aware. The logic it introduces looks > > > > > something like this: > > > > > > > > > > If the instance requests a specific pagesize > > > > > (#1) Check if each host cell can provide enough memory of the > > > > > pagesize requested for each instance cell > > > > > Otherwise > > > > > If the host has hugepages > > > > > (#2) Check if each host cell can provide enough memory of the > > > > > smallest pagesize available on the host for each instance cell > > > > > Otherwise > > > > > (#3) Check if each host cell can provide enough memory for > > > > > each instance cell, ignoring pagesizes > > > > > > > > > > This also has the side-effect of allowing instances with hugepages and > > > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > > > host, because the latter will now be aware of hugepages and won't > > > > > consume them. However, there are a couple of issues with this: > > > > > > > > > > 1. It breaks overcommit for instances without pagesize request > > > > > running on hosts with different pagesizes. This is because we don't > > > > > allow overcommit for hugepages, but case (#2) above means we are now > > > > > reusing the same functions previously used for actual hugepage > > > > > checks to check for regular 4k pages > > > > > > > > I think that we should not accept any overcommit. Only instances with > > > > an InstanceNUMATopology associated pass to this part of check. Such > > > > instances want to use features like guest NUMA topology so their > > > > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > > > > for performance reason and to avoid any cross memory latency. > > > > > > This issue with this is that we had previously designed everything *to* > > > allow overcommit: > > > > > > https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 > > > > This code never worked Stephen, that instead of to please unit tests > > related. I would not recommend to use it as a reference. > > > > > The only time this doesn't apply is if CPU pinning is also in action > > > (remembering that CPU pinning and NUMA topologies are tightly bound and > > > CPU pinning implies a NUMA topology, much to Jay's consternation). As > > > noted below, our previous advice was not to mix hugepage instances and > > > non-hugepage instances, meaning hosts handling non-hugepage instances > > > should not have hugepages (or should mark the memory consumed by them > > > as reserved for host). We have in effect broken previous behaviour in > > > the name of solving a bug that didn't necessarily have to be fixed yet. > > > > > > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > > > run through any of the code above, meaning they're still not > > > > > pagesize aware > > > > > > > > That is an other issue. We report to the resource tracker all the > > > > physical memory (small pages + hugepages allocated). The difficulty is > > > > that we can't just change the virt driver to report only small > > > > pages. Some instances wont be able to get scheduled. We should > > > > basically change the resource tracker so it can take into account the > > > > different kind of page memory. > > > > > > Agreed (likely via move tracking of this resource to placement, I > > > assume). It's a longer term fix though. > > > > > > > But it's not really an issue since instances that use "NUMA features" > > > > (in Nova world) should be isolated to an aggregate and not be mixed > > > > with no-NUMA instances. The reason is simple no-NUMA instances do not > > > > have boundaries and break rules of NUMA instances. > > > > > > Again, we have to be careful not to mix up NUMA and CPU pinning. It's > > > perfectly fine to have NUMA without CPU pinning, though not the other > > > way around. For example: > > > > > > $ openstack flavor set --property hw:numa_nodes=2 FLAVOR > > > > > > > From what I can tell, there are three reasons that an instance will > > > > > > have a NUMA topology: the user explicitly requested one, the user > > > requested CPU pinning and got one implicitly, or the user requested a > > > specific pagesize and, again, got one implicitly. We handle the latter > > > two with the advice given below, but I don't think anyone has ever said > > > we must separate instances that had a user-specified NUMA topology from > > > those that had no NUMA topology. If we're going down this path, we need > > > clear docs. > > Now I remember why we can't support it. When defining guest NUMA > topology (hw:numa_node) the memory is mapped to the assigned host NUMA > nodes meaning that the guest memory can't swap out. the guest memory should still be able to swap out. we do not memlock the pages when we set hw:numa_nodes we only do that for realtime instances. its down implcitly for hugepages but if you taskset/memtune cores/ram to a host numa node it does not prevent the kernel form paging that memory out to swap space. locked defaults to false in the memory backing https://github.com/openstack/nova/blob/88951ca98e1b286b58aa1ad94f9af40b8260c01f/nova/virt/libvirt/config.py#L2053 and we only set it to true here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L4745-L4749 if we take the wantsrealtime branch. > If a non-NUMA > instance starts using memory from host NUMA nodes used by a guest with > NUMA it can result that the guest with NUMA run out of memory and be > killed. that is unrelated to this. that happens because the OOM killer works per numa node and the mempressur value for vms tends to be highre then other processes and the kernel will prefer to kill them over options. a instnace live migration or any other process that add extra memory pressuer can trigger the same effect. part of the issue is the host reserved memory config option is not per numa node but the OOM killer in the kernel is run per numa node. > > > > > The implementation is pretty old and it was a first design from > > scratch, all the situations have not been take into account or been > > documented. If we want create specific behaviors we are going to add > > more complexity on something which is already, and which is not > > completely stable, as an example the patch you have mentioned which > > has been merged last release. > > > > I agree documenting is probably where we should go; don't try to mix > > instances with InstanceNUMATopology and without, Nova uses a different > > way to compute their resources, like don't try to overcommit such > > instances. > > > > We basically recommend to use aggregate for pinning, realtime, > > hugepages, so it looks reasonable to add guest NUMA topology to that > > list. > > > > > Stephen > > > > > > > > We could probably fix issue (1) by modifying those hugepage functions > > > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > > > We can mitigate issue (2) by advising operators to split hosts into > > > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > > > addition, we did actually called that out in the original spec: > > > > > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > > > question why the patch is necessary/acceptable for NUMA instances. For > > > > > what it's worth, a longer fix would be to start tracking hugepages in a > > > > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > > > > issue now. > > > > > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > > > documenting issue (2), or should we revert the thing wholesale until we > > > > > work on a solution that could e.g. let us track hugepages via placement > > > > > and resolve issue (2) too. > > > > > > > > > > Thoughts? > > > > > Stephen > > > > > > > From johnsomor at gmail.com Tue Jan 8 17:00:24 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Tue, 8 Jan 2019 09:00:24 -0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Yes, we do not allow eventlet in Octavia. It leads to a number of conflicts and problems with the overall code base, including the use of taskflow. Is there a reason we need to use the os-ken BGP code as opposed to the exabgp option that was being used before? I remember we looked at those two options back when the other team was developing the l3 option, but I don't remember all of the details of why exabgp was selected. Michael On Mon, Jan 7, 2019 at 1:18 AM Jeff Yang wrote: > > Hi Michael, > I found that you forbid import eventlet in octavia.[1] > I guess the eventlet has a conflict with gunicorn, is that? > But, I need to import eventlet for os-ken that used to implement bgp speaker.[2] > I am studying eventlet and gunicorn deeply. Have you some suggestions to resolve this conflict? > > [1] https://review.openstack.org/#/c/462334/ > [2] https://review.openstack.org/#/c/628915/ > > Michael Johnson 于2019年1月5日周六 上午8:02写道: >> >> Hi Jeff, >> >> Unfortunately the team that was working on that code had stopped due >> to internal reasons. >> >> I hope to make the reference active/active blueprint a priority again >> during the Train cycle. Following that I may be able to look at the L3 >> distributor option, but I cannot commit to that at this time. >> >> If you are interesting in picking up that work, please let me know and >> we can sync up on that status of the WIP patches, etc. >> >> Michael >> >> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: >> > >> > Dear Octavia team: >> > The email aims to ask the development progress about l3-active-active blueprint. I >> > noticed that the work in this area has been stagnant for eight months. >> > https://review.openstack.org/#/q/l3-active-active >> > I want to know the community's next work plan in this regard. >> > Thanks. From johnsomor at gmail.com Tue Jan 8 17:05:52 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Tue, 8 Jan 2019 09:05:52 -0800 Subject: Queens octavia error In-Reply-To: References: Message-ID: Hi Ignazio, Please use the [octavia] tag in the subject line as this will alert the octavia team to your message. As the message says, this is a nova failure: {u'message': u'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 581, in build_instances\n raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': u'2019-01-07T15:15:59Z'} I recommend you check the nova logs to identify the root issue in nova. If this is related to the security group issue you mentioned on IRC, make sure you create the security group for the Octavia controllers under the account you are running the controllers under. This is the account you specified in your octavia.conf file under the "[service_auth]" section. It is likely you are creating the security group under a different project than your controllers are configured to use. Michael On Mon, Jan 7, 2019 at 7:25 AM Ignazio Cassano wrote: > > Hello All, > I installed octavia on queens with centos 7, but when I create a load balance with the command > openstack loadbalancer create --name lb1 --vip-subnet-id admin-subnet I got some errors in octavia worker.log: > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server failures[0].reraise() > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/taskflow/types/failure.py", line 343, in reraise > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server six.reraise(*self._exc_info) > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server result = task.execute(**arguments) > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 192, in execute > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server raise exceptions.ComputeBuildException(fault=fault) > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server ComputeBuildException: Failed to build compute instance due to: {u'message': u'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 581, in build_instances\n raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': u'2019-01-07T15:15:59Z'} > > Anyone could help me, please ? > > Regards > Ignazio From juliaashleykreger at gmail.com Tue Jan 8 17:10:22 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 8 Jan 2019 09:10:22 -0800 Subject: [ironic] Mid-cycle call times In-Reply-To: References: Message-ID: Greetings everyone! It seems we have coalesced around January 21st and 22nd. I have posted a poll[1] with time windows in two hour blocks so we can reach a consensus on when we should meet. Please vote for your available time windows so we can find the best overlap for everyone. Additionally, if there are any topics or items that you feel would be a good use of the time, please feel free to add them to the planning etherpad[2]. Thanks everyone! -Julia [1]: https://doodle.com/poll/i2awf3zvztncixpg [2]: https://etherpad.openstack.org/p/ironic-stein-midcycle On Wed, Jan 2, 2019 at 1:44 PM Julia Kreger wrote: > > Greetings everyone, > > During our ironic team meeting in December, we discussed if we should go ahead and have a "mid-cycle" call in order to try sync up on where we are at during this cycle, and the next steps for us to take as a team. > > With that said, I have created a doodle poll[1] in an attempt to identify some days that might work. Largely the days available on the poll are geared around my availability this month. > > Ideally, I would like to find three days where we can schedule some 2-4 hour blocks of time. I've gone ahead and started an etherpad[2] to get us started on brainstorming. Once we have some ideas, we will be able to form a schedule and attempt to identify the amount of time required. > > -Julia > > [1]: https://doodle.com/poll/uqwywaxuxsiu7zde > [2]: https://etherpad.openstack.org/p/ironic-stein-midcycle From flux.adam at gmail.com Tue Jan 8 17:13:09 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Tue, 8 Jan 2019 09:13:09 -0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Jeff, Eventlet cannot be included in the main Octavia controller side code or added to our main requirements.txt, but that technically doesn't apply to the pre-built amphora images. They use a different requirements system and since it won't actually be used by any of the real Octavia services (os-ken is just some other software that will run independently inside the amphora, right?) it should actually be fine. In fact, I would assume if it is a requirement of os-ken, you wouldn't have to explicitly list it as a requirement anywhere, as it should be pulled into the system by installing that package. At least, that is my early morning take on this, as I haven't really looked too closely at os-ken yet. --Adam Harwell (rm_work) On Mon, Jan 7, 2019, 01:25 Jeff Yang wrote: > Hi Michael, > I found that you forbid import eventlet in octavia.[1] > I guess the eventlet has a conflict with gunicorn, is that? > But, I need to import eventlet for os-ken that used to implement bgp > speaker.[2] > I am studying eventlet and gunicorn deeply. Have you some suggestions > to resolve this conflict? > > [1] https://review.openstack.org/#/c/462334/ > [2] https://review.openstack.org/#/c/628915/ > > Michael Johnson 于2019年1月5日周六 上午8:02写道: > >> Hi Jeff, > > >> >> Unfortunately the team that was working on that code had stopped due >> to internal reasons. >> >> I hope to make the reference active/active blueprint a priority again >> during the Train cycle. Following that I may be able to look at the L3 >> distributor option, but I cannot commit to that at this time. >> >> If you are interesting in picking up that work, please let me know and >> we can sync up on that status of the WIP patches, etc. >> >> Michael >> >> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang >> wrote: >> > >> > Dear Octavia team: >> > The email aims to ask the development progress about >> l3-active-active blueprint. I >> > noticed that the work in this area has been stagnant for eight months. >> > https://review.openstack.org/#/q/l3-active-active >> > I want to know the community's next work plan in this regard. >> > Thanks. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ed at leafe.com Tue Jan 8 17:26:11 2019 From: ed at leafe.com (Ed Leafe) Date: Tue, 8 Jan 2019 11:26:11 -0600 Subject: [Cyborg] IRC meeting In-Reply-To: References: Message-ID: <7B6CC0C8-82BA-410E-823B-357F08213734@leafe.com> On Jan 8, 2019, at 12:31 AM, Li Liu wrote: > > The IRC meeting will be held Tuesday at 0300 UTC, which is 10:00 pm est(Tuesday) / 7:00 pm pst(Tuesday) /11 am Beijing time (Wednesday) I believe you meant *Wednesday* at 0300 UTC, correct? -- Ed Leafe From brenski at mirantis.com Tue Jan 8 18:21:32 2019 From: brenski at mirantis.com (Boris Renski) Date: Tue, 8 Jan 2019 10:21:32 -0800 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift Message-ID: Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: - We have new look and feel at stackalytics.com - We did away with DriverLog and Member Directory , which were not very actively used or maintained. Those are still available via direct links, but not in the menu on the top - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible via top menu. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback. -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Tue Jan 8 18:22:24 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 08 Jan 2019 18:22:24 +0000 Subject: [dev] 'sqlalchemy.exc.NoSuchTableError: migration_tmp' errors due to SQLite 3.26.0 Message-ID: <088ba53338bf68edbd3742c7e145ccf7605df615.camel@redhat.com> Just to note that I'm currently unable to run nova unit tests locally on Fedora 29 without downgrading my sqlite package. The error I'm seeing is: sqlalchemy.exc.NoSuchTableError: migration_tmp The root cause appears to be a change in 3.26.0 which is breaking sqlalchemy-migrate, as noted here: https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1807262 Corey Bryant has proposed the patch for this patch, linked below, which should resolve this. Any chance an sqlalchemy-migrate core could look at this before I reach the point of not being able to downgrade my sqlite package and run nova unit tests? :) https://review.openstack.org/#/c/623564/5 Stephen From juliaashleykreger at gmail.com Tue Jan 8 18:26:00 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 8 Jan 2019 10:26:00 -0800 Subject: Ironic ibmc driver for Huawei server In-Reply-To: References: Message-ID: Greetings Qianbiao.NG, Welcome to Ironic! The purpose and requirement of Third Party CI is to test drivers are in working order with the current state of the code in Ironic and help prevent the community from accidentally breaking an in-tree vendor driver. Vendors do this by providing one or more physical systems in a pool of hardware that is managed by a Zuul v3 or Jenkins installation which installs ironic (typically in a virtual machine), and configures it to perform a deployment upon the physical bare metal node. Upon failure or successful completion of the test, the results are posted back to OpenStack Gerrit. Ultimately this helps provide the community and the vendor with a level of assurance in what is released by the ironic community. The cinder project has a similar policy and I'll email you directly with the contacts at Huawei that work with the Cinder community, as they would be familiar with many of the aspects of operating third party CI. You can find additional information here on the requirement and the reasoning behind it: https://specs.openstack.org/openstack/ironic-specs/specs/approved/third-party-ci.html We may also be able to put you in touch with some vendors that have recently worked on implementing third-party CI. I'm presently inquiring with others if that will be possible. If you are able to join Internet Relay Chat, our IRC channel (#openstack-ironic) has several individual who have experience setting up and maintaining third-party CI for ironic. Thanks, -Julia On Tue, Jan 8, 2019 at 8:54 AM xmufive at qq.com wrote: > > Hi julia, > > According to the comment of story, > 1. The spec for huawei ibmc drvier has been post here: https://storyboard.openstack.org/#!/story/2004635 , waiting for review. > 2. About the third-party CI part, we provide mocked unittests for our driver's code. Not sure what third-party CI works for in this case. What else we should do? > > Thanks > Qianbiao.NG From sfinucan at redhat.com Tue Jan 8 18:29:03 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 08 Jan 2019 18:29:03 +0000 Subject: [nova] Mempage fun In-Reply-To: <20190108141755.GA9289@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> <20190108115027.GA7825@canonical> <20190108141755.GA9289@canonical> Message-ID: On Tue, 2019-01-08 at 15:17 +0100, Sahid Orentino Ferdjaoui wrote: > On Tue, Jan 08, 2019 at 12:50:27PM +0100, Sahid Orentino Ferdjaoui wrote: > > On Tue, Jan 08, 2019 at 10:47:47AM +0000, Stephen Finucane wrote: > > > On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > > > > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > > > > We've been looking at a patch that landed some months ago and have > > > > > spotted some issues: > > > > > > > > > > https://review.openstack.org/#/c/532168 > > > > > > > > > > In summary, that patch is intended to make the memory check for > > > > > instances memory pagesize aware. The logic it introduces looks > > > > > something like this: > > > > > > > > > > If the instance requests a specific pagesize > > > > > (#1) Check if each host cell can provide enough memory of the > > > > > pagesize requested for each instance cell > > > > > Otherwise > > > > > If the host has hugepages > > > > > (#2) Check if each host cell can provide enough memory of the > > > > > smallest pagesize available on the host for each instance cell > > > > > Otherwise > > > > > (#3) Check if each host cell can provide enough memory for > > > > > each instance cell, ignoring pagesizes > > > > > > > > > > This also has the side-effect of allowing instances with hugepages and > > > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > > > host, because the latter will now be aware of hugepages and won't > > > > > consume them. However, there are a couple of issues with this: > > > > > > > > > > 1. It breaks overcommit for instances without pagesize request > > > > > running on hosts with different pagesizes. This is because we don't > > > > > allow overcommit for hugepages, but case (#2) above means we are now > > > > > reusing the same functions previously used for actual hugepage > > > > > checks to check for regular 4k pages > > > > > > > > I think that we should not accept any overcommit. Only instances with > > > > an InstanceNUMATopology associated pass to this part of check. Such > > > > instances want to use features like guest NUMA topology so their > > > > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > > > > for performance reason and to avoid any cross memory latency. > > > > > > This issue with this is that we had previously designed everything *to* > > > allow overcommit: > > > > > > https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 > > > > This code never worked Stephen, that instead of to please unit tests > > related. I would not recommend to use it as a reference. > > > > > The only time this doesn't apply is if CPU pinning is also in action > > > (remembering that CPU pinning and NUMA topologies are tightly bound and > > > CPU pinning implies a NUMA topology, much to Jay's consternation). As > > > noted below, our previous advice was not to mix hugepage instances and > > > non-hugepage instances, meaning hosts handling non-hugepage instances > > > should not have hugepages (or should mark the memory consumed by them > > > as reserved for host). We have in effect broken previous behaviour in > > > the name of solving a bug that didn't necessarily have to be fixed yet. > > > > > > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > > > run through any of the code above, meaning they're still not > > > > > pagesize aware > > > > > > > > That is an other issue. We report to the resource tracker all the > > > > physical memory (small pages + hugepages allocated). The difficulty is > > > > that we can't just change the virt driver to report only small > > > > pages. Some instances wont be able to get scheduled. We should > > > > basically change the resource tracker so it can take into account the > > > > different kind of page memory. > > > > > > Agreed (likely via move tracking of this resource to placement, I > > > assume). It's a longer term fix though. > > > > > > > But it's not really an issue since instances that use "NUMA features" > > > > (in Nova world) should be isolated to an aggregate and not be mixed > > > > with no-NUMA instances. The reason is simple no-NUMA instances do not > > > > have boundaries and break rules of NUMA instances. > > > > > > Again, we have to be careful not to mix up NUMA and CPU pinning. It's > > > perfectly fine to have NUMA without CPU pinning, though not the other > > > way around. For example: > > > > > > $ openstack flavor set --property hw:numa_nodes=2 FLAVOR > > > > > > > From what I can tell, there are three reasons that an instance will > > > have a NUMA topology: the user explicitly requested one, the user > > > requested CPU pinning and got one implicitly, or the user requested a > > > specific pagesize and, again, got one implicitly. We handle the latter > > > two with the advice given below, but I don't think anyone has ever said > > > we must separate instances that had a user-specified NUMA topology from > > > those that had no NUMA topology. If we're going down this path, we need > > > clear docs. > > Now I remember why we can't support it. When defining guest NUMA > topology (hw:numa_node) the memory is mapped to the assigned host NUMA > nodes meaning that the guest memory can't swap out. If a non-NUMA > instance starts using memory from host NUMA nodes used by a guest with > NUMA it can result that the guest with NUMA run out of memory and be > killed. Based on my minimal test, it seems to work just fine? https://bugs.launchpad.net/nova/+bug/1810977 The instances boot with the patch reverted. Is there something I've missed? > > The implementation is pretty old and it was a first design from > > scratch, all the situations have not been take into account or been > > documented. If we want create specific behaviors we are going to add > > more complexity on something which is already, and which is not > > completely stable, as an example the patch you have mentioned which > > has been merged last release. > > > > I agree documenting is probably where we should go; don't try to mix > > instances with InstanceNUMATopology and without, Nova uses a different > > way to compute their resources, like don't try to overcommit such > > instances. > > > > We basically recommend to use aggregate for pinning, realtime, > > hugepages, so it looks reasonable to add guest NUMA topology to that > > list. > > > > > Stephen > > > > > > > > We could probably fix issue (1) by modifying those hugepage functions > > > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > > > We can mitigate issue (2) by advising operators to split hosts into > > > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > > > addition, we did actually called that out in the original spec: > > > > > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > > > question why the patch is necessary/acceptable for NUMA instances. For > > > > > what it's worth, a longer fix would be to start tracking hugepages in a > > > > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > > > > issue now. > > > > > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > > > documenting issue (2), or should we revert the thing wholesale until we > > > > > work on a solution that could e.g. let us track hugepages via placement > > > > > and resolve issue (2) too. > > > > > > > > > > Thoughts? > > > > > Stephen > > > > > From sfinucan at redhat.com Tue Jan 8 18:38:49 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 08 Jan 2019 18:38:49 +0000 Subject: [nova] Mempage fun In-Reply-To: <1546937673.17763.2@smtp.office365.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <1546937673.17763.2@smtp.office365.com> Message-ID: <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> On Tue, 2019-01-08 at 08:54 +0000, Balázs Gibizer wrote: > On Mon, Jan 7, 2019 at 6:32 PM, Stephen Finucane wrote: > > We've been looking at a patch that landed some months ago and have > > spotted some issues: > > > > https://review.openstack.org/#/c/532168 > > > > In summary, that patch is intended to make the memory check for > > instances memory pagesize aware. The logic it introduces looks > > something like this: > > > > If the instance requests a specific pagesize > > (#1) Check if each host cell can provide enough memory of the > > pagesize requested for each instance cell > > Otherwise > > If the host has hugepages > > (#2) Check if each host cell can provide enough memory of the > > smallest pagesize available on the host for each instance cell > > Otherwise > > (#3) Check if each host cell can provide enough memory for > > each instance cell, ignoring pagesizes > > > > This also has the side-effect of allowing instances with hugepages and > > instances with a NUMA topology but no hugepages to co-exist on the same > > host, because the latter will now be aware of hugepages and won't > > consume them. However, there are a couple of issues with this: > > > > 1. It breaks overcommit for instances without pagesize request > > running on hosts with different pagesizes. This is because we don't > > allow overcommit for hugepages, but case (#2) above means we are now > > reusing the same functions previously used for actual hugepage > > checks to check for regular 4k pages > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > host as NUMA instances with hugepages. The non-NUMA instances don't > > run through any of the code above, meaning they're still not > > pagesize aware > > > > We could probably fix issue (1) by modifying those hugepage functions > > we're using to allow overcommit via a flag that we pass for case (#2). > > We can mitigate issue (2) by advising operators to split hosts into > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > I think this may be the case in some docs (sean-k-mooney said Intel > > used to do this. I don't know about Red Hat's docs or upstream). In > > addition, we did actually called that out in the original spec: > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > However, if we're doing that for non-NUMA instances, one would have to > > question why the patch is necessary/acceptable for NUMA instances. For > > what it's worth, a longer fix would be to start tracking hugepages in > > a non-NUMA aware way too but that's a lot more work and doesn't fix the > > issue now. > > > > As such, my question is this: should be look at fixing issue (1) and > > documenting issue (2), or should we revert the thing wholesale until > > we work on a solution that could e.g. let us track hugepages via > > placement and resolve issue (2) too. > > If you feel that fixing (1) is pretty simple then I suggest to do that > and document the limitation of (2) while we think about a proper > solution. > > gibi I have (1) fixed here: https://review.openstack.org/#/c/629281/ That said, I'm not sure if it's the best thing to do. From what I'm hearing, it seems the advice we should be giving is to not mix instances with/without NUMA topologies, with/without hugepages and with/without CPU pinning. We've only documented the latter, as discussed on this related bug by cfriesen: https://bugs.launchpad.net/nova/+bug/1792985 Given that we should be advising folks not to mix these (something I wasn't aware of until now), what does the original patch actually give us? If you're not mixing instances with/without hugepages, then the only use case that would fix is booting an instance with a NUMA topology but no hugepages on a host that had hugepages (because the instance would be limited to CPUs and memory from one NUMA nodes, but it's conceivable all available memory could be on another NUMA node). That seems like a very esoteric use case that might be better solved by perhaps making the reserved memory configuration option optionally NUMA specific. This would allow us to mark this hugepage memory, which is clearly not intended for consumption by nova (remember: this host only handles non-hugepage instances), as reserved on a per-node basis. I'm not sure how we would map this to placement, though I'm sure it could be figured out. jaypipes is going to have so much fun mapping all this in placement :D Stephen From openstack at nemebean.com Tue Jan 8 19:04:58 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 8 Jan 2019 13:04:58 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Message-ID: <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> Further update: I dusted off my gdb skills and attached it to the privsep process to try to get more details about exactly what is crashing. It looks like the segfault happens on this line: https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 which is h->cb = cb; h being the conntrack handle and cb being the callback function. This makes me think the problem isn't the callback itself (even if we assigned a bogus pointer, which we didn't, it shouldn't cause a segfault unless you try to dereference it) but in the handle we pass in. Trying to look at h->cb results in: (gdb) print h->cb Cannot access memory at address 0x800f228 Interestingly, h itself is fine: (gdb) print h $3 = (struct nfct_handle *) 0x800f1e0 It doesn't _look_ to me like the handle should be crossing any thread boundaries or anything, so I'm not sure why it would be a problem. It gets created in the same privileged function that ultimately registers the callback: https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 So still not sure what's going on, but I thought I'd share what I've found before I stop to eat something. -Ben On 1/7/19 12:11 PM, Ben Nemec wrote: > Renamed the thread to be more descriptive. > > Just to update the list on this, it looks like the problem is a segfault > when the netlink_lib module makes a C call. Digging into that code a > bit, it appears there is a callback being used[1]. I've seen some > comments that when you use a callback with a Python thread, the thread > needs to be registered somehow, but this is all uncharted territory for > me. Suggestions gratefully accepted. :-) > > 1: > https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 > > > On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >> Hi, >> >> I just found that functional tests in Neutron are failing since today >> or maybe yesterday. See [1] >> I was able to reproduce it locally and it looks that it happens with >> oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >> >> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >> >> — >> Slawek Kaplonski >> Senior software engineer >> Red Hat >> From lbragstad at gmail.com Tue Jan 8 19:18:38 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Tue, 8 Jan 2019 13:18:38 -0600 Subject: [dev][keystone][nova][oslo] unified limits + oslo.limit interface questions Message-ID: Hi all, Before the holidays there was a bunch of discussion around unified limits and getting that integrated into nova. One of the last hurdles is smoothing out the interface between nova and the oslo.limit library, which John and Jay were helping out with a bunch. There are a couple of WIP patches proposed that attempt to work through this [0][1][2]. Now that people are starting to recover from the holidays, I wanted to start a thread on what remains for this work. Specifically, what can we do to air out the remaining concerns so that we can release a useable version of oslo.limit for services to consume. Thoughts? [0] https://review.openstack.org/#/c/615180/ John's WIP'd integration patch [1] https://review.openstack.org/#/c/602201/ nova specification [2] https://review.openstack.org/#/c/596520/21 XiYuan's patch to sort out the interface from oslo.limit -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 19:23:12 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 20:23:12 +0100 Subject: Queens octavia error In-Reply-To: References: Message-ID: Hello Michael, thanks for your suggestion. I solved my issue. There was some wrong configuration in octavia.conf file. Security group must belong to the octavia project. Regards Ignazio Il giorno Mar 8 Gen 2019 18:06 Michael Johnson ha scritto: > Hi Ignazio, > > Please use the [octavia] tag in the subject line as this will alert > the octavia team to your message. > > As the message says, this is a nova failure: > > {u'message': u'Exceeded maximum number of retries. Exhausted all hosts > available for retrying build failures for instance > 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' > File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", > line 581, in build_instances\n raise > exception.MaxRetriesExceeded(reason=msg)\n', u'created': > u'2019-01-07T15:15:59Z'} > > I recommend you check the nova logs to identify the root issue in nova. > > If this is related to the security group issue you mentioned on IRC, > make sure you create the security group for the Octavia controllers > under the account you are running the controllers under. This is the > account you specified in your octavia.conf file under the > "[service_auth]" section. It is likely you are creating the security > group under a different project than your controllers are configured > to use. > > Michael > > On Mon, Jan 7, 2019 at 7:25 AM Ignazio Cassano > wrote: > > > > Hello All, > > I installed octavia on queens with centos 7, but when I create a load > balance with the command > > openstack loadbalancer create --name lb1 --vip-subnet-id admin-subnet I > got some errors in octavia worker.log: > > > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server > failures[0].reraise() > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/taskflow/types/failure.py", line 343, in > reraise > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server > six.reraise(*self._exc_info) > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", > line 53, in _execute_task > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server result > = task.execute(**arguments) > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/octavia/controller/worker/tasks/compute_tasks.py", > line 192, in execute > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server raise > exceptions.ComputeBuildException(fault=fault) > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server > ComputeBuildException: Failed to build compute instance due to: > {u'message': u'Exceeded maximum number of retries. Exhausted all hosts > available for retrying build failures for instance > 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' File > "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 581, in > build_instances\n raise exception.MaxRetriesExceeded(reason=msg)\n', > u'created': u'2019-01-07T15:15:59Z'} > > > > Anyone could help me, please ? > > > > Regards > > Ignazio > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Tue Jan 8 19:26:44 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 8 Jan 2019 13:26:44 -0600 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: <20190108192643.GA25045@sm-workstation> On Tue, Jan 08, 2019 at 10:21:32AM -0800, Boris Renski wrote: > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - > > We have new look and feel at stackalytics.com > - > > We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the menu on the top > - > > BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible via top menu. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback. > > -Boris Really looks nice - thanks Boris! Sean From stig.openstack at telfer.org Tue Jan 8 19:44:21 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Tue, 8 Jan 2019 19:44:21 +0000 Subject: [scientific-sig] IRC meeting 2100UTC: Lustre, conferences, CFPs Message-ID: Hi All - We have a Scientific SIG meeting later today at 2100 UTC (about an hour’s time). Everyone is welcome. Today I’d like to restart the efforts for better integration of Lustre with OpenStack, and to canvas for people with use cases for this. Plus start the year with the conference calendar. We meet at 2100 UTC in IRC channel #openstack-meeting. The agenda and full details are here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_8th_2019 Cheers Stig From msm at redhat.com Tue Jan 8 19:51:34 2019 From: msm at redhat.com (Michael McCune) Date: Tue, 8 Jan 2019 14:51:34 -0500 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: <20190108192643.GA25045@sm-workstation> References: <20190108192643.GA25045@sm-workstation> Message-ID: On Tue, Jan 8, 2019 at 2:29 PM Sean McGinnis wrote: > Really looks nice - thanks Boris! ++, really responsive too. thanks for the update =) peace o/ From skaplons at redhat.com Tue Jan 8 20:22:27 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Tue, 8 Jan 2019 21:22:27 +0100 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> Message-ID: <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> Hi Ben, I was also looking at it today. I’m totally not an C and Oslo.privsep expert but I think that there is some new process spawned here. I put pdb before line https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L191 where this issue happen. Then, with "ps aux” I saw: vagrant at fullstack-ubuntu ~ $ ps aux | grep privsep root 18368 0.1 0.5 185752 33544 pts/1 Sl+ 13:24 0:00 /opt/stack/neutron/.tox/dsvm-functional/bin/python /opt/stack/neutron/.tox/dsvm-functional/bin/privsep-helper --config-file neutron/tests/etc/neutron.conf --privsep_context neutron.privileged.default --privsep_sock_path /tmp/tmpG5iqb9/tmp1dMGq0/privsep.sock vagrant 18555 0.0 0.0 14512 1092 pts/2 S+ 13:25 0:00 grep --color=auto privsep But then when I continue run test, and it segfaulted, in journal log I have: Jan 08 13:25:29 fullstack-ubuntu kernel: privsep-helper[18369] segfault at 140043e8 ip 00007f8e1800ef32 sp 00007f8e18a63320 error 4 in libnetfilter_conntrack.so.3.5.0[7f8e18009000+1a000] Please check pics of those processes. First one (when test was „paused” with pdb) has 18368 and later segfault has 18369. I don’t know if You saw my today’s comment in launchpad. I was trying to change method used to start PrivsepDaemon from Method.ROOTWRAP to Method.FORK (in https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/priv_context.py#L218) and run test as root, then tests were passed. — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Ben Nemec w dniu 08.01.2019, o godz. 20:04: > > Further update: I dusted off my gdb skills and attached it to the privsep process to try to get more details about exactly what is crashing. It looks like the segfault happens on this line: > > https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 > > which is > > h->cb = cb; > > h being the conntrack handle and cb being the callback function. > > This makes me think the problem isn't the callback itself (even if we assigned a bogus pointer, which we didn't, it shouldn't cause a segfault unless you try to dereference it) but in the handle we pass in. Trying to look at h->cb results in: > > (gdb) print h->cb > Cannot access memory at address 0x800f228 > > Interestingly, h itself is fine: > > (gdb) print h > $3 = (struct nfct_handle *) 0x800f1e0 > > It doesn't _look_ to me like the handle should be crossing any thread boundaries or anything, so I'm not sure why it would be a problem. It gets created in the same privileged function that ultimately registers the callback: https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 > > So still not sure what's going on, but I thought I'd share what I've found before I stop to eat something. > > -Ben > > On 1/7/19 12:11 PM, Ben Nemec wrote: >> Renamed the thread to be more descriptive. >> Just to update the list on this, it looks like the problem is a segfault when the netlink_lib module makes a C call. Digging into that code a bit, it appears there is a callback being used[1]. I've seen some comments that when you use a callback with a Python thread, the thread needs to be registered somehow, but this is all uncharted territory for me. Suggestions gratefully accepted. :-) >> 1: https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >>> Hi, >>> >>> I just found that functional tests in Neutron are failing since today or maybe yesterday. See [1] >>> I was able to reproduce it locally and it looks that it happens with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >>> >>> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >>> >>> — >>> Slawek Kaplonski >>> Senior software engineer >>> Red Hat >>> From liliueecg at gmail.com Tue Jan 8 21:15:37 2019 From: liliueecg at gmail.com (Li Liu) Date: Tue, 8 Jan 2019 16:15:37 -0500 Subject: [Cyborg] IRC meeting In-Reply-To: <7B6CC0C8-82BA-410E-823B-357F08213734@leafe.com> References: <7B6CC0C8-82BA-410E-823B-357F08213734@leafe.com> Message-ID: Thank you Ed for the correction. It is *Wednesday* at 0300 UTC :P Regards Li Liu On Tue, Jan 8, 2019 at 12:26 PM Ed Leafe wrote: > On Jan 8, 2019, at 12:31 AM, Li Liu wrote: > > > > The IRC meeting will be held Tuesday at 0300 UTC, which is 10:00 pm > est(Tuesday) / 7:00 pm pst(Tuesday) /11 am Beijing time (Wednesday) > > I believe you meant *Wednesday* at 0300 UTC, correct? > > > -- Ed Leafe > > > > > > -- Thank you Regards Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Tue Jan 8 22:00:14 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 8 Jan 2019 16:00:14 -0600 Subject: [cinder] Proposing new Core Members ... Message-ID: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> Team, I would like propose two people who have been taking a more active role in Cinder reviews as Core Team Members: First, Rajat Dhasmana who has been active in doing reviews the last couple of releases (http://www.stackalytics.com/?module=cinder-group&user_id=whoami-rajat). He has also made efforts to join our PTG and Forum sessions remotely, has helped to stay on top of bugs and has submitted a number of fixes recently.  I feel he would be a great addition to our team. Also, I would like to propose Yikun Jiang as a core member.  He had big shoes to fill, back-filling TommyLike Hu and he has stood to the challenge.  Continuing to implement the features that TommyLike had in progress and taking an active role as a reviewer: (http://www.stackalytics.com/?module=cinder-group&user_id=yikunkero) I think that both Rajat and Yikun will be welcome additions to help replace the cores that have recently been removed. If there is no disagreement I plan to add both people to the core reviewer list in a week. Thanks! Jay Bryant (jungleboyj) From fungi at yuggoth.org Tue Jan 8 22:05:23 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 8 Jan 2019 22:05:23 +0000 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: <20190108220522.iczyv2yz5rfg4qci@yuggoth.org> On 2019-01-08 10:21:32 -0800 (-0800), Boris Renski wrote: > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). [...] > Happy to hear comments or feedback. Looks slick! When you say "based on" I guess you mean "forked from?" I don't see those modifications in the repository at https://git.openstack.org/cgit/openstack/stackalytics nor proposed to it through https://review.openstack.org/ so presumably the source code now lives elsewhere. Is Stackalytics still open source, or has it become proprietary? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at nemebean.com Tue Jan 8 22:30:03 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 8 Jan 2019 16:30:03 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> Message-ID: <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> On 1/8/19 2:22 PM, Slawomir Kaplonski wrote: > Hi Ben, > > I was also looking at it today. I’m totally not an C and Oslo.privsep expert but I think that there is some new process spawned here. > I put pdb before line https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L191 where this issue happen. Then, with "ps aux” I saw: > > vagrant at fullstack-ubuntu ~ $ ps aux | grep privsep > root 18368 0.1 0.5 185752 33544 pts/1 Sl+ 13:24 0:00 /opt/stack/neutron/.tox/dsvm-functional/bin/python /opt/stack/neutron/.tox/dsvm-functional/bin/privsep-helper --config-file neutron/tests/etc/neutron.conf --privsep_context neutron.privileged.default --privsep_sock_path /tmp/tmpG5iqb9/tmp1dMGq0/privsep.sock > vagrant 18555 0.0 0.0 14512 1092 pts/2 S+ 13:25 0:00 grep --color=auto privsep > > But then when I continue run test, and it segfaulted, in journal log I have: > > Jan 08 13:25:29 fullstack-ubuntu kernel: privsep-helper[18369] segfault at 140043e8 ip 00007f8e1800ef32 sp 00007f8e18a63320 error 4 in libnetfilter_conntrack.so.3.5.0[7f8e18009000+1a000] > > Please check pics of those processes. First one (when test was „paused” with pdb) has 18368 and later segfault has 18369. privsep-helper does fork, so I _think_ that's normal. https://github.com/openstack/oslo.privsep/blob/ecb1870c29b760f09fb933fc8ebb3eac29ffd03e/oslo_privsep/daemon.py#L539 > > I don’t know if You saw my today’s comment in launchpad. I was trying to change method used to start PrivsepDaemon from Method.ROOTWRAP to Method.FORK (in https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/priv_context.py#L218) and run test as root, then tests were passed. Yeah, I saw that, but I don't understand it. :-/ The daemon should end up running with the same capabilities in either case. By the time it starts making the C calls the environment should be identical, regardless of which method was used to start the process. > > — > Slawek Kaplonski > Senior software engineer > Red Hat > >> Wiadomość napisana przez Ben Nemec w dniu 08.01.2019, o godz. 20:04: >> >> Further update: I dusted off my gdb skills and attached it to the privsep process to try to get more details about exactly what is crashing. It looks like the segfault happens on this line: >> >> https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 >> >> which is >> >> h->cb = cb; >> >> h being the conntrack handle and cb being the callback function. >> >> This makes me think the problem isn't the callback itself (even if we assigned a bogus pointer, which we didn't, it shouldn't cause a segfault unless you try to dereference it) but in the handle we pass in. Trying to look at h->cb results in: >> >> (gdb) print h->cb >> Cannot access memory at address 0x800f228 >> >> Interestingly, h itself is fine: >> >> (gdb) print h >> $3 = (struct nfct_handle *) 0x800f1e0 >> >> It doesn't _look_ to me like the handle should be crossing any thread boundaries or anything, so I'm not sure why it would be a problem. It gets created in the same privileged function that ultimately registers the callback: https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 >> >> So still not sure what's going on, but I thought I'd share what I've found before I stop to eat something. >> >> -Ben >> >> On 1/7/19 12:11 PM, Ben Nemec wrote: >>> Renamed the thread to be more descriptive. >>> Just to update the list on this, it looks like the problem is a segfault when the netlink_lib module makes a C call. Digging into that code a bit, it appears there is a callback being used[1]. I've seen some comments that when you use a callback with a Python thread, the thread needs to be registered somehow, but this is all uncharted territory for me. Suggestions gratefully accepted. :-) >>> 1: https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >>>> Hi, >>>> >>>> I just found that functional tests in Neutron are failing since today or maybe yesterday. See [1] >>>> I was able to reproduce it locally and it looks that it happens with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >>>> >>>> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >>>> >>>> — >>>> Slawek Kaplonski >>>> Senior software engineer >>>> Red Hat >>>> > From sean.mcginnis at gmx.com Tue Jan 8 22:35:36 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 8 Jan 2019 16:35:36 -0600 Subject: [cinder] Proposing new Core Members ... In-Reply-To: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> References: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> Message-ID: <20190108223535.GA29520@sm-workstation> On Tue, Jan 08, 2019 at 04:00:14PM -0600, Jay Bryant wrote: > Team, > > I would like propose two people who have been taking a more active role in > Cinder reviews as Core Team Members: > > > I think that both Rajat and Yikun will be welcome additions to help replace > the cores that have recently been removed. > +1 from me. Both have been doing a good job giving constructive feedback on reviews and have been spending some time reviewing code other than their own direct interests, so I think they would be welcome additions. Sean From openstack at nemebean.com Wed Jan 9 00:30:19 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 8 Jan 2019 18:30:19 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> Message-ID: <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> I think I've got it. At least in my local tests, the handle pointer being passed from C -> Python -> C was getting truncated at the Python step because we didn't properly define the type. If the address assigned was larger than would fit in a standard int then we passed what amounted to a bogus pointer back to the C code, which caused the segfault. I have no idea why privsep threading would have exposed this, other than maybe running in threads affected the address space somehow? In any case, https://review.openstack.org/629335 has got these functional tests working for me locally in oslo.privsep 1.31.0. It would be great if somebody could try them out and verify that I didn't just find a solution that somehow only works on my system. :-) -Ben On 1/8/19 4:30 PM, Ben Nemec wrote: > > > On 1/8/19 2:22 PM, Slawomir Kaplonski wrote: >> Hi Ben, >> >> I was also looking at it today. I’m totally not an C and Oslo.privsep >> expert but I think that there is some new process spawned here. >> I put pdb before line >> https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L191 >> where this issue happen. Then, with "ps aux” I saw: >> >> vagrant at fullstack-ubuntu ~ $ ps aux | grep privsep >> root     18368  0.1  0.5 185752 33544 pts/1    Sl+  13:24   0:00 >> /opt/stack/neutron/.tox/dsvm-functional/bin/python >> /opt/stack/neutron/.tox/dsvm-functional/bin/privsep-helper >> --config-file neutron/tests/etc/neutron.conf --privsep_context >> neutron.privileged.default --privsep_sock_path >> /tmp/tmpG5iqb9/tmp1dMGq0/privsep.sock >> vagrant  18555  0.0  0.0  14512  1092 pts/2    S+   13:25   0:00 grep >> --color=auto privsep >> >> But then when I continue run test, and it segfaulted, in journal log I >> have: >> >> Jan 08 13:25:29 fullstack-ubuntu kernel: privsep-helper[18369] >> segfault at 140043e8 ip 00007f8e1800ef32 sp 00007f8e18a63320 error 4 >> in libnetfilter_conntrack.so.3.5.0[7f8e18009000+1a000] >> >> Please check pics of those processes. First one (when test was >> „paused” with pdb) has 18368 and later segfault has 18369. > > privsep-helper does fork, so I _think_ that's normal. > > https://github.com/openstack/oslo.privsep/blob/ecb1870c29b760f09fb933fc8ebb3eac29ffd03e/oslo_privsep/daemon.py#L539 > > >> >> I don’t know if You saw my today’s comment in launchpad. I was trying >> to change method used to start PrivsepDaemon from Method.ROOTWRAP to >> Method.FORK (in >> https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/priv_context.py#L218) >> and run test as root, then tests were passed. > > Yeah, I saw that, but I don't understand it. :-/ > > The daemon should end up running with the same capabilities in either > case. By the time it starts making the C calls the environment should be > identical, regardless of which method was used to start the process. > >> >> — >> Slawek Kaplonski >> Senior software engineer >> Red Hat >> >>> Wiadomość napisana przez Ben Nemec w dniu >>> 08.01.2019, o godz. 20:04: >>> >>> Further update: I dusted off my gdb skills and attached it to the >>> privsep process to try to get more details about exactly what is >>> crashing. It looks like the segfault happens on this line: >>> >>> https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 >>> >>> >>> which is >>> >>> h->cb = cb; >>> >>> h being the conntrack handle and cb being the callback function. >>> >>> This makes me think the problem isn't the callback itself (even if we >>> assigned a bogus pointer, which we didn't, it shouldn't cause a >>> segfault unless you try to dereference it) but in the handle we pass >>> in. Trying to look at h->cb results in: >>> >>> (gdb) print h->cb >>> Cannot access memory at address 0x800f228 >>> >>> Interestingly, h itself is fine: >>> >>> (gdb) print h >>> $3 = (struct nfct_handle *) 0x800f1e0 >>> >>> It doesn't _look_ to me like the handle should be crossing any thread >>> boundaries or anything, so I'm not sure why it would be a problem. It >>> gets created in the same privileged function that ultimately >>> registers the callback: >>> https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 >>> >>> >>> So still not sure what's going on, but I thought I'd share what I've >>> found before I stop to eat something. >>> >>> -Ben >>> >>> On 1/7/19 12:11 PM, Ben Nemec wrote: >>>> Renamed the thread to be more descriptive. >>>> Just to update the list on this, it looks like the problem is a >>>> segfault when the netlink_lib module makes a C call. Digging into >>>> that code a bit, it appears there is a callback being used[1]. I've >>>> seen some comments that when you use a callback with a Python >>>> thread, the thread needs to be registered somehow, but this is all >>>> uncharted territory for me. Suggestions gratefully accepted. :-) >>>> 1: >>>> https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 >>>> On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >>>>> Hi, >>>>> >>>>> I just found that functional tests in Neutron are failing since >>>>> today or maybe yesterday. See [1] >>>>> I was able to reproduce it locally and it looks that it happens >>>>> with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >>>>> >>>>> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >>>>> >>>>> — >>>>> Slawek Kaplonski >>>>> Senior software engineer >>>>> Red Hat >>>>> >> > From iwienand at redhat.com Wed Jan 9 06:11:09 2019 From: iwienand at redhat.com (Ian Wienand) Date: Wed, 9 Jan 2019 17:11:09 +1100 Subject: [infra] NetworkManager on infra Fedora 29 and CentOS nodes Message-ID: <20190109061109.GA24618@fedora19.localdomain> Hello, Just a heads-up; with Fedora 29 the legacy networking setup was moved into a separate, not-installed-by-default network-scripts package. This has prompted us to finally move to managing interfaces on our Fedora and CentOS CI hosts with NetworkManager (see [1]) Support for this is enabled with features added in glean 1.13.0 and diskimage-builder 1.19.0. The newly created Fedora 29 nodes [2] will have it enabled, and [3] will switch CentOS nodes shortly. This is tested by our nodepool jobs which build images, upload them into devstack and boot them, and then check the networking [4]. I don't really expect any problems, but be aware NetworkManager packages will appear on the CentOS 7 and Fedora base images with these changes. Thanks -i [1] https://bugzilla.redhat.com/show_bug.cgi?id=1643763#c2 [2] https://review.openstack.org/618672 [3] https://review.openstack.org/619960 [4] https://review.openstack.org/618671 From smooney at redhat.com Wed Jan 9 06:11:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 09 Jan 2019 06:11:54 +0000 Subject: [nova] Mempage fun In-Reply-To: <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <1546937673.17763.2@smtp.office365.com> <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> Message-ID: On Tue, 2019-01-08 at 18:38 +0000, Stephen Finucane wrote: > On Tue, 2019-01-08 at 08:54 +0000, Balázs Gibizer wrote: > > On Mon, Jan 7, 2019 at 6:32 PM, Stephen Finucane wrote: > > > We've been looking at a patch that landed some months ago and have > > > spotted some issues: > > > > > > https://review.openstack.org/#/c/532168 > > > > > > In summary, that patch is intended to make the memory check for > > > instances memory pagesize aware. The logic it introduces looks > > > something like this: > > > > > > If the instance requests a specific pagesize > > > (#1) Check if each host cell can provide enough memory of the > > > pagesize requested for each instance cell > > > Otherwise > > > If the host has hugepages > > > (#2) Check if each host cell can provide enough memory of the > > > smallest pagesize available on the host for each instance cell > > > Otherwise > > > (#3) Check if each host cell can provide enough memory for > > > each instance cell, ignoring pagesizes > > > > > > This also has the side-effect of allowing instances with hugepages and > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > host, because the latter will now be aware of hugepages and won't > > > consume them. However, there are a couple of issues with this: > > > > > > 1. It breaks overcommit for instances without pagesize request > > > running on hosts with different pagesizes. This is because we don't > > > allow overcommit for hugepages, but case (#2) above means we are now > > > reusing the same functions previously used for actual hugepage > > > checks to check for regular 4k pages > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > run through any of the code above, meaning they're still not > > > pagesize aware > > > > > > We could probably fix issue (1) by modifying those hugepage functions > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > We can mitigate issue (2) by advising operators to split hosts into > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > addition, we did actually called that out in the original spec: > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > question why the patch is necessary/acceptable for NUMA instances. For > > > what it's worth, a longer fix would be to start tracking hugepages in > > > a non-NUMA aware way too but that's a lot more work and doesn't fix the > > > issue now. > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > documenting issue (2), or should we revert the thing wholesale until > > > we work on a solution that could e.g. let us track hugepages via > > > placement and resolve issue (2) too. > > > > If you feel that fixing (1) is pretty simple then I suggest to do that > > and document the limitation of (2) while we think about a proper > > solution. > > > > gibi > > I have (1) fixed here: > > https://review.openstack.org/#/c/629281/ > > That said, I'm not sure if it's the best thing to do. From what I'm > hearing, it seems the advice we should be giving is to not mix > instances with/without NUMA topologies, with/without hugepages and it should be with and without hw:mem_page_size. guest with that set should not be mixed with guests without that set on the same host. and with shiad patch and your patch this now become safe if the guest without hw:mem_page_size has a numa topology. mixing hugepage and non hugepage guests is fine provided the non hugepage guest has an implcit or expcit numa toplogy such as a guest that is useing cpu pinning. > with/without CPU pinning. We've only documented the latter, as > discussed on this related bug by cfriesen: > > https://bugs.launchpad.net/nova/+bug/1792985 > > Given that we should be advising folks not to mix these (something I > wasn't aware of until now), what does the original patch actually give > us? If you're not mixing instances with/without hugepages, then the > only use case that would fix is booting an instance with a NUMA > topology but no hugepages on a host that had hugepages (because the > instance would be limited to CPUs and memory from one NUMA nodes, but > it's conceivable all available memory could be on another NUMA node). > That seems like a very esoteric use case that might be better solved by this is not that esoteric. one simple example is an operator has configred some number of hugepges on the hypervior and want to run pinnined instance some of which have hugepages and somme that dont. this works fine today however oversubsciption of memory in the non hugepage case is broken as per the bug. > perhaps making the reserved memory configuration option optionally NUMA > specific. well i have been asking for that for 2-3 releases. i would like to do that independenly of this issue and i think it will be a requirement if we ever model mempages per numa node in placement. > This would allow us to mark this hugepage memory, which is > clearly not intended for consumption by nova (remember: this host only > handles non-hugepage instances) again it is safe to mix hugepage instance with non hugepages instance if hw:mem_page_size is set in the non hugepage case. but with your senario in mind we can already resrve the hugepage memory for the host use by setting reserved_huge_pages in the default section of the nova.conf https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.reserved_huge_pages > , as reserved on a per-node basis. I'm > not sure how we would map this to placement, though I'm sure it could > be figured out. that is simple. the placement inventory would just have the reserved value set to the value for the reserved_huge_pages config option. > > jaypipes is going to have so much fun mapping all this in placement :D we have disscued this at lenght before so placement can already model this quite well if nova created the RPs and inventories for mempages. the main question is can we stop modeling memory_mb inventories in the root compute node RP entirely. i personcally would like to make all instances numa affined by default. e.g. we woudl start treading all instances as if hw:numa_nodes=1 was set and preferabley hw:mem_page_size=small. this would signifcantly simplfy our lives in placement but it has a down side that if you want to create really large instance they must be multi numa. e.g. if the guest will be larger then will fit in a singel host numa node it must have have hw:numa_nodes>1 to be schduled. the simple fact is that such an instance is already spanning host numa nodes and but we are not tell ing the guest that. by actully telling the geust it has multiple numa nodes it will imporve the guest perfromance but its a behavior change that not everyone will like. Our current practics or tracking memory and cpus both per numa node and per host is tech debt that we need to clean up at some point or live with the fact that numa will never be modeled in placement. we already have numa afinity for vswitch, pci/sriov devices and we will/should have it for vgpus and pmem in the future. long term i think we would only track things per numa node but i know sylvain has a detailed spec on this which has more context the we can resonably discuss here. > > Stephen > > From alfredo.deluca at gmail.com Wed Jan 9 07:26:32 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 08:26:32 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Hi Ignazio. I downloaded your magnum.conf but it\s not that different from mine. Not sure why but the cluster build seems to run....along with heat but I get that error mentioned earlier. Cheers On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano wrote: > Hi Alfredo, > attached here there is my magnum.conf for queens release > As you can see my heat sections are empty > When you create your cluster, I suggest to check heat logs e magnum logs > for verifyng what is wrong > Ignazio > > > > Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> so. Creating a stack either manually or dashboard works fine. The problem >> seems to be when I create a cluster (kubernetes/swarm) that I got that >> error. >> Maybe the magnum conf it's not properly setup? >> In the heat section of the magnum.conf I have only >> *[heat_client]* >> *region_name = RegionOne* >> *endpoint_type = internalURL* >> >> Cheers >> >> >> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >> alfredo.deluca at gmail.com> wrote: >> >>> Yes. Next step is to check with ansible. >>> I do think it's some rights somewhere... >>> I'll check later. Thanks >>> >>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano >> wrote: >>> >>>> Alfredo, >>>> 1 . how did you run the last heat template? By dashboard ? >>>> 2. Using openstack command you can check if ansible configured heat >>>> user/domain correctly >>>> >>>> >>>> It seems a problem related to >>>> heat user rights? >>>> >>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>> killed by signal 15 >>>>> which is last log from a few days ago. >>>>> >>>>> While the journal of the heat engine says >>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>> heat-engine service. >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>> SAWarning: Unicode type received non-unicode bind param value >>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>> occurrences) >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> (util.ellipses_string(value),)) >>>>> >>>>> >>>>> I also checked the configuration and it seems to be ok. the problem is >>>>> that I installed openstack with ansible-openstack.... so I can't change >>>>> anything unless I re run everything. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Check heat user and domani are c onfigured like at the following: >>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>> >>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>> >>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>> >>>>>>>> I ll try asap. Thanks >>>>>>>> >>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>> >>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>> heat is working fine? >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>> >>>>>>>>>> HI IGNAZIO >>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>> creating the master. >>>>>>>>>> >>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> Anycase during deployment you can connect with ssh to the master >>>>>>>>>>> and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>> >>>>>>>>>>>> Cheers >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my answer >>>>>>>>>>>>> could help you.... >>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>> Ignazio >>>>>>>>>>>>> >>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>> one.... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Alfredo* >>>>>>>>>>>> >>>>>>>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhengzhenyulixi at gmail.com Wed Jan 9 07:39:49 2019 From: zhengzhenyulixi at gmail.com (Zhenyu Zheng) Date: Wed, 9 Jan 2019 15:39:49 +0800 Subject: [Nova] Suggestion needed for detach-boot-volume design In-Reply-To: References: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> Message-ID: Thanks all for the feedback, I have update the spec to be more clear about the scope: https://review.openstack.org/#/c/619161/ On Mon, Jan 7, 2019 at 4:37 PM Zhenyu Zheng wrote: > Thanks alot for the replies, lets wait for some more comments, and I will > update the follow-up spec about this within two days. > > On Sat, Jan 5, 2019 at 7:37 AM melanie witt wrote: > >> On Fri, 4 Jan 2019 09:50:46 -0600, Matt Riedemann >> wrote: >> > On 1/2/2019 2:57 AM, Zhenyu Zheng wrote: >> >> I've been working on detach-boot-volume[1] in Stein, we got the initial >> >> design merged and while implementing we have meet some new problems and >> >> now I'm amending the spec to cover these new problems[2]. >> > >> > [2] is https://review.openstack.org/#/c/619161/ >> > >> >> >> >> The thing I want to discuss for wider opinion is that in the initial >> >> design, we planned to support detach root volume for only STOPPED and >> >> SHELVED/SHELVE_OFFLOADED instances. But then we found out that we >> >> allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as >> >> well. Should we allow detaching root volume for instances in these >> >> status too? Cases like RESIZE could be complicated for the revert >> resize >> >> action, and it also seems unnecesary. >> > >> > The full set of allowed states for attaching and detaching are here: >> > >> > >> https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4187 >> > >> > >> https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4297 >> > >> > Concerning those other states: >> > >> > RESIZED: There might be a case for attaching/detaching volumes based on >> > flavor during a resize, but I'm not sure about the root volume in that >> > case (that really sounds more like rebuild with a new image to me, which >> > is a different blueprint). I'm also not sure how much people know about >> > the ability to do this or what the behavior is on revert if you have >> > changed the volumes while the server is resized. If we consider that >> > when a user reverts a resize, they want to go back to the way things >> > were for the root disk image, then I would think we should not allow >> > changing out the root volume while resized. >> >> Yeah, if someone attaches/detaches a regular volume while the instance >> is in VERIFY_RESIZE state and then reverts the resize, I assume we >> probably don't attempt to change or restore anything with the volume >> attachments to put them back to how they were attached before the >> resize. But as you point out, the situation does seem different >> regarding a root volume. If a user changes that while in VERIFY_RESIZE >> and reverts the resize, and we leave the root volume alone, then they >> end up with a different root disk image than they had before the resize. >> Which seems weird. >> >> I agree it seems better not to allow this for now and come back to it >> later if people start asking for it. >> >> > PAUSED: First, I'm not sure how much anyone uses the pause API (or >> > suspend for that matter) although most of the virt drivers implement it. >> > At one point you could attach volumes to suspended servers as well, but >> > because libvirt didn't support it that was removed from the API (yay for >> > non-discoverable backend-specific API behavior changes): >> > >> > https://review.openstack.org/#/c/83505/ >> > >> > Anyway, swapping the root volume on a paused instance seems dangerous to >> > me, so until someone really has a good use case for it, then I think we >> > should avoid that one as well. >> > >> > SOFT_DELETED: I really don't understand the use case for >> > attaching/detaching volumes to/from a (soft) deleted server. If the >> > server is deleted and only hanging around because it hasn't been >> > reclaimed yet, there are really no guarantees that this would work, so >> > again, I would just skip this one for the root volume changes. If the >> > user really wants to play with the volumes attached to a soft deleted >> > server, they should restore it first. >> > >> > So in summary, I think we should just not support any of those other >> > states for attach/detach root volumes and only focus on stopped or >> > shelved instances. >> >> Again, agree, I think we should just not allow the other states for the >> initial implementation and revisit later if it turns out people need >> these. >> >> -melanie >> >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 07:47:34 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 08:47:34 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Hello Alfredo, I think I could connect on IRC #openstack-containers where magnum experts could help you. Any case the error you reported in previous emails seems to be related to heat wait conditions. Ignazio Il giorno mer 9 gen 2019 alle ore 08:26 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > Hi Ignazio. I downloaded your magnum.conf but it\s not that different from > mine. Not sure why but the cluster build seems to run....along with heat > but I get that error mentioned earlier. > > Cheers > > > On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano > wrote: > >> Hi Alfredo, >> attached here there is my magnum.conf for queens release >> As you can see my heat sections are empty >> When you create your cluster, I suggest to check heat logs e magnum logs >> for verifyng what is wrong >> Ignazio >> >> >> >> Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> so. Creating a stack either manually or dashboard works fine. The >>> problem seems to be when I create a cluster (kubernetes/swarm) that I got >>> that error. >>> Maybe the magnum conf it's not properly setup? >>> In the heat section of the magnum.conf I have only >>> *[heat_client]* >>> *region_name = RegionOne* >>> *endpoint_type = internalURL* >>> >>> Cheers >>> >>> >>> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> Yes. Next step is to check with ansible. >>>> I do think it's some rights somewhere... >>>> I'll check later. Thanks >>>> >>>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano < >>>> ignaziocassano at gmail.com wrote: >>>> >>>>> Alfredo, >>>>> 1 . how did you run the last heat template? By dashboard ? >>>>> 2. Using openstack command you can check if ansible configured heat >>>>> user/domain correctly >>>>> >>>>> >>>>> It seems a problem related to >>>>> heat user rights? >>>>> >>>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>>> killed by signal 15 >>>>>> which is last log from a few days ago. >>>>>> >>>>>> While the journal of the heat engine says >>>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>>> heat-engine service. >>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>>> SAWarning: Unicode type received non-unicode bind param value >>>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>>> occurrences) >>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>> (util.ellipses_string(value),)) >>>>>> >>>>>> >>>>>> I also checked the configuration and it seems to be ok. the problem >>>>>> is that I installed openstack with ansible-openstack.... so I can't change >>>>>> anything unless I re run everything. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Check heat user and domani are c onfigured like at the following: >>>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>>> >>>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>>> >>>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>>> >>>>>>>>> I ll try asap. Thanks >>>>>>>>> >>>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>> >>>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>>> heat is working fine? >>>>>>>>>> Ignazio >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>> >>>>>>>>>>> HI IGNAZIO >>>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>>> creating the master. >>>>>>>>>>> >>>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>>> >>>>>>>>>>>> Anycase during deployment you can connect with ssh to the >>>>>>>>>>>> master and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>>> Ignazio >>>>>>>>>>>> >>>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>> >>>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my >>>>>>>>>>>>>> answer could help you.... >>>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>>> Ignazio >>>>>>>>>>>>>> >>>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>>> one.... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 08:08:37 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 09:08:37 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Alfredo, you could make another test searching on internet as simple heat stack example with wait conditions inside for checking if heat wait conditions work fine. Cheers -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 08:21:00 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 09:21:00 +0100 Subject: queens [magnum] patches Message-ID: Hello, last week I talked on #openstack-containers IRC about important patches for magnum reported here: https://review.openstack.org/#/c/577477/ I'd like to know when the above will be backported on queens and if centos7 and ubuntu packages will be upgraded with them. Any roadmap ? I would go on with magnum testing on queens because I am going to upgrade from ocata to pike and from pike to queens. At this time I have aproduction environment on ocata and a testing environment on queens. Best Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjf1970231893 at gmail.com Wed Jan 9 08:51:48 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Wed, 9 Jan 2019 16:51:48 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Hi, Michael & Adam: I only need to confirm the eventlet has no conflict with amphora-agent. Because I just need to use eventlet in amphora-agent. Michael: 1、The os-ken is managed by OpenStack Community now, and neutron-dynamic-routing's default driver also is os-ken. I think we should be consistent with the community for later maintenance. 2、For the current application scenario, exabgp is a bit too heavy. If use exabgp we need to manage an extra service and need to write adaption code for different Linux distributions. 3、We can more accurately get the bgp speaker and bgp peer's status and statistics by use Os-Ken's functions, for example, peer_down_handler, peer_up_handler, neighbor_state_get. I didn't find similar function in Exabgp. 4、Personally, I am more familiar with os-ken. Adam: Os-Ken is a python library, it implemented bgp protocol. Os-ken manages bgp speaker by starting a green thread. So, I need to use eventlet in amphora-agent code. Extra illustration: Last week, I found the monkey_patch of eventlet will result in gunicorn does not work properly. But now, I resolved the problem. We must pass `os=False` to eventlet.monkey_patch when we call eventlet.monkey_patch, if not, the gunicorn master process will not exit never. Michael Johnson 于2019年1月9日周三 上午1:00写道: > Yes, we do not allow eventlet in Octavia. It leads to a number of > conflicts and problems with the overall code base, including the use > of taskflow. > Is there a reason we need to use the os-ken BGP code as opposed to the > exabgp option that was being used before? > I remember we looked at those two options back when the other team was > developing the l3 option, but I don't remember all of the details of > why exabgp was selected. > > Michael > > On Mon, Jan 7, 2019 at 1:18 AM Jeff Yang wrote: > > > > Hi Michael, > > I found that you forbid import eventlet in octavia.[1] > > I guess the eventlet has a conflict with gunicorn, is that? > > But, I need to import eventlet for os-ken that used to implement bgp > speaker.[2] > > I am studying eventlet and gunicorn deeply. Have you some > suggestions to resolve this conflict? > > > > [1] https://review.openstack.org/#/c/462334/ > > [2] https://review.openstack.org/#/c/628915/ > > > > Michael Johnson 于2019年1月5日周六 上午8:02写道: > >> > >> Hi Jeff, > >> > >> Unfortunately the team that was working on that code had stopped due > >> to internal reasons. > >> > >> I hope to make the reference active/active blueprint a priority again > >> during the Train cycle. Following that I may be able to look at the L3 > >> distributor option, but I cannot commit to that at this time. > >> > >> If you are interesting in picking up that work, please let me know and > >> we can sync up on that status of the WIP patches, etc. > >> > >> Michael > >> > >> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang > wrote: > >> > > >> > Dear Octavia team: > >> > The email aims to ask the development progress about > l3-active-active blueprint. I > >> > noticed that the work in this area has been stagnant for eight months. > >> > https://review.openstack.org/#/q/l3-active-active > >> > I want to know the community's next work plan in this regard. > >> > Thanks. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Jan 9 08:52:29 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 09 Jan 2019 08:52:29 +0000 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: can we tag this conversation with [heat][magnum] in the subject by the way. i keep clicking on it to get the context and realising i can help. On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: > Alfredo, you could make another test searching on internet as simple heat stack example with wait conditions inside > for checking if heat wait conditions work fine. > Cheers > From marcin.juszkiewicz at linaro.org Wed Jan 9 08:57:06 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Wed, 9 Jan 2019 09:57:06 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: <20190108140026.p4462df5otnyizm2@yuggoth.org> References: <20190108140026.p4462df5otnyizm2@yuggoth.org> Message-ID: <6e3c0328-c544-a4dd-32e5-d7e45193a4a7@linaro.org> W dniu 08.01.2019 o 15:00, Jeremy Stanley pisze: >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > [...] > > These days it's probably better to recommend > https://docs.openstack.org/contributors/ since I expect we're about > ready to retire that old wiki page. Then I hope that someone will take care of SEO and redirects. Link I gave was first link from "openstack contributing" google search. From alfredo.deluca at gmail.com Wed Jan 9 09:40:41 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 10:40:41 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Hi Sean. Thanks for that. Do you have any idea about this error? @Ignazio Cassano I will try also on IRC and I am looking on internet a lot. Next step also I will try to create a simple stack to see if it works fine. Cheers On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: > can we tag this conversation with [heat][magnum] in the subject by the way. > i keep clicking on it to get the context and realising i can help. > > On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: > > Alfredo, you could make another test searching on internet as simple > heat stack example with wait conditions inside > > for checking if heat wait conditions work fine. > > Cheers > > > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 10:08:08 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 11:08:08 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Yes, I presume something goes wrong in heat wait condition authorization how reported by Alfredo. So I suggested to try a simple heat stack with a wait condition. Cheers Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > Hi Sean. Thanks for that. > Do you have any idea about this error? @Ignazio Cassano > I will try also on IRC and I am looking on > internet a lot. > Next step also I will try to create a simple stack to see if it works > fine. > Cheers > > On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: > >> can we tag this conversation with [heat][magnum] in the subject by the >> way. >> i keep clicking on it to get the context and realising i can help. >> >> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >> > Alfredo, you could make another test searching on internet as simple >> heat stack example with wait conditions inside >> > for checking if heat wait conditions work fine. >> > Cheers >> > >> >> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Wed Jan 9 10:21:01 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 11:21:01 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: thanks Ignazio. Do you have a quick example for that? pls... Cheers On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano wrote: > Yes, I presume something goes wrong in heat wait condition authorization > how reported by Alfredo. > So I suggested to try a simple heat stack with a wait condition. > Cheers > > Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> Hi Sean. Thanks for that. >> Do you have any idea about this error? @Ignazio Cassano >> I will try also on IRC and I am looking on >> internet a lot. >> Next step also I will try to create a simple stack to see if it works >> fine. >> Cheers >> >> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: >> >>> can we tag this conversation with [heat][magnum] in the subject by the >>> way. >>> i keep clicking on it to get the context and realising i can help. >>> >>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>> > Alfredo, you could make another test searching on internet as simple >>> heat stack example with wait conditions inside >>> > for checking if heat wait conditions work fine. >>> > Cheers >>> > >>> >>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 10:25:45 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 11:25:45 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Alfredo, attached herer there is an example: substitute the image with your image, flavor with your flavor, key_name with your key and network with your network Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > thanks Ignazio. Do you have a quick example for that? pls... > > Cheers > > > On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano > wrote: > >> Yes, I presume something goes wrong in heat wait condition authorization >> how reported by Alfredo. >> So I suggested to try a simple heat stack with a wait condition. >> Cheers >> >> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> Hi Sean. Thanks for that. >>> Do you have any idea about this error? @Ignazio Cassano >>> I will try also on IRC and I am looking on >>> internet a lot. >>> Next step also I will try to create a simple stack to see if it works >>> fine. >>> Cheers >>> >>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: >>> >>>> can we tag this conversation with [heat][magnum] in the subject by the >>>> way. >>>> i keep clicking on it to get the context and realising i can help. >>>> >>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>> > Alfredo, you could make another test searching on internet as simple >>>> heat stack example with wait conditions inside >>>> > for checking if heat wait conditions work fine. >>>> > Cheers >>>> > >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wait.yml Type: application/x-yaml Size: 2529 bytes Desc: not available URL: From balazs.gibizer at ericsson.com Wed Jan 9 10:30:57 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Wed, 9 Jan 2019 10:30:57 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1546865551.29530.0@smtp.office365.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> Message-ID: <1547029853.1128.0@smtp.office365.com> On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer wrote: > > >> But, let's chat more about it via a hangout the week after next >> (week >> of January 14 when Matt is back), as suggested in #openstack-nova >> today. We'll be able to have a high-bandwidth discussion then and >> agree on a decision on how to move forward with this. > > Thank you all for the discussion. I agree to have a real-time > discussion about the way forward. > > Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a > hangouts[2]? > > I see the following topics we need to discuss: > * backward compatibility with already existing SRIOV ports having min > bandwidth > * introducing microversion(s) for this feature in Nova > * allowing partial support for this feature in Nova in Stein (E.g.: > only server create/delete but no migrate support). > * step-by-step verification of the really long commit chain in Nova > > I will post a summar of each issue to the ML during this week. Hi, As I promised here is an etherpad[1] for the hangouts discussion, with a sort summary for the topic I think we need to discuss. Feel free to comment in there or add new topics you feel important. [1] https://etherpad.openstack.org/p/bandwidth-way-forward Cheers, gibi > > From alfredo.deluca at gmail.com Wed Jan 9 10:43:26 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 11:43:26 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: thanks Ignazio. Appreciated On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano wrote: > Alfredo, attached herer there is an example: > > substitute the image with your image, flavor with your flavor, key_name > with your key and network with your network > > Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> thanks Ignazio. Do you have a quick example for that? pls... >> >> Cheers >> >> >> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano >> wrote: >> >>> Yes, I presume something goes wrong in heat wait condition authorization >>> how reported by Alfredo. >>> So I suggested to try a simple heat stack with a wait condition. >>> Cheers >>> >>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>> alfredo.deluca at gmail.com> ha scritto: >>> >>>> Hi Sean. Thanks for that. >>>> Do you have any idea about this error? @Ignazio Cassano >>>> I will try also on IRC and I am looking on >>>> internet a lot. >>>> Next step also I will try to create a simple stack to see if it works >>>> fine. >>>> Cheers >>>> >>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: >>>> >>>>> can we tag this conversation with [heat][magnum] in the subject by the >>>>> way. >>>>> i keep clicking on it to get the context and realising i can help. >>>>> >>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>> > Alfredo, you could make another test searching on internet as simple >>>>> heat stack example with wait conditions inside >>>>> > for checking if heat wait conditions work fine. >>>>> > Cheers >>>>> > >>>>> >>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan.vondra at ultimum.io Wed Jan 9 11:28:55 2019 From: jan.vondra at ultimum.io (Jan Vondra) Date: Wed, 9 Jan 2019 12:28:55 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: út 8. 1. 2019 v 12:00 odesílatel Marcin Juszkiewicz napsal: > > W dniu 08.01.2019 o 11:08, Jan Vondra pisze: > > Dear Kolla team, > > > > during project for one of our customers we have upgraded debian part > > of kolla project using a queens debian repositories > > (http://stretch-queens.debian.net/debian stretch-queens-backports) and > > we would like to share this work with community. > > Thanks for doing that. Is there an option to provide arm64 packages next > time? > It's more of a question for OpenStack Debian Team - namely Thomas Goirand who creates this repo. > > I would like to ask what's the proper process of contributing since > > the patches affects both kolla and kolla-ansible repositories. > > Send patches for review [1] and then we can discuss about changing them. > Remember that we target Stein now. > > 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > Thank you for help (and others with updated links) - I will upload patches today or tomorrow. > > Also any other comments regarding debian in kolla would be appriciated. > > Love to see someone else caring about Debian in Kolla. I took it over > two years ago, revived and moved to 'stretch'. But skipped support for > binary packages as there were no up-to-date packages available. > > In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' > as it will enter final freeze. Had some discussion with Debian OpenStack > team about providing preliminary Stein packages so support for 'binary' > type of images could be possible. I suppose that switch from Queens to Stein would be quite easy since all packages in Queens in Debian are Python 3 only so the most of the work has been already done. From alfredo.deluca at gmail.com Wed Jan 9 11:32:59 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 12:32:59 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: ...more found in logs 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] Domain admin client authentication failed: Unauthorized: The request you have made requires authentication. (HTTP 401) (Request-ID: req-28f9873a-5627-4f2e-9e19-da6c63753383) So what authentication it need? On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca wrote: > hi Ignazio. > the wait condition failed too on your simple stack. > Any other idea where to look at? > > Cheers > > [image: image.png] > > > > On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca > wrote: > >> thanks Ignazio. Appreciated >> >> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano >> wrote: >> >>> Alfredo, attached herer there is an example: >>> >>> substitute the image with your image, flavor with your flavor, key_name >>> with your key and network with your network >>> >>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>> alfredo.deluca at gmail.com> ha scritto: >>> >>>> thanks Ignazio. Do you have a quick example for that? pls... >>>> >>>> Cheers >>>> >>>> >>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Yes, I presume something goes wrong in heat wait condition >>>>> authorization how reported by Alfredo. >>>>> So I suggested to try a simple heat stack with a wait condition. >>>>> Cheers >>>>> >>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> Hi Sean. Thanks for that. >>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>> I will try also on IRC and I am looking >>>>>> on internet a lot. >>>>>> Next step also I will try to create a simple stack to see if it works >>>>>> fine. >>>>>> Cheers >>>>>> >>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>> wrote: >>>>>> >>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>> the way. >>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>> >>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>> simple heat stack example with wait conditions inside >>>>>>> > for checking if heat wait conditions work fine. >>>>>>> > Cheers >>>>>>> > >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >> >> -- >> *Alfredo* >> >> > > -- > *Alfredo* > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 12:18:37 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 13:18:37 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: My suggestion was only to verify that the problem is not magnum but heat/keystone. Heat stacks generated by magnum use wait conditions. I am not so expert of heat, but I am sure it uses keystone. So you have some issues or in keystone or in heat configuration. Are you able to create instances/volumes on your openstack ? If yes keystone can be ok. Probably Sean could help !!! Cheers Ignazio Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > ...more found in logs > > 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient > [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] > Domain admin client authentication failed: Unauthorized: The request you > have made requires authentication. (HTTP 401) (Request-ID: > req-28f9873a-5627-4f2e-9e19-da6c63753383) > > So what authentication it need? > > > On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca > wrote: > >> hi Ignazio. >> the wait condition failed too on your simple stack. >> Any other idea where to look at? >> >> Cheers >> >> [image: image.png] >> >> >> >> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca >> wrote: >> >>> thanks Ignazio. Appreciated >>> >>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Alfredo, attached herer there is an example: >>>> >>>> substitute the image with your image, flavor with your flavor, key_name >>>> with your key and network with your network >>>> >>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>> >>>>> Cheers >>>>> >>>>> >>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>> authorization how reported by Alfredo. >>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>> Cheers >>>>>> >>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Sean. Thanks for that. >>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>> I will try also on IRC and I am looking >>>>>>> on internet a lot. >>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>> works fine. >>>>>>> Cheers >>>>>>> >>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>> wrote: >>>>>>> >>>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>>> the way. >>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>> >>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>> simple heat stack example with wait conditions inside >>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>> > Cheers >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Alfredo* >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >>> >>> -- >>> *Alfredo* >>> >>> >> >> -- >> *Alfredo* >> >> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 12:23:37 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 13:23:37 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Alfredo, try to source you openstack environment variable file: source admin-openrc Then try to execute the following commands heat stack-list openstack stack list Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > ...more found in logs > > 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient > [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] > Domain admin client authentication failed: Unauthorized: The request you > have made requires authentication. (HTTP 401) (Request-ID: > req-28f9873a-5627-4f2e-9e19-da6c63753383) > > So what authentication it need? > > > On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca > wrote: > >> hi Ignazio. >> the wait condition failed too on your simple stack. >> Any other idea where to look at? >> >> Cheers >> >> [image: image.png] >> >> >> >> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca >> wrote: >> >>> thanks Ignazio. Appreciated >>> >>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Alfredo, attached herer there is an example: >>>> >>>> substitute the image with your image, flavor with your flavor, key_name >>>> with your key and network with your network >>>> >>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>> >>>>> Cheers >>>>> >>>>> >>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>> authorization how reported by Alfredo. >>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>> Cheers >>>>>> >>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Sean. Thanks for that. >>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>> I will try also on IRC and I am looking >>>>>>> on internet a lot. >>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>> works fine. >>>>>>> Cheers >>>>>>> >>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>> wrote: >>>>>>> >>>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>>> the way. >>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>> >>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>> simple heat stack example with wait conditions inside >>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>> > Cheers >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Alfredo* >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >>> >>> -- >>> *Alfredo* >>> >>> >> >> -- >> *Alfredo* >> >> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcin.juszkiewicz at linaro.org Wed Jan 9 13:04:56 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Wed, 9 Jan 2019 14:04:56 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: W dniu 09.01.2019 o 12:28, Jan Vondra pisze: > út 8. 1. 2019 v 12:00 odesílatel Marcin Juszkiewicz > napsal: >> >> W dniu 08.01.2019 o 11:08, Jan Vondra pisze: >>> Dear Kolla team, >>> >>> during project for one of our customers we have upgraded debian part >>> of kolla project using a queens debian repositories >>> (http://stretch-queens.debian.net/debian stretch-queens-backports) and >>> we would like to share this work with community. >> >> Thanks for doing that. Is there an option to provide arm64 packages next >> time? > It's more of a question for OpenStack Debian Team - namely Thomas > Goirand who creates this repo. 'Buster' has Rocky now. I was told by Thomas that Stein will follow. >>> I would like to ask what's the proper process of contributing since >>> the patches affects both kolla and kolla-ansible repositories. >> >> Send patches for review [1] and then we can discuss about changing them. >> Remember that we target Stein now. >> >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing >> > > Thank you for help (and others with updated links) - I will upload > patches today or tomorrow. Thanks. Will review. >>> Also any other comments regarding debian in kolla would be appriciated. >> >> Love to see someone else caring about Debian in Kolla. I took it over >> two years ago, revived and moved to 'stretch'. But skipped support for >> binary packages as there were no up-to-date packages available. >> >> In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' >> as it will enter final freeze. Had some discussion with Debian OpenStack >> team about providing preliminary Stein packages so support for 'binary' >> type of images could be possible. > > I suppose that switch from Queens to Stein would be quite easy since > all packages in Queens in Debian are Python 3 only so the most of the > work has been already done. https://review.openstack.org/#/c/625298/ does most of Python 3 packages bring up for Ubuntu with Stein UCA and for Debian 'buster' (which itself is not yet [1] supported in Kolla). 1. https://review.openstack.org/#/c/612681/ From zigo at debian.org Wed Jan 9 13:05:32 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 9 Jan 2019 14:05:32 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: <576004de-5ad4-6b51-31da-d7173df41b47@debian.org> Hi, On 1/9/19 12:28 PM, Jan Vondra wrote: > út 8. 1. 2019 v 12:00 odesílatel Marcin Juszkiewicz > napsal: >> >> W dniu 08.01.2019 o 11:08, Jan Vondra pisze: >>> Dear Kolla team, >>> >>> during project for one of our customers we have upgraded debian part >>> of kolla project using a queens debian repositories >>> (http://stretch-queens.debian.net/debian stretch-queens-backports) and >>> we would like to share this work with community. >> >> Thanks for doing that. Is there an option to provide arm64 packages next >> time? >> > > It's more of a question for OpenStack Debian Team - namely Thomas > Goirand who creates this repo. We do not produce arm64 backports for Stretch, because I don't have access to an arm64 instance to build packages. It would also be a lot of manual work, and I'm not sure I would have the time for it. However, I could help you setting that up, it's not very hard. Also, Debian official (ie: Sid, Buster) has arm64 packages, and it will be in Buster. >>> I would like to ask what's the proper process of contributing since >>> the patches affects both kolla and kolla-ansible repositories. >> >> Send patches for review [1] and then we can discuss about changing them. >> Remember that we target Stein now. >> >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing >> > > Thank you for help (and others with updated links) - I will upload > patches today or tomorrow. If you want to contribute to the Debian packaging, it's done in the Gitlab instance of Debian: https://salsa.debian.org/openstack-team Contributors are very much welcome! >>> Also any other comments regarding debian in kolla would be appriciated. >> >> Love to see someone else caring about Debian in Kolla. I took it over >> two years ago, revived and moved to 'stretch'. But skipped support for >> binary packages as there were no up-to-date packages available. >> >> In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' >> as it will enter final freeze. Had some discussion with Debian OpenStack >> team about providing preliminary Stein packages so support for 'binary' >> type of images could be possible. > > I suppose that switch from Queens to Stein would be quite easy since > all packages in Queens in Debian are Python 3 only so the most of the > work has been already done. Why not switching to Rocky right now, which has been the most tested? Cheers, Thomas Goirand (zigo) From brenski at mirantis.com Tue Jan 8 17:10:56 2019 From: brenski at mirantis.com (Boris Renski) Date: Tue, 8 Jan 2019 09:10:56 -0800 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift Message-ID: Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: - We have new look and feel at stackalytics.com - We did away with DriverLog and Member Directory , which were not very actively used or maintained. Those are still available via direct links, but not in the men on the top - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible at the top nav. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback or answer questions. -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Wed Jan 9 11:10:33 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 12:10:33 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: hi Ignazio. the wait condition failed too on your simple stack. Any other idea where to look at? Cheers [image: image.png] On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca wrote: > thanks Ignazio. Appreciated > > On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano > wrote: > >> Alfredo, attached herer there is an example: >> >> substitute the image with your image, flavor with your flavor, key_name >> with your key and network with your network >> >> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> thanks Ignazio. Do you have a quick example for that? pls... >>> >>> Cheers >>> >>> >>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Yes, I presume something goes wrong in heat wait condition >>>> authorization how reported by Alfredo. >>>> So I suggested to try a simple heat stack with a wait condition. >>>> Cheers >>>> >>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> Hi Sean. Thanks for that. >>>>> Do you have any idea about this error? @Ignazio Cassano >>>>> I will try also on IRC and I am looking >>>>> on internet a lot. >>>>> Next step also I will try to create a simple stack to see if it works >>>>> fine. >>>>> Cheers >>>>> >>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: >>>>> >>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>> the way. >>>>>> i keep clicking on it to get the context and realising i can help. >>>>>> >>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>> > Alfredo, you could make another test searching on internet as >>>>>> simple heat stack example with wait conditions inside >>>>>> > for checking if heat wait conditions work fine. >>>>>> > Cheers >>>>>> > >>>>>> >>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 40109 bytes Desc: not available URL: From jan.vondra at ultimum.io Wed Jan 9 13:28:38 2019 From: jan.vondra at ultimum.io (Jan Vondra) Date: Wed, 9 Jan 2019 14:28:38 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: <576004de-5ad4-6b51-31da-d7173df41b47@debian.org> References: <576004de-5ad4-6b51-31da-d7173df41b47@debian.org> Message-ID: st 9. 1. 2019 v 14:08 odesílatel Thomas Goirand napsal: > > Hi, > > On 1/9/19 12:28 PM, Jan Vondra wrote: > > út 8. 1. 2019 v 12:00 odesílatel Marcin Juszkiewicz > > napsal: > >> > >> W dniu 08.01.2019 o 11:08, Jan Vondra pisze: > >>> Dear Kolla team, > >>> > >>> during project for one of our customers we have upgraded debian part > >>> of kolla project using a queens debian repositories > >>> (http://stretch-queens.debian.net/debian stretch-queens-backports) and > >>> we would like to share this work with community. > >> > >> Thanks for doing that. Is there an option to provide arm64 packages next > >> time? > >> > > > > It's more of a question for OpenStack Debian Team - namely Thomas > > Goirand who creates this repo. > > We do not produce arm64 backports for Stretch, because I don't have > access to an arm64 instance to build packages. It would also be a lot of > manual work, and I'm not sure I would have the time for it. However, I > could help you setting that up, it's not very hard. Also, Debian > official (ie: Sid, Buster) has arm64 packages, and it will be in Buster. > > >>> I would like to ask what's the proper process of contributing since > >>> the patches affects both kolla and kolla-ansible repositories. > >> > >> Send patches for review [1] and then we can discuss about changing them. > >> Remember that we target Stein now. > >> > >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > >> > > > > Thank you for help (and others with updated links) - I will upload > > patches today or tomorrow. > > If you want to contribute to the Debian packaging, it's done in the > Gitlab instance of Debian: > > https://salsa.debian.org/openstack-team > > Contributors are very much welcome! > Well, it's Michals (kevko) responsibility in our company :) > >>> Also any other comments regarding debian in kolla would be appriciated. > >> > >> Love to see someone else caring about Debian in Kolla. I took it over > >> two years ago, revived and moved to 'stretch'. But skipped support for > >> binary packages as there were no up-to-date packages available. > >> > >> In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' > >> as it will enter final freeze. Had some discussion with Debian OpenStack > >> team about providing preliminary Stein packages so support for 'binary' > >> type of images could be possible. > > > > I suppose that switch from Queens to Stein would be quite easy since > > all packages in Queens in Debian are Python 3 only so the most of the > > work has been already done. > > Why not switching to Rocky right now, which has been the most tested? The main point is that we have Queens deployed for customers and we have to support it. Still there is planned upgrade to Rocky in few months and probably to Stein sometime... Jan Vondra From alfredo.deluca at gmail.com Wed Jan 9 13:31:46 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 14:31:46 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Hi Ignazio. the CLI on stack and so on work just fine. root at aio1:~# os stack list +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ | ID | Stack Name | Project | Stack Status | Creation Time | Updated Time | +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ | 7a9a37d4-e2a6-4187-beae-5c5e03d66839 | freddy | f19b0141f7b240ed85f3cb02703a86a5 | CREATE_FAILED | 2019-01-09T11:49:02Z | None | +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ On Wed, Jan 9, 2019 at 1:23 PM Ignazio Cassano wrote: > Alfredo, try to source you openstack environment variable file: > > source admin-openrc > > Then try to execute the following commands > > heat stack-list > > openstack stack list > > > Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> ...more found in logs >> >> 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient >> [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] >> Domain admin client authentication failed: Unauthorized: The request you >> have made requires authentication. (HTTP 401) (Request-ID: >> req-28f9873a-5627-4f2e-9e19-da6c63753383) >> >> So what authentication it need? >> >> >> On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca >> wrote: >> >>> hi Ignazio. >>> the wait condition failed too on your simple stack. >>> Any other idea where to look at? >>> >>> Cheers >>> >>> [image: image.png] >>> >>> >>> >>> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> thanks Ignazio. Appreciated >>>> >>>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Alfredo, attached herer there is an example: >>>>> >>>>> substitute the image with your image, flavor with your flavor, >>>>> key_name with your key and network with your network >>>>> >>>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>>> >>>>>> Cheers >>>>>> >>>>>> >>>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>>> authorization how reported by Alfredo. >>>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>>> Cheers >>>>>>> >>>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Sean. Thanks for that. >>>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>>> I will try also on IRC and I am >>>>>>>> looking on internet a lot. >>>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>>> works fine. >>>>>>>> Cheers >>>>>>>> >>>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>>> wrote: >>>>>>>> >>>>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>>>> the way. >>>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>>> >>>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>>> simple heat stack example with wait conditions inside >>>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>>> > Cheers >>>>>>>>> > >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Alfredo* >>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Wed Jan 9 13:32:46 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 14:32:46 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: I can create instances and volumes with no issue. it seems to be maybe keystone I think..... On Wed, Jan 9, 2019 at 1:18 PM Ignazio Cassano wrote: > My suggestion was only to verify that the problem is not magnum but > heat/keystone. > Heat stacks generated by magnum use wait conditions. > I am not so expert of heat, but I am sure it uses keystone. > So you have some issues or in keystone or in heat configuration. > Are you able to create instances/volumes on your openstack ? > If yes keystone can be ok. > > Probably Sean could help !!! > > Cheers > Ignazio > > > Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> ...more found in logs >> >> 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient >> [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] >> Domain admin client authentication failed: Unauthorized: The request you >> have made requires authentication. (HTTP 401) (Request-ID: >> req-28f9873a-5627-4f2e-9e19-da6c63753383) >> >> So what authentication it need? >> >> >> On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca >> wrote: >> >>> hi Ignazio. >>> the wait condition failed too on your simple stack. >>> Any other idea where to look at? >>> >>> Cheers >>> >>> [image: image.png] >>> >>> >>> >>> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> thanks Ignazio. Appreciated >>>> >>>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Alfredo, attached herer there is an example: >>>>> >>>>> substitute the image with your image, flavor with your flavor, >>>>> key_name with your key and network with your network >>>>> >>>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>>> >>>>>> Cheers >>>>>> >>>>>> >>>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>>> authorization how reported by Alfredo. >>>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>>> Cheers >>>>>>> >>>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Sean. Thanks for that. >>>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>>> I will try also on IRC and I am >>>>>>>> looking on internet a lot. >>>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>>> works fine. >>>>>>>> Cheers >>>>>>>> >>>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>>> wrote: >>>>>>>> >>>>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>>>> the way. >>>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>>> >>>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>>> simple heat stack example with wait conditions inside >>>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>>> > Cheers >>>>>>>>> > >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Alfredo* >>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ifatafekn at gmail.com Wed Jan 9 13:37:24 2019 From: ifatafekn at gmail.com (Ifat Afek) Date: Wed, 9 Jan 2019 15:37:24 +0200 Subject: [vitrage] Nominating Ivan Kolodyazhny for Vitrage core Message-ID: Hi, I would like to nominate Ivan Kolodyazhny for Vitrage core. Ivan has been contributing to Vitrage for a while now. He has focused on upgrade support, vitrage-dashboard and vitrage-tempest-plugin enhancements, and during this time gained a lot of knowledge and experience with Vitrage code base. I believe he would make a great addition to our team. Thanks, Ifat. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 13:37:32 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 14:37:32 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: I think you need to verify your /etc/heat/heat.conf with the documentation related to openstack version you are using, Simple heat stack without wait conditions works fine ??? Il giorno mer 9 gen 2019 alle ore 14:31 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > Hi Ignazio. > the CLI on stack and so on work just fine. > root at aio1:~# os stack list > > +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ > | ID | Stack Name | Project > | Stack Status | Creation Time | Updated Time | > > +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ > | 7a9a37d4-e2a6-4187-beae-5c5e03d66839 | freddy | > f19b0141f7b240ed85f3cb02703a86a5 | CREATE_FAILED | 2019-01-09T11:49:02Z | > None | > > +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ > > > On Wed, Jan 9, 2019 at 1:23 PM Ignazio Cassano > wrote: > >> Alfredo, try to source you openstack environment variable file: >> >> source admin-openrc >> >> Then try to execute the following commands >> >> heat stack-list >> >> openstack stack list >> >> >> Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> ...more found in logs >>> >>> 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient >>> [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] >>> Domain admin client authentication failed: Unauthorized: The request you >>> have made requires authentication. (HTTP 401) (Request-ID: >>> req-28f9873a-5627-4f2e-9e19-da6c63753383) >>> >>> So what authentication it need? >>> >>> >>> On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> hi Ignazio. >>>> the wait condition failed too on your simple stack. >>>> Any other idea where to look at? >>>> >>>> Cheers >>>> >>>> [image: image.png] >>>> >>>> >>>> >>>> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca < >>>> alfredo.deluca at gmail.com> wrote: >>>> >>>>> thanks Ignazio. Appreciated >>>>> >>>>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Alfredo, attached herer there is an example: >>>>>> >>>>>> substitute the image with your image, flavor with your flavor, >>>>>> key_name with your key and network with your network >>>>>> >>>>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>>> >>>>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>>>> authorization how reported by Alfredo. >>>>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>>>> Cheers >>>>>>>> >>>>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>> >>>>>>>>> Hi Sean. Thanks for that. >>>>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>>>> I will try also on IRC and I am >>>>>>>>> looking on internet a lot. >>>>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>>>> works fine. >>>>>>>>> Cheers >>>>>>>>> >>>>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> can we tag this conversation with [heat][magnum] in the subject >>>>>>>>>> by the way. >>>>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>>>> >>>>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>>>> simple heat stack example with wait conditions inside >>>>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>>>> > Cheers >>>>>>>>>> > >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *Alfredo* >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Alfredo* >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eyalb1 at gmail.com Wed Jan 9 13:49:51 2019 From: eyalb1 at gmail.com (Eyal B) Date: Wed, 9 Jan 2019 15:49:51 +0200 Subject: [vitrage] Nominating Ivan Kolodyazhny for Vitrage core In-Reply-To: References: Message-ID: +1 On Wed, Jan 9, 2019, 15:42 Ifat Afek Hi, > > > I would like to nominate Ivan Kolodyazhny for Vitrage core. > > Ivan has been contributing to Vitrage for a while now. He has focused on > upgrade support, vitrage-dashboard and vitrage-tempest-plugin enhancements, > and during this time gained a lot of knowledge and experience with Vitrage > code base. I believe he would make a great addition to our team. > > > Thanks, > > Ifat. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Wed Jan 9 13:52:41 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 9 Jan 2019 13:52:41 +0000 Subject: [Kolla] Queens for debian images In-Reply-To: <6e3c0328-c544-a4dd-32e5-d7e45193a4a7@linaro.org> References: <20190108140026.p4462df5otnyizm2@yuggoth.org> <6e3c0328-c544-a4dd-32e5-d7e45193a4a7@linaro.org> Message-ID: <20190109135241.a6mfpgupedylgfws@yuggoth.org> On 2019-01-09 09:57:06 +0100 (+0100), Marcin Juszkiewicz wrote: > W dniu 08.01.2019 o 15:00, Jeremy Stanley pisze: > >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > > [...] > > > > These days it's probably better to recommend > > https://docs.openstack.org/contributors/ since I expect we're about > > ready to retire that old wiki page. > > Then I hope that someone will take care of SEO and redirects. Link I > gave was first link from "openstack contributing" google search. Yes, I believe Kendall's plan there is to replace all (or most of) the content in that article with a link to the contributor guide. We just wanted to be sure everything it mentions is covered in that newer and more durable document before replacing it. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From doug at doughellmann.com Wed Jan 9 14:21:46 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 09 Jan 2019 09:21:46 -0500 Subject: queens [magnum] patches In-Reply-To: References: Message-ID: Ignazio Cassano writes: > Hello, > last week I talked on #openstack-containers IRC about important patches for > magnum reported here: > https://review.openstack.org/#/c/577477/ > > I'd like to know when the above will be backported on queens and if centos7 > and ubuntu packages > will be upgraded with them. > Any roadmap ? > I would go on with magnum testing on queens because I am going to upgrade > from ocata to pike and from pike to queens. > > At this time I have aproduction environment on ocata and a testing > environment on queens. > > Best Regards > Ignazio You can submit those backports yourself, either through the gerrit web UI or by manually creating the patches locally using git commands. There are more details on processes and tools for doing this in the stable maintenance section of the project team guide [1]. As far as when those changes might end up in packages, the community doesn't really have much insight into (or influence over) what stable patches are pulled down by the distributors or how they schedule their updates and releases. So I recommend talking to the folks who prepare the distribution(s) you're interested in, after the backport patches are approved. [1] https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes -- Doug From juliaashleykreger at gmail.com Wed Jan 9 14:47:58 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 9 Jan 2019 06:47:58 -0800 Subject: Ironic ibmc driver for Huawei server In-Reply-To: References: Message-ID: Ironic does not have a deadline for merging specs. We will generally avoid landing large features the closer we get to the end of the cycle. If third party CI is up before the end of the cycle, I suspect it would just be a matter of iterating the driver code through review. You may wish to propose it sooner rather than later, and we can begin to give you feedback from there. -Julia On Tue, Jan 8, 2019 at 11:21 PM xmufive at qq.com wrote: > > Hi Julia, > > When is the deadline of approving specs, I am afraid that huawei ibmc spec will be put off util next release. > > Thanks > Qianbiao NG > > > ------------------ 原始邮件 ------------------ > 发件人: "Julia Kreger"; > 发送时间: 2019年1月9日(星期三) 凌晨2:26 > 收件人: "xmufive at qq.com"; > 抄送: "openstack-discuss"; > 主题: Re: Ironic ibmc driver for Huawei server > > Greetings Qianbiao.NG, > > Welcome to Ironic! > > The purpose and requirement of Third Party CI is to test drivers are > in working order with the current state of the code in Ironic and help > prevent the community from accidentally breaking an in-tree vendor > driver. Vendors do this by providing one or more physical systems in a > pool of hardware that is managed by a Zuul v3 or Jenkins installation > which installs ironic (typically in a virtual machine), and configures > it to perform a deployment upon the physical bare metal node. Upon > failure or successful completion of the test, the results are posted > back to OpenStack Gerrit. > > Ultimately this helps provide the community and the vendor with a > level of assurance in what is released by the ironic community. The > cinder project has a similar policy and I'll email you directly with > the contacts at Huawei that work with the Cinder community, as they > would be familiar with many of the aspects of operating third party > CI. > > You can find additional information here on the requirement and the > reasoning behind it: > > https://specs.openstack.org/openstack/ironic-specs/specs/approved/third-party-ci.html > > We may also be able to put you in touch with some vendors that have > recently worked on implementing third-party CI. I'm presently > inquiring with others if that will be possible. If you are able to > join Internet Relay Chat, our IRC channel (#openstack-ironic) has > several individual who have experience setting up and maintaining > third-party CI for ironic. > > Thanks, > > -Julia > > On Tue, Jan 8, 2019 at 8:54 AM xmufive at qq.com wrote: > > > > Hi julia, > > > > According to the comment of story, > > 1. The spec for huawei ibmc drvier has been post here: https://storyboard.openstack.org/#!/story/2004635 , waiting for review. > > 2. About the third-party CI part, we provide mocked unittests for our driver's code. Not sure what third-party CI works for in this case. What else we should do? > > > > Thanks > > Qianbiao.NG From dh3 at sanger.ac.uk Wed Jan 9 15:13:29 2019 From: dh3 at sanger.ac.uk (Dave Holland) Date: Wed, 9 Jan 2019 15:13:29 +0000 Subject: [cinder] volume encryption performance impact Message-ID: <20190109151329.GA7953@sanger.ac.uk> Hello, I've just started investigating Cinder volume encryption using Queens (RHOSP13) with a Ceph/RBD backend and the performance overhead is... surprising. Some naive bonnie++ numbers, comparing a plain vs encrypted volume: plain: write 1400MB/s, read 390MB/s encrypted: write 81MB/s, read 83MB/s The encryption was configured with: openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LuksEncryptor-Template-256 Does anyone have a similar setup, and can share their performance figures, or give me an idea of what percentage performance impact I should expect? Alternatively: is AES256 overkill, or, where should I start looking for a misconfiguration or bottleneck? Thanks in advance. Dave -- ** Dave Holland ** Systems Support -- Informatics Systems Group ** ** 01223 496923 ** Wellcome Sanger Institute, Hinxton, UK ** -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From Arkady.Kanevsky at dell.com Wed Jan 9 15:20:15 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 9 Jan 2019 15:20:15 +0000 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: References: Message-ID: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> Thanks Boris. Do we still use DriverLog for marketplace driver status updates? Thanks, Arkady From: Boris Renski Sent: Tuesday, January 8, 2019 11:11 AM To: openstack-dev at lists.openstack.org; Ilya Shakhat; Herman Narkaytis; David Stoltenberg Subject: [openstack-dev] [stackalytics] Stackalytics Facelift [EXTERNAL EMAIL] Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: * We have new look and feel at stackalytics.com * We did away with DriverLog and Member Directory, which were not very actively used or maintained. Those are still available via direct links, but not in the men on the top * BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible at the top nav. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback or answer questions. -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From e0ne at e0ne.info Wed Jan 9 15:53:04 2019 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Wed, 9 Jan 2019 17:53:04 +0200 Subject: [cinder] Proposing new Core Members ... In-Reply-To: <20190108223535.GA29520@sm-workstation> References: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> <20190108223535.GA29520@sm-workstation> Message-ID: +1! Welcome to the team, guys! Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Wed, Jan 9, 2019 at 12:36 AM Sean McGinnis wrote: > On Tue, Jan 08, 2019 at 04:00:14PM -0600, Jay Bryant wrote: > > Team, > > > > I would like propose two people who have been taking a more active role > in > > Cinder reviews as Core Team Members: > > > > > > > > I think that both Rajat and Yikun will be welcome additions to help > replace > > the cores that have recently been removed. > > > > +1 from me. Both have been doing a good job giving constructive feedback on > reviews and have been spending some time reviewing code other than their > own > direct interests, so I think they would be welcome additions. > > Sean > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Wed Jan 9 16:02:57 2019 From: aschultz at redhat.com (Alex Schultz) Date: Wed, 9 Jan 2019 09:02:57 -0700 Subject: [tripleo] Re: [infra] NetworkManager on infra Fedora 29 and CentOS nodes In-Reply-To: <20190109061109.GA24618@fedora19.localdomain> References: <20190109061109.GA24618@fedora19.localdomain> Message-ID: On Tue, Jan 8, 2019 at 11:15 PM Ian Wienand wrote: > > Hello, > > Just a heads-up; with Fedora 29 the legacy networking setup was moved > into a separate, not-installed-by-default network-scripts package. > This has prompted us to finally move to managing interfaces on our > Fedora and CentOS CI hosts with NetworkManager (see [1]) > > Support for this is enabled with features added in glean 1.13.0 and > diskimage-builder 1.19.0. > > The newly created Fedora 29 nodes [2] will have it enabled, and [3] > will switch CentOS nodes shortly. This is tested by our nodepool jobs > which build images, upload them into devstack and boot them, and then > check the networking [4]. > Don't suppose we could try this with tripleo jobs prior to cutting them all over could we? We don't use NetworkManager and infact os-net-config doesn't currently support NetworkManager. I don't think it'll cause problems, but I'd like to have some test prior to cutting them all over. Thanks, -Alex > I don't really expect any problems, but be aware NetworkManager > packages will appear on the CentOS 7 and Fedora base images with these > changes. > > Thanks > > -i > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1643763#c2 > [2] https://review.openstack.org/618672 > [3] https://review.openstack.org/619960 > [4] https://review.openstack.org/618671 > From mthode at mthode.org Wed Jan 9 16:54:35 2019 From: mthode at mthode.org (Matthew Thode) Date: Wed, 9 Jan 2019 10:54:35 -0600 Subject: [cinder] volume encryption performance impact In-Reply-To: <20190109151329.GA7953@sanger.ac.uk> References: <20190109151329.GA7953@sanger.ac.uk> Message-ID: <20190109165435.jpmcgxmktabjplps@mthode.org> On 19-01-09 15:13:29, Dave Holland wrote: > Hello, > > I've just started investigating Cinder volume encryption using Queens > (RHOSP13) with a Ceph/RBD backend and the performance overhead is... > surprising. Some naive bonnie++ numbers, comparing a plain vs encrypted > volume: > > plain: write 1400MB/s, read 390MB/s > encrypted: write 81MB/s, read 83MB/s > > The encryption was configured with: > > openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LuksEncryptor-Template-256 > > Does anyone have a similar setup, and can share their performance > figures, or give me an idea of what percentage performance impact I > should expect? Alternatively: is AES256 overkill, or, where should I > start looking for a misconfiguration or bottleneck? > I haven't tested yet, but that doesn't sound right, it sounds like it's not using aes-ni (or tha amd equiv). 256 may be higher than is needed (256 aes has some attacks that 128 does not iirc as well) but should drop perf that much unless it's dropping back to sofware. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From balazs.gibizer at ericsson.com Wed Jan 9 16:56:03 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Wed, 9 Jan 2019 16:56:03 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1547029853.1128.0@smtp.office365.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> <1547029853.1128.0@smtp.office365.com> Message-ID: <1547052955.1128.1@smtp.office365.com> On Wed, Jan 9, 2019 at 11:30 AM, Balázs Gibizer wrote: > > > On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer > wrote: >> >> >>> But, let's chat more about it via a hangout the week after next >>> (week >>> of January 14 when Matt is back), as suggested in #openstack-nova >>> today. We'll be able to have a high-bandwidth discussion then and >>> agree on a decision on how to move forward with this. >> >> Thank you all for the discussion. I agree to have a real-time >> discussion about the way forward. >> >> Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a >> hangouts[2]? > It seems that Tuesday 15th of Jan, 17:00 UTC [2] would be better for the team. So I'm moving the call there. Cheers, gibi [1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI [2] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190115T170000 From mark at stackhpc.com Wed Jan 9 17:08:47 2019 From: mark at stackhpc.com (Mark Goddard) Date: Wed, 9 Jan 2019 17:08:47 +0000 Subject: [kayobe] IRC meetings In-Reply-To: References: Message-ID: Thanks to everyone who replied. There was a tie, so let's go with every other Monday at 14:00 UTC. First meeting will be Monday 21st January in #openstack-kayobe. Mark On Thu, 29 Nov 2018 at 13:07, Mark Goddard wrote: > Hi, > > The community has requested that we start holding regular IRC meetings for > kayobe, and I agree. I suggest we start with meeting every other week, in > the #openstack-kayobe channel. > > I've created a Doodle poll [1] with each hour between 2pm and 6pm UTC > available every weekday. Please respond on the poll with your availability. > > Thanks, > Mark > > [1] https://doodle.com/poll/6di3pddsahg6h66k > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Arne.Wiebalck at cern.ch Wed Jan 9 17:45:25 2019 From: Arne.Wiebalck at cern.ch (Arne Wiebalck) Date: Wed, 9 Jan 2019 17:45:25 +0000 Subject: [cinder] volume encryption performance impact In-Reply-To: <20190109151329.GA7953@sanger.ac.uk> References: <20190109151329.GA7953@sanger.ac.uk> Message-ID: Hi Dave, With the same key length and backend, we’ve done some quick checks at the time, but did not notice any significant performance impact (beyond a slight CPU increase). We did not test beyond the QoS limits we apply, though. Cheers, Arne > On 9 Jan 2019, at 16:13, Dave Holland wrote: > > Hello, > > I've just started investigating Cinder volume encryption using Queens > (RHOSP13) with a Ceph/RBD backend and the performance overhead is... > surprising. Some naive bonnie++ numbers, comparing a plain vs encrypted > volume: > > plain: write 1400MB/s, read 390MB/s > encrypted: write 81MB/s, read 83MB/s > > The encryption was configured with: > > openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LuksEncryptor-Template-256 > > Does anyone have a similar setup, and can share their performance > figures, or give me an idea of what percentage performance impact I > should expect? Alternatively: is AES256 overkill, or, where should I > start looking for a misconfiguration or bottleneck? > > Thanks in advance. > > Dave > -- > ** Dave Holland ** Systems Support -- Informatics Systems Group ** > ** 01223 496923 ** Wellcome Sanger Institute, Hinxton, UK ** > > > -- > The Wellcome Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > -- Arne Wiebalck CERN IT From pierre at stackhpc.com Wed Jan 9 18:21:00 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 9 Jan 2019 18:21:00 +0000 Subject: [blazar] Nominating Tetsuro Nakamura for blazar-core Message-ID: Hello, I would like to nominate Tetsuro Nakamura for membership in the blazar-core team. Tetsuro started contributing to Blazar last summer. He has been contributing great code for integrating Blazar with placement and participating actively in the project. He is also providing good feedback to the rest of the contributors via code review, including on code not related to placement. He would make a great addition to the core team. Unless there are objections, I will add him to the core team in a week's time. Pierre From tbechtold at suse.com Wed Jan 9 19:40:12 2019 From: tbechtold at suse.com (Thomas Bechtold) Date: Wed, 9 Jan 2019 20:40:12 +0100 Subject: [rpm-packaging] Proposing new core member Message-ID: Hi, I would like to nominate Colleen Murphy for rpm-packaging core. Colleen has be active in doing very valuable reviews since some time so I feel she would be a great addition to the team. Please give your +1/-1 in the next days. Cheers, Tom From doug at doughellmann.com Wed Jan 9 19:57:09 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 09 Jan 2019 14:57:09 -0500 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: Message-ID: Doug Hellmann writes: > Doug Hellmann writes: > >> Today devstack requires each project to explicitly indicate that it can >> be installed under python 3, even when devstack itself is running with >> python 3 enabled. >> >> As part of the python3-first goal, I have proposed a change to devstack >> to modify that behavior [1]. With the change in place, when devstack >> runs with python3 enabled all services are installed under python 3, >> unless explicitly listed as not supporting python 3. >> >> If your project has a devstack plugin or runs integration or functional >> test jobs that use devstack, please test your project with the patch >> (you can submit a trivial change to your project and use Depends-On to >> pull in the devstack change). >> >> [1] https://review.openstack.org/#/c/622415/ >> -- >> Doug >> > > We have had a few +1 votes on the patch above with comments that > indicate at least a couple of projects have taken the time to test and > verify that things won't break for them with the change. > > Are we ready to proceed with merging the change? > > -- > Doug > The patch mentioned above that changes the default version of Python in devstack to 3 by default has merged. If this triggers a failure in your devstack-based jobs, you can use the disable_python3_package function to add your package to the list *not* installed using Python 3 until the fixes are available. -- Doug From skaplons at redhat.com Wed Jan 9 20:22:26 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 9 Jan 2019 21:22:26 +0100 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: Message-ID: <1A260E8F-8EC1-4701-BC10-759457F174DA@redhat.com> Hi, Just to be sure, does it mean that we don’t need to add USE_PYTHON3=True to run job on python3, right? — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Doug Hellmann w dniu 09.01.2019, o godz. 20:57: > > Doug Hellmann writes: > >> Doug Hellmann writes: >> >>> Today devstack requires each project to explicitly indicate that it can >>> be installed under python 3, even when devstack itself is running with >>> python 3 enabled. >>> >>> As part of the python3-first goal, I have proposed a change to devstack >>> to modify that behavior [1]. With the change in place, when devstack >>> runs with python3 enabled all services are installed under python 3, >>> unless explicitly listed as not supporting python 3. >>> >>> If your project has a devstack plugin or runs integration or functional >>> test jobs that use devstack, please test your project with the patch >>> (you can submit a trivial change to your project and use Depends-On to >>> pull in the devstack change). >>> >>> [1] https://review.openstack.org/#/c/622415/ >>> -- >>> Doug >>> >> >> We have had a few +1 votes on the patch above with comments that >> indicate at least a couple of projects have taken the time to test and >> verify that things won't break for them with the change. >> >> Are we ready to proceed with merging the change? >> >> -- >> Doug >> > > The patch mentioned above that changes the default version of Python in > devstack to 3 by default has merged. If this triggers a failure in your > devstack-based jobs, you can use the disable_python3_package function to > add your package to the list *not* installed using Python 3 until the > fixes are available. > > -- > Doug > From ltoscano at redhat.com Wed Jan 9 20:45:47 2019 From: ltoscano at redhat.com (Luigi Toscano) Date: Wed, 09 Jan 2019 21:45:47 +0100 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: Message-ID: <1880193.M4ppHeEpuz@whitebase.usersys.redhat.com> On Wednesday, 9 January 2019 20:57:09 CET Doug Hellmann wrote: > Doug Hellmann writes: > > Doug Hellmann writes: > >> Today devstack requires each project to explicitly indicate that it can > >> be installed under python 3, even when devstack itself is running with > >> python 3 enabled. > >> > >> As part of the python3-first goal, I have proposed a change to devstack > >> to modify that behavior [1]. With the change in place, when devstack > >> runs with python3 enabled all services are installed under python 3, > >> unless explicitly listed as not supporting python 3. > >> > >> If your project has a devstack plugin or runs integration or functional > >> test jobs that use devstack, please test your project with the patch > >> (you can submit a trivial change to your project and use Depends-On to > >> pull in the devstack change). > >> > >> [1] https://review.openstack.org/#/c/622415/ > > > > We have had a few +1 votes on the patch above with comments that > > indicate at least a couple of projects have taken the time to test and > > verify that things won't break for them with the change. > > > > Are we ready to proceed with merging the change? > > The patch mentioned above that changes the default version of Python in > devstack to 3 by default has merged. If this triggers a failure in your > devstack-based jobs, you can use the disable_python3_package function to > add your package to the list *not* installed using Python 3 until the > fixes are available. Isn't the purpose of the patch to make sure that all services are installed using Python 3 when Python 3 is enabled, but that we still need to set USE_PYTHON3=True? Ciao -- Luigi From doug at doughellmann.com Wed Jan 9 20:55:29 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 09 Jan 2019 15:55:29 -0500 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: <1880193.M4ppHeEpuz@whitebase.usersys.redhat.com> References: <1880193.M4ppHeEpuz@whitebase.usersys.redhat.com> Message-ID: Luigi Toscano writes: > On Wednesday, 9 January 2019 20:57:09 CET Doug Hellmann wrote: >> Doug Hellmann writes: >> > Doug Hellmann writes: >> >> Today devstack requires each project to explicitly indicate that it can >> >> be installed under python 3, even when devstack itself is running with >> >> python 3 enabled. >> >> >> >> As part of the python3-first goal, I have proposed a change to devstack >> >> to modify that behavior [1]. With the change in place, when devstack >> >> runs with python3 enabled all services are installed under python 3, >> >> unless explicitly listed as not supporting python 3. >> >> >> >> If your project has a devstack plugin or runs integration or functional >> >> test jobs that use devstack, please test your project with the patch >> >> (you can submit a trivial change to your project and use Depends-On to >> >> pull in the devstack change). >> >> >> >> [1] https://review.openstack.org/#/c/622415/ >> > >> > We have had a few +1 votes on the patch above with comments that >> > indicate at least a couple of projects have taken the time to test and >> > verify that things won't break for them with the change. >> > >> > Are we ready to proceed with merging the change? >> >> The patch mentioned above that changes the default version of Python in >> devstack to 3 by default has merged. If this triggers a failure in your >> devstack-based jobs, you can use the disable_python3_package function to >> add your package to the list *not* installed using Python 3 until the >> fixes are available. > > Isn't the purpose of the patch to make sure that all services are installed > using Python 3 when Python 3 is enabled, but that we still need to set > USE_PYTHON3=True? > > Ciao > -- > Luigi > > > It is still necessary to set USE_PYTHON3=True in the job. The logic enabled by that flag used to *also* require each service to individually enable python 3 testing. Now that is no longer true, and services must explicitly *disable* python 3 if it is not supported. The USE_PYTHON3 flag allows us to have 2 jobs, with devstack running under python 2 and python 3. When we drop python 2 support, we can drop the USE_PYTHON3 flag from devstack and always run under python 3. (We could do that before we drop support for 2, but we would have to modify a lot of job configurations and I'm not sure it buys us much given the amount of effort involved there.) -- Doug From skaplons at redhat.com Wed Jan 9 21:11:29 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 9 Jan 2019 22:11:29 +0100 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: <1880193.M4ppHeEpuz@whitebase.usersys.redhat.com> Message-ID: <9DF55F17-EE56-4237-874E-70DE05A1722A@redhat.com> Hi, Thx for clarification Doug. — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Doug Hellmann w dniu 09.01.2019, o godz. 21:55: > > Luigi Toscano writes: > >> On Wednesday, 9 January 2019 20:57:09 CET Doug Hellmann wrote: >>> Doug Hellmann writes: >>>> Doug Hellmann writes: >>>>> Today devstack requires each project to explicitly indicate that it can >>>>> be installed under python 3, even when devstack itself is running with >>>>> python 3 enabled. >>>>> >>>>> As part of the python3-first goal, I have proposed a change to devstack >>>>> to modify that behavior [1]. With the change in place, when devstack >>>>> runs with python3 enabled all services are installed under python 3, >>>>> unless explicitly listed as not supporting python 3. >>>>> >>>>> If your project has a devstack plugin or runs integration or functional >>>>> test jobs that use devstack, please test your project with the patch >>>>> (you can submit a trivial change to your project and use Depends-On to >>>>> pull in the devstack change). >>>>> >>>>> [1] https://review.openstack.org/#/c/622415/ >>>> >>>> We have had a few +1 votes on the patch above with comments that >>>> indicate at least a couple of projects have taken the time to test and >>>> verify that things won't break for them with the change. >>>> >>>> Are we ready to proceed with merging the change? >>> >>> The patch mentioned above that changes the default version of Python in >>> devstack to 3 by default has merged. If this triggers a failure in your >>> devstack-based jobs, you can use the disable_python3_package function to >>> add your package to the list *not* installed using Python 3 until the >>> fixes are available. >> >> Isn't the purpose of the patch to make sure that all services are installed >> using Python 3 when Python 3 is enabled, but that we still need to set >> USE_PYTHON3=True? >> >> Ciao >> -- >> Luigi >> >> >> > > It is still necessary to set USE_PYTHON3=True in the job. The logic > enabled by that flag used to *also* require each service to individually > enable python 3 testing. Now that is no longer true, and services must > explicitly *disable* python 3 if it is not supported. > > The USE_PYTHON3 flag allows us to have 2 jobs, with devstack running > under python 2 and python 3. When we drop python 2 support, we can drop > the USE_PYTHON3 flag from devstack and always run under python 3. (We > could do that before we drop support for 2, but we would have to modify > a lot of job configurations and I'm not sure it buys us much given the > amount of effort involved there.) > > -- > Doug > From dirk at dmllr.de Wed Jan 9 22:03:41 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Wed, 9 Jan 2019 23:03:41 +0100 Subject: [rpm-packaging] Proposing new core member In-Reply-To: References: Message-ID: Am Mi., 9. Jan. 2019 um 20:41 Uhr schrieb Thomas Bechtold : > Please give your +1/-1 in the next days. +1, well said, happy to have her increase our (too) small core reviewer team! Greetings, Dirk From miguel at mlavalle.com Wed Jan 9 22:07:50 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Wed, 9 Jan 2019 16:07:50 -0600 Subject: [openstack-dev] [neutron] Cancelling the L3 meeting on January 10th Message-ID: Dear Neutron team, Several members of the L3 sub-team are attending internal company meetings, have medical appointments or have other personal commitments at the time of the weekly meeting on January 10th. As a consequence, we are cancelling this meeting and will resume on the 17th. Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Wed Jan 9 23:11:55 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 9 Jan 2019 17:11:55 -0600 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> Message-ID: On 12/20/18 4:41 AM, Herve Beraud wrote: > > > Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong > > a écrit : > > Hi Ben, > > I am apology that in last month we do not have much time maintaining > the code. > > > but if no one's going to use it then I'd rather cut our > > losses than continue pouring time into it. > > I agree, we will wait for the community to decide the need for the > feature. > In the near future, we do not have ability to maintain the code. If > anyone > has interest to continue maintaining the patch, we will support with > document, > reviewing... in our possibility. > > > I can help you to maintain the code if needed. > > Personaly I doesn't need this feature so I agree Ben and Doug point of view. > > We need to measure how many this feature is useful and if it make sense > to support and maintain more code in the future related to this feature > without any usages behind that. We discussed this again in the Oslo meeting this week, and to share with the wider audience here's what I propose: Since the team that initially proposed the feature and that we expected to help maintain it are no longer able to do so, and it's not clear to the Oslo team that there is sufficient demand for a rather complex feature like this, I suggest that we either WIP or abandon the current patch series. Gerrit never forgets, so if at some point there are contributors (new or old) who have a vested interest in the feature we can always resurrect it. If you have any thoughts about this plan please let me know. Otherwise I will act on it sometime in the near-ish future. In the meantime, if anyone is desperate for Oslo work to do here are a few things that have been lingering on my todo list: * We have a unit test in oslo.utils (test_excutils) that is still using mox. That needs to be migrated to mock. * oslo.cookiecutter has a number of things that are out of date (doc layout, lack of reno, coverage job). Since it's unlikely we've reached peak Oslo library we should update that so there aren't a bunch of post-creation changes needed like there were with oslo.upgradecheck (and I'm guessing oslo.limit). * The config validator still needs support for dynamic groups, if oslo.config is your thing. * There are 326 bugs open across Oslo projects. Help wanted. :-) Thanks. -Ben From iwienand at redhat.com Wed Jan 9 23:26:24 2019 From: iwienand at redhat.com (Ian Wienand) Date: Thu, 10 Jan 2019 10:26:24 +1100 Subject: [tripleo] Re: [infra] NetworkManager on infra Fedora 29 and CentOS nodes In-Reply-To: References: <20190109061109.GA24618@fedora19.localdomain> Message-ID: <20190109232624.GB24618@fedora19.localdomain> On Wed, Jan 09, 2019 at 09:02:57AM -0700, Alex Schultz wrote: > Don't suppose we could try this with tripleo jobs prior to cutting > them all over could we? We don't use NetworkManager and infact > os-net-config doesn't currently support NetworkManager. I don't think > it'll cause problems, but I'd like to have some test prior to cutting > them all over. It is possible to stage this in by creating a new NetworkManager enabled node-type. I've proposed that in [1] but it's only useful if you want to then follow-up with setting up testing jobs to use the new node-type. We can then revert and apply the change to regular nodes. By just switching directly in [2], we can quite quickly revert if there should be an issue. We can immediately delete the new image, revert the config change and then worry about fixing it. Staging it is the conservative approach and more work all round but obviously safer; hoping for the best with the escape hatch is probably my preferred option given the low risk. I've WIP'd both reviews so just let us know in there your thoughts. Thanks, -i [1] https://review.openstack.org/629680 [2] https://review.openstack.org/619960 From aschultz at redhat.com Wed Jan 9 23:34:08 2019 From: aschultz at redhat.com (Alex Schultz) Date: Wed, 9 Jan 2019 16:34:08 -0700 Subject: [tripleo] Re: [infra] NetworkManager on infra Fedora 29 and CentOS nodes In-Reply-To: <20190109232624.GB24618@fedora19.localdomain> References: <20190109061109.GA24618@fedora19.localdomain> <20190109232624.GB24618@fedora19.localdomain> Message-ID: On Wed, Jan 9, 2019 at 4:26 PM Ian Wienand wrote: > > On Wed, Jan 09, 2019 at 09:02:57AM -0700, Alex Schultz wrote: > > Don't suppose we could try this with tripleo jobs prior to cutting > > them all over could we? We don't use NetworkManager and infact > > os-net-config doesn't currently support NetworkManager. I don't think > > it'll cause problems, but I'd like to have some test prior to cutting > > them all over. > > It is possible to stage this in by creating a new NetworkManager > enabled node-type. I've proposed that in [1] but it's only useful if > you want to then follow-up with setting up testing jobs to use the new > node-type. We can then revert and apply the change to regular nodes. > > By just switching directly in [2], we can quite quickly revert if > there should be an issue. We can immediately delete the new image, > revert the config change and then worry about fixing it. > > Staging it is the conservative approach and more work all round but > obviously safer; hoping for the best with the escape hatch is probably > my preferred option given the low risk. I've WIP'd both reviews so > just let us know in there your thoughts. > For us to test I think we just need https://review.openstack.org/#/c/629685/ once the node pool change goes in. Then the jobs on that change will be the NetworkManager version. I would really prefer testing this way than possibly having to revert after breaking a bunch of in flight patches. I'll defer to others if they think it's OK to just land it and revert as needed. Thanks, -Alex > Thanks, > > -i > > [1] https://review.openstack.org/629680 > [2] https://review.openstack.org/619960 From iwienand at redhat.com Thu Jan 10 00:43:06 2019 From: iwienand at redhat.com (Ian Wienand) Date: Thu, 10 Jan 2019 11:43:06 +1100 Subject: [infra] Updating fedora-latest nodeset to Fedora 29 Message-ID: <20190110004306.GA995@fedora19.localdomain> Hi, Just a heads up that we're soon switching "fedora-latest" nodes from Fedora 28 to Fedora 29 [1] (setting up this switch took a bit longer than usual, see [2]). Presumably if you're using "fedora-latest" you want the latest Fedora, so this should not be unexpected :) But this is the first time we're making this transition with the "-latest" nodeset, so please report any issues. Thanks, -i [1] https://review.openstack.org/618673 [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001530.html From dangtrinhnt at gmail.com Thu Jan 10 01:24:46 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 10 Jan 2019 10:24:46 +0900 Subject: [Searchlight] Nominating Thuy Dang for Searchlight core Message-ID: Hello team, I would like to nominate Thuy Dang for Searchlight core. He has been leading the effort to clarify our vision and working on some blueprints to make Searchlight a multi-cloud application. I believe Thuy will be a great resource for our team. Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From tuantuluong at gmail.com Thu Jan 10 02:07:10 2019 From: tuantuluong at gmail.com (=?UTF-8?B?bMawxqFuZyBo4buvdSB0deG6pW4=?=) Date: Thu, 10 Jan 2019 10:07:10 +0800 Subject: [Searchlight] Nominating Thuy Dang for Searchlight core In-Reply-To: References: Message-ID: +1 from me :) On Thursday, January 10, 2019, Trinh Nguyen wrote: > Hello team, > > I would like to nominate Thuy Dang for > Searchlight core. He has been leading the effort to clarify our vision and > working on some blueprints to make Searchlight a multi-cloud application. I > believe Thuy will be a great resource for our team. > > Bests, > > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phuongnh at vn.fujitsu.com Thu Jan 10 03:20:16 2019 From: phuongnh at vn.fujitsu.com (Nguyen Hung, Phuong) Date: Thu, 10 Jan 2019 03:20:16 +0000 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> Message-ID: <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> Hi Ben, > I suggest that we either WIP or abandon the current > patch series. ... > If you have any thoughts about this plan please let me know. Otherwise I > will act on it sometime in the near-ish future. Thanks for your consideration. I am agree with you, please help me to abandon them because I am not privileged with those patches. Regards, Phuong. -----Original Message----- From: Ben Nemec [mailto:openstack at nemebean.com] Sent: Thursday, January 10, 2019 6:12 AM To: Herve Beraud; Nguyen, Hung Phuong Cc: openstack-discuss at lists.openstack.org Subject: Re: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams On 12/20/18 4:41 AM, Herve Beraud wrote: > > > Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong > > a écrit : > > Hi Ben, > > I am apology that in last month we do not have much time maintaining > the code. > > > but if no one's going to use it then I'd rather cut our > > losses than continue pouring time into it. > > I agree, we will wait for the community to decide the need for the > feature. > In the near future, we do not have ability to maintain the code. If > anyone > has interest to continue maintaining the patch, we will support with > document, > reviewing... in our possibility. > > > I can help you to maintain the code if needed. > > Personaly I doesn't need this feature so I agree Ben and Doug point of view. > > We need to measure how many this feature is useful and if it make sense > to support and maintain more code in the future related to this feature > without any usages behind that. We discussed this again in the Oslo meeting this week, and to share with the wider audience here's what I propose: Since the team that initially proposed the feature and that we expected to help maintain it are no longer able to do so, and it's not clear to the Oslo team that there is sufficient demand for a rather complex feature like this, I suggest that we either WIP or abandon the current patch series. Gerrit never forgets, so if at some point there are contributors (new or old) who have a vested interest in the feature we can always resurrect it. If you have any thoughts about this plan please let me know. Otherwise I will act on it sometime in the near-ish future. In the meantime, if anyone is desperate for Oslo work to do here are a few things that have been lingering on my todo list: * We have a unit test in oslo.utils (test_excutils) that is still using mox. That needs to be migrated to mock. * oslo.cookiecutter has a number of things that are out of date (doc layout, lack of reno, coverage job). Since it's unlikely we've reached peak Oslo library we should update that so there aren't a bunch of post-creation changes needed like there were with oslo.upgradecheck (and I'm guessing oslo.limit). * The config validator still needs support for dynamic groups, if oslo.config is your thing. * There are 326 bugs open across Oslo projects. Help wanted. :-) Thanks. -Ben From melwittt at gmail.com Thu Jan 10 03:51:24 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 9 Jan 2019 19:51:24 -0800 Subject: [nova][dev] spec freeze is today/tomorrow Jan 10 Message-ID: <810a0ef0-a943-bc03-7c24-17ceaa6bd241@gmail.com> Hey all, Spec freeze is today/tomorrow, depending on your time zone. We've been tracking specs that are close to approval here: https://etherpad.openstack.org/p/nova-stein-blueprint-spec-freeze Thanks everyone for jumping in and getting so much review done ahead of s-2! Please take one more look and let's get those final approvals done before spec freeze at EOD Jan 10. Best, -melanie From singh.surya64mnnit at gmail.com Thu Jan 10 04:45:49 2019 From: singh.surya64mnnit at gmail.com (Surya Singh) Date: Thu, 10 Jan 2019 10:15:49 +0530 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: Hi Boris Great to see new facelift of Stackalytics. Its really good. I have a query regarding contributors name is not listed as per company affiliation. Before facelift to stackalytics it was showing correct whether i have entry in https://github.com/openstack/stackalytics/blob/master/etc/default_data.json or not. Though now i have pushed the patch for same https://review.openstack.org/629150, but another thing is one of my colleague Vishal Manchanda name is also showing as independent contributor rather than NEC contributor. While his name entry already in etc/default_data.json. Would be great if you check the same. --- Thanks Surya On Tue, Jan 8, 2019 at 11:57 PM Boris Renski wrote: > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - > > We have new look and feel at stackalytics.com > - > > We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the menu on the top > - > > BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible via top menu. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback. > > -Boris > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Thu Jan 10 06:17:46 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 10 Jan 2019 15:17:46 +0900 Subject: Review-Priority for Project Repos In-Reply-To: <20190103135155.GC27473@sm-workstation> References: <20190103135155.GC27473@sm-workstation> Message-ID: <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> ---- On Thu, 03 Jan 2019 22:51:55 +0900 Sean McGinnis wrote ---- > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: > > Dear All, > > > > There are many occasion when we want to priorities some of the patches > > whether it is related to unblock the gates or blocking the non freeze > > patches during RC. > > > > So adding the Review-Priority will allow more precise dashboard. As > > Designate and Cinder projects already experiencing this[1][2] and after > > discussion with Jeremy brought this to ML to interact with these team > > before landing [3], as there is possibility that reapply the priority vote > > following any substantive updates to change could make it more cumbersome > > than it is worth. > > With Cinder this is fairly new, but I think it is working well so far. The > oddity we've run into, that I think you're referring to here, is how those > votes carry forward with updates. > > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when This idea looks great and helpful especially for blockers and cycle priority patches to get regular review bandwidth from Core or Active members of that project. IMO only +ve votes are more appropriate for this label. -1 is little confusing for many reasons like what is the difference between Review-Priority -1 and Code-Review -2 ? Review-Priority -1 means, it is less priority than 0/not labelled (explicitly setting any patch very less priority). After seeing Cinder dashboard, I got to know that -1 is used to block the changes due to procedural or technical reason. But that can be done by -2 on Code-Review label. Keeping Review-Priority label only for priority set makes it more clear which is nothing but allowing only +ve votes for this label. Personally, I prefer only a single vote set which can be +1 to convey that these are the set of changes priority for review but having multiple +ve vote set as per project need/interest is all fine. -gmann > a patchset is updates, the -1 and +2 carry forward. But for some reason we > can't get the +1 to be sticky. > > So far, that's just a slight inconvenience. It would be great if we can figure > out a way to have them all be sticky, but if we need to live with reapplying +1 > votes, that's manageable to me. > > The one thing I have been slightly concerned with is the process around using > these priority votes. It hasn't been an issue, but I could see a scenario where > one core (in Cinder we have it set up so all cores can use the priority voting) > has set something like a procedural -1, then been pulled away or is absent for > an extended period. Like a Workflow -2, another core cannot override that vote. > So until that person is back to remove the -1, that patch would not be able to > be merged. > > Granted, we've lived with this with Workflow -2's for years and it's never been > a major issue, but I think as far as centralizing control, it may make sense to > have a separate smaller group (just the PTL, or PTL and a few "deputies") that > are able to set priorities on patches just to make sure the folks setting it > are the ones that are actively tracking what the priorities are for the > project. > > Anyway, my 2 cents. I can imagine this would work really well for some teams, > less well for others. So if you think it can help you manage your project > priorities, I would recommend giving it a shot and seeing how it goes. You can > always drop it if it ends up not being effective or causing issues. > > Sean > > From muroi.masahito at lab.ntt.co.jp Thu Jan 10 06:40:26 2019 From: muroi.masahito at lab.ntt.co.jp (Masahito MUROI) Date: Thu, 10 Jan 2019 15:40:26 +0900 Subject: [blazar] Nominating Tetsuro Nakamura for blazar-core In-Reply-To: References: Message-ID: <6357fd01-edee-c287-0269-1f8c51386471@lab.ntt.co.jp> +1 Tetsuro is doing a great contributing to Blazar :) best regards, Masahito On 2019/01/10 3:21, Pierre Riteau wrote: > Hello, > > I would like to nominate Tetsuro Nakamura for membership in the > blazar-core team. > > Tetsuro started contributing to Blazar last summer. He has been > contributing great code for integrating Blazar with placement and > participating actively in the project. He is also providing good > feedback to the rest of the contributors via code review, including on > code not related to placement. He would make a great addition to the > core team. > > Unless there are objections, I will add him to the core team in a week's time. > > Pierre > > > From lujinluo at gmail.com Thu Jan 10 06:47:54 2019 From: lujinluo at gmail.com (Lujin Luo) Date: Wed, 9 Jan 2019 22:47:54 -0800 Subject: [neutron] [upgrade] No meeting on Jan. 10th Message-ID: Hi everyone, Due to some personal reasons, I cannot hold the meeting tomorrow. Let's resume next week. Thanks, Lujin From gmann at ghanshyammann.com Thu Jan 10 06:48:47 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 10 Jan 2019 15:48:47 +0900 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> Message-ID: <16836853a7f.f7a92ce550692.4268288898180642317@ghanshyammann.com> ---- On Wed, 02 Jan 2019 20:18:40 +0900 Dmitry Tantsur wrote ---- > Hi all and happy new year :) > > As you know, tempest plugins are branchless, so the CI of ironic-tempest-plugin > has to run tests on all supported branches. Currently it amounts to 16 (!) > voting devstack jobs. With each of them have some small probability of a random > failure, it is impossible to land anything without at least one recheck, usually > more. > > The bad news is, we only run master API tests job, and these tests are changed > more often that the other. We already had a minor stable branch breakage because > of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And I've just > spotted a missing master multinode job, which is defined but does not run for > some reason :( > > Here is my proposal to deal with gate bloat on ironic-tempest-plugin: > > 1. Do not run CI jobs at all for unsupported branches and branches in extended > maintenance. For Ocata this has already been done in [2]. +1. We have the same policy in Tempest also[1]. You mean not to run CI for unsupported/EM branches on the master testing right? CI on Unsupported/EM branch can be run until they all are passing or EM maintainers want to run them. > > 2. Make jobs running with N-3 (currently Pike) and older non-voting (and thus > remove them from the gate queue). I have a gut feeling that a change that breaks > N-3 is very likely to break N-2 (currently Queens) as well, so it's enough to > have N-2 voting. IMO, running all supported stable branches as voting make sense than running oldest one(N-3 as you mentioned) as n-v. That way, tempest-plugins will be successfully maintained to run on N-3 otherwise it is likely to be broken for that branch especially in case of feature discovery based tests. > > 3. Make the discovery and the multinode jobs from all stable branches > non-voting. These jobs cover the tests that get changed very infrequently (if > ever). These are also the jobs with the highest random failure rate. > > 4. Add the API tests, voting for Queens to master, non-voting for Pike (as > proposed above). > > This should leave us with 20 jobs, but with only 11 of them voting. Which is > still a lot, but probably manageable. > > The corresponding change is [3], please comment here or there. > > Dmitry > > [1] https://review.openstack.org/622177 > [2] https://review.openstack.org/621537 > [3] https://review.openstack.org/627955 > > [1] https://docs.openstack.org/tempest/latest/stable_branch_support_policy.html -gmann From gmann at ghanshyammann.com Thu Jan 10 07:16:28 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 10 Jan 2019 16:16:28 +0900 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> <1546453449.3633235.1623759896.26639384@webmail.messagingengine.com> Message-ID: <168369e9149.f7aa3daf51154.4025722229812805971@ghanshyammann.com> ---- On Thu, 03 Jan 2019 03:39:00 +0900 Dmitry Tantsur wrote ---- > On 1/2/19 7:24 PM, Clark Boylan wrote: > > On Wed, Jan 2, 2019, at 3:18 AM, Dmitry Tantsur wrote: > >> Hi all and happy new year :) > >> > >> As you know, tempest plugins are branchless, so the CI of ironic- > >> tempest-plugin > >> has to run tests on all supported branches. Currently it amounts to 16 > >> (!) > >> voting devstack jobs. With each of them have some small probability of a > >> random > >> failure, it is impossible to land anything without at least one recheck, > >> usually > >> more. > >> > >> The bad news is, we only run master API tests job, and these tests are > >> changed > >> more often that the other. We already had a minor stable branch breakage > >> because > >> of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And > >> I've just > >> spotted a missing master multinode job, which is defined but does not > >> run for > >> some reason :( Yeah, that is because ironic multinode's parent job "tempest-multinode-full" is restricted to run only on master. It was done that way until we had all multinode zuulv3 things backported till pike which is completed already. I am making this job for pike onwards [1] so that multinode job can be run on stable branches also. > >> > >> Here is my proposal to deal with gate bloat on ironic-tempest-plugin: > >> > >> 1. Do not run CI jobs at all for unsupported branches and branches in extended > >> maintenance. For Ocata this has already been done in [2]. > >> > >> 2. Make jobs running with N-3 (currently Pike) and older non-voting (and > >> thus > >> remove them from the gate queue). I have a gut feeling that a change > >> that breaks > >> N-3 is very likely to break N-2 (currently Queens) as well, so it's > >> enough to > >> have N-2 voting. > >> > >> 3. Make the discovery and the multinode jobs from all stable branches > >> non-voting. These jobs cover the tests that get changed very infrequently (if > >> ever). These are also the jobs with the highest random failure rate. > > > > Has any work been done to investigate why these jobs fail? And if not maybe we should stop running the jobs entirely. Non voting jobs that aren't reliable will just get ignored. > > From my experience it's PXE failing or just generic timeout on slow nodes. Note > that they still don't fail too often, it's their total number that makes it > problematic. When you have 20 jobs each failing with, say, 5% rate it's just 35% > chance of passing (unless I cannot do math). > > But to answer your question, yes, we do put work in that. We just never got to > 0% of random failures. While making the multinode job running for stable branches, I got the consistent failure on multinode job for pike, queens which run fine on Rocky. Failure are on migration tests due to hostname mismatch. I have not debugged the failure yet but we will be making multinode runnable on stable branches also. [1] https://review.openstack.org/#/c/610938/ [2] https://review.openstack.org/#/q/topic:tempest-multinode-slow-stable+(status:open+OR+status:merged) -gmann > > > > >> > >> 4. Add the API tests, voting for Queens to master, non-voting for Pike (as > >> proposed above). > >> > >> This should leave us with 20 jobs, but with only 11 of them voting. Which is > >> still a lot, but probably manageable. > >> > >> The corresponding change is [3], please comment here or there. > >> > >> Dmitry > >> > >> [1] https://review.openstack.org/622177 > >> [2] https://review.openstack.org/621537 > >> [3] https://review.openstack.org/627955 > >> > > > > > From gmann at ghanshyammann.com Thu Jan 10 07:33:40 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 10 Jan 2019 16:33:40 +0900 Subject: [sahara][qa][api-sig]Support for Sahara APIv2 in tempest tests, unversioned endpoints In-Reply-To: References: <1818981.9ErCeWV4fL@whitebase.usersys.redhat.com> Message-ID: <16836ae526c.f599c98651471.2262588979872936985@ghanshyammann.com> ---- On Thu, 03 Jan 2019 07:29:27 +0900 Jeremy Freudberg wrote ---- > Hey Luigi. > > I poked around in Tempest and saw these code bits: > https://github.com/openstack/tempest/blob/master/tempest/lib/common/rest_client.py#L210 > https://github.com/openstack/tempest/blob/f9650269a32800fdcb873ff63f366b7bc914b3d7/tempest/lib/auth.py#L53 > > Here's a patch which takes advantage of those bits to append the > version to the unversioned base URL: > https://review.openstack.org/#/c/628056/ > > Hope it works without regression (I'm a bit worried since Tempest does > its own URL mangling rather than nicely use keystoneauth...) Yeah, that is the code where service client can tell which version of API tests needs to use. This was kind of hack we do in Tempest for different versioned API with same endpoint of service Other way to test different API version in Tempest is via catalog_type. You can define the different jobs running the same test for different versioned endpoints defined in tempest's config options catalog_type. But if you want to run the both versions test in same job then api_version is the way to do that. > > On Wed, Jan 2, 2019 at 5:19 AM Luigi Toscano wrote: > > > > Hi all, > > > > I'm working on adding support for APIv2 to the Sahara tempest plugin. > > > > If I get it correctly, there are two main steps > > > > 1) Make sure that that tempest client works with APIv2 (and don't regress with > > APIv1.1). > > > > This mainly mean implementing the tempest client for Sahara APIv2, which > > should not be too complicated. > > > > On the other hand, we hit an issue with the v1.1 client in an APIv2 > > environment. > > A change associated with API v2 is usage of an unversioned endpoint for the > > deployment (see https://review.openstack.org/#/c/622330/ , without the /v1,1/$ > > (tenant_id) suffix) which should magically work with both API variants, but it > > seems that the current tempest client fails in this case: > > > > http://logs.openstack.org/30/622330/1/check/sahara-tests-tempest/7e02114/job-output.txt.gz#_2018-12-05_21_20_23_535544 > > > > Does anyone know if this is an issue with the code of the tempest tests (which > > should maybe have some logic to build the expected endpoint when it's > > unversioned, like saharaclient does) or somewhere else? > > > > > > 2) fix the tests to support APIv2. > > > > Should I duplicate the tests for APIv1.1 and APIv2? Other projects which > > supports different APIs seems to do this. > > But can I freely move the existing tests under a subdirectory > > (sahara_tempest_plugins/tests/api/ -> sahara_tempest_plugins/tests/api/v1/), > > or are there any compatibility concerns? Are the test ID enough to ensure that > > everything works as before? It depends on compatibility and state of version v1.1 and v2. If both are supposed to be compatible at least feature wise then you should not duplicate the test instead run the same set of tests against both version either in same job or in different job. We do that for nova, cinder, image etc where we run same set of the test against 1. compute v2.0 and v2.1, 2. volume v2 and v3. We have done that testing those in different jobs with defining different catalog_type(version endpoints). Duplicating the tests has two drawbacks, 1. maintenance 2. easy to loose the coverage against specific version. -gmann > > > > And what about CLI tests currently under sahara_tempest_plugin/tests/cli/ ? > > They supports both API versions through a configuration flag. Should they be > > duplicated as well? > > > > > > Ciao > > (and happy new year if you have a new one in your calendar!) > > -- > > Luigi > > > > > > > > From ghcks1000 at gmail.com Thu Jan 10 07:51:25 2019 From: ghcks1000 at gmail.com (=?utf-8?Q?=EC=9D=B4=ED=98=B8=EC=B0=AC?=) Date: Thu, 10 Jan 2019 16:51:25 +0900 Subject: [dev][Tacker] Implementing Multisite VNFFG Message-ID: <5c36f97e.1c69fb81.79c09.a033@mx.google.com> Dear Tacker folks,   Hello, I'm interested in implementing multisite VNFFG in Tacker project.   As far as I know, current single Tacker controller can manage multiple Openstack sites (Multisite VIM), but it can create VNFFG in only singlesite, so it can't create VNFFG across multisite. I think if multisite VNFFG is possible, tacker can have more flexibility in managing VNF and VNFFG.   In the current tacker, networking-sfc driver is used to support VNFFG, and networking-sfc uses port chaining to construct service chain. So, I think extending current port chaining in singleiste to multisite can be one solution.   Is there development process about multisite VNFFG in tacker project? Otherwise, I wonder that tacker is interested in this feature. I want to develop this feature for Tacker project if I can.   Yours sincerely, Hochan Lee. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Jan 10 08:28:45 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 10 Jan 2019 09:28:45 +0100 Subject: queens [magnum] patches In-Reply-To: References: Message-ID: Hello Doug, sorry but I am not so expert of gerrit and how community process for patching works. I saw the https://review.openstack.org/#/c/577477/ page but I cannot understand if those patches are approved and backported on stable queens. Please, help me to understand.... For example: I cloned the stable/queens magnum branch, the file magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-master.sh is different from the same file I downloaded from cherry-picks, so I presume the patch is not merged in the branch yet. I presume the link you sent me ( https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes) is for developers....that's right ? Thanks ans sorry for my poor skill Ignazio Il giorno mer 9 gen 2019 alle ore 15:20 Doug Hellmann ha scritto: > Ignazio Cassano writes: > > > Hello, > > last week I talked on #openstack-containers IRC about important patches > for > > magnum reported here: > > https://review.openstack.org/#/c/577477/ > > > > I'd like to know when the above will be backported on queens and if > centos7 > > and ubuntu packages > > will be upgraded with them. > > Any roadmap ? > > I would go on with magnum testing on queens because I am going to upgrade > > from ocata to pike and from pike to queens. > > > > At this time I have aproduction environment on ocata and a testing > > environment on queens. > > > > Best Regards > > Ignazio > > You can submit those backports yourself, either through the gerrit web > UI or by manually creating the patches locally using git commands. There > are more details on processes and tools for doing this in the stable > maintenance section of the project team guide [1]. > > As far as when those changes might end up in packages, the community > doesn't really have much insight into (or influence over) what stable > patches are pulled down by the distributors or how they schedule their > updates and releases. So I recommend talking to the folks who prepare > the distribution(s) you're interested in, after the backport patches are > approved. > > [1] > https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes > > -- > Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From artem.goncharov at gmail.com Thu Jan 10 08:45:29 2019 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Thu, 10 Jan 2019 09:45:29 +0100 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: Hi, I can repeat the issue - stackalytics stopped showing my affiliation correctly (user: gtema, entry in default_data.json is present) Regards, Artem On Thu, Jan 10, 2019 at 5:48 AM Surya Singh wrote: > Hi Boris > > Great to see new facelift of Stackalytics. Its really good. > > I have a query regarding contributors name is not listed as per company > affiliation. > Before facelift to stackalytics it was showing correct whether i have > entry in > https://github.com/openstack/stackalytics/blob/master/etc/default_data.json > or not. > Though now i have pushed the patch for same > https://review.openstack.org/629150, but another thing is one of my > colleague Vishal Manchanda name is also showing as independent contributor > rather than NEC contributor. While his name entry already in > etc/default_data.json. > > Would be great if you check the same. > > --- > Thanks > Surya > > > On Tue, Jan 8, 2019 at 11:57 PM Boris Renski wrote: > >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics openstack project). Brief >> summary of updates: >> >> - >> >> We have new look and feel at stackalytics.com >> - >> >> We did away with DriverLog >> and Member Directory , which >> were not very actively used or maintained. Those are still available via >> direct links, but not in the menu on the top >> - >> >> BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated >> project commits via a separate subsection accessible via top menu. Before >> this was all bunched up in Project Type -> Complimentary >> >> Happy to hear comments or feedback. >> >> -Boris >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ifatafekn at gmail.com Thu Jan 10 09:30:43 2019 From: ifatafekn at gmail.com (Ifat Afek) Date: Thu, 10 Jan 2019 11:30:43 +0200 Subject: [vitrage] Nominating Ivan Kolodyazhny for Vitrage core In-Reply-To: References: Message-ID: Ivan, welcome to the team :-) On Wed, Jan 9, 2019 at 3:50 PM Eyal B wrote: > +1 > > On Wed, Jan 9, 2019, 15:42 Ifat Afek >> Hi, >> >> >> I would like to nominate Ivan Kolodyazhny for Vitrage core. >> >> Ivan has been contributing to Vitrage for a while now. He has focused on >> upgrade support, vitrage-dashboard and vitrage-tempest-plugin enhancements, >> and during this time gained a lot of knowledge and experience with Vitrage >> code base. I believe he would make a great addition to our team. >> >> >> Thanks, >> >> Ifat. >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jpena at redhat.com Thu Jan 10 10:10:31 2019 From: jpena at redhat.com (Javier Pena) Date: Thu, 10 Jan 2019 05:10:31 -0500 (EST) Subject: [rpm-packaging] Proposing new core member In-Reply-To: References: Message-ID: <1523156859.67815456.1547115031073.JavaMail.zimbra@redhat.com> ----- Original Message ----- > Am Mi., 9. Jan. 2019 um 20:41 Uhr schrieb Thomas Bechtold > : > > > Please give your +1/-1 in the next days. > > +1, well said, happy to have her increase our (too) small core reviewer team! > +1, welcome to the core team! Regards, Javier > Greetings, > Dirk > > From jakub.sliva at ultimum.io Thu Jan 10 11:27:06 2019 From: jakub.sliva at ultimum.io (=?UTF-8?B?SmFrdWIgU2zDrXZh?=) Date: Thu, 10 Jan 2019 12:27:06 +0100 Subject: [tc][telemetry][horizon] ceilometer-dashboard repository creation Message-ID: Hello, our company created a little plugin to Horizon and we would like to share it with the community in a bit more official way. So I created change request (https://review.openstack.org/#/c/619235/) in order to create official repository under project Telemetry. However, PTL recommended me to put this new repository under OpenStack without any project - i.e. make it unofficial. I have also discussed this with Horizon team during their meeting (http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-31) and now I am bit stuck because I do not know how to proceed next. Could you, please, advise me? All opinions are appreciated. Jakub Sliva Ultimum Technologies s.r.o. Na Poříčí 1047/26, 11000 Praha 1 Czech Republic http://ultimum.io From paul.bourke at oracle.com Thu Jan 10 11:38:34 2019 From: paul.bourke at oracle.com (Paul Bourke) Date: Thu, 10 Jan 2019 11:38:34 +0000 Subject: [kolla] Stepping down from core Message-ID: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> Hi all, Due to a change of direction for me I'll be stepping down from the Kolla core group. It's been a blast, thanks to everyone I've worked/interacted with over the past few years. Thanks in particular to Eduardo who's done a stellar job of PTL since taking the reins. I hope we'll cross paths again in the future :) All the best! -Paul From dabarren at gmail.com Thu Jan 10 12:07:47 2019 From: dabarren at gmail.com (Eduardo Gonzalez) Date: Thu, 10 Jan 2019 13:07:47 +0100 Subject: [kolla] Stepping down from core In-Reply-To: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> References: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> Message-ID: Hi Paul, So sad see you leaving the team, your work over this years has been critical to make kolla as great as it is. Thank you for this amazing work. Wish you the best on your new projects and hope to cross our path together some day. Feel free to join the team any time if your job responsibility allows you or if have the time as **independent contributor ;) Again, thank you for your work, wish you the best in the future. Regards El jue., 10 ene. 2019 a las 12:43, Paul Bourke () escribió: > Hi all, > > Due to a change of direction for me I'll be stepping down from the Kolla > core group. It's been a blast, thanks to everyone I've worked/interacted > with over the past few years. Thanks in particular to Eduardo who's done > a stellar job of PTL since taking the reins. I hope we'll cross paths > again in the future :) > > All the best! > -Paul > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcin.juszkiewicz at linaro.org Thu Jan 10 12:21:56 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Thu, 10 Jan 2019 13:21:56 +0100 Subject: [kolla] Stepping down from core In-Reply-To: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> References: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> Message-ID: W dniu 10.01.2019 o 12:38, Paul Bourke pisze: > Hi all, > > Due to a change of direction for me I'll be stepping down from the Kolla > core group. It's been a blast, thanks to everyone I've worked/interacted > with over the past few years. Thanks in particular to Eduardo who's done > a stellar job of PTL since taking the reins. I hope we'll cross paths > again in the future :) Sad to see you leaving but such is life. Have fun with whatever else you will be doing. And thanks for all that help I got from you during my Kolla work. From fungi at yuggoth.org Thu Jan 10 12:42:15 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 10 Jan 2019 12:42:15 +0000 Subject: queens [magnum] patches In-Reply-To: References: Message-ID: <20190110124214.bdxiry37z7oymenx@yuggoth.org> On 2019-01-10 09:28:45 +0100 (+0100), Ignazio Cassano wrote: > Hello Doug, sorry but I am not so expert of gerrit and how community > process for patching works. The Code and Documentation volume of the OpenStack Contributor Guide has chapters on the Git and Gerrit workflows our community uses: https://docs.openstack.org/contributors/code-and-documentation/ > I saw the https://review.openstack.org/#/c/577477/ page but I cannot > understand if those patches are approved and backported on stable queens. > Please, help me to understand.... Typically, we propose backports under a common Change-Id to the master branch change. Here you can see that backports to stable/rocky and stable/queens were proposed Monday by Bharat Kunwar: https://review.openstack.org/#/q/Ife5558f1db4e581b64cc4a8ffead151f7b405702 The stable/queens backport is well on its way to approval; it's passing CI jobs (the Verified +1 from Zuul) and already has one of the customary two stable branch core reviews (the Code-Review +2 vote from Spyros Trigazis), so I expect it's well on its way to approval. > For example: I cloned the stable/queens magnum branch, the file > magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-master.sh > is different from the same file I downloaded from cherry-picks, so > I presume the patch is not merged in the branch yet. The stable/queens backport looks like it still needs some work, as evidenced by the Verified -1 vote from Zuul. It's currently failing CI jobs openstack-tox-pep8 (coding style validation) and magnum-functional-k8s (a Kubernetes functional testsuite for Magnum). The names of those jobs in the Gerrit webUI lead to detailed build logs, which can be used to identify and iterate on solutions to get them passing for that change. > I presume the link you sent me ( > https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes > ) is for developers....that's right ? It's for anyone in the community who wants to help. "Developer" is just a reference to someone performing an activity, not a qualification. > Thanks ans sorry for my poor skill [...] Please don't apologize. Skills are just something we learn, nobody is born knowing any of this. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Thu Jan 10 13:05:57 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 10 Jan 2019 13:05:57 +0000 Subject: [tc][telemetry][horizon] ceilometer-dashboard repository creation In-Reply-To: References: Message-ID: <20190110130557.q3fgchx3uot6aupj@yuggoth.org> On 2019-01-10 12:27:06 +0100 (+0100), Jakub Slíva wrote: > our company created a little plugin to Horizon and we would like to > share it with the community in a bit more official way. So I created > change request (https://review.openstack.org/#/c/619235/) in order to > create official repository under project Telemetry. However, PTL > recommended me to put this new repository under OpenStack without any > project - i.e. make it unofficial. > > I have also discussed this with Horizon team during their meeting > (http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-31) > and now I am bit stuck because I do not know how to proceed next. > Could you, please, advise me? It looks like much of this confusion stemmed from recommendation by project-config-core reviewers, unfortunately. We too often see people from official teams in OpenStack request new Git repositories for work their team will be performing, but who forget to also record them in the appropriate governance lists. As a result, if a proposed repository looks closely-related to the work of an existing team (in this case possibly either Horizon or Telemetry) we usually assume this was the case and recommend during the review process that they file a corresponding change to the OpenStack TC's governance repository. Given this is an independent group's work for which neither the Horizon nor Telemetry teams have expressed an interest in adopting responsibility, it's perfectly acceptable to have it operate as an unofficial project or to apply for status as another official project team within OpenStack. The main differences between the two options are that contributors to official OpenStack project teams gain the ability to vote in Technical Committee elections, their repositories can publish documentation on the https://docs.openstack.org/ Web site, they're able to reserve space for team-specific discussions and working sessions at OSF Project Teams Gathering meetings (such as the one coming up in Denver immediately following the Open Infrastructure Summit)... but official project teams are also expected to hold team lead elections twice a year, participate in OpenStack release processes, follow up on implementing cycle goals, and otherwise meet the requirements laid out in our https://governance.openstack.org/tc/reference/new-projects-requirements.html document. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From doug at doughellmann.com Thu Jan 10 13:28:26 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 10 Jan 2019 08:28:26 -0500 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> Message-ID: "Nguyen Hung, Phuong" writes: > Hi Ben, > >> I suggest that we either WIP or abandon the current >> patch series. > ... >> If you have any thoughts about this plan please let me know. Otherwise I >> will act on it sometime in the near-ish future. > > Thanks for your consideration. I am agree with you, please help me to abandon them because I am not privileged with those patches. > > Regards, > Phuong. +1 for abandoning them, at least for now. As Ben points out, gerrit will still have copies. Doug > > -----Original Message----- > From: Ben Nemec [mailto:openstack at nemebean.com] > Sent: Thursday, January 10, 2019 6:12 AM > To: Herve Beraud; Nguyen, Hung Phuong > Cc: openstack-discuss at lists.openstack.org > Subject: Re: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams > > > > On 12/20/18 4:41 AM, Herve Beraud wrote: >> >> >> Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong >> > a écrit : >> >> Hi Ben, >> >> I am apology that in last month we do not have much time maintaining >> the code. >> >> > but if no one's going to use it then I'd rather cut our >> > losses than continue pouring time into it. >> >> I agree, we will wait for the community to decide the need for the >> feature. >> In the near future, we do not have ability to maintain the code. If >> anyone >> has interest to continue maintaining the patch, we will support with >> document, >> reviewing... in our possibility. >> >> >> I can help you to maintain the code if needed. >> >> Personaly I doesn't need this feature so I agree Ben and Doug point of view. >> >> We need to measure how many this feature is useful and if it make sense >> to support and maintain more code in the future related to this feature >> without any usages behind that. > > We discussed this again in the Oslo meeting this week, and to share with > the wider audience here's what I propose: > > Since the team that initially proposed the feature and that we expected > to help maintain it are no longer able to do so, and it's not clear to > the Oslo team that there is sufficient demand for a rather complex > feature like this, I suggest that we either WIP or abandon the current > patch series. Gerrit never forgets, so if at some point there are > contributors (new or old) who have a vested interest in the feature we > can always resurrect it. > > If you have any thoughts about this plan please let me know. Otherwise I > will act on it sometime in the near-ish future. > > In the meantime, if anyone is desperate for Oslo work to do here are a > few things that have been lingering on my todo list: > > * We have a unit test in oslo.utils (test_excutils) that is still using > mox. That needs to be migrated to mock. > * oslo.cookiecutter has a number of things that are out of date (doc > layout, lack of reno, coverage job). Since it's unlikely we've reached > peak Oslo library we should update that so there aren't a bunch of > post-creation changes needed like there were with oslo.upgradecheck (and > I'm guessing oslo.limit). > * The config validator still needs support for dynamic groups, if > oslo.config is your thing. > * There are 326 bugs open across Oslo projects. Help wanted. :-) > > Thanks. > > -Ben > -- Doug From ignaziocassano at gmail.com Thu Jan 10 13:31:44 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 10 Jan 2019 14:31:44 +0100 Subject: queens [magnum] patches In-Reply-To: <20190110124214.bdxiry37z7oymenx@yuggoth.org> References: <20190110124214.bdxiry37z7oymenx@yuggoth.org> Message-ID: Many thanks, Jeremy. Ignazio Il giorno gio 10 gen 2019 alle ore 13:45 Jeremy Stanley ha scritto: > On 2019-01-10 09:28:45 +0100 (+0100), Ignazio Cassano wrote: > > Hello Doug, sorry but I am not so expert of gerrit and how community > > process for patching works. > > The Code and Documentation volume of the OpenStack Contributor Guide > has chapters on the Git and Gerrit workflows our community uses: > > https://docs.openstack.org/contributors/code-and-documentation/ > > > I saw the https://review.openstack.org/#/c/577477/ page but I cannot > > understand if those patches are approved and backported on stable queens. > > Please, help me to understand.... > > Typically, we propose backports under a common Change-Id to the > master branch change. Here you can see that backports to > stable/rocky and stable/queens were proposed Monday by Bharat Kunwar: > > https://review.openstack.org/#/q/Ife5558f1db4e581b64cc4a8ffead151f7b405702 > > The stable/queens backport is well on its way to approval; it's passing > CI jobs (the Verified +1 from Zuul) and already has one of the > customary two stable branch core reviews (the Code-Review +2 vote > from Spyros Trigazis), so I expect it's well on its way to approval. > > > For example: I cloned the stable/queens magnum branch, the file > > > magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-master.sh > > is different from the same file I downloaded from cherry-picks, so > > I presume the patch is not merged in the branch yet. > > The stable/queens backport looks like it still needs some work, as > evidenced by the Verified -1 vote from Zuul. It's currently failing > CI jobs openstack-tox-pep8 (coding style validation) and > magnum-functional-k8s (a Kubernetes functional testsuite for > Magnum). The names of those jobs in the Gerrit webUI lead to > detailed build logs, which can be used to identify and iterate on > solutions to get them passing for that change. > > > I presume the link you sent me ( > > > https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes > > ) is for developers....that's right ? > > It's for anyone in the community who wants to help. "Developer" is > just a reference to someone performing an activity, not a > qualification. > > > Thanks ans sorry for my poor skill > [...] > > Please don't apologize. Skills are just something we learn, nobody > is born knowing any of this. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Thu Jan 10 13:32:59 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 10 Jan 2019 08:32:59 -0500 Subject: Review-Priority for Project Repos In-Reply-To: <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> Message-ID: Ghanshyam Mann writes: > ---- On Thu, 03 Jan 2019 22:51:55 +0900 Sean McGinnis wrote ---- > > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: > > > Dear All, > > > > > > There are many occasion when we want to priorities some of the patches > > > whether it is related to unblock the gates or blocking the non freeze > > > patches during RC. > > > > > > So adding the Review-Priority will allow more precise dashboard. As > > > Designate and Cinder projects already experiencing this[1][2] and after > > > discussion with Jeremy brought this to ML to interact with these team > > > before landing [3], as there is possibility that reapply the priority vote > > > following any substantive updates to change could make it more cumbersome > > > than it is worth. > > > > With Cinder this is fairly new, but I think it is working well so far. The > > oddity we've run into, that I think you're referring to here, is how those > > votes carry forward with updates. > > > > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when > > This idea looks great and helpful especially for blockers and cycle priority patches to get regular > review bandwidth from Core or Active members of that project. > > IMO only +ve votes are more appropriate for this label. -1 is little confusing for many reasons like > what is the difference between Review-Priority -1 and Code-Review -2 ? Review-Priority -1 means, > it is less priority than 0/not labelled (explicitly setting any patch very less priority). > > After seeing Cinder dashboard, I got to know that -1 is used to block the changes due to procedural > or technical reason. But that can be done by -2 on Code-Review label. Keeping Review-Priority label > only for priority set makes it more clear which is nothing but allowing only +ve votes for this label. > Personally, I prefer only a single vote set which can be +1 to convey that these are the set of changes > priority for review but having multiple +ve vote set as per project need/interest is all fine. > > -gmann Given the complexity of our review process already, if this new aspect is going to spread it would be really nice if we could try to agree on a standard way to apply it. Not only would that let someone build a dashboard for cross-project priorities, but it would mean contributors wouldn't need to learn different rules for interacting with each of our teams. Doug From ildiko.vancsa at gmail.com Thu Jan 10 13:41:26 2019 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Thu, 10 Jan 2019 14:41:26 +0100 Subject: [edge] Use cases mapping to MVP architectures - FEEDBACK NEEDED Message-ID: Hi, We are reaching out to you about the use cases for edge cloud infrastructure that the Edge Computing Group is working on to collect. They are recorded in our wiki [1] and they describe high level scenarios when an edge cloud infrastructure would be needed. During the second Denver PTG discussions we drafted two MVP architectures what we could build from the current functionality of OpenStack with some slight modifications [2]. These are based on the work of James and his team from Oath. We differentiate between a distributed [3] and a centralized [4] control plane architecture scenarios. In one of the Berlin Forum sessions we were asked to map the MVP architecture scenarios to the use cases so I made an initial mapping and now we are looking for feedback. This mapping only means, that the listed use case can be implemented using the MVP architecture scenarios. It should be noted, that none of the MVP architecture scenarios provide solution for edge cloud infrastructure upgrade or centralized management. Please comment on the wiki or in a reply to this mail in case you have questions or disagree with the initial mapping we put together. Please let us know if you have any questions. Here is the use cases and the mapped architecture scenarios: Mobile service provider 5G/4G virtual RAN deployment and Edge Cloud B2B2X [5] Both distributed [3] and centralized [4] Universal customer premise equipment (uCPE) for Enterprise Network Services[6] Both distributed [3] and centralized [4] Unmanned Aircraft Systems (Drones) [7] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Cloud Storage Gateway - Storage at the Edge [8] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Open Caching - stream/store data at the edge [9] Both distributed [3] and centralized [4] Smart City as Software-Defined closed-loop system [10] The use case is not complete enough to figure out Augmented Reality -- Sony Gaming Network [11] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Analytics/control at the edge [12] The use case is not complete enough to figure out Manage retail chains - chick-fil-a [13] The use case is not complete enough to figure out At this moment chick-fil-a uses a different Kubernetes cluster in every edge location and they manage them using Git [14] Smart Home [15] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Data Collection - Smart cooler/cold chain tracking [16] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event VPN Gateway Service Delivery [17] The use case is not complete enough to figure out [1]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases [2]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures [3]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Distributed_Control_Plane_Scenario [4]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Centralized_Control_Plane_Scenario [5]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Mobile_service_provider_5G.2F4G_virtual_RAN_deployment_and_Edge_Cloud_B2B2X. [6]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Universal_customer_premise_equipment_.28uCPE.29_for_Enterprise_Network_Services [7]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Unmanned_Aircraft_Systems_.28Drones.29 [8]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Cloud_Storage_Gateway_-_Storage_at_the_Edge [9]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Open_Caching_-_stream.2Fstore_data_at_the_edge [10]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_City_as_Software-Defined_closed-loop_system [11]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Augmented_Reality_--_Sony_Gaming_Network [12]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Analytics.2Fcontrol_at_the_edge [13]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Manage_retail_chains_-_chick-fil-a [14]: https://schd.ws/hosted_files/kccna18/34/GitOps.pdf [15]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_Home [16]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Data_Collection_-_Smart_cooler.2Fcold_chain_tracking [17]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#VPN_Gateway_Service_Delivery Thanks and Best Regards, Gergely and Ildikó From doug at doughellmann.com Thu Jan 10 13:47:53 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 10 Jan 2019 08:47:53 -0500 Subject: [tc][telemetry][horizon] ceilometer-dashboard repository creation In-Reply-To: <20190110130557.q3fgchx3uot6aupj@yuggoth.org> References: <20190110130557.q3fgchx3uot6aupj@yuggoth.org> Message-ID: Jeremy Stanley writes: > On 2019-01-10 12:27:06 +0100 (+0100), Jakub Slíva wrote: >> our company created a little plugin to Horizon and we would like to >> share it with the community in a bit more official way. So I created >> change request (https://review.openstack.org/#/c/619235/) in order to >> create official repository under project Telemetry. However, PTL >> recommended me to put this new repository under OpenStack without any >> project - i.e. make it unofficial. >> >> I have also discussed this with Horizon team during their meeting >> (http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-31) >> and now I am bit stuck because I do not know how to proceed next. >> Could you, please, advise me? > > It looks like much of this confusion stemmed from recommendation by > project-config-core reviewers, unfortunately. We too often see > people from official teams in OpenStack request new Git repositories > for work their team will be performing, but who forget to also > record them in the appropriate governance lists. As a result, if a > proposed repository looks closely-related to the work of an existing > team (in this case possibly either Horizon or Telemetry) we usually > assume this was the case and recommend during the review process > that they file a corresponding change to the OpenStack TC's > governance repository. Given this is an independent group's work for > which neither the Horizon nor Telemetry teams have expressed an > interest in adopting responsibility, it's perfectly acceptable to > have it operate as an unofficial project or to apply for status as > another official project team within OpenStack. > > The main differences between the two options are that contributors > to official OpenStack project teams gain the ability to vote in > Technical Committee elections, their repositories can publish > documentation on the https://docs.openstack.org/ Web site, they're > able to reserve space for team-specific discussions and working > sessions at OSF Project Teams Gathering meetings (such as the one > coming up in Denver immediately following the Open Infrastructure > Summit)... but official project teams are also expected to hold team > lead elections twice a year, participate in OpenStack release > processes, follow up on implementing cycle goals, and otherwise meet > the requirements laid out in our > https://governance.openstack.org/tc/reference/new-projects-requirements.html > document. > -- > Jeremy Stanley Jakub, thank you for starting this thread. As you can see from Jeremy's response, you have a couple of options. You had previously told me you wanted the repository to be "official", and since the existing teams do not want to manage it I think that it is likely that you will want to create a new team for it. However, since that path does introduce some obligations, before you go ahead it would be good to understand what benefits you are seeking by joining an official team. Can you fill in some background for us, so we can offer the best guidance? -- Doug From lyarwood at redhat.com Thu Jan 10 13:56:05 2019 From: lyarwood at redhat.com (Lee Yarwood) Date: Thu, 10 Jan 2019 13:56:05 +0000 Subject: [cinder] volume encryption performance impact In-Reply-To: <20190109151329.GA7953@sanger.ac.uk> References: <20190109151329.GA7953@sanger.ac.uk> Message-ID: <20190110135605.qd34tb54deh5zv6f@lyarwood.usersys.redhat.com> On 09-01-19 15:13:29, Dave Holland wrote: > Hello, > > I've just started investigating Cinder volume encryption using Queens > (RHOSP13) with a Ceph/RBD backend and the performance overhead is... > surprising. Some naive bonnie++ numbers, comparing a plain vs encrypted > volume: > > plain: write 1400MB/s, read 390MB/s > encrypted: write 81MB/s, read 83MB/s > > The encryption was configured with: > > openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LuksEncryptor-Template-256 > > Does anyone have a similar setup, and can share their performance > figures, or give me an idea of what percentage performance impact I > should expect? Alternatively: is AES256 overkill, or, where should I > start looking for a misconfiguration or bottleneck? What's the underlying version of QEMU being used here? FWIW I can't recall seeing any performance issues when working on and verifying this downstream with QEMU 2.10. Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: not available URL: From cdent+os at anticdent.org Thu Jan 10 14:14:38 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 10 Jan 2019 14:14:38 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC Message-ID: Recently Thierry, with the help of other TC members, wrote down the perceived role of the TC [1]. This was inspired by the work on the "Vision for OpenStack Clouds" [2]. If we think we should have that document to help validate and direct our software development, we should have something similar to validate governance. Now we need to make sure the document reflects not just how things are but also how they should be. We (the TC) would like feedback from the community on the following general questions (upon which you should feel free to expand as necessary). * Does the document accurately reflect what you see the TC doing? * What's in the list that shouldn't be? * What's not in the list that should be? * Should something that is listed be done more or less? Discussions like these are sometimes perceived as pointless navel gazing. That's a fair complaint when they result in nothing changing (if it should). In this case however, it is fair to say that the composition of the OpenStack community is changing and we _may_ need some adjustments in governance to effectively adapt. We can't know if any changes should be big or little until we talk about them. We have several weeks before the next TC election, so now seems an appropriate time. Note that the TC was chartered with a mission [3]: The Technical Committee (“TC”) is tasked with providing the technical leadership for OpenStack as a whole (all official projects, as defined below). It enforces OpenStack ideals (Openness, Transparency, Commonality, Integration, Quality…), decides on issues affecting multiple projects, forms an ultimate appeals board for technical decisions, and generally has technical oversight over all of OpenStack. Thanks for your participation and help. [1] https://governance.openstack.org/tc/reference/role-of-the-tc.html [2] https://governance.openstack.org/tc/reference/technical-vision.html [3] https://governance.openstack.org/tc/reference/charter.html#mission -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From dangtrinhnt at gmail.com Thu Jan 10 14:41:27 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 10 Jan 2019 23:41:27 +0900 Subject: [Searchlight] We reached Stein-2 milestone Message-ID: Hi team, Just so you know, we reached Stein-2 milestone and were able to release Searchlight yesterday :) Yay! I put a document here [1] to summarize what we covered in this release. Hope that it will get you excited and understand our vision. [1] https://www.dangtrinh.com/2019/01/searchlight-at-stein-2-r-14-r-13.html Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Thu Jan 10 15:15:24 2019 From: hberaud at redhat.com (Herve Beraud) Date: Thu, 10 Jan 2019 16:15:24 +0100 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> Message-ID: Make sense so +1 Le jeu. 10 janv. 2019 14:27, Doug Hellmann a écrit : > "Nguyen Hung, Phuong" writes: > > > Hi Ben, > > > >> I suggest that we either WIP or abandon the current > >> patch series. > > ... > >> If you have any thoughts about this plan please let me know. Otherwise > I > >> will act on it sometime in the near-ish future. > > > > Thanks for your consideration. I am agree with you, please help me to > abandon them because I am not privileged with those patches. > > > > Regards, > > Phuong. > > +1 for abandoning them, at least for now. As Ben points out, gerrit will > still have copies. > > Doug > > > > > -----Original Message----- > > From: Ben Nemec [mailto:openstack at nemebean.com] > > Sent: Thursday, January 10, 2019 6:12 AM > > To: Herve Beraud; Nguyen, Hung Phuong > > Cc: openstack-discuss at lists.openstack.org > > Subject: Re: [oslo][migrator] RFE Configuration mapping tool for upgrade > - coordinate teams > > > > > > > > On 12/20/18 4:41 AM, Herve Beraud wrote: > >> > >> > >> Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong > >> > a écrit : > >> > >> Hi Ben, > >> > >> I am apology that in last month we do not have much time maintaining > >> the code. > >> > >> > but if no one's going to use it then I'd rather cut our > >> > losses than continue pouring time into it. > >> > >> I agree, we will wait for the community to decide the need for the > >> feature. > >> In the near future, we do not have ability to maintain the code. If > >> anyone > >> has interest to continue maintaining the patch, we will support with > >> document, > >> reviewing... in our possibility. > >> > >> > >> I can help you to maintain the code if needed. > >> > >> Personaly I doesn't need this feature so I agree Ben and Doug point of > view. > >> > >> We need to measure how many this feature is useful and if it make sense > >> to support and maintain more code in the future related to this feature > >> without any usages behind that. > > > > We discussed this again in the Oslo meeting this week, and to share with > > the wider audience here's what I propose: > > > > Since the team that initially proposed the feature and that we expected > > to help maintain it are no longer able to do so, and it's not clear to > > the Oslo team that there is sufficient demand for a rather complex > > feature like this, I suggest that we either WIP or abandon the current > > patch series. Gerrit never forgets, so if at some point there are > > contributors (new or old) who have a vested interest in the feature we > > can always resurrect it. > > > > If you have any thoughts about this plan please let me know. Otherwise I > > will act on it sometime in the near-ish future. > > > > In the meantime, if anyone is desperate for Oslo work to do here are a > > few things that have been lingering on my todo list: > > > > * We have a unit test in oslo.utils (test_excutils) that is still using > > mox. That needs to be migrated to mock. > > * oslo.cookiecutter has a number of things that are out of date (doc > > layout, lack of reno, coverage job). Since it's unlikely we've reached > > peak Oslo library we should update that so there aren't a bunch of > > post-creation changes needed like there were with oslo.upgradecheck (and > > I'm guessing oslo.limit). > > * The config validator still needs support for dynamic groups, if > > oslo.config is your thing. > > * There are 326 bugs open across Oslo projects. Help wanted. :-) > > > > Thanks. > > > > -Ben > > > > -- > Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From msm at redhat.com Thu Jan 10 15:24:53 2019 From: msm at redhat.com (Michael McCune) Date: Thu, 10 Jan 2019 10:24:53 -0500 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: thanks for posting this Chris, i have one minor question On Thu, Jan 10, 2019 at 9:17 AM Chris Dent wrote: > Now we need to make sure the document reflects not just how things > are but also how they should be. We (the TC) would like feedback > from the community on the following general questions (upon which > you should feel free to expand as necessary). where is the best venue for providing feedback? i see these documents are published, should we start threads on the ml (or use this one), or make issues somewhere? peace o/ From cdent+os at anticdent.org Thu Jan 10 15:27:08 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 10 Jan 2019 15:27:08 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: On Thu, 10 Jan 2019, Michael McCune wrote: > On Thu, Jan 10, 2019 at 9:17 AM Chris Dent wrote: >> Now we need to make sure the document reflects not just how things >> are but also how they should be. We (the TC) would like feedback >> from the community on the following general questions (upon which >> you should feel free to expand as necessary). > > where is the best venue for providing feedback? > > i see these documents are published, should we start threads on the ml > (or use this one), or make issues somewhere? Sorry that I wasn't clear about that. Here, on this thread, would be a great place to start. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From bdobreli at redhat.com Thu Jan 10 15:34:45 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Thu, 10 Jan 2019 16:34:45 +0100 Subject: [Edge-computing] Use cases mapping to MVP architectures - FEEDBACK NEEDED In-Reply-To: References: Message-ID: <133ad3bc-bc57-b627-7b43-94f8ac846746@redhat.com> On 10.01.2019 14:43, Ildiko Vancsa wrote: > Hi, > > We are reaching out to you about the use cases for edge cloud infrastructure that the Edge Computing Group is working on to collect. They are recorded in our wiki [1] and they describe high level scenarios when an edge cloud infrastructure would be needed. Hello. Verifying the mappings created for the "Elementary operations on a site" [18] feature against the distributed glance specification [19], I can see a vital feature is missing for "Advanced operations on a site", like creating an image locally, when the parent control plane is not available. And consequences coming off that, like availability of create snapshots for Nova as well. All that boils down to a) better identifying the underlying requirement/limitations for CRUD operations available for middle edge sites in the Distributed Control Plane case. And b) the requirement of data replication and conflicts resolving tooling, which comes out, if we assume we want all CRUDs being always available for middle edge sites disregard of the parent edge's control plane state. So that is the missing and important thing to have socialised and noted for the mappings. [18] https://wiki.openstack.org/wiki/MappingOfUseCasesFeaturesRequirementsAndUserStories#Elementary_operations_on_one_site [19] https://review.openstack.org/619638 > > During the second Denver PTG discussions we drafted two MVP architectures what we could build from the current functionality of OpenStack with some slight modifications [2]. These are based on the work of James and his team from Oath. We differentiate between a distributed [3] and a centralized [4] control plane architecture scenarios. > > In one of the Berlin Forum sessions we were asked to map the MVP architecture scenarios to the use cases so I made an initial mapping and now we are looking for feedback. > > This mapping only means, that the listed use case can be implemented using the MVP architecture scenarios. It should be noted, that none of the MVP architecture scenarios provide solution for edge cloud infrastructure upgrade or centralized management. > > Please comment on the wiki or in a reply to this mail in case you have questions or disagree with the initial mapping we put together. > > Please let us know if you have any questions. > > > Here is the use cases and the mapped architecture scenarios: > > Mobile service provider 5G/4G virtual RAN deployment and Edge Cloud B2B2X [5] > Both distributed [3] and centralized [4] > Universal customer premise equipment (uCPE) for Enterprise Network Services[6] > Both distributed [3] and centralized [4] > Unmanned Aircraft Systems (Drones) [7] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > Cloud Storage Gateway - Storage at the Edge [8] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > Open Caching - stream/store data at the edge [9] > Both distributed [3] and centralized [4] > Smart City as Software-Defined closed-loop system [10] > The use case is not complete enough to figure out > Augmented Reality -- Sony Gaming Network [11] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > Analytics/control at the edge [12] > The use case is not complete enough to figure out > Manage retail chains - chick-fil-a [13] > The use case is not complete enough to figure out > At this moment chick-fil-a uses a different Kubernetes cluster in every edge location and they manage them using Git [14] > Smart Home [15] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > Data Collection - Smart cooler/cold chain tracking [16] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > VPN Gateway Service Delivery [17] > The use case is not complete enough to figure out > > [1]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases > [2]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures > [3]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Distributed_Control_Plane_Scenario > [4]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Centralized_Control_Plane_Scenario > [5]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Mobile_service_provider_5G.2F4G_virtual_RAN_deployment_and_Edge_Cloud_B2B2X. > [6]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Universal_customer_premise_equipment_.28uCPE.29_for_Enterprise_Network_Services > [7]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Unmanned_Aircraft_Systems_.28Drones.29 > [8]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Cloud_Storage_Gateway_-_Storage_at_the_Edge > [9]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Open_Caching_-_stream.2Fstore_data_at_the_edge > [10]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_City_as_Software-Defined_closed-loop_system > [11]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Augmented_Reality_--_Sony_Gaming_Network > [12]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Analytics.2Fcontrol_at_the_edge > [13]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Manage_retail_chains_-_chick-fil-a > [14]: https://schd.ws/hosted_files/kccna18/34/GitOps.pdf > [15]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_Home > [16]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Data_Collection_-_Smart_cooler.2Fcold_chain_tracking > [17]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#VPN_Gateway_Service_Delivery > > > Thanks and Best Regards, > Gergely and Ildikó > > > > _______________________________________________ > Edge-computing mailing list > Edge-computing at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing > -- Best regards, Bogdan Dobrelya, Irc #bogdando From rob at cleansafecloud.com Thu Jan 10 15:49:14 2019 From: rob at cleansafecloud.com (Robert Donovan) Date: Thu, 10 Jan 2019 15:49:14 +0000 Subject: [nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation Message-ID: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> Hello Nova folks, I spoke to some of you very briefly about this in Berlin (thanks again for your time), and we were resigned to turning off SMT to fully protect against future CPU cache side-channel attacks as I know many others have done. However, we have stubbornly done a bit of last-resort research and testing into using vCPU pinning on a per-tenant basis as an alternative and I’d like to lay it out in more detail for you to make sure there are no legs in the idea before abandoning it completely. The idea is to use libvirt’s vcpupin ability to ensure that two different tenants never share the same physical CPU core, so they cannot theoretically steal each other’s data via an L1 or L2 cache side-channel. The pinning would be optimised to make use of as many logical cores as possible for any given tenant. We would also isolate other key system processes to a separate range of physical cores. After discussions in Berlin, we ran some tests with live migration, as this is key to our maintenance activities and would be a show-stopped if it didn’t work. We found that removing any pinning restrictions immediately prior to migration resulted in them being completely reset on the target host, which could then be optimised accordingly post-migration. Unfortunately, there would be a small window of time where we couldn’t prevent tenants from sharing a physical core on the target host after a migration, but we think this is an acceptable risk given the nature of these attacks. Obviously, this approach may not be appropriate in many circumstances, such as if you have many tenants who just run single VMs with one vCPU, or if over-allocation is in use. We have also only looked at KVM and libvirt. I would love to know what people think of this approach however. Are there any other clear issues that you can think of which we may not have considered? If it seems like a reasonable idea, is it something that could fit into Nova and, if so, where in the architecture is the best place for it to sit? I know you can currently specify per-instance CPU pinning via flavor parameters, so a similar approach could be taken for this strategy. Alternatively, we can look at implementing it as an external plugin of some kind for use by those with a similar setup. Many thanks, Rob From msm at redhat.com Thu Jan 10 15:49:21 2019 From: msm at redhat.com (Michael McCune) Date: Thu, 10 Jan 2019 10:49:21 -0500 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: On Thu, Jan 10, 2019 at 10:31 AM Chris Dent wrote: > Sorry that I wasn't clear about that. Here, on this thread, would be > a great place to start. no problem, thanks for the clarification =) peace o/ From jaypipes at gmail.com Thu Jan 10 16:05:43 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Thu, 10 Jan 2019 11:05:43 -0500 Subject: [nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation In-Reply-To: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> References: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> Message-ID: <0b37748c-bbc4-e5cf-a434-6adcd0248b64@gmail.com> On 01/10/2019 10:49 AM, Robert Donovan wrote: > Hello Nova folks, > > I spoke to some of you very briefly about this in Berlin (thanks again for your time), and we were resigned to turning off SMT to fully protect against future CPU cache side-channel attacks as I know many others have done. However, we have stubbornly done a bit of last-resort research and testing into using vCPU pinning on a per-tenant basis as an alternative and I’d like to lay it out in more detail for you to make sure there are no legs in the idea before abandoning it completely. > > The idea is to use libvirt’s vcpupin ability to ensure that two different tenants never share the same physical CPU core, so they cannot theoretically steal each other’s data via an L1 or L2 cache side-channel. The pinning would be optimised to make use of as many logical cores as possible for any given tenant. We would also isolate other key system processes to a separate range of physical cores. After discussions in Berlin, we ran some tests with live migration, as this is key to our maintenance activities and would be a show-stopped if it didn’t work. We found that removing any pinning restrictions immediately prior to migration resulted in them being completely reset on the target host, which could then be optimised accordingly post-migration. Unfortunately, there would be a small window of time where we couldn’t prevent tenants from sharing a physical core on the target host after a migration, but we think this is an acceptable risk given the nature of these attacks. > > Obviously, this approach may not be appropriate in many circumstances, such as if you have many tenants who just run single VMs with one vCPU, or if over-allocation is in use. We have also only looked at KVM and libvirt. I would love to know what people think of this approach however. Are there any other clear issues that you can think of which we may not have considered? If it seems like a reasonable idea, is it something that could fit into Nova and, if so, where in the architecture is the best place for it to sit? I know you can currently specify per-instance CPU pinning via flavor parameters, so a similar approach could be taken for this strategy. Alternatively, we can look at implementing it as an external plugin of some kind for use by those with a similar setup. IMHO, if you're going to go through all the hassle of pinning guest vCPU threads to distinct logical host processors, you might as well just use dedicated CPU resources for everything. As you mention above, you can't have overcommit anyway if you're concerned about this problem. Once you have a 1.0 cpu_allocation_ratio, you're essentially limiting your CPU resources to a dedicated host CPU -> guest CPU situation so you might as well just use CPU pinning and deal with all the headaches that brings with it. Best, jay From chris.friesen at windriver.com Thu Jan 10 16:08:05 2019 From: chris.friesen at windriver.com (Chris Friesen) Date: Thu, 10 Jan 2019 10:08:05 -0600 Subject: [nova] Mempage fun In-Reply-To: <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <1546937673.17763.2@smtp.office365.com> <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> Message-ID: <328b78c1-5993-aef1-b279-fb04677b6e98@windriver.com> On 1/8/2019 12:38 PM, Stephen Finucane wrote: > I have (1) fixed here: > > https://review.openstack.org/#/c/629281/ > > That said, I'm not sure if it's the best thing to do. From what I'm > hearing, it seems the advice we should be giving is to not mix > instances with/without NUMA topologies, with/without hugepages and > with/without CPU pinning. We've only documented the latter, as > discussed on this related bug by cfriesen: > > https://bugs.launchpad.net/nova/+bug/1792985 > > Given that we should be advising folks not to mix these (something I > wasn't aware of until now), what does the original patch actually give > us? I think we should look at it from the other direction...what is the ultimate *desired* behaviour? Personally, I'm coming at it from a "small-cloud" perspective where we may only have one or two compute nodes. As such, the host-aggregate solution doesn't really work. I would like to be able to run cpu-pinned and cpu-shared instances on the same node. I would like to run small-page (with overcommit) and huge-page (without overcommit) instances on the same node. I would like to run cpu-shared/small-page instances (which float over the whole host) on the same host as a cpu-pinned/small-page instance (which is pinned to specific NUMA nodes). We have a warning in the docs currently that is specifically for separating CPU-pinned and CPU-shared instances, but we also have a spec that plans to specifically support that case. The way the code is currently written we also need to separate NUMA-affined small-page instances from non-NUMA-affined small-page instances, but I think that's a bug, not a sensible design. Chris From grant at absolutedevops.io Thu Jan 10 13:16:58 2019 From: grant at absolutedevops.io (Grant Morley) Date: Thu, 10 Jan 2019 13:16:58 +0000 Subject: Issues setting up a SolidFire node with Cinder Message-ID: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> Hi all, We are in the process of trying to add a SolidFire storage solution to our existing OpenStack setup and seem to have hit a snag with cinder / iscsi. We are trying to create a bootable volume to allow us to launch an instance from it, but we are getting some errors in our cinder-volumes containers that seem to suggest they can't connect to iscsi although the volume seems to create fine on the SolidFire node. The command we are running is: openstack volume create --image $image-id --size 20 --bootable --type solidfire sf-volume-v12 The volume seems to create on SolidFire but I then see these errors in the "cinder-volume.log" https://pastebin.com/LyjLUhfk The volume containers can talk to the iscsi VIP on the SolidFire so I am a bit stuck and wondered if anyone had come across any issues before? Kind Regards, -- Grant Morley Cloud Lead Absolute DevOps Ltd Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP www.absolutedevops.io grant at absolutedevops.io 0845 874 0580 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Thu Jan 10 17:12:41 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 10 Jan 2019 11:12:41 -0600 Subject: Review-Priority for Project Repos In-Reply-To: <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> Message-ID: <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> On 1/10/19 12:17 AM, Ghanshyam Mann wrote: > ---- On Thu, 03 Jan 2019 22:51:55 +0900 Sean McGinnis wrote ---- > > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: > > > Dear All, > > > > > > There are many occasion when we want to priorities some of the patches > > > whether it is related to unblock the gates or blocking the non freeze > > > patches during RC. > > > > > > So adding the Review-Priority will allow more precise dashboard. As > > > Designate and Cinder projects already experiencing this[1][2] and after > > > discussion with Jeremy brought this to ML to interact with these team > > > before landing [3], as there is possibility that reapply the priority vote > > > following any substantive updates to change could make it more cumbersome > > > than it is worth. > > > > With Cinder this is fairly new, but I think it is working well so far. The > > oddity we've run into, that I think you're referring to here, is how those > > votes carry forward with updates. > > > > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when > > This idea looks great and helpful especially for blockers and cycle priority patches to get regular > review bandwidth from Core or Active members of that project. > > IMO only +ve votes are more appropriate for this label. -1 is little confusing for many reasons like > what is the difference between Review-Priority -1 and Code-Review -2 ? Review-Priority -1 means, > it is less priority than 0/not labelled (explicitly setting any patch very less priority). > > After seeing Cinder dashboard, I got to know that -1 is used to block the changes due to procedural > or technical reason. But that can be done by -2 on Code-Review label. Keeping Review-Priority label > only for priority set makes it more clear which is nothing but allowing only +ve votes for this label. > Personally, I prefer only a single vote set which can be +1 to convey that these are the set of changes > priority for review but having multiple +ve vote set as per project need/interest is all fine. I don't know if this was the reasoning behind Cinder's system, but I know some people object to procedural -2 because it's a big hammer to essentially say "not right now". It overloads the meaning of the vote in a potentially confusing way that requires explanation every time it's used. At least I hope procedural -2's always include a comment. Whether adding a whole new vote type is a meaningful improvement is another question, but if we're adding the type anyway for prioritization it might make sense to use it to replace procedural -2. Especially if we could make it so any core can change it (apparently not right now), whereas -2 requires the original core to come back and remove it. From singh.surya64mnnit at gmail.com Thu Jan 10 17:31:52 2019 From: singh.surya64mnnit at gmail.com (Surya Singh) Date: Thu, 10 Jan 2019 23:01:52 +0530 Subject: [kolla] Stepping down from core In-Reply-To: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> References: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> Message-ID: Hi Paul, Thanks a lot for your long term contribution to make Kolla great project. Sad to see you stepping down. Hope to see you around. All the very best for your new project. --- Thanks Surya On Thu, Jan 10, 2019 at 5:13 PM Paul Bourke wrote: > Hi all, > > Due to a change of direction for me I'll be stepping down from the Kolla > core group. It's been a blast, thanks to everyone I've worked/interacted > with over the past few years. Thanks in particular to Eduardo who's done > a stellar job of PTL since taking the reins. I hope we'll cross paths > again in the future :) > > All the best! > -Paul > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Thu Jan 10 17:34:41 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 10 Jan 2019 11:34:41 -0600 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> Message-ID: <5C378231.8010603@openstack.org> > Arkady.Kanevsky at dell.com > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > We do still use DriverLog for the Marketplace drivers listing. We have a cronjob set up to ingest nightly from Stackalytics. We also have the ability to CRUD the listings in the Foundation website CMS. That said, as Boris mentioned, the list is really not used much and I know there is a lot of out of date info there. We're planning to move the marketplace list to yaml in a public repo, similar to what we did for OpenStack Map [1]. Cheers, Jimmy [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > January 8, 2019 at 11:10 AM > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Thu Jan 10 17:34:41 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 10 Jan 2019 11:34:41 -0600 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> Message-ID: <5C378231.8010603@openstack.org> > Arkady.Kanevsky at dell.com > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > We do still use DriverLog for the Marketplace drivers listing. We have a cronjob set up to ingest nightly from Stackalytics. We also have the ability to CRUD the listings in the Foundation website CMS. That said, as Boris mentioned, the list is really not used much and I know there is a lot of out of date info there. We're planning to move the marketplace list to yaml in a public repo, similar to what we did for OpenStack Map [1]. Cheers, Jimmy [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > January 8, 2019 at 11:10 AM > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Thu Jan 10 17:36:09 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 10 Jan 2019 11:36:09 -0600 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> Message-ID: Thanks for the quick feedback everyone. I've abandoned the patch series, although I did pull out one change that seemed to be a valid bugfix independent of the migrator work: https://review.openstack.org/#/c/607690/ On 1/10/19 9:15 AM, Herve Beraud wrote: > Make sense so +1 > > Le jeu. 10 janv. 2019 14:27, Doug Hellmann > a écrit : > > "Nguyen Hung, Phuong" > writes: > > > Hi Ben, > > > >> I suggest that we either WIP or abandon the current > >> patch series. > > ... > >> If you have any thoughts about this plan please let me know. > Otherwise I > >> will act on it sometime in the near-ish future. > > > > Thanks for your consideration. I am agree with you, please help > me to abandon them because I am not privileged with those patches. > > > > Regards, > > Phuong. > > +1 for abandoning them, at least for now. As Ben points out, gerrit will > still have copies. > > Doug > > > > > -----Original Message----- > > From: Ben Nemec [mailto:openstack at nemebean.com > ] > > Sent: Thursday, January 10, 2019 6:12 AM > > To: Herve Beraud; Nguyen, Hung Phuong > > Cc: openstack-discuss at lists.openstack.org > > > Subject: Re: [oslo][migrator] RFE Configuration mapping tool for > upgrade - coordinate teams > > > > > > > > On 12/20/18 4:41 AM, Herve Beraud wrote: > >> > >> > >> Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong > >> > >> a > écrit : > >> > >>     Hi Ben, > >> > >>     I am apology that in last month we do not have much time > maintaining > >>     the code. > >> > >>      > but if no one's going to use it then I'd rather cut our > >>      > losses than continue pouring time into it. > >> > >>     I agree, we will wait for the community to decide the need > for the > >>     feature. > >>     In the near future, we do not have ability to maintain the > code. If > >>     anyone > >>     has interest to continue maintaining the patch, we will > support with > >>     document, > >>     reviewing... in our possibility. > >> > >> > >> I can help you to maintain the code if needed. > >> > >> Personaly I doesn't need this feature so I agree Ben and Doug > point of view. > >> > >> We need to measure how many this feature is useful and if it > make sense > >> to support and maintain more code in the future related to this > feature > >> without any usages behind that. > > > > We discussed this again in the Oslo meeting this week, and to > share with > > the wider audience here's what I propose: > > > > Since the team that initially proposed the feature and that we > expected > > to help maintain it are no longer able to do so, and it's not > clear to > > the Oslo team that there is sufficient demand for a rather complex > > feature like this, I suggest that we either WIP or abandon the > current > > patch series. Gerrit never forgets, so if at some point there are > > contributors (new or old) who have a vested interest in the > feature we > > can always resurrect it. > > > > If you have any thoughts about this plan please let me know. > Otherwise I > > will act on it sometime in the near-ish future. > > > > In the meantime, if anyone is desperate for Oslo work to do here > are a > > few things that have been lingering on my todo list: > > > > * We have a unit test in oslo.utils (test_excutils) that is still > using > > mox. That needs to be migrated to mock. > > * oslo.cookiecutter has a number of things that are out of date (doc > > layout, lack of reno, coverage job). Since it's unlikely we've > reached > > peak Oslo library we should update that so there aren't a bunch of > > post-creation changes needed like there were with > oslo.upgradecheck (and > > I'm guessing oslo.limit). > > * The config validator still needs support for dynamic groups, if > > oslo.config is your thing. > > * There are 326 bugs open across Oslo projects. Help wanted. :-) > > > > Thanks. > > > > -Ben > > > > -- > Doug > From Arkady.Kanevsky at dell.com Thu Jan 10 17:38:23 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Thu, 10 Jan 2019 17:38:23 +0000 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <5C378231.8010603@openstack.org> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> Message-ID: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Thanks Jimmy. Since I am responsible for updating marketplace per release I just need to know what mechanism to use and which file I need to patch. Thanks, Arkady From: Jimmy McArthur Sent: Thursday, January 10, 2019 11:35 AM To: openstack-dev at lists.openstack.org; openstack-discuss at lists.openstack.org Subject: Re: [openstack-dev] [stackalytics] Stackalytics Facelift [EXTERNAL EMAIL] Arkady.Kanevsky at dell.com January 9, 2019 at 9:20 AM Thanks Boris. Do we still use DriverLog for marketplace driver status updates? We do still use DriverLog for the Marketplace drivers listing. We have a cronjob set up to ingest nightly from Stackalytics. We also have the ability to CRUD the listings in the Foundation website CMS. That said, as Boris mentioned, the list is really not used much and I know there is a lot of out of date info there. We're planning to move the marketplace list to yaml in a public repo, similar to what we did for OpenStack Map [1]. Cheers, Jimmy [1] https://git.openstack.org/cgit/openstack/openstack-map/ Thanks, Arkady From: Boris Renski Sent: Tuesday, January 8, 2019 11:11 AM To: openstack-dev at lists.openstack.org; Ilya Shakhat; Herman Narkaytis; David Stoltenberg Subject: [openstack-dev] [stackalytics] Stackalytics Facelift [EXTERNAL EMAIL] Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: * We have new look and feel at stackalytics.com * We did away with DriverLog and Member Directory, which were not very actively used or maintained. Those are still available via direct links, but not in the men on the top * BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible at the top nav. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback or answer questions. -Boris Boris Renski January 8, 2019 at 11:10 AM Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: * We have new look and feel at stackalytics.com * We did away with DriverLog and Member Directory, which were not very actively used or maintained. Those are still available via direct links, but not in the men on the top * BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible at the top nav. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback or answer questions. -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Thu Jan 10 17:56:54 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Thu, 10 Jan 2019 17:56:54 +0000 Subject: [nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation In-Reply-To: <0b37748c-bbc4-e5cf-a434-6adcd0248b64@gmail.com> References: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> <0b37748c-bbc4-e5cf-a434-6adcd0248b64@gmail.com> Message-ID: <1c2024d88b8c900edb2f063b4203da3d5cc76c11.camel@redhat.com> On Thu, 2019-01-10 at 11:05 -0500, Jay Pipes wrote: > On 01/10/2019 10:49 AM, Robert Donovan wrote: > > Hello Nova folks, > > > > I spoke to some of you very briefly about this in Berlin (thanks > > again for your time), and we were resigned to turning off SMT to > > fully protect against future CPU cache side-channel attacks as I > > know many others have done. However, we have stubbornly done a bit > > of last-resort research and testing into using vCPU pinning on a > > per-tenant basis as an alternative and I’d like to lay it out in > > more detail for you to make sure there are no legs in the idea > > before abandoning it completely. > > > > The idea is to use libvirt’s vcpupin ability to ensure that two > > different tenants never share the same physical CPU core, so they > > cannot theoretically steal each other’s data via an L1 or L2 cache > > side-channel. The pinning would be optimised to make use of as many > > logical cores as possible for any given tenant. We would also > > isolate other key system processes to a separate range of physical > > cores. After discussions in Berlin, we ran some tests with live > > migration, as this is key to our maintenance activities and would > > be a show-stopped if it didn’t work. We found that removing any > > pinning restrictions immediately prior to migration resulted in > > them being completely reset on the target host, which could then be > > optimised accordingly post-migration. Unfortunately, there would be > > a small window of time where we couldn’t prevent tenants from > > sharing a physical core on the target host after a migration, but > > we think this is an acceptable risk given the nature of these > > attacks. > > > > Obviously, this approach may not be appropriate in many > > circumstances, such as if you have many tenants who just run single > > VMs with one vCPU, or if over-allocation is in use. We have also > > only looked at KVM and libvirt. I would love to know what people > > think of this approach however. Are there any other clear issues > > that you can think of which we may not have considered? If it seems > > like a reasonable idea, is it something that could fit into Nova > > and, if so, where in the architecture is the best place for it to > > sit? I know you can currently specify per-instance CPU pinning via > > flavor parameters, so a similar approach could be taken for this > > strategy. Alternatively, we can look at implementing it as an > > external plugin of some kind for use by those with a similar setup. > > IMHO, if you're going to go through all the hassle of pinning guest vCPU > threads to distinct logical host processors, you might as well just use > dedicated CPU resources for everything. As you mention above, you can't > have overcommit anyway if you're concerned about this problem. Once you > have a 1.0 cpu_allocation_ratio, you're essentially limiting your CPU > resources to a dedicated host CPU -> guest CPU situation so you might as > well just use CPU pinning and deal with all the headaches that brings > with it. Indeed. My initial answer to this was "use CPU thread policies" (specifically, the 'require' policy) to ensure each instance owns its entire core, thinking you were using dedicated/pinned CPUs. For shared CPUs, I'm not sure how we could ever do something like you've proposed in a manner that would result in less than the ~20% or so performance degradation I usually see quoted when turning off SMT. Far too much second guessing of the expected performance requirements of the guest would be necessary. Stephen From kbcaulder at gmail.com Thu Jan 10 18:34:40 2019 From: kbcaulder at gmail.com (Brandon Caulder) Date: Thu, 10 Jan 2019 10:34:40 -0800 Subject: [cinder] db sync error upgrading from pike to queens Message-ID: Hi, I am receiving the following error when performing an offline upgrade of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to openstack-cinder-1:12.0.3-1.el7. # cinder-manage db version 105 # cinder-manage --debug db sync Error during database migration: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes SET shared_targets=%(shared_targets)s'] [parameters: {'shared_targets': 1}] # cinder-manage db version 114 The db version does not upgrade to queens version 117. Any help would be appreciated. Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Thu Jan 10 19:01:27 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Thu, 10 Jan 2019 13:01:27 -0600 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: Message-ID: Brandon, I am thinking you are hitting this bug: https://bugs.launchpad.net/cinder/+bug/1806156 I think you can work around it by retrying the migration with the volume service running.  You may, however, want to check with Iain MacDonnell as he has been looking at this for a while. Thanks! Jay On 1/10/2019 12:34 PM, Brandon Caulder wrote: > Hi, > > I am receiving the following error when performing an offline upgrade > of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > openstack-cinder-1:12.0.3-1.el7. > > # cinder-manage db version > 105 > > # cinder-manage --debug db sync > Error during database migration: (pymysql.err.OperationalError) (2013, > 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes > SET shared_targets=%(shared_targets)s'] [parameters: > {'shared_targets': 1}] > > # cinder-manage db version > 114 > > The db version does not upgrade to queens version 117.  Any help would > be appreciated. > > Thank you From smooney at redhat.com Thu Jan 10 19:02:51 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 10 Jan 2019 19:02:51 +0000 Subject: [nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation In-Reply-To: <1c2024d88b8c900edb2f063b4203da3d5cc76c11.camel@redhat.com> References: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> <0b37748c-bbc4-e5cf-a434-6adcd0248b64@gmail.com> <1c2024d88b8c900edb2f063b4203da3d5cc76c11.camel@redhat.com> Message-ID: On Thu, 2019-01-10 at 17:56 +0000, Stephen Finucane wrote: > On Thu, 2019-01-10 at 11:05 -0500, Jay Pipes wrote: > > On 01/10/2019 10:49 AM, Robert Donovan wrote: > > > Hello Nova folks, > > > > > > I spoke to some of you very briefly about this in Berlin (thanks > > > again for your time), and we were resigned to turning off SMT to > > > fully protect against future CPU cache side-channel attacks as I > > > know many others have done. However, we have stubbornly done a bit > > > of last-resort research and testing into using vCPU pinning on a > > > per-tenant basis as an alternative and I’d like to lay it out in > > > more detail for you to make sure there are no legs in the idea > > > before abandoning it completely. > > > > > > The idea is to use libvirt’s vcpupin ability to ensure that two > > > different tenants never share the same physical CPU core, so they > > > cannot theoretically steal each other’s data via an L1 or L2 cache > > > side-channel. The pinning would be optimised to make use of as many > > > logical cores as possible for any given tenant. We would also > > > isolate other key system processes to a separate range of physical > > > cores. After discussions in Berlin, we ran some tests with live > > > migration, as this is key to our maintenance activities and would > > > be a show-stopped if it didn’t work. We found that removing any > > > pinning restrictions immediately prior to migration resulted in > > > them being completely reset on the target host, which could then be > > > optimised accordingly post-migration. Unfortunately, there would be > > > a small window of time where we couldn’t prevent tenants from > > > sharing a physical core on the target host after a migration, but > > > we think this is an acceptable risk given the nature of these > > > attacks. > > > > > > Obviously, this approach may not be appropriate in many > > > circumstances, such as if you have many tenants who just run single > > > VMs with one vCPU, or if over-allocation is in use. We have also > > > only looked at KVM and libvirt. I would love to know what people > > > think of this approach however. Are there any other clear issues > > > that you can think of which we may not have considered? If it seems > > > like a reasonable idea, is it something that could fit into Nova > > > and, if so, where in the architecture is the best place for it to > > > sit? I know you can currently specify per-instance CPU pinning via > > > flavor parameters, so a similar approach could be taken for this > > > strategy. Alternatively, we can look at implementing it as an > > > external plugin of some kind for use by those with a similar setup. > > > > IMHO, if you're going to go through all the hassle of pinning guest vCPU > > threads to distinct logical host processors, you might as well just use > > dedicated CPU resources for everything. As you mention above, you can't > > have overcommit anyway if you're concerned about this problem. Once you > > have a 1.0 cpu_allocation_ratio, you're essentially limiting your CPU > > resources to a dedicated host CPU -> guest CPU situation so you might as > > well just use CPU pinning and deal with all the headaches that brings > > with it. > > Indeed. My initial answer to this was "use CPU thread policies" > (specifically, the 'require' policy) to ensure each instance owns its > entire core, thinking you were using dedicated/pinned CPUs. the isolate policy should address this. the require policy would for a even number of cores and a singel numa node. the require policy does not adress this is you have multiple numa nodes e.g. a 14 cores spread aross 2 numa nodes with require will have one free ht sibling on each numa node when pinned unless we hava a check for that i missed. > For shared > CPUs, I'm not sure how we could ever do something like you've proposed > in a manner that would result in less than the ~20% or so performance > degradation I usually see quoted when turning off SMT. Far too much > second guessing of the expected performance requirements of the guest > would be necessary. for shared cpus the assumtion is that as the guest cores are floating that your victim and payload vm woudl not remain running on the same core/hypertread for a protracted period of time. if both are activly using cpu cycles then the kernel schuler will schduler them to different threads/cores to allow them to exectue without contention. Note that im not saying there is not a risk but tenat aware shcduleing for shared cpus effefctivly mean we woudl have to stop supporting floating instance entirely and only allow oversubsripton to happen between vms from the same tenant which is a unlikely to ever happen in a cloud enviorment as teant vms typically are not coloated on a single host and second is not desirable in all environments. > Stephen > > From brenski at mirantis.com Thu Jan 10 19:10:42 2019 From: brenski at mirantis.com (Boris Renski) Date: Thu, 10 Jan 2019 11:10:42 -0800 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: Hey guys! thanks for the heads up on this. Let us check and fix ASAP. On Thu, Jan 10, 2019 at 12:45 AM Artem Goncharov wrote: > Hi, > > I can repeat the issue - stackalytics stopped showing my affiliation > correctly (user: gtema, entry in default_data.json is present) > > Regards, > Artem > > On Thu, Jan 10, 2019 at 5:48 AM Surya Singh > wrote: > >> Hi Boris >> >> Great to see new facelift of Stackalytics. Its really good. >> >> I have a query regarding contributors name is not listed as per company >> affiliation. >> Before facelift to stackalytics it was showing correct whether i have >> entry in >> https://github.com/openstack/stackalytics/blob/master/etc/default_data.json >> or not. >> Though now i have pushed the patch for same >> https://review.openstack.org/629150, but another thing is one of my >> colleague Vishal Manchanda name is also showing as independent contributor >> rather than NEC contributor. While his name entry already in >> etc/default_data.json. >> >> Would be great if you check the same. >> >> --- >> Thanks >> Surya >> >> >> On Tue, Jan 8, 2019 at 11:57 PM Boris Renski >> wrote: >> >>> Folks, >>> >>> Happy New Year! We wanted to start the year by giving a facelift to >>> stackalytics.com (based on stackalytics openstack project). Brief >>> summary of updates: >>> >>> - >>> >>> We have new look and feel at stackalytics.com >>> - >>> >>> We did away with DriverLog >>> and Member Directory , which >>> were not very actively used or maintained. Those are still available via >>> direct links, but not in the menu on the top >>> - >>> >>> BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated >>> project commits via a separate subsection accessible via top menu. Before >>> this was all bunched up in Project Type -> Complimentary >>> >>> Happy to hear comments or feedback. >>> >>> -Boris >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From iain.macdonnell at oracle.com Thu Jan 10 19:11:36 2019 From: iain.macdonnell at oracle.com (iain MacDonnell) Date: Thu, 10 Jan 2019 11:11:36 -0800 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: Message-ID: Different issue, I believe (DB sync vs. online migrations) - it just happens that both pertain to shared targets. Brandon, might you have a very large number of rows in your volumes table? Have you been purging soft-deleted rows? ~iain On 1/10/19 11:01 AM, Jay Bryant wrote: > Brandon, > > I am thinking you are hitting this bug: > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > I think you can work around it by retrying the migration with the volume > service running.  You may, however, want to check with Iain MacDonnell > as he has been looking at this for a while. > > Thanks! > Jay > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: >> Hi, >> >> I am receiving the following error when performing an offline upgrade >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to >> openstack-cinder-1:12.0.3-1.el7. >> >> # cinder-manage db version >> 105 >> >> # cinder-manage --debug db sync >> Error during database migration: (pymysql.err.OperationalError) (2013, >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes >> SET shared_targets=%(shared_targets)s'] [parameters: >> {'shared_targets': 1}] >> >> # cinder-manage db version >> 114 >> >> The db version does not upgrade to queens version 117.  Any help would >> be appreciated. >> >> Thank you > From sean.mcginnis at gmx.com Thu Jan 10 19:37:09 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Jan 2019 13:37:09 -0600 Subject: [release] Release countdown for week R-12, Jan 14-18 Message-ID: <20190110193709.GA14554@sm-workstation> Development Focus ----------------- Focus should be on wrapping up any design specs, then moving on to implementation as we head into the last stretch of Stein. General Information ------------------- Stein-2 is the membership freeze for deliverables to be included in the Stein coordinated release. We've reached out to a few folks, but if your project has any new deliverables that have not been released yet, please let us know ASAP if you hope to have them included in Stein. Following the changes we had proposed at the beginning of the release cycle, the release team will be proposing releases for any libraries that have significant changes merged that have not been released. PTL's and release liaisons, please watch for these and give a +1 to acknowledge them. If there is some reason to hold off on a release, let us know that as well. A +1 would be appreciated, but if we do not hear anything at all by the end of the week, we will assume things are OK to proceed. Upcoming Deadlines & Dates -------------------------- Individual OpenStack Foundation Board election: Jan 14-18 Non-client library freeze: February 28 -- Sean McGinnis (smcginnis) From sean.mcginnis at gmx.com Thu Jan 10 19:42:28 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Jan 2019 13:42:28 -0600 Subject: Review-Priority for Project Repos In-Reply-To: <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> Message-ID: <20190110194227.GB14554@sm-workstation> > > I don't know if this was the reasoning behind Cinder's system, but I know > some people object to procedural -2 because it's a big hammer to essentially > say "not right now". It overloads the meaning of the vote in a potentially > confusing way that requires explanation every time it's used. At least I > hope procedural -2's always include a comment. > This was exactly the reasoning. -2 is overloaded, but its primary meaning was/is "we do not want this code change". It just happens that it was also a convenient way to say that with "right now" at the end. The Review-Priority -1 is a clear way to say whether something is held because it can't be merged right now due to procedural or process reasons, versus something that we just don't want at all. From openstack at nemebean.com Thu Jan 10 20:07:41 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 10 Jan 2019 14:07:41 -0600 Subject: [tripleo] OVB 2.0-dev branch merged to master Message-ID: In preparation for importing OVB to Gerrit, the 2.0-dev branch was merged back to master. If the 2.0 changes break you, please use the stable/1.0 branch instead, which was created from the last commit before 2.0-dev was merged. -Ben From brenski at mirantis.com Thu Jan 10 20:54:56 2019 From: brenski at mirantis.com (Boris Renski) Date: Thu, 10 Jan 2019 12:54:56 -0800 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Message-ID: I think it would make sense to move driverlog to a separate domain... something like driverlog.openstack.org or something On Thu, Jan 10, 2019 at 9:45 AM wrote: > Thanks Jimmy. > > Since I am responsible for updating marketplace per release I just need to > know what mechanism to use and which file I need to patch. > > Thanks, > > Arkady > > > > *From:* Jimmy McArthur > *Sent:* Thursday, January 10, 2019 11:35 AM > *To:* openstack-dev at lists.openstack.org; > openstack-discuss at lists.openstack.org > *Subject:* Re: [openstack-dev] [stackalytics] Stackalytics Facelift > > > > [EXTERNAL EMAIL] > > > > Arkady.Kanevsky at dell.com > > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > We do still use DriverLog for the Marketplace drivers listing. We have a > cronjob set up to ingest nightly from Stackalytics. We also have the > ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I know > there is a lot of out of date info there. We're planning to move the > marketplace list to yaml in a public repo, similar to what we did for > OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman Narkaytis; > David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > > > [EXTERNAL EMAIL] > > Folks, > > > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - We have new look and feel at stackalytics.com > - We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the men on the top > - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible at the top nav. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > > > -Boris > > Boris Renski > > January 8, 2019 at 11:10 AM > > Folks, > > > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - We have new look and feel at stackalytics.com > - We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the men on the top > - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible at the top nav. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > > > -Boris > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenski at mirantis.com Thu Jan 10 20:54:56 2019 From: brenski at mirantis.com (Boris Renski) Date: Thu, 10 Jan 2019 12:54:56 -0800 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Message-ID: I think it would make sense to move driverlog to a separate domain... something like driverlog.openstack.org or something On Thu, Jan 10, 2019 at 9:45 AM wrote: > Thanks Jimmy. > > Since I am responsible for updating marketplace per release I just need to > know what mechanism to use and which file I need to patch. > > Thanks, > > Arkady > > > > *From:* Jimmy McArthur > *Sent:* Thursday, January 10, 2019 11:35 AM > *To:* openstack-dev at lists.openstack.org; > openstack-discuss at lists.openstack.org > *Subject:* Re: [openstack-dev] [stackalytics] Stackalytics Facelift > > > > [EXTERNAL EMAIL] > > > > Arkady.Kanevsky at dell.com > > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > We do still use DriverLog for the Marketplace drivers listing. We have a > cronjob set up to ingest nightly from Stackalytics. We also have the > ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I know > there is a lot of out of date info there. We're planning to move the > marketplace list to yaml in a public repo, similar to what we did for > OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman Narkaytis; > David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > > > [EXTERNAL EMAIL] > > Folks, > > > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - We have new look and feel at stackalytics.com > - We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the men on the top > - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible at the top nav. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > > > -Boris > > Boris Renski > > January 8, 2019 at 11:10 AM > > Folks, > > > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - We have new look and feel at stackalytics.com > - We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the men on the top > - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible at the top nav. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > > > -Boris > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kbcaulder at gmail.com Thu Jan 10 21:32:27 2019 From: kbcaulder at gmail.com (Brandon Caulder) Date: Thu, 10 Jan 2019 13:32:27 -0800 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: Message-ID: Hi Iain, There are 424 rows in volumes which drops down to 185 after running cinder-manage db purge 1. Restarting the volume service after package upgrade and running sync again does not remediate the problem, although running db sync a second time does bump the version up to 117, the following appears in the volume.log... http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ Thanks On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell wrote: > > Different issue, I believe (DB sync vs. online migrations) - it just > happens that both pertain to shared targets. > > Brandon, might you have a very large number of rows in your volumes > table? Have you been purging soft-deleted rows? > > ~iain > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > Brandon, > > > > I am thinking you are hitting this bug: > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > I think you can work around it by retrying the migration with the volume > > service running. You may, however, want to check with Iain MacDonnell > > as he has been looking at this for a while. > > > > Thanks! > > Jay > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > >> Hi, > >> > >> I am receiving the following error when performing an offline upgrade > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > >> openstack-cinder-1:12.0.3-1.el7. > >> > >> # cinder-manage db version > >> 105 > >> > >> # cinder-manage --debug db sync > >> Error during database migration: (pymysql.err.OperationalError) (2013, > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes > >> SET shared_targets=%(shared_targets)s'] [parameters: > >> {'shared_targets': 1}] > >> > >> # cinder-manage db version > >> 114 > >> > >> The db version does not upgrade to queens version 117. Any help would > >> be appreciated. > >> > >> Thank you > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aspiers at suse.com Thu Jan 10 22:02:35 2019 From: aspiers at suse.com (Adam Spiers) Date: Thu, 10 Jan 2019 22:02:35 +0000 Subject: [meta-sig][docs] new section for SIG documentation on docs.o.o Message-ID: <20190110220235.2rggnmxwxqyn6lnz@pacific.linksys.moosehall> For the Stein release there is now a new section for SIGs on the documentation home page: https://docs.openstack.org/stein/ Currently only the self-healing SIG has a link but if other SIGs have links to add, it won't feel so lonely ;-) From jimmy at openstack.org Thu Jan 10 17:42:40 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 10 Jan 2019 11:42:40 -0600 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Message-ID: <5C378410.6050603@openstack.org> Absolutely. When we get there, I'll send an announcement to the MLs and ping you :) I don't currently have a timeline, but given the Stackalytics changes, this might speed it up a bit. > Arkady.Kanevsky at dell.com > January 10, 2019 at 11:38 AM > > Thanks Jimmy. > > Since I am responsible for updating marketplace per release I just > need to know what mechanism to use and which file I need to patch. > > Thanks, > > Arkady > > *From:*Jimmy McArthur > *Sent:* Thursday, January 10, 2019 11:35 AM > *To:* openstack-dev at lists.openstack.org; > openstack-discuss at lists.openstack.org > *Subject:* Re: [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > > > Arkady.Kanevsky at dell.com > > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > We do still use DriverLog for the Marketplace drivers listing. We > have a cronjob set up to ingest nightly from Stackalytics. We also > have the ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I > know there is a lot of out of date info there. We're planning to move > the marketplace list to yaml in a public repo, similar to what we did > for OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > *From:*Boris Renski > > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org > ; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift > to stackalytics.com (based on > stackalytics openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member > Directory , which were > not very actively used or maintained. Those are still > available via direct links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection > accessible at the top nav. Before this was all bunched up in > Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > > January 8, 2019 at 11:10 AM > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift > to stackalytics.com (based on > stackalytics openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member > Directory , which were > not very actively used or maintained. Those are still > available via direct links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection > accessible at the top nav. Before this was all bunched up in > Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Jimmy McArthur > January 10, 2019 at 11:34 AM > >> Arkady.Kanevsky at dell.com >> January 9, 2019 at 9:20 AM >> >> Thanks Boris. >> >> Do we still use DriverLog for marketplace driver status updates? >> > We do still use DriverLog for the Marketplace drivers listing. We > have a cronjob set up to ingest nightly from Stackalytics. We also > have the ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I > know there is a lot of out of date info there. We're planning to move > the marketplace list to yaml in a public repo, similar to what we did > for OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ >> >> Thanks, >> >> Arkady >> >> *From:* Boris Renski >> *Sent:* Tuesday, January 8, 2019 11:11 AM >> *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman >> Narkaytis; David Stoltenberg >> *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift >> >> [EXTERNAL EMAIL] >> >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics >> openstack project). Brief summary of updates: >> >> * We have new look and feel at stackalytics.com >> >> * We did away with DriverLog >> and Member Directory >> , which were not very >> actively used or maintained. Those are still available via direct >> links, but not in the men on the top >> * BIGGEST CHANGE: You can now track some of the CNCF and >> Unaffiliated project commits via a separate subsection accessible >> at the top nav. Before this was all bunched up in Project Type -> >> Complimentary >> >> Happy to hear comments or feedback or answer questions. >> >> -Boris >> >> Boris Renski >> January 8, 2019 at 11:10 AM >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics >> openstack project). Brief summary of updates: >> >> * We have new look and feel at stackalytics.com >> >> * We did away with DriverLog >> and Member Directory >> , which were not very >> actively used or maintained. Those are still available via direct >> links, but not in the men on the top >> * BIGGEST CHANGE: You can now track some of the CNCF and >> Unaffiliated project commits via a separate subsection accessible >> at the top nav. Before this was all bunched up in Project Type -> >> Complimentary >> >> Happy to hear comments or feedback or answer questions. >> >> -Boris > > Arkady.Kanevsky at dell.com > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > Thanks, > > Arkady > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > January 8, 2019 at 11:10 AM > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Thu Jan 10 17:42:40 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 10 Jan 2019 11:42:40 -0600 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Message-ID: <5C378410.6050603@openstack.org> Absolutely. When we get there, I'll send an announcement to the MLs and ping you :) I don't currently have a timeline, but given the Stackalytics changes, this might speed it up a bit. > Arkady.Kanevsky at dell.com > January 10, 2019 at 11:38 AM > > Thanks Jimmy. > > Since I am responsible for updating marketplace per release I just > need to know what mechanism to use and which file I need to patch. > > Thanks, > > Arkady > > *From:*Jimmy McArthur > *Sent:* Thursday, January 10, 2019 11:35 AM > *To:* openstack-dev at lists.openstack.org; > openstack-discuss at lists.openstack.org > *Subject:* Re: [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > > > Arkady.Kanevsky at dell.com > > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > We do still use DriverLog for the Marketplace drivers listing. We > have a cronjob set up to ingest nightly from Stackalytics. We also > have the ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I > know there is a lot of out of date info there. We're planning to move > the marketplace list to yaml in a public repo, similar to what we did > for OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > *From:*Boris Renski > > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org > ; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift > to stackalytics.com (based on > stackalytics openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member > Directory , which were > not very actively used or maintained. Those are still > available via direct links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection > accessible at the top nav. Before this was all bunched up in > Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > > January 8, 2019 at 11:10 AM > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift > to stackalytics.com (based on > stackalytics openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member > Directory , which were > not very actively used or maintained. Those are still > available via direct links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection > accessible at the top nav. Before this was all bunched up in > Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Jimmy McArthur > January 10, 2019 at 11:34 AM > >> Arkady.Kanevsky at dell.com >> January 9, 2019 at 9:20 AM >> >> Thanks Boris. >> >> Do we still use DriverLog for marketplace driver status updates? >> > We do still use DriverLog for the Marketplace drivers listing. We > have a cronjob set up to ingest nightly from Stackalytics. We also > have the ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I > know there is a lot of out of date info there. We're planning to move > the marketplace list to yaml in a public repo, similar to what we did > for OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ >> >> Thanks, >> >> Arkady >> >> *From:* Boris Renski >> *Sent:* Tuesday, January 8, 2019 11:11 AM >> *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman >> Narkaytis; David Stoltenberg >> *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift >> >> [EXTERNAL EMAIL] >> >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics >> openstack project). Brief summary of updates: >> >> * We have new look and feel at stackalytics.com >> >> * We did away with DriverLog >> and Member Directory >> , which were not very >> actively used or maintained. Those are still available via direct >> links, but not in the men on the top >> * BIGGEST CHANGE: You can now track some of the CNCF and >> Unaffiliated project commits via a separate subsection accessible >> at the top nav. Before this was all bunched up in Project Type -> >> Complimentary >> >> Happy to hear comments or feedback or answer questions. >> >> -Boris >> >> Boris Renski >> January 8, 2019 at 11:10 AM >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics >> openstack project). Brief summary of updates: >> >> * We have new look and feel at stackalytics.com >> >> * We did away with DriverLog >> and Member Directory >> , which were not very >> actively used or maintained. Those are still available via direct >> links, but not in the men on the top >> * BIGGEST CHANGE: You can now track some of the CNCF and >> Unaffiliated project commits via a separate subsection accessible >> at the top nav. Before this was all bunched up in Project Type -> >> Complimentary >> >> Happy to hear comments or feedback or answer questions. >> >> -Boris > > Arkady.Kanevsky at dell.com > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > Thanks, > > Arkady > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > January 8, 2019 at 11:10 AM > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony at bakeyournoodle.com Thu Jan 10 22:54:43 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Fri, 11 Jan 2019 09:54:43 +1100 Subject: [meta-sig][docs] new section for SIG documentation on docs.o.o In-Reply-To: <20190110220235.2rggnmxwxqyn6lnz@pacific.linksys.moosehall> References: <20190110220235.2rggnmxwxqyn6lnz@pacific.linksys.moosehall> Message-ID: <20190110225442.GI28232@thor.bakeyournoodle.com> On Thu, Jan 10, 2019 at 10:02:35PM +0000, Adam Spiers wrote: > For the Stein release there is now a new section for SIGs on the > documentation home page: > > https://docs.openstack.org/stein/ > > Currently only the self-healing SIG has a link but if other SIGs > have links to add, it won't feel so lonely ;-) Hi Adam, Silly question but how would I added the Extended Maintenance SIG there? We really only have https://docs.openstack.org/project-team-guide/stable-branches.html to link to but you;d feel less lonely ;P Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From smooney at redhat.com Thu Jan 10 23:03:06 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 10 Jan 2019 23:03:06 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <20190110194227.GB14554@sm-workstation> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> <20190110194227.GB14554@sm-workstation> Message-ID: <16ba68b1772befaf5d689ecfb8a7b60ad055bdeb.camel@redhat.com> On Thu, 2019-01-10 at 13:42 -0600, Sean McGinnis wrote: > > > > I don't know if this was the reasoning behind Cinder's system, but I know > > some people object to procedural -2 because it's a big hammer to essentially > > say "not right now". It overloads the meaning of the vote in a potentially > > confusing way that requires explanation every time it's used. At least I > > hope procedural -2's always include a comment. > > > > This was exactly the reasoning. -2 is overloaded, but its primary meaning > was/is "we do not want this code change". It just happens that it was also a > convenient way to say that with "right now" at the end. > > The Review-Priority -1 is a clear way to say whether something is held because > it can't be merged right now due to procedural or process reasons, versus > something that we just don't want at all. for what its worth my understanding of why a procdural -2 is more correct is that this change cannot be merged because it has not met the procedual requirement to be considerd for this release. haveing received several over the years i have never seen it to carry any malaise or weight then the zuul pep8 job complianing about the line lenght of my code. with either a procedural -2 or a verify -1 from zuul my code is equally un mergeable. the prime example being a patch that requires a spec that has not been approved. while most cores will not approve chage when other cores have left a -1 mistakes happen and the -2 does emphasise the point that even if the code is perfect under the porject processes this change should not be acitvly reporposed until the issue raised by the -2 has been addressed. In the case of a procedual -2 that typically means the spec is merge or the master branch opens for the next cycle. i agree that procedural -2's can seam harsh at first glance but i have also never seen one left without a comment explaining why it was left. the issue with a procedural -1 is i can jsut resubmit the patch several times and it can get lost in the comments. we recently intoduced a new review priority lable if we really wanted to disabiguate form normal -2s then we coudl have an explcitly lable for it but i personally would prefer to keep procedural -2s. anyway that just my two cents. > > From aspiers at suse.com Thu Jan 10 23:50:11 2019 From: aspiers at suse.com (Adam Spiers) Date: Thu, 10 Jan 2019 23:50:11 +0000 Subject: [meta-sig][docs] new section for SIG documentation on docs.o.o In-Reply-To: <20190110225442.GI28232@thor.bakeyournoodle.com> References: <20190110220235.2rggnmxwxqyn6lnz@pacific.linksys.moosehall> <20190110225442.GI28232@thor.bakeyournoodle.com> Message-ID: <20190110235010.3ozo6hgxbgrvoqxx@pacific.linksys.moosehall> Tony Breeds wrote: >On Thu, Jan 10, 2019 at 10:02:35PM +0000, Adam Spiers wrote: >>For the Stein release there is now a new section for SIGs on the >>documentation home page: >> >> https://docs.openstack.org/stein/ >> >>Currently only the self-healing SIG has a link but if other SIGs >>have links to add, it won't feel so lonely ;-) > >Hi Adam, > Silly question but how would I added the Extended Maintenance SIG >there? Yeah sorry, it was more silly that I didn't think to explain that :-) >We really only have >https://docs.openstack.org/project-team-guide/stable-branches.html to >link to but you;d feel less lonely ;P Indeed we would ;-) You can just submit a simple change to www/stein/index.html in openstack-manuals, and then run tox to check the render locally. Here's the self-healing SIG addition for you to copy from: https://review.openstack.org/#/c/628054/2/www/stein/index.html From jungleboyj at gmail.com Fri Jan 11 00:10:43 2019 From: jungleboyj at gmail.com (Jay S. Bryant) Date: Thu, 10 Jan 2019 18:10:43 -0600 Subject: Issues setting up a SolidFire node with Cinder In-Reply-To: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> References: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> Message-ID: <6f53c037-b03d-1550-3e7a-e42850d950ec@gmail.com> Grant, So, the copy is failing because it can't find the volume to copy the image into. I would check the host and container for any iSCSI errors as well as the backend.  It appears that something is going wrong when attempting to temporarily attach the volume to write the image into it. Jay On 1/10/2019 7:16 AM, Grant Morley wrote: > > Hi all, > > We are in the process of trying to add a SolidFire storage solution to > our existing OpenStack setup and seem to have hit a snag with cinder / > iscsi. > > We are trying to create a bootable volume to allow us to launch an > instance from it, but we are getting some errors in our cinder-volumes > containers that seem to suggest they can't connect to iscsi although > the volume seems to create fine on the SolidFire node. > > The command we are running is: > > openstack volume create --image $image-id --size 20 --bootable --type > solidfire sf-volume-v12 > > The volume seems to create on SolidFire but I then see these errors in > the "cinder-volume.log" > > https://pastebin.com/LyjLUhfk > > The volume containers can talk to the iscsi VIP on the SolidFire so I > am a bit stuck and wondered if anyone had come across any issues before? > > Kind Regards, > > > -- > Grant Morley > Cloud Lead > Absolute DevOps Ltd > Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP > www.absolutedevops.io > grant at absolutedevops.io 0845 874 0580 -------------- next part -------------- An HTML attachment was scrubbed... URL: From duc.openstack at gmail.com Fri Jan 11 00:56:31 2019 From: duc.openstack at gmail.com (Duc Truong) Date: Thu, 10 Jan 2019 16:56:31 -0800 Subject: [senlin] Meeting today at 0530 UTC Message-ID: Everyone, Our regular Senlin meetings are resuming today. This is an even week so the meeting will be happening on Friday at 530 UTC. Regards, Duc From ekcs.openstack at gmail.com Fri Jan 11 01:40:39 2019 From: ekcs.openstack at gmail.com (Eric K) Date: Thu, 10 Jan 2019 17:40:39 -0800 Subject: [congress][infra] override-checkout problem Message-ID: The congress-tempest-plugin zuul jobs against stable branches appear to be working incorrectly. Tests that should fail on stable/rocky (and indeed fails when triggered by congress patch [1]) are passing when triggered by congress-tempest-plugin patch [2]. I'd assume it's some kind of zuul misconfiguration in congress-tempest-plugin [3], but I've so far failed to figure out what's wrong. Particularly strange is that the job-output appears to show it checking out the right thing [4]. Any thoughts or suggestions? Thanks so much! [1] https://review.openstack.org/#/c/629070/ http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87474d7/logs/testr_results.html.gz The two failing z3 tests should indeed fail because the feature was not available in rocky. The tests were introduced because for some reason they pass in the job triggered by a patch in congress-tempest-plugin. [2] https://review.openstack.org/#/c/618951/ http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/logs/testr_results.html.gz [3] https://github.com/openstack/congress-tempest-plugin/blob/master/.zuul.yaml#L4 [4] http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/job-output.txt.gz#_2019-01-09_05_18_08_183562 shows congress is checked out to the correct commit at the top of the stable/rocky branch. From adriant at catalyst.net.nz Fri Jan 11 06:18:15 2019 From: adriant at catalyst.net.nz (Adrian Turjak) Date: Fri, 11 Jan 2019 19:18:15 +1300 Subject: [tc][all] Project deletion community goal for Train cycle Message-ID: Hello OpenStackers! As discussed at the Berlin Summit, one of the proposed community goals was project deletion and resource clean-up. Essentially the problem here is that for almost any company that is running OpenStack we run into the issue of how to delete a project and all the resources associated with that project. What we need is an OpenStack wide solution that every project supports which allows operators of OpenStack to delete everything related to a given project. Before we can choose this as a goal, we need to define what the actual proposed solution is, and what each service is either implementing or contributing to. I've started an Etherpad here: https://etherpad.openstack.org/p/community-goal-project-deletion Please add to it if I've missed anything about the problem description, or to flesh out the proposed solutions, but try to mostly keep any discussion here on the mailing list, so that the Etherpad can hopefully be more of a summary of where the discussions have led. This is mostly a starting point, and I expect there to be a lot of opinions and probably some push back from doing anything too big. That said, this is a major issue in OpenStack, and something we really do need because OpenStack is too big and too complicated for this not to exist in a smart cross-project manner. Let's solve this the best we can! Cheers, Adrian Turjak From eandersson at blizzard.com Fri Jan 11 08:13:41 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Fri, 11 Jan 2019 08:13:41 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com>, <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> Message-ID: <052175D3-F10F-4777-9C93-D2551F801720@blizzard.com> This has worked great for Designate as most of the reviewers have limited time, it has helped us focus on core issues and get critical patches out a lot faster than we otherwise would. Sent from my iPhone > On Jan 10, 2019, at 9:14 AM, Ben Nemec wrote: > > > >> On 1/10/19 12:17 AM, Ghanshyam Mann wrote: >> ---- On Thu, 03 Jan 2019 22:51:55 +0900 Sean McGinnis wrote ---- >> > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: >> > > Dear All, >> > > >> > > There are many occasion when we want to priorities some of the patches >> > > whether it is related to unblock the gates or blocking the non freeze >> > > patches during RC. >> > > >> > > So adding the Review-Priority will allow more precise dashboard. As >> > > Designate and Cinder projects already experiencing this[1][2] and after >> > > discussion with Jeremy brought this to ML to interact with these team >> > > before landing [3], as there is possibility that reapply the priority vote >> > > following any substantive updates to change could make it more cumbersome >> > > than it is worth. >> > >> > With Cinder this is fairly new, but I think it is working well so far. The >> > oddity we've run into, that I think you're referring to here, is how those >> > votes carry forward with updates. >> > >> > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when >> This idea looks great and helpful especially for blockers and cycle priority patches to get regular >> review bandwidth from Core or Active members of that project. >> IMO only +ve votes are more appropriate for this label. -1 is little confusing for many reasons like >> what is the difference between Review-Priority -1 and Code-Review -2 ? Review-Priority -1 means, >> it is less priority than 0/not labelled (explicitly setting any patch very less priority). >> After seeing Cinder dashboard, I got to know that -1 is used to block the changes due to procedural >> or technical reason. But that can be done by -2 on Code-Review label. Keeping Review-Priority label >> only for priority set makes it more clear which is nothing but allowing only +ve votes for this label. >> Personally, I prefer only a single vote set which can be +1 to convey that these are the set of changes >> priority for review but having multiple +ve vote set as per project need/interest is all fine. > > I don't know if this was the reasoning behind Cinder's system, but I know some people object to procedural -2 because it's a big hammer to essentially say "not right now". It overloads the meaning of the vote in a potentially confusing way that requires explanation every time it's used. At least I hope procedural -2's always include a comment. > > Whether adding a whole new vote type is a meaningful improvement is another question, but if we're adding the type anyway for prioritization it might make sense to use it to replace procedural -2. Especially if we could make it so any core can change it (apparently not right now), whereas -2 requires the original core to come back and remove it. > From dharmendra.kushwaha at india.nec.com Fri Jan 11 11:31:16 2019 From: dharmendra.kushwaha at india.nec.com (Dharmendra Kushwaha) Date: Fri, 11 Jan 2019 11:31:16 +0000 Subject: [dev][Tacker] Implementing Multisite VNFFG In-Reply-To: <5c36f97e.1c69fb81.79c09.a033@mx.google.com> References: <5c36f97e.1c69fb81.79c09.a033@mx.google.com> Message-ID: Dear Lee, Good point & Thanks for the proposal. Currently no ongoing activity on that. And That will be great help if you lead this feature enhancement. Feel free to join Tacker weekly meeting with some initial drafts. Thanks & Regards Dharmendra Kushwaha From: 이호찬 [mailto:ghcks1000 at gmail.com] Sent: 10 January 2019 13:21 To: openstack-discuss at lists.openstack.org Subject: [dev][Tacker] Implementing Multisite VNFFG Dear Tacker folks, Hello, I'm interested in implementing multisite VNFFG in Tacker project. As far as I know, current single Tacker controller can manage multiple Openstack sites (Multisite VIM), but it can create VNFFG in only singlesite, so it can't create VNFFG across multisite. I think if multisite VNFFG is possible, tacker can have more flexibility in managing VNF and VNFFG. In the current tacker, networking-sfc driver is used to support VNFFG, and networking-sfc uses port chaining to construct service chain. So, I think extending current port chaining in singleiste to multisite can be one solution. Is there development process about multisite VNFFG in tacker project? Otherwise, I wonder that tacker is interested in this feature. I want to develop this feature for Tacker project if I can. Yours sincerely, Hochan Lee. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Jan 11 12:57:39 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 11 Jan 2019 21:57:39 +0900 Subject: [congress][infra] override-checkout problem In-Reply-To: References: Message-ID: <1683cfd4d5c.ec4bd8195854.3368279581961683040@ghanshyammann.com> Hi Eric, This seems the same issue happening on congress-tempest-plugin gate where 'congress-devstack-py35-api-mysql-queens' is failing [1]. python-congressclient was not able to install and openstack client trow error for congress command. The issue is stable branch jobs on congress-tempest-plugin does checkout the master version for all repo instead of what mentioned in override-checkout var. If you see congress's rocky patch, congress is checkout out with rocky version[2] but congress-tempest-plugin patch's rocky job checkout the master version of congress instead of rocky version [3]. That is why your test expectedly fail on congress patch but pass on congress-tempest-plugin. Root cause is that override-checkout var does not work on the legacy job (it is only zuulv3 job var, if I am not wrong), you need to use BRANCH_OVERRIDE for legacy jobs. Myself, amotoki and akhil was trying lot other workarounds to debug the root cause but at the end we just notice that congress jobs are legacy jobs and using override-checkout :). I have submitted the testing patch with BRANCH_OVERRIDE for congress-tempest-plugin queens job[4]. Which seems working fine, I can make those patches more formal for merge. Another thing I was discussing with Akhil that new tests of builins feature need another feature flag (different than congressz3.enabled) as that feature of z3 is in stein onwards only. [1] https://review.openstack.org/#/c/618951/ [2] http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87474d7/logs/pip2-freeze.txt.gz [3] http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/logs/pip2-freeze.txt.gz [4] https://review.openstack.org/#/q/topic:fix-stable-branch-testing+(status:open+OR+status:merged) -gmann ---- On Fri, 11 Jan 2019 10:40:39 +0900 Eric K wrote ---- > The congress-tempest-plugin zuul jobs against stable branches appear > to be working incorrectly. Tests that should fail on stable/rocky (and > indeed fails when triggered by congress patch [1]) are passing when > triggered by congress-tempest-plugin patch [2]. > > I'd assume it's some kind of zuul misconfiguration in > congress-tempest-plugin [3], but I've so far failed to figure out > what's wrong. Particularly strange is that the job-output appears to > show it checking out the right thing [4]. > > Any thoughts or suggestions? Thanks so much! > > [1] > https://review.openstack.org/#/c/629070/ > http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87474d7/logs/testr_results.html.gz > The two failing z3 tests should indeed fail because the feature was > not available in rocky. The tests were introduced because for some > reason they pass in the job triggered by a patch in > congress-tempest-plugin. > > [2] > https://review.openstack.org/#/c/618951/ > http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/logs/testr_results.html.gz > > [3] https://github.com/openstack/congress-tempest-plugin/blob/master/.zuul.yaml#L4 > > [4] http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/job-output.txt.gz#_2019-01-09_05_18_08_183562 > shows congress is checked out to the correct commit at the top of the > stable/rocky branch. > > From alfredo.deluca at gmail.com Fri Jan 11 14:01:10 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Fri, 11 Jan 2019 15:01:10 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Hi Ignazio. So...on horizon I changed the project name from *admin* to *service* and that error disappeared even tho now I have a different erro with network..... is service the project where you run the vm on Magnum? Cheers On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano wrote: > Hi Alfredo, > attached here there is my magnum.conf for queens release > As you can see my heat sections are empty > When you create your cluster, I suggest to check heat logs e magnum logs > for verifyng what is wrong > Ignazio > > > > Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> so. Creating a stack either manually or dashboard works fine. The problem >> seems to be when I create a cluster (kubernetes/swarm) that I got that >> error. >> Maybe the magnum conf it's not properly setup? >> In the heat section of the magnum.conf I have only >> *[heat_client]* >> *region_name = RegionOne* >> *endpoint_type = internalURL* >> >> Cheers >> >> >> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >> alfredo.deluca at gmail.com> wrote: >> >>> Yes. Next step is to check with ansible. >>> I do think it's some rights somewhere... >>> I'll check later. Thanks >>> >>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano >> wrote: >>> >>>> Alfredo, >>>> 1 . how did you run the last heat template? By dashboard ? >>>> 2. Using openstack command you can check if ansible configured heat >>>> user/domain correctly >>>> >>>> >>>> It seems a problem related to >>>> heat user rights? >>>> >>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>> killed by signal 15 >>>>> which is last log from a few days ago. >>>>> >>>>> While the journal of the heat engine says >>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>> heat-engine service. >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>> SAWarning: Unicode type received non-unicode bind param value >>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>> occurrences) >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> (util.ellipses_string(value),)) >>>>> >>>>> >>>>> I also checked the configuration and it seems to be ok. the problem is >>>>> that I installed openstack with ansible-openstack.... so I can't change >>>>> anything unless I re run everything. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Check heat user and domani are c onfigured like at the following: >>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>> >>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>> >>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>> >>>>>>>> I ll try asap. Thanks >>>>>>>> >>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>> >>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>> heat is working fine? >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>> >>>>>>>>>> HI IGNAZIO >>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>> creating the master. >>>>>>>>>> >>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> Anycase during deployment you can connect with ssh to the master >>>>>>>>>>> and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>> >>>>>>>>>>>> Cheers >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my answer >>>>>>>>>>>>> could help you.... >>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>> Ignazio >>>>>>>>>>>>> >>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>> one.... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Alfredo* >>>>>>>>>>>> >>>>>>>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From lajos.katona at ericsson.com Fri Jan 11 14:19:26 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Fri, 11 Jan 2019 14:19:26 +0000 Subject: [L2-Gateway] Message-ID: Hi, I have a question regarding networking-l2gw, specifically l2gw-connection. We have an issue where the hw switch configured by networking-l2gw is slow, so when the l2gw-connection is created the API returns successfully, but the dataplane configuration is not yet ready. Do you think that adding state field to the connection is feasible somehow? By checking the vtep schema (http://www.openvswitch.org/support/dist-docs/vtep.5.html) no such information is available on vtep level. Thanks in advance for the help. Regarads Lajos From rico.lin.guanyu at gmail.com Fri Jan 11 14:26:32 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Fri, 11 Jan 2019 22:26:32 +0800 Subject: [all] New Automatic SIG (continue discussion) Message-ID: Dear all To continue the discussion of whether we should have new SIG for autoscaling. I think we already got enough time for this ML [1], and it's time to jump to the next step. As we got a lot of positive feedbacks from ML [1], I think it's definitely considered an action to create a new SIG, do some init works, and finally Here are some things that we can start right now, to come out with the name of SIG, the definition and mission. Here's my draft plan: To create a SIG name `Automatic SIG`, with given initial mission to improve automatic scaling with (but not limited to) OpenStack. As we discussed in forum [2], to have scenario tests and documents will be considered as actions for the initial mission. I gonna assume we will start from scenarios which already provide some basic tests and documents which we can adapt very soon and use them to build a SIG environment. And the long-term mission of this SIG is to make sure we provide good documentation and test coverage for most automatic functionality. I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we can provide more value if there are more needs in the future. Just like the example which Adam raised `self-optimizing` from people who are using watcher [3]. Let me know if you got any concerns about this name. And to clarify, there will definitely some cross SIG co-work between this new SIG and Self-Healing SIG (there're some common requirements even across self-healing and autoscaling features.). We also need to make sure we do not provide any duplicated work against self-healing SIG. As a start, let's only focus on autoscaling scenario, and make sure we're doing it right before we move to multiple cases. If no objection, I will create the new SIG before next weekend and plan a short schedule in Denver summit and PTG. [1] http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000284.html [2] https://etherpad.openstack.org/p/autoscaling-integration-and-feedback [3] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000813.html -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas at lrasc.fr Fri Jan 11 14:28:12 2019 From: nicolas at lrasc.fr (nicolas at lrasc.fr) Date: Fri, 11 Jan 2019 15:28:12 +0100 Subject: [dev][Tacker] Implementing Multisite VNFFG In-Reply-To: References: <5c36f97e.1c69fb81.79c09.a033@mx.google.com> Message-ID: <5869a6ccf31f156b7e1dec1ef8969558@lrasc.fr> Hi all, First, I am not involved in tacker or in networking-sfc, so I can be wrong. Just to be sure by 'multiple VIM' you mean multi domains, multi autonomous systems, multi OpenStack/VNFinfra sites that are all different? When it comes to VNFFG over multiple VIM, I think a question is: what does the networking-sfc driver already support? Some other questions: 1. In a single VIM situation, does the networking-sfc driver support VNFFG (or port chaining) over multiple different IP subnets? 2. Does networking-sfc support both IPv4 and IPv6? 3. What routing/steering protocol (NSH, SRv6) does networking-sfc support? 4. How healthy (or up to date) is the development of networking-sfc? I think modifying tacker (or modifying any other VNF Orchestrator that is plugged to an OpenStack VIM with networking-sfc driver installed) alone is not enough. Maybe VNFFG over multiple VIM needs an SDN controller, maybe it needs new feature in the networking-sfc driver or new features in neutron... --- Nicolas On 2019-01-11 12:31, Dharmendra Kushwaha wrote: > Dear Lee, > > Good point & Thanks for the proposal. > > Currently no ongoing activity on that. And That will be great help if you lead this feature enhancement. > > Feel free to join Tacker weekly meeting with some initial drafts. > > Thanks & Regards > > Dharmendra Kushwaha > > FROM: 이호찬 [mailto:ghcks1000 at gmail.com] > SENT: 10 January 2019 13:21 > TO: openstack-discuss at lists.openstack.org > SUBJECT: [dev][Tacker] Implementing Multisite VNFFG > > Dear Tacker folks, > > Hello, I'm interested in implementing multisite VNFFG in Tacker project. > > As far as I know, current single Tacker controller can manage multiple Openstack sites (Multisite VIM), but it can create VNFFG in only singlesite, so it can't create VNFFG across multisite. I think if multisite VNFFG is possible, tacker can have more flexibility in managing VNF and VNFFG. > > In the current tacker, networking-sfc driver is used to support VNFFG, and networking-sfc uses port chaining to construct service chain. So, I think extending current port chaining in singleiste to multisite can be one solution. > > Is there development process about multisite VNFFG in tacker project? Otherwise, I wonder that tacker is interested in this feature. I want to develop this feature for Tacker project if I can. > > Yours sincerely, > > Hochan Lee. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dh3 at sanger.ac.uk Fri Jan 11 14:32:10 2019 From: dh3 at sanger.ac.uk (Dave Holland) Date: Fri, 11 Jan 2019 14:32:10 +0000 Subject: [cinder] volume encryption performance impact In-Reply-To: <20190110135605.qd34tb54deh5zv6f@lyarwood.usersys.redhat.com> References: <20190109151329.GA7953@sanger.ac.uk> <20190110135605.qd34tb54deh5zv6f@lyarwood.usersys.redhat.com> Message-ID: <20190111143210.GE7953@sanger.ac.uk> Thanks Lee, Arne, Thomas for replying. On Thu, Jan 10, 2019 at 01:56:05PM +0000, Lee Yarwood wrote: > What's the underlying version of QEMU being used here? It's qemu-kvm-rhev-2.10.0-21.el7_5.4.x86_64 > FWIW I can't recall seeing any performance issues when working on and > verifying this downstream with QEMU 2.10. I had wondered about https://bugzilla.redhat.com/1500334 (LUKS driver buffer size) which fits the symptoms, but the fix apparently went in to qemu-kvm-rhev-2.10.0-11.el7 so shouldn't be affecting us. I have a case open with RH Support now and I am keeping my fingers crossed. We will be redeploying this system again shortly with the latest Queens/RHOSP13 package versions, so should end up with qemu-kvm-rhev-2.12.0-18.el7_6.1.x86_64 and I will re-test then. Cheers, Dave -- ** Dave Holland ** Systems Support -- Informatics Systems Group ** ** 01223 496923 ** Wellcome Sanger Institute, Hinxton, UK ** -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From pabelanger at redhat.com Fri Jan 11 14:46:23 2019 From: pabelanger at redhat.com (Paul Belanger) Date: Fri, 11 Jan 2019 09:46:23 -0500 Subject: [infra] Updating fedora-latest nodeset to Fedora 29 In-Reply-To: <20190110004306.GA995@fedora19.localdomain> References: <20190110004306.GA995@fedora19.localdomain> Message-ID: <20190111144623.GA29154@localhost.localdomain> On Thu, Jan 10, 2019 at 11:43:06AM +1100, Ian Wienand wrote: > Hi, > > Just a heads up that we're soon switching "fedora-latest" nodes from > Fedora 28 to Fedora 29 [1] (setting up this switch took a bit longer > than usual, see [2]). Presumably if you're using "fedora-latest" you > want the latest Fedora, so this should not be unexpected :) But this > is the first time we're making this transition with the "-latest" > nodeset, so please report any issues. > Great work, just looked at fedora-latest job for windmill and no failures. Thanks! Paul From ignaziocassano at gmail.com Fri Jan 11 14:51:53 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 11 Jan 2019 15:51:53 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Hi Alfredo, I am using admin project. If your run the simple heat stack I sent you from service projects, it works ? Il giorno ven 11 gen 2019 alle ore 15:01 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > Hi Ignazio. So...on horizon I changed the project name from *admin* to > *service* and that error disappeared even tho now I have a different erro > with network..... > is service the project where you run the vm on Magnum? > > Cheers > > > > On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano > wrote: > >> Hi Alfredo, >> attached here there is my magnum.conf for queens release >> As you can see my heat sections are empty >> When you create your cluster, I suggest to check heat logs e magnum logs >> for verifyng what is wrong >> Ignazio >> >> >> >> Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> so. Creating a stack either manually or dashboard works fine. The >>> problem seems to be when I create a cluster (kubernetes/swarm) that I got >>> that error. >>> Maybe the magnum conf it's not properly setup? >>> In the heat section of the magnum.conf I have only >>> *[heat_client]* >>> *region_name = RegionOne* >>> *endpoint_type = internalURL* >>> >>> Cheers >>> >>> >>> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> Yes. Next step is to check with ansible. >>>> I do think it's some rights somewhere... >>>> I'll check later. Thanks >>>> >>>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano < >>>> ignaziocassano at gmail.com wrote: >>>> >>>>> Alfredo, >>>>> 1 . how did you run the last heat template? By dashboard ? >>>>> 2. Using openstack command you can check if ansible configured heat >>>>> user/domain correctly >>>>> >>>>> >>>>> It seems a problem related to >>>>> heat user rights? >>>>> >>>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>>> killed by signal 15 >>>>>> which is last log from a few days ago. >>>>>> >>>>>> While the journal of the heat engine says >>>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>>> heat-engine service. >>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>>> SAWarning: Unicode type received non-unicode bind param value >>>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>>> occurrences) >>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>> (util.ellipses_string(value),)) >>>>>> >>>>>> >>>>>> I also checked the configuration and it seems to be ok. the problem >>>>>> is that I installed openstack with ansible-openstack.... so I can't change >>>>>> anything unless I re run everything. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Check heat user and domani are c onfigured like at the following: >>>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>>> >>>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>>> >>>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>>> >>>>>>>>> I ll try asap. Thanks >>>>>>>>> >>>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>> >>>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>>> heat is working fine? >>>>>>>>>> Ignazio >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>> >>>>>>>>>>> HI IGNAZIO >>>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>>> creating the master. >>>>>>>>>>> >>>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>>> >>>>>>>>>>>> Anycase during deployment you can connect with ssh to the >>>>>>>>>>>> master and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>>> Ignazio >>>>>>>>>>>> >>>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>> >>>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my >>>>>>>>>>>>>> answer could help you.... >>>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>>> Ignazio >>>>>>>>>>>>>> >>>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>>> one.... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dprince at redhat.com Fri Jan 11 15:04:06 2019 From: dprince at redhat.com (Dan Prince) Date: Fri, 11 Jan 2019 10:04:06 -0500 Subject: [TripleO] flattening breakages Message-ID: <299e2464dbbe0c73335aedf86f7f206bd5a58a3c.camel@redhat.com> I noticed a few breakages [1][2] today with the flattening effort in the codebase. Specifically we are missing some of the 'monitoring_subscription' sections in the flattened files. We apparently have no CI on these ATM so please be careful in reviewing patches in this regard until (and if) we can add CI on this feature. I fear this type of restructuring is going to break subtle things and highlight what we don't have CI on. Some of the 3rd party vendor integration worries me in that we've got no upstream way of testing this stuff ATM. [1] https://review.openstack.org/#/c/630280/ (Ironic) [2] https://review.openstack.org/#/c/630281/ (Aodh) From saphi070 at gmail.com Fri Jan 11 15:16:37 2019 From: saphi070 at gmail.com (Sa Pham) Date: Fri, 11 Jan 2019 22:16:37 +0700 Subject: [all] New Automatic SIG (continue discussion) In-Reply-To: References: Message-ID: +1 from me. On Fri, Jan 11, 2019 at 9:32 PM Rico Lin wrote: > Dear all > > To continue the discussion of whether we should have new SIG for > autoscaling. > > I think we already got enough time for this ML [1], and it's time to jump > to the next step. > As we got a lot of positive feedbacks from ML [1], I think it's definitely > considered an action to create a new SIG, do some init works, and finally > Here are some things that we can start right now, to come out with the > name of SIG, the definition and mission. > > Here's my draft plan: > To create a SIG name `Automatic SIG`, with given initial mission to improve > automatic scaling with (but not limited to) OpenStack. As we discussed in > forum [2], to have scenario tests and documents will be considered as > actions for the initial mission. I gonna assume we will start from > scenarios which already provide some basic tests and documents which we can > adapt very soon and use them to build a SIG environment. And the long-term > mission of this SIG is to make sure we provide good documentation and test > coverage for most automatic functionality. > > I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we can > provide more value if there are more needs in the future. Just like the > example which Adam raised `self-optimizing` from people who are > using watcher [3]. > Let me know if you got any concerns about this name. > And to clarify, there will definitely some cross SIG co-work between this > new SIG and Self-Healing SIG (there're some common requirements even across > self-healing and autoscaling features.). We also need to make sure we do > not provide any duplicated work against self-healing SIG. > As a start, let's only focus on autoscaling scenario, and make sure we're > doing it right before we move to multiple cases. > > If no objection, I will create the new SIG before next weekend and plan a > short schedule in Denver summit and PTG. > > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000284.html > > [2] https://etherpad.openstack.org/p/autoscaling-integration-and-feedback > [3] > http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000813.html > > -- > May The Force of OpenStack Be With You, > > *Rico Lin*irc: ricolin > > -- Sa Pham Dang Cloud RnD Team - VCCloud Phone/Telegram: 0986.849.582 Skype: great_bn -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Fri Jan 11 15:16:50 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 11 Jan 2019 16:16:50 +0100 Subject: Issues setting up a SolidFire node with Cinder In-Reply-To: <6f53c037-b03d-1550-3e7a-e42850d950ec@gmail.com> References: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> <6f53c037-b03d-1550-3e7a-e42850d950ec@gmail.com> Message-ID: <20190111151650.phmxm22rzigmmgo5@localhost> On 10/01, Jay S. Bryant wrote: > Grant, > > So, the copy is failing because it can't find the volume to copy the image > into. > > I would check the host and container for any iSCSI errors as well as the > backend.  It appears that something is going wrong when attempting to > temporarily attach the volume to write the image into it. > > Jay Hi, I've also seen this error when the initiator name in /etc/iscsi/initiatorname.iscsi inside the container does not match the one in use by the iscsid initiator daemon. This can happen because the initiator name was changed after the daemon started or because it is not shared between the container and the host. I've also seen this happen (thought this is not the case) on VM migrations when the driver has a bug and doesn't return the right connection information (returns the first one). I would recommend setting the log level to debug to see additional info from OS-Brick. I've debugged these type of issues many times, and if it's not a production env I usually go with: - Setting a breakpoint in the OS-Brick code: I stop at the right place and check the state of the system and how the volume has been exported and mapped in the backend. - Installing cinderlib on a virtualenv (with --system-site-packages) in the cinder node, then using cinderlib to create a volume and debug an attach operation same as in previous step, like this: * Prepare the env: $ virtualenv --system-site-packages venv $ source venv/bin/activate (venv) $ pip install cinderlib * Run python interpreter (venv) $ python * Initialize cinderlib to store volumes in ./cl3.sqlite >>> import cinderlib as cl >>> db_connection = 'sqlite:///cl3.sqlite' >>> persistence_config = {'storage': 'db', 'connection': db_connection} >>> cl.setup(persistence_config=persistence_config, disable_logs=False, debug=True) * Setup the backend. You'll have to use your own configuration here: >>> sf = cl.Backend( ... volume_backend_name='solidfire', ... volume_driver='cinder.volume.drivers.solidfire.SolidFireDriver', ... san_ip='192.168.1.4', ... san_login='admin', ... san_password='admin_password', ... sf_allow_template_caching=False) * Create a 1GB empty volume: >>> vol = sf.create_volume(1) * Debug the attachment: >>> import pdb >>> pdb.run('att = vol.attach()') - If it's a container I usually execute a bash terminal interactively and pip install cinderlib and do the debugging like in the step above. Cheers, Gorka. > > On 1/10/2019 7:16 AM, Grant Morley wrote: > > > > Hi all, > > > > We are in the process of trying to add a SolidFire storage solution to > > our existing OpenStack setup and seem to have hit a snag with cinder / > > iscsi. > > > > We are trying to create a bootable volume to allow us to launch an > > instance from it, but we are getting some errors in our cinder-volumes > > containers that seem to suggest they can't connect to iscsi although the > > volume seems to create fine on the SolidFire node. > > > > The command we are running is: > > > > openstack volume create --image $image-id --size 20 --bootable --type > > solidfire sf-volume-v12 > > > > The volume seems to create on SolidFire but I then see these errors in > > the "cinder-volume.log" > > > > https://pastebin.com/LyjLUhfk > > > > The volume containers can talk to the iscsi VIP on the SolidFire so I am > > a bit stuck and wondered if anyone had come across any issues before? > > > > Kind Regards, > > > > > > -- > > Grant Morley > > Cloud Lead > > Absolute DevOps Ltd > > Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP > > www.absolutedevops.io > > grant at absolutedevops.io 0845 874 0580 From geguileo at redhat.com Fri Jan 11 15:23:18 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 11 Jan 2019 16:23:18 +0100 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: Message-ID: <20190111152318.ztuwirfgypehdfp6@localhost> On 10/01, Brandon Caulder wrote: > Hi Iain, > > There are 424 rows in volumes which drops down to 185 after running > cinder-manage db purge 1. Restarting the volume service after package > upgrade and running sync again does not remediate the problem, although > running db sync a second time does bump the version up to 117, the > following appears in the volume.log... > > http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ > Hi, If I understand correctly the steps were: - Run DB sync --> Fail - Run DB purge - Restart volume services - See the log error - Run DB sync --> version proceeds to 117 If that is the case, could you restart the services again now that the migration has been moved to version 117? If the cinder-volume service is able to restart please run the online data migrations with the service running. Cheers, Gorka. > Thanks > > On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell > wrote: > > > > > Different issue, I believe (DB sync vs. online migrations) - it just > > happens that both pertain to shared targets. > > > > Brandon, might you have a very large number of rows in your volumes > > table? Have you been purging soft-deleted rows? > > > > ~iain > > > > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > > Brandon, > > > > > > I am thinking you are hitting this bug: > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > > > > I think you can work around it by retrying the migration with the volume > > > service running. You may, however, want to check with Iain MacDonnell > > > as he has been looking at this for a while. > > > > > > Thanks! > > > Jay > > > > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > > >> Hi, > > >> > > >> I am receiving the following error when performing an offline upgrade > > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > > >> openstack-cinder-1:12.0.3-1.el7. > > >> > > >> # cinder-manage db version > > >> 105 > > >> > > >> # cinder-manage --debug db sync > > >> Error during database migration: (pymysql.err.OperationalError) (2013, > > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes > > >> SET shared_targets=%(shared_targets)s'] [parameters: > > >> {'shared_targets': 1}] > > >> > > >> # cinder-manage db version > > >> 114 > > >> > > >> The db version does not upgrade to queens version 117. Any help would > > >> be appreciated. > > >> > > >> Thank you > > > > > > > From alfredo.deluca at gmail.com Fri Jan 11 15:38:55 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Fri, 11 Jan 2019 16:38:55 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: nope. I created another one and I got this error... Create_Failed: Resource CREATE failed: ValueError: resources.my_instance: nics are required after microversion 2.36 On Fri, Jan 11, 2019 at 3:52 PM Ignazio Cassano wrote: > Hi Alfredo, I am using admin project. > If your run the simple heat stack I sent you from service projects, it > works ? > > > Il giorno ven 11 gen 2019 alle ore 15:01 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> Hi Ignazio. So...on horizon I changed the project name from *admin* to >> *service* and that error disappeared even tho now I have a different >> erro with network..... >> is service the project where you run the vm on Magnum? >> >> Cheers >> >> >> >> On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano >> wrote: >> >>> Hi Alfredo, >>> attached here there is my magnum.conf for queens release >>> As you can see my heat sections are empty >>> When you create your cluster, I suggest to check heat logs e magnum >>> logs for verifyng what is wrong >>> Ignazio >>> >>> >>> >>> Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < >>> alfredo.deluca at gmail.com> ha scritto: >>> >>>> so. Creating a stack either manually or dashboard works fine. The >>>> problem seems to be when I create a cluster (kubernetes/swarm) that I got >>>> that error. >>>> Maybe the magnum conf it's not properly setup? >>>> In the heat section of the magnum.conf I have only >>>> *[heat_client]* >>>> *region_name = RegionOne* >>>> *endpoint_type = internalURL* >>>> >>>> Cheers >>>> >>>> >>>> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >>>> alfredo.deluca at gmail.com> wrote: >>>> >>>>> Yes. Next step is to check with ansible. >>>>> I do think it's some rights somewhere... >>>>> I'll check later. Thanks >>>>> >>>>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano < >>>>> ignaziocassano at gmail.com wrote: >>>>> >>>>>> Alfredo, >>>>>> 1 . how did you run the last heat template? By dashboard ? >>>>>> 2. Using openstack command you can check if ansible configured heat >>>>>> user/domain correctly >>>>>> >>>>>> >>>>>> It seems a problem related to >>>>>> heat user rights? >>>>>> >>>>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child >>>>>>> 4202 killed by signal 15 >>>>>>> which is last log from a few days ago. >>>>>>> >>>>>>> While the journal of the heat engine says >>>>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>>>> heat-engine service. >>>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>>>> SAWarning: Unicode type received non-unicode bind param value >>>>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>>>> occurrences) >>>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>>> (util.ellipses_string(value),)) >>>>>>> >>>>>>> >>>>>>> I also checked the configuration and it seems to be ok. the problem >>>>>>> is that I installed openstack with ansible-openstack.... so I can't change >>>>>>> anything unless I re run everything. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> Check heat user and domani are c onfigured like at the following: >>>>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>>>> >>>>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>> >>>>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>>>> >>>>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>>>> >>>>>>>>>> I ll try asap. Thanks >>>>>>>>>> >>>>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>>>> heat is working fine? >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>>> HI IGNAZIO >>>>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>>>> creating the master. >>>>>>>>>>>> >>>>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Anycase during deployment you can connect with ssh to the >>>>>>>>>>>>> master and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>>>> Ignazio >>>>>>>>>>>>> >>>>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my >>>>>>>>>>>>>>> answer could help you.... >>>>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>>>> Ignazio >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>>>> one.... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Alfredo* >>>>>>> >>>>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Fri Jan 11 15:44:13 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 11 Jan 2019 15:44:13 +0000 (GMT) Subject: [placement] update 19-01 Message-ID: HTML: https://anticdent.org/placement-update-19-01.html Hello! Here's placement update 19-01. Not a ton to report this week, so this will mostly be updating the lists provided last week. # Most Important As mentioned last week, there will be a meeting next week to discuss what is left before we can pull the trigger on [deleting the placement code from nova](https://review.openstack.org/618215). Wednesday is looking like a good day, perhaps at 1700UTC, but we'll need to confirm that on Monday when more people are around. Feel free to respond on this thread if that won't work for you (and suggest an alternative). Since deleting the code is dependent on deployment tooling being able to handle extracted placement (and upgrades to it), reviewing that work is important (see below). # What's Changed * It was nova's spec freeze this week, so a lot of effort was spent getting some specs reviewed and merged. That's reflected in the shorter specs section, below. * Placement had a release and was published to [pypi](https://pypi.org/project/openstack-placement/). This was a good excuse to write (yet another) blog post on [how easy it is to play with](https://anticdent.org/placement-from-pypi.html). # Bugs * Placement related [bugs not yet in progress](https://goo.gl/TgiPXb): 14. -1. * [In progress placement bugs](https://goo.gl/vzGGDQ) 16. +1 # Specs With spec freeze this week, this will be the last time we'll see this section until near the end of this cycle. Only one of the specs listed last week merged (placement for counting quota). * Account for host agg allocation ratio in placement (Still in rocky/) * Add subtree filter for GET /resource_providers * Resource provider - request group mapping in allocation candidate * VMware: place instances on resource pool (still in rocky/) * Standardize CPU resource tracking * Allow overcommit of dedicated CPU (Has an alternative which changes allocations to a float) * Modelling passthrough devices for report to placement * Nova Cyborg interaction specification. * supporting virtual NVDIMM devices * Proposes NUMA topology with RPs * Count quota based on resource class * Adds spec for instance live resize * Provider config YAML file * Resource modeling in cyborg. * Support filtering of allocation_candidates by forbidden aggregates * support virtual persistent memory # Main Themes ## Making Nested Useful I've been saying for a few weeks that "progress continues on gpu-reshaping for libvirt and xen" but it looks like the work at: * is actually stalled. Anyone have some insight on the status of that work? Also making use of nested is bandwidth-resource-provider: * There's a [review guide](http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001129.html) for those patches. Eric's in the process of doing lots of cleanups to how often the ProviderTree in the resource tracker is checked against placement, and a variety of other "let's make this more right" changes in the same neighborhood: * Stack at: ## Extraction Besides the meeting mentioned above, I've refactored the extraction etherpad to make a [new version](https://etherpad.openstack.org/p/placement-extract-stein-5) that has less noise in it so the required actions are a bit more clear. The tasks remain much the same as mentioned last week: the reshaper work mentioned above and the work to get deployment tools operating with an extracted placement: * [TripleO](https://review.openstack.org/#/q/topic:tripleo-placement-extraction) * [OpenStack Ansible](https://review.openstack.org/#/q/project:openstack/openstack-ansible-os_placement) * [Kolla and Kolla Ansible](https://review.openstack.org/#/q/topic:split-placement) Loci's change to have an extracted placement has merged. Kolla has a patch to [include the upgrade script](https://review.openstack.org/#/q/topic:upgrade-placement). It raises the question of how or if the `mysql-migrate-db.sh` should be distributed. Should it maybe end up in the pypi distribution? (The rest of this section is duplicated from last week.) Documentation tuneups: * Release-notes: This is blocked until we refactor the release notes to reflect _now_ better. * The main remaining task here is participating in [openstack-manuals](https://docs.openstack.org/doc-contrib-guide/doc-index.html), to that end: * A stack of changes to nova to remove placement from the install docs. * Install docs in placement. I wrote to the [mailing list](http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001379.html) asking for input on making sure these things are close to correct, especially with regard to distro-specific things like package names. * Change to openstack-manuals to assert that placement is publishing install docs. Depends on the above. * There is a patch to [delete placement](https://review.openstack.org/#/c/618215/) from nova that we've put an administrative -2 on while we determine where things are (see about the meeting above). * There's a pending patch to support [online data migrations](https://review.openstack.org/#/c/624942/). This is important to make sure that fixup commands like `create_incomplete_consumers` can be safely removed from nova and implemented in placement. # Other There are still 13 [open changes](https://review.openstack.org/#/q/project:openstack/placement+status:open) in placement itself. Most of the time critical work is happening elsewhere (notably the deployment tool changes listed above). Of those placement changes, the [database-related](https://review.openstack.org/#/q/owner:nakamura.tetsuro%2540lab.ntt.co.jp+status:open+project:openstack/placement) ones from Tetsuro are the most important. Outside of placement: * Neutron minimum bandwidth implementation * zun: Use placement for unified resource management * WIP: add Placement aggregates tests (in tempest) * blazar: Consider the number of reservation inventory * Add placement client for basic GET operations (to tempest) # End If anyone has submitted, or is planning to, a proposal for summit that is placement-related, it would be great to hear about it. I had thought about doing a resilient placement in kubernetes with cockroachdb for the edge sort of thing, but then realized my motivations were suspect and I have enough to do otherwise. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From colleen at gazlene.net Fri Jan 11 15:44:39 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 11 Jan 2019 16:44:39 +0100 Subject: [dev][keystone] Keystone Team Update - Week of 7 January 2019 Message-ID: <1547221479.1146713.1631988432.52724E11@webmail.messagingengine.com> # Keystone Team Update - Week of 7 January 2019 Happy new year! We are ramping back up following the holidays. ## News ### Cross-Project Limits Followup We are trying to close in on a stable API for limits and want to restart the discussion on what the other projects need from it. Please chime in on the thread or the linked reviews[1]. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001518.html ## Open Specs Stein specs: https://bit.ly/2Pi6dGj Ongoing specs: https://bit.ly/2OyDLTh Spec freeze is today but we have two open specs for still open for Stein, we will need to decide whether to push them or grant exceptions for them, keeping in mind there is not much time left for implementation at this point. ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 27 changes this week. ## Changes that need Attention Search query: https://bit.ly/2RLApdA There are 103 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. Lance's patch bomb of doom still needs more review attention. ## Bugs Since the last report we opened 7 new bugs and closed 10. Bugs opened (7) Bug #1810485 (keystone:Medium) opened by Guang Yee https://bugs.launchpad.net/keystone/+bug/1810485 Bug #1810983 (keystone:Medium) opened by Guang Yee https://bugs.launchpad.net/keystone/+bug/1810983 Bug #1809779 (keystone:Undecided) opened by Yang Youseok https://bugs.launchpad.net/keystone/+bug/1809779 Bug #1810393 (keystone:Undecided) opened by wangxiyuan https://bugs.launchpad.net/keystone/+bug/1810393 Bug #1810278 (keystonemiddleware:Undecided) opened by Yang Youseok https://bugs.launchpad.net/keystonemiddleware/+bug/1810278 Bug #1810761 (keystonemiddleware:Undecided) opened by Hugo Kou https://bugs.launchpad.net/keystonemiddleware/+bug/1810761 Bug #1811351 (python-keystoneclient:Undecided) opened by Colleen Murphy https://bugs.launchpad.net/python-keystoneclient/+bug/1811351 Bugs closed (2) Bug #1809779 (keystone:Undecided) https://bugs.launchpad.net/keystone/+bug/1809779 Bug #1810761 (keystonemiddleware:Undecided) https://bugs.launchpad.net/keystonemiddleware/+bug/1810761 Bugs fixed (8) Bug #1805403 (keystone:Medium) fixed by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1805403 Bug #1810485 (keystone:Medium) fixed by Guang Yee https://bugs.launchpad.net/keystone/+bug/1810485 Bug #1810983 (keystone:Medium) fixed by no one https://bugs.launchpad.net/keystone/+bug/1810983 Bug #1786594 (keystone:Low) fixed by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1786594 Bug #1793374 (keystone:Low) fixed by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1793374 Bug #1810393 (keystone:Undecided) fixed by wangxiyuan https://bugs.launchpad.net/keystone/+bug/1810393 Bug #1809101 (keystonemiddleware:Undecided) fixed by leehom https://bugs.launchpad.net/keystonemiddleware/+bug/1809101 Bug #1807184 (oslo.policy:Medium) fixed by Brian Rosmaita https://bugs.launchpad.net/oslo.policy/+bug/1807184 ## Milestone Outlook https://releases.openstack.org/stein/schedule.html Spec freeze is today. The feature proposal freeze is at the end of this month, with feature freeze just five weeks after. ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter Dashboard generated using gerrit-dash-creator and https://gist.github.com/lbragstad/9b0477289177743d1ebfc276d1697b67 From aspiers at suse.com Fri Jan 11 16:14:17 2019 From: aspiers at suse.com (Adam Spiers) Date: Fri, 11 Jan 2019 16:14:17 +0000 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: References: Message-ID: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> Rico Lin wrote: >Dear all > >To continue the discussion of whether we should have new SIG for >autoscaling. > >I think we already got enough time for this ML [1], and it's time to jump >to the next step. >As we got a lot of positive feedbacks from ML [1], I think it's definitely >considered an action to create a new SIG, do some init works, and finally >Here are some things that we can start right now, to come out with the name >of SIG, the definition and mission. > >Here's my draft plan: >To create a SIG name `Automatic SIG`, with given initial mission to improve >automatic scaling with (but not limited to) OpenStack. As we discussed in >forum [2], to have scenario tests and documents will be considered as >actions for the initial mission. I gonna assume we will start from >scenarios which already provide some basic tests and documents which we can >adapt very soon and use them to build a SIG environment. And the long-term >mission of this SIG is to make sure we provide good documentation and test >coverage for most automatic functionality. > >I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we can >provide more value if there are more needs in the future. Just like the >example which Adam raised `self-optimizing` from people who are >using watcher [3]. >Let me know if you got any concerns about this name. I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound quite right to me, because it's not clear what is being automated. For example from the outside people might think it was a SIG about CI, or about automated testing, or both - or even some kind of automatic creation of new SIGs ;-) Here are some alternative suggestions: - Optimization SIG - Self-optimization SIG - Auto-optimization SIG - Adaptive Cloud SIG - Self-adaption SIG - Auto-adaption SIG - Auto-configuration SIG although I'm not sure these are a huge improvement on "Autoscaling SIG" - maybe some are too broad, or too vague. It depends on how likely it is that the scope will go beyond just auto-scaling. Of course you could also just stick with the original idea of "Auto-scaling" :-) >And to clarify, there will definitely some cross SIG co-work between this >new SIG and Self-Healing SIG (there're some common requirements even across >self-healing and autoscaling features.). We also need to make sure we do >not provide any duplicated work against self-healing SIG. >As a start, let's only focus on autoscaling scenario, and make sure we're >doing it right before we move to multiple cases. Sounds good! >If no objection, I will create the new SIG before next weekend and plan a >short schedule in Denver summit and PTG. Thanks for driving this! From geguileo at redhat.com Fri Jan 11 16:16:45 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 11 Jan 2019 17:16:45 +0100 Subject: [Openstack-discuss][Cinder] Fail to mount the volume on the target node In-Reply-To: References: Message-ID: <20190111161645.6lljpxjf66jgsbby@localhost> On 31/12, Minjun Hong wrote: > Hi. > After I installed Cinder, I have had a problem which I cannot make > instances with volume storage. > I created an instance on Horizon and it always has failed. > Actually, I'm using Xen as the hypervisor and there was not any special log > about Nova. > But, in the Xen's log (/var/log/xen/bootloader.5.log), I saw the hypervisor > cannot find the volume which is provided by Cinder: > > Traceback (most recent call last): > > File "/usr/lib/xen/bin/pygrub", line 929, in > > raise RuntimeError, "Unable to find partition containing kernel" > > RuntimeError: Unable to find partition containing kernel > > > And, I also found something noticeable in the Cinder's log > ('/var/log/cinder/cinder-volume.log' on the Block storage node): > > 2018-12-31 04:08:11.189 12380 INFO cinder.volume.manager > > [req-93eb0ad3-6c6c-4842-851f-435e15d8639b bb1e571e4d64462bac80654b153a88c3 > > 96ad10a59d114042b8f1ee82c438649a - default default] Attaching volume > > 4c21b8f1-ff07-4916-9692-e74759635978 to instance > > bea7dca6-fb04-4791-bac9-3ad560280cc3 at mountpoint /dev/xvda on host None. > > > It seems that Cinder cannot receive information of the target node ('on > host None' above) so, I think it can cause the problem that Cinder fails to > provide the volume due to lack of the host information. > Since I could not find any other logs except that log, I need more hints. > Please give me some help > > Thanks! > Regard, Hi, The "on host None" message looks like Nova is either not sending the "host" key in the connector information or is sending it set to '' or None. You'd need to see the logs in DEBUG level to know which it is. And that is strange, because the "host" key is set by os-brick when Nova calls the "get_connector_properties": props['host'] = host if host else socket.gethostname() So even if Nova doesn't have the "host" config option set, os-brick should get the hostname of the node. But from Cinder's perspective I don't think that's necessarily a problem. How was the volume created? Because that looks like a problem with the contents of the volume, as it is not complaining about not being able to map/export it or attach it. Cheers, Gorka. From emilien at redhat.com Fri Jan 11 16:20:48 2019 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 11 Jan 2019 11:20:48 -0500 Subject: [RHEL8-OSP15] Container Runtimes integration - Status report #7 Message-ID: Welcome to the seventh status report about the progress we make to Container Runtimes into Red Hat OpenStack Platform, version 15. You can read the previous report here: http://post-office.corp.redhat.com/archives/container-teams/2018-December/msg00090.html Our efforts are tracked here: https://trello.com/b/S8TmOU0u/tripleo-podman TL;DR =========================================== - Some OSP folks will meet in Brno next week, to work together on RHEL8/OSP15. See [1]. - We have replaced the Docker Healthchecks by SystemD timers when Podman is deployed. Now figuring out the next steps [2]. - Slow progress on the Python-based uploader (using tar-split + buildah), slowed by bugs. - We are waiting for podman 1.0 so we can build / test / ship it in TripleO CI. Context reminder =========================================== The OpenStack team is preparing the 15th version of Red Hat OpenStack Platform that will work on RHEL8. We are working together to support the future Container Runtimes which replace Docker. Done =========================================== - Implemented Podman healthchecks with SystemD timers: https://review.openstack.org/#/c/620372/ - Renamed SystemD services controlling Podman containers to not conflict with baremetal services https://review.openstack.org/#/c/623241/ - podman issues (reported by us) closed: - pull: error setting new rlimits: operation not permitted https://github.com/containers/libpod/issues/2123 - New podman version introduce new issue with selinux and relabelling: relabel failed "/run/netns": operation not supported https://github.com/containers/libpod/issues/2034 - container create failed: container_linux.go:336: starting container process caused "setup user: permission denied" https://github.com/containers/libpod/issues/1980 - "podman inspect --type image --format exists " reports a not-friendly error when image doesn't exist in local storage https://github.com/containers/libpod/issues/1845 - container create failed: container_linux.go:336: starting container process caused "process_linux.go:293: applying cgroup configuration for process caused open /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus: no such file or directory" https://github.com/containers/libpod/issues/1841 - paunch/runner: test if image exists before running inspect https://review.openstack.org/#/c/619313/ - Fixing a bunch of issues with docker-puppet.py to reduce chances of race conditions. - A lot of SElinux work, to make everything working in Enforced mode. - tar-split packaging is done, and will be consumed in TripleO for the python image uplaoded In progress =========================================== - Still investigating standard_init_linux.go:203: exec user process caused \"no such file or directory\" [5]. This one is nasty and painful. It involves concurrency and we are evaluating solutions, but we'll probably end up reduce the default multi-processing of podman commands from 6 to 3 by default. - Investigating ways to gate new versions of Podman + dependencies: https://review.rdoproject.org/r/#/c/17960/ - Investigating how to consume systemd timers in sensu (healtchecks) [2] - Investigating and prototyping a pattern to safely spawn a container from a container with systemd https://review.openstack.org/#/c/620062 - Investigating how we can prune Docker data when upgrading from Docker to Podman https://review.openstack.org/#/c/620405/ - Using the new "podman image exist" in Paunch https://review.openstack.org/#/c/619313/ - Still implementing a Python-based container uploader (using tar-split and buildah) - this method will be the default later: https://review.openstack.org/#/c/616018/ - Testing future Podman 1.0 in TripleO [3] - Help the Skydive team to migrate to Podman [4] Blocked =========================================== Podman 1.0 contains a lot of fixes that we need (from libpod and vendored as well). Any comment or feedback is welcome, thanks for reading! [1] https://docs.google.com/document/d/18-1M1eSnlls6j2Op2TxyvyuqoOksxmwHOhqaD6B8FQY/edit [2] https://trello.com/c/g6bi5DQF/4-healthchecks [3] https://trello.com/c/2tXNLJUN/58-test-podman-10 [4] https://trello.com/c/tW935FGe/56-migrate-ansible-skydive-to-podman [5] https://github.com/containers/libpod/issues/1844 -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Fri Jan 11 16:24:42 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 11 Jan 2019 10:24:42 -0600 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> Message-ID: <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> On 1/11/19 10:14 AM, Adam Spiers wrote: > Rico Lin wrote: >> Dear all >> >> To continue the discussion of whether we should have new SIG for >> autoscaling. >> I think we already got enough time for this ML  [1], and it's time to >> jump to the next step. As we got a lot of positive feedbacks from ML >> [1], I think it's definitely considered an action to create a new SIG, >> do some init works, and finally Here are some things that we can start >> right now, to come out with the name of SIG, the definition and mission. >> Here's my draft plan: To create a SIG name `Automatic SIG`, with given >> initial mission to improve automatic scaling with (but not limited to) >> OpenStack. As we discussed in forum [2], to have scenario tests and >> documents will be considered as actions for the initial mission. I >> gonna assume we will start from scenarios which already provide some >> basic tests and documents which we can adapt very soon and use them to >> build a SIG environment. And the long-term mission of this SIG is to >> make sure we provide good documentation and test coverage for most >> automatic functionality. >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we >> can provide more value if there are more needs in the future. Just >> like the example which Adam raised `self-optimizing` from people who >> are using watcher [3]. Let me know if you got any concerns about this >> name. > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound > quite right to me, because it's not clear what is being automated. For > example from the outside people might think it was a SIG about CI, or > about automated testing, or both - or even some kind of automatic > creation of new SIGs ;-) > Here are some alternative suggestions: > - Optimization SIG > - Self-optimization SIG > - Auto-optimization SIG > - Adaptive Cloud SIG > - Self-adaption SIG > - Auto-adaption SIG > - Auto-configuration SIG > > although I'm not sure these are a huge improvement on "Autoscaling SIG" > - maybe some are too broad, or too vague.  It depends on how likely it > is that the scope will go beyond just auto-scaling.  Of course you could > also just stick with the original idea of "Auto-scaling" :-) I'm inclined to argue that limiting the scope of this SIG is actually a feature, not a bug. Better to have a tightly focused SIG that has very specific, achievable goals than to try to boil the ocean by solving all of the auto* problems in OpenStack. We all know how "one SIG to rule them all" ends. ;-) >> And to clarify, there will definitely some cross SIG co-work between >> this new SIG and Self-Healing SIG (there're some common requirements >> even across self-healing and autoscaling features.). We also need to >> make sure we do not provide any duplicated work against self-healing >> SIG. As a start, let's only focus on autoscaling scenario, and make >> sure we're doing it right before we move to multiple cases. > > Sounds good! >> If no objection, I will create the new SIG before next weekend and >> plan a short schedule in Denver summit and PTG. > > Thanks for driving this! From kbcaulder at gmail.com Fri Jan 11 16:25:55 2019 From: kbcaulder at gmail.com (Brandon Caulder) Date: Fri, 11 Jan 2019 08:25:55 -0800 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: <20190111152318.ztuwirfgypehdfp6@localhost> References: <20190111152318.ztuwirfgypehdfp6@localhost> Message-ID: Hi, The steps were... - purge - shutdown cinder-scheduler, cinder-api - upgrade software - restart cinder-volume - sync (upgrade fails and stops at v114) - sync again (db upgrades to v117) - restart cinder-volume - stacktrace observed in volume.log Thanks On Fri, Jan 11, 2019 at 7:23 AM Gorka Eguileor wrote: > On 10/01, Brandon Caulder wrote: > > Hi Iain, > > > > There are 424 rows in volumes which drops down to 185 after running > > cinder-manage db purge 1. Restarting the volume service after package > > upgrade and running sync again does not remediate the problem, although > > running db sync a second time does bump the version up to 117, the > > following appears in the volume.log... > > > > http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ > > > > Hi, > > If I understand correctly the steps were: > > - Run DB sync --> Fail > - Run DB purge > - Restart volume services > - See the log error > - Run DB sync --> version proceeds to 117 > > If that is the case, could you restart the services again now that the > migration has been moved to version 117? > > If the cinder-volume service is able to restart please run the online > data migrations with the service running. > > Cheers, > Gorka. > > > > Thanks > > > > On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell < > iain.macdonnell at oracle.com> > > wrote: > > > > > > > > Different issue, I believe (DB sync vs. online migrations) - it just > > > happens that both pertain to shared targets. > > > > > > Brandon, might you have a very large number of rows in your volumes > > > table? Have you been purging soft-deleted rows? > > > > > > ~iain > > > > > > > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > > > Brandon, > > > > > > > > I am thinking you are hitting this bug: > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > > > > > > > I think you can work around it by retrying the migration with the > volume > > > > service running. You may, however, want to check with Iain > MacDonnell > > > > as he has been looking at this for a while. > > > > > > > > Thanks! > > > > Jay > > > > > > > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > > > >> Hi, > > > >> > > > >> I am receiving the following error when performing an offline > upgrade > > > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > > > >> openstack-cinder-1:12.0.3-1.el7. > > > >> > > > >> # cinder-manage db version > > > >> 105 > > > >> > > > >> # cinder-manage --debug db sync > > > >> Error during database migration: (pymysql.err.OperationalError) > (2013, > > > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE > volumes > > > >> SET shared_targets=%(shared_targets)s'] [parameters: > > > >> {'shared_targets': 1}] > > > >> > > > >> # cinder-manage db version > > > >> 114 > > > >> > > > >> The db version does not upgrade to queens version 117. Any help > would > > > >> be appreciated. > > > >> > > > >> Thank you > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mihalis68 at gmail.com Fri Jan 11 16:26:08 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Fri, 11 Jan 2019 11:26:08 -0500 Subject: [ops] OpenStack Operators Meetup, March 2019 Message-ID: Dear All The OpenStack Ops Meetups team is pleased to announce preliminary details for the next Ops Meetup. The event will be held March 7th and 8th in Berlin, Germany and is being hosted by Deutsche Telekom(DT). We thank them for their kind offer to host this event. The exact venue has not yet been decided but DT has two similar facilities both reserved at present and they will be working out shortly which one works better for them. DT's proposal is here https://etherpad.openstack.org/p/ops-meetup-venue-discuss-1st-2019-berlin The meetups team will be sharing the planning docs for the technical agenda in the next few weeks. So far, there has been interest expressed in having a research track at this meetup alongside the general track. Please let us know ASAP if that is of interest. Looking forward to seeing operators in Berlin! Chris Morgan (on behalf of the ops meetups team) -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Fri Jan 11 16:32:21 2019 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 11 Jan 2019 11:32:21 -0500 Subject: [RHEL8-OSP15] Container Runtimes integration - Status report #7 In-Reply-To: References: Message-ID: I didn't mean to send that on that list, but whatever. Nothing is confidential in that email. Except a few links that nobody cares. What I realized though is that I think it's time to communicate this effort in the public, which was impossible for me until now because RHEL8. For the next edition, I will send it to this list so anyone interested by podman can take a look. Also I'm available for any questions if needed. Thanks & sorry for noise. Emilien On Fri, Jan 11, 2019 at 11:20 AM Emilien Macchi wrote: > Welcome to the seventh status report about the progress we make to > Container Runtimes into Red Hat OpenStack Platform, version 15. > You can read the previous report here: > > http://post-office.corp.redhat.com/archives/container-teams/2018-December/msg00090.html > Our efforts are tracked here: https://trello.com/b/S8TmOU0u/tripleo-podman > > > TL;DR > =========================================== > - Some OSP folks will meet in Brno next week, to work together on > RHEL8/OSP15. See [1]. > - We have replaced the Docker Healthchecks by SystemD timers when Podman > is deployed. Now figuring out the next steps [2]. > - Slow progress on the Python-based uploader (using tar-split + buildah), > slowed by bugs. > - We are waiting for podman 1.0 so we can build / test / ship it in > TripleO CI. > > Context reminder > =========================================== > The OpenStack team is preparing the 15th version of Red Hat OpenStack > Platform that will work on RHEL8. > We are working together to support the future Container Runtimes which > replace Docker. > > Done > =========================================== > - Implemented Podman healthchecks with SystemD timers: > https://review.openstack.org/#/c/620372/ > - Renamed SystemD services controlling Podman containers to not conflict > with baremetal services https://review.openstack.org/#/c/623241/ > - podman issues (reported by us) closed: > - pull: error setting new rlimits: operation not permitted > https://github.com/containers/libpod/issues/2123 > - New podman version introduce new issue with selinux and relabelling: > relabel failed "/run/netns": operation not supported > https://github.com/containers/libpod/issues/2034 > - container create failed: container_linux.go:336: starting container > process caused "setup user: permission denied" > https://github.com/containers/libpod/issues/1980 > - "podman inspect --type image --format exists " reports a > not-friendly error when image doesn't exist in local storage > https://github.com/containers/libpod/issues/1845 > - container create failed: container_linux.go:336: starting container > process caused "process_linux.go:293: applying cgroup configuration for > process caused open /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus: no > such file or directory" https://github.com/containers/libpod/issues/1841 > - paunch/runner: test if image exists before running inspect > https://review.openstack.org/#/c/619313/ > - Fixing a bunch of issues with docker-puppet.py to reduce chances of race > conditions. > - A lot of SElinux work, to make everything working in Enforced mode. > - tar-split packaging is done, and will be consumed in TripleO for the > python image uplaoded > > In progress > =========================================== > - Still investigating standard_init_linux.go:203: exec user process caused > \"no such file or directory\" [5]. This one is nasty and painful. It > involves concurrency and we are evaluating solutions, but we'll probably > end up reduce the default multi-processing of podman commands from 6 to 3 > by default. > - Investigating ways to gate new versions of Podman + dependencies: > https://review.rdoproject.org/r/#/c/17960/ > - Investigating how to consume systemd timers in sensu (healtchecks) [2] > - Investigating and prototyping a pattern to safely spawn a container from > a container with systemd https://review.openstack.org/#/c/620062 > - Investigating how we can prune Docker data when upgrading from Docker to > Podman https://review.openstack.org/#/c/620405/ > - Using the new "podman image exist" in Paunch > https://review.openstack.org/#/c/619313/ > - Still implementing a Python-based container uploader (using tar-split > and buildah) - this method will be the default later: > https://review.openstack.org/#/c/616018/ > - Testing future Podman 1.0 in TripleO [3] > - Help the Skydive team to migrate to Podman [4] > > Blocked > =========================================== > Podman 1.0 contains a lot of fixes that we need (from libpod and vendored > as well). > > Any comment or feedback is welcome, thanks for reading! > > [1] > https://docs.google.com/document/d/18-1M1eSnlls6j2Op2TxyvyuqoOksxmwHOhqaD6B8FQY/edit > [2] https://trello.com/c/g6bi5DQF/4-healthchecks > [3] https://trello.com/c/2tXNLJUN/58-test-podman-10 > [4] https://trello.com/c/tW935FGe/56-migrate-ansible-skydive-to-podman > [5] https://github.com/containers/libpod/issues/1844 > -- > Emilien Macchi > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Fri Jan 11 16:40:29 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 11 Jan 2019 16:40:29 +0000 Subject: [nova] Retiring gantt, python-ganttclient projects Message-ID: <1fec3e43b5247493614fe3f3b175133408f960e2.camel@redhat.com> Hey, These projects are mega old, don't appear to have been official projects, and should have been retired a long time ago. This is serves as a heads up on the off-chance someone has managed to do something with them. Stephen From jungleboyj at gmail.com Fri Jan 11 16:45:56 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Fri, 11 Jan 2019 10:45:56 -0600 Subject: Issues setting up a SolidFire node with Cinder In-Reply-To: <5738052c-10c5-10db-e63c-7aee351db87c@absolutedevops.io> References: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> <6f53c037-b03d-1550-3e7a-e42850d950ec@gmail.com> <2b527e54-1cfc-e2c3-31ea-b3d64225a9cb@gmail.com> <5738052c-10c5-10db-e63c-7aee351db87c@absolutedevops.io> Message-ID: Grant, Ah, if you are using a different VLAN for your storage traffic than that is likely the cause of the problem.  Good luck getting the networking issue resolved. Jay On 1/11/2019 9:12 AM, Grant Morley wrote: > > Jay, > > Thanks for that info. It appears that the cinder-volume service can > speak to the  SolidFire over the network but for some reason it can't > actually access it over iSCSI. I think it might be something to do > with how we are tagging / untagging VLANs. > > Thank you for your help, I think I am heading in the right direction now! > > Kind Regards, > > Grant > > On 11/01/2019 14:44, Jay Bryant wrote: >> >> Grant, >> >> Doing the boot from volume is actually quite different than attaching >> a volume to an instance. >> >> In the case that you are doing the boot from volume (assuming that >> your glance storage is not in the Solidfire) the volume is created >> and attached to where the cinder-volume service is running.  Then the >> image is written into the volume. >> >> Have you verified that the host and container that is running >> cinder-volume is able to access the Solidfire backend? >> >> Jay >> >> On 1/11/2019 4:45 AM, Grant Morley wrote: >>> >>> Hi Jay, >>> >>> Thanks for the tip there. I am still having some trouble with it >>> which is really annoying. The strange thing is, I can launch a >>> volume and attach it to an instance absolutely fine. The only issue >>> I am having is literally creating this bootable volume. >>> >>> I assume creating a volume and attaching it to an instance is >>> exactly the same as creating a bootable volume minus the Nova part? >>> >>> I would just expect nothing to work if nothing could speak to the >>> SolidFire. >>> >>> Would it make a difference if the current image that is being copied >>> over to the bootable volume is in a ceph cluster? I know glance >>> should deal with it but I am wondering if the copy of the image is >>> the actual issue? >>> >>> Thanks again, >>> >>> On 11/01/2019 00:10, Jay S. Bryant wrote: >>>> >>>> Grant, >>>> >>>> So, the copy is failing because it can't find the volume to copy >>>> the image into. >>>> >>>> I would check the host and container for any iSCSI errors as well >>>> as the backend.  It appears that something is going wrong when >>>> attempting to temporarily attach the volume to write the image into it. >>>> >>>> Jay >>>> >>>> On 1/10/2019 7:16 AM, Grant Morley wrote: >>>>> >>>>> Hi all, >>>>> >>>>> We are in the process of trying to add a SolidFire storage >>>>> solution to our existing OpenStack setup and seem to have hit a >>>>> snag with cinder / iscsi. >>>>> >>>>> We are trying to create a bootable volume to allow us to launch an >>>>> instance from it, but we are getting some errors in our >>>>> cinder-volumes containers that seem to suggest they can't connect >>>>> to iscsi although the volume seems to create fine on the SolidFire >>>>> node. >>>>> >>>>> The command we are running is: >>>>> >>>>> openstack volume create --image $image-id --size 20 --bootable >>>>> --type solidfire sf-volume-v12 >>>>> >>>>> The volume seems to create on SolidFire but I then see these >>>>> errors in the "cinder-volume.log" >>>>> >>>>> https://pastebin.com/LyjLUhfk >>>>> >>>>> The volume containers can talk to the iscsi VIP on the SolidFire >>>>> so I am a bit stuck and wondered if anyone had come across any >>>>> issues before? >>>>> >>>>> Kind Regards, >>>>> >>>>> >>>>> -- >>>>> Grant Morley >>>>> Cloud Lead >>>>> Absolute DevOps Ltd >>>>> Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP >>>>> www.absolutedevops.io >>>>> grant at absolutedevops.io 0845 874 >>>>> 0580 >>> -- >>> Grant Morley >>> Cloud Lead >>> Absolute DevOps Ltd >>> Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP >>> www.absolutedevops.io >>> grant at absolutedevops.io 0845 874 0580 > -- > Grant Morley > Cloud Lead > Absolute DevOps Ltd > Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP > www.absolutedevops.io > grant at absolutedevops.io 0845 874 0580 -------------- next part -------------- An HTML attachment was scrubbed... URL: From dirk at dmllr.de Fri Jan 11 17:11:03 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Fri, 11 Jan 2019 18:11:03 +0100 Subject: [self-healing-sig] best practices for haproxy health checking Message-ID: Hi, Does anyone have a good pointer for good healthchecks to be used by the frontend api haproxy loadbalancer? in one case that I am looking at right now, the entry haproxy loadbalancer was not able to detect a particular backend being not responding to api requests, so it flipped up and down repeatedly, causing intermittend spurious 503 errors. The backend was able to respond to connections and to basic HTTP GET requests (e.g. / or even /v3 as path), but when it got a "real" query it hung. the reason for that was, as it turned out, the configured caching backend memcached on that machine being locked up (due to some other bug). I wonder if there is a better way to check if a backend is "working" and what the best practices around this are. A potential thought I had was to do the backend check via some other healthcheck specific port that runs a custom daemon that does more sophisticated checks like checking for system wide errors (like memcache, database, rabbitmq) being unavailable on that node, and hence not accepting any api traffic until that is being resolved. Any pointers to read upon / best practices appreciated. Thanks, Dirk From duc.openstack at gmail.com Fri Jan 11 17:14:03 2019 From: duc.openstack at gmail.com (Duc Truong) Date: Fri, 11 Jan 2019 09:14:03 -0800 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> Message-ID: +1 on limiting the scope to autoscaling at first. I prefer the name autoscaling since the mission is to improve automatic scaling. If the mission is changed later, we can change the name of the SIG to reflect that. On Fri, Jan 11, 2019 at 8:24 AM Ben Nemec wrote: > > > > On 1/11/19 10:14 AM, Adam Spiers wrote: > > Rico Lin wrote: > >> Dear all > >> > >> To continue the discussion of whether we should have new SIG for > >> autoscaling. > >> I think we already got enough time for this ML [1], and it's time to > >> jump to the next step. As we got a lot of positive feedbacks from ML > >> [1], I think it's definitely considered an action to create a new SIG, > >> do some init works, and finally Here are some things that we can start > >> right now, to come out with the name of SIG, the definition and mission. > >> Here's my draft plan: To create a SIG name `Automatic SIG`, with given > >> initial mission to improve automatic scaling with (but not limited to) > >> OpenStack. As we discussed in forum [2], to have scenario tests and > >> documents will be considered as actions for the initial mission. I > >> gonna assume we will start from scenarios which already provide some > >> basic tests and documents which we can adapt very soon and use them to > >> build a SIG environment. And the long-term mission of this SIG is to > >> make sure we provide good documentation and test coverage for most > >> automatic functionality. > >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we > >> can provide more value if there are more needs in the future. Just > >> like the example which Adam raised `self-optimizing` from people who > >> are using watcher [3]. Let me know if you got any concerns about this > >> name. > > > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound > > quite right to me, because it's not clear what is being automated. For > > example from the outside people might think it was a SIG about CI, or > > about automated testing, or both - or even some kind of automatic > > creation of new SIGs ;-) > > Here are some alternative suggestions: > > - Optimization SIG > > - Self-optimization SIG > > - Auto-optimization SIG > > - Adaptive Cloud SIG > > - Self-adaption SIG > > - Auto-adaption SIG > > - Auto-configuration SIG > > > > although I'm not sure these are a huge improvement on "Autoscaling SIG" > > - maybe some are too broad, or too vague. It depends on how likely it > > is that the scope will go beyond just auto-scaling. Of course you could > > also just stick with the original idea of "Auto-scaling" :-) > > I'm inclined to argue that limiting the scope of this SIG is actually a > feature, not a bug. Better to have a tightly focused SIG that has very > specific, achievable goals than to try to boil the ocean by solving all > of the auto* problems in OpenStack. We all know how "one SIG to rule > them all" ends. ;-) > > >> And to clarify, there will definitely some cross SIG co-work between > >> this new SIG and Self-Healing SIG (there're some common requirements > >> even across self-healing and autoscaling features.). We also need to > >> make sure we do not provide any duplicated work against self-healing > >> SIG. As a start, let's only focus on autoscaling scenario, and make > >> sure we're doing it right before we move to multiple cases. > > > > Sounds good! > >> If no objection, I will create the new SIG before next weekend and > >> plan a short schedule in Denver summit and PTG. > > > > Thanks for driving this! > From rfolco at redhat.com Fri Jan 11 17:27:01 2019 From: rfolco at redhat.com (Rafael Folco) Date: Fri, 11 Jan 2019 15:27:01 -0200 Subject: [openstack-dev][tripleo] TripleO CI Summary: Sprint 24 Message-ID: Greetings, The TripleO CI team has just completed Sprint 24 (Dec 20 thru Jan 09). The following is a summary of completed work during this sprint cycle: - Created Zuul configuration and changed repository scripts for the new Fedora 28 promotion pipeline, including container build jobs. - Replaced multinode scenarios(1-4) with standalone scenarios(1-4) jobs across TripleO projects. Also fixed missing services for standalone scenario jobs. A few changes are still “in-flight” and are close to merge. - Tempest is now successfully running on Fedora 28 standalone jobs. - Improved reproducer solution using upstream Zuul containers by moving code to an ansible role in rdo-infra/ansible-role-tripleo-ci-reproducer and automated the libvirt setup. - Created a new OVB workflow without te-broker, moved OVB repo from github to gerrit and did a PoC with new reproducer and OVB jobs. The tebroker is no longer part of the ovb workflow. The planned work for the next sprint [1] are: - Apply changes for Fedora 28 promotion pipeline in production environment to start collecting logs for container build job. - Complete transition from multinode scenarios (1-4) to standalone jobs across all TripleO projects. - Improve the new Zuul container reproducer by automating nodepool config for libvirt. - Enable CI on OVB under ovb’s new git repo in the openstack namespace. - Refactor the upstream zuul job configuration to consolidate file parameters into one repo openstack-infra/tripleo-ci. - Begin to move the RDO-Phase2 Baremetal jobs to upstream tripleo. The Ruck and Rover for this sprint are Arx Cruz (arxcruz) and Sorin Sbarnea (ssbarnea). Please direct questions or queries to them regarding CI status or issues in #tripleo, ideally to whomever has the ‘|ruck’ suffix on their nick. Notes are recorded on etherpad [2]. Thanks, rfolco [1] https://tree.taiga.io/project/tripleo-ci-board/taskboard/unified-sprint-4 [2] https://review.rdoproject.org/etherpad/p/ruckrover-sprint25 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Fri Jan 11 17:31:34 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 11 Jan 2019 11:31:34 -0600 Subject: [self-healing-sig] best practices for haproxy health checking In-Reply-To: References: Message-ID: On 1/11/19 11:11 AM, Dirk Müller wrote: > Hi, > > Does anyone have a good pointer for good healthchecks to be used by > the frontend api haproxy loadbalancer? > > in one case that I am looking at right now, the entry haproxy > loadbalancer was not able > to detect a particular backend being not responding to api requests, > so it flipped up and down repeatedly, causing intermittend spurious > 503 errors. > > The backend was able to respond to connections and to basic HTTP GET > requests (e.g. / or even /v3 as path), but when it got a "real" query > it hung. the reason for that was, as it turned out, > the configured caching backend memcached on that machine being locked > up (due to some other bug). > > I wonder if there is a better way to check if a backend is "working" > and what the best practices around this are. A potential thought I had > was to do the backend check via some other healthcheck specific port > that runs a custom daemon that does more sophisticated checks like > checking for system wide errors (like memcache, database, rabbitmq) > being unavailable on that node, and hence not accepting any api > traffic until that is being resolved. A very similar thing has been proposed: https://review.openstack.org/#/c/531456/ It also came up as a possible community goal for Train: http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000558.html But to my knowledge no one has stepped forward to drive the work. It seems to be something people generally agree we need, but nobody has time to do. :-( > > Any pointers to read upon / best practices appreciated. > > Thanks, > Dirk > From openstack at nemebean.com Fri Jan 11 17:34:10 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 11 Jan 2019 11:34:10 -0600 Subject: [openstack-dev][tripleo] TripleO CI Summary: Sprint 24 In-Reply-To: References: Message-ID: On 1/11/19 11:27 AM, Rafael Folco wrote: > moved OVB repo from github to gerrit This is news to me. ;-) The work to do the gerrit import is still underway, but should be done soon: https://review.openstack.org/#/c/620613/ I have a bit more cleanup to do in the github repo and then we can proceed. -Ben From mrhillsman at gmail.com Fri Jan 11 17:56:16 2019 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Fri, 11 Jan 2019 11:56:16 -0600 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> Message-ID: +1 SIGs should have limited scope - shared interest in a particular area - even if that area is something broad like security the mission and work should be specific which could lead to working groups, additional SIGs, projects, etc so I want to be careful how I word it but yes limited scope is the ideal way to start a SIG imo. On Fri, Jan 11, 2019 at 11:14 AM Duc Truong wrote: > +1 on limiting the scope to autoscaling at first. I prefer the name > autoscaling since the mission is to improve automatic scaling. If the > mission is changed later, we can change the name of the SIG to reflect > that. > > On Fri, Jan 11, 2019 at 8:24 AM Ben Nemec wrote: > > > > > > > > On 1/11/19 10:14 AM, Adam Spiers wrote: > > > Rico Lin wrote: > > >> Dear all > > >> > > >> To continue the discussion of whether we should have new SIG for > > >> autoscaling. > > >> I think we already got enough time for this ML [1], and it's time to > > >> jump to the next step. As we got a lot of positive feedbacks from ML > > >> [1], I think it's definitely considered an action to create a new SIG, > > >> do some init works, and finally Here are some things that we can start > > >> right now, to come out with the name of SIG, the definition and > mission. > > >> Here's my draft plan: To create a SIG name `Automatic SIG`, with given > > >> initial mission to improve automatic scaling with (but not limited to) > > >> OpenStack. As we discussed in forum [2], to have scenario tests and > > >> documents will be considered as actions for the initial mission. I > > >> gonna assume we will start from scenarios which already provide some > > >> basic tests and documents which we can adapt very soon and use them to > > >> build a SIG environment. And the long-term mission of this SIG is to > > >> make sure we provide good documentation and test coverage for most > > >> automatic functionality. > > >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we > > >> can provide more value if there are more needs in the future. Just > > >> like the example which Adam raised `self-optimizing` from people who > > >> are using watcher [3]. Let me know if you got any concerns about this > > >> name. > > > > > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound > > > quite right to me, because it's not clear what is being automated. For > > > example from the outside people might think it was a SIG about CI, or > > > about automated testing, or both - or even some kind of automatic > > > creation of new SIGs ;-) > > > Here are some alternative suggestions: > > > - Optimization SIG > > > - Self-optimization SIG > > > - Auto-optimization SIG > > > - Adaptive Cloud SIG > > > - Self-adaption SIG > > > - Auto-adaption SIG > > > - Auto-configuration SIG > > > > > > although I'm not sure these are a huge improvement on "Autoscaling SIG" > > > - maybe some are too broad, or too vague. It depends on how likely it > > > is that the scope will go beyond just auto-scaling. Of course you > could > > > also just stick with the original idea of "Auto-scaling" :-) > > > > I'm inclined to argue that limiting the scope of this SIG is actually a > > feature, not a bug. Better to have a tightly focused SIG that has very > > specific, achievable goals than to try to boil the ocean by solving all > > of the auto* problems in OpenStack. We all know how "one SIG to rule > > them all" ends. ;-) > > > > >> And to clarify, there will definitely some cross SIG co-work between > > >> this new SIG and Self-Healing SIG (there're some common requirements > > >> even across self-healing and autoscaling features.). We also need to > > >> make sure we do not provide any duplicated work against self-healing > > >> SIG. As a start, let's only focus on autoscaling scenario, and make > > >> sure we're doing it right before we move to multiple cases. > > > > > > Sounds good! > > >> If no objection, I will create the new SIG before next weekend and > > >> plan a short schedule in Denver summit and PTG. > > > > > > Thanks for driving this! > > > > -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Fri Jan 11 18:26:43 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Fri, 11 Jan 2019 11:26:43 -0700 Subject: [openstack-dev][tripleo] TripleO CI Summary: Sprint 24 In-Reply-To: References: Message-ID: On Fri, Jan 11, 2019 at 10:38 AM Ben Nemec wrote: > > > On 1/11/19 11:27 AM, Rafael Folco wrote: > > moved OVB repo from github to gerrit > Same, that should be in plan for this sprint. Just a misunderstanding that I should have caught. My current understanding is that OVB is in the process of moving to the openstack namespace, and Sagi is prepping CI for it. Thanks Ben! > > This is news to me. ;-) > > The work to do the gerrit import is still underway, but should be done > soon: https://review.openstack.org/#/c/620613/ > > I have a bit more cleanup to do in the github repo and then we can proceed. > > -Ben > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aspiers at suse.com Fri Jan 11 18:51:56 2019 From: aspiers at suse.com (Adam Spiers) Date: Fri, 11 Jan 2019 18:51:56 +0000 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> Message-ID: <20190111185156.fmpaplichmwpvk5u@pacific.linksys.moosehall> Fine by me - sounds like we have a consensus for autoscaling then? Melvin Hillsman wrote: >+1 SIGs should have limited scope - shared interest in a particular area - >even if that area is something broad like security the mission and work >should be specific which could lead to working groups, additional SIGs, >projects, etc so I want to be careful how I word it but yes limited scope >is the ideal way to start a SIG imo. > >On Fri, Jan 11, 2019 at 11:14 AM Duc Truong wrote: > >> +1 on limiting the scope to autoscaling at first. I prefer the name >> autoscaling since the mission is to improve automatic scaling. If the >> mission is changed later, we can change the name of the SIG to reflect >> that. >> >> On Fri, Jan 11, 2019 at 8:24 AM Ben Nemec wrote: >> > >> > >> > >> > On 1/11/19 10:14 AM, Adam Spiers wrote: >> > > Rico Lin wrote: >> > >> Dear all >> > >> >> > >> To continue the discussion of whether we should have new SIG for >> > >> autoscaling. >> > >> I think we already got enough time for this ML [1], and it's time to >> > >> jump to the next step. As we got a lot of positive feedbacks from ML >> > >> [1], I think it's definitely considered an action to create a new SIG, >> > >> do some init works, and finally Here are some things that we can start >> > >> right now, to come out with the name of SIG, the definition and >> mission. >> > >> Here's my draft plan: To create a SIG name `Automatic SIG`, with given >> > >> initial mission to improve automatic scaling with (but not limited to) >> > >> OpenStack. As we discussed in forum [2], to have scenario tests and >> > >> documents will be considered as actions for the initial mission. I >> > >> gonna assume we will start from scenarios which already provide some >> > >> basic tests and documents which we can adapt very soon and use them to >> > >> build a SIG environment. And the long-term mission of this SIG is to >> > >> make sure we provide good documentation and test coverage for most >> > >> automatic functionality. >> > >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we >> > >> can provide more value if there are more needs in the future. Just >> > >> like the example which Adam raised `self-optimizing` from people who >> > >> are using watcher [3]. Let me know if you got any concerns about this >> > >> name. >> > > >> > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound >> > > quite right to me, because it's not clear what is being automated. For >> > > example from the outside people might think it was a SIG about CI, or >> > > about automated testing, or both - or even some kind of automatic >> > > creation of new SIGs ;-) >> > > Here are some alternative suggestions: >> > > - Optimization SIG >> > > - Self-optimization SIG >> > > - Auto-optimization SIG >> > > - Adaptive Cloud SIG >> > > - Self-adaption SIG >> > > - Auto-adaption SIG >> > > - Auto-configuration SIG >> > > >> > > although I'm not sure these are a huge improvement on "Autoscaling SIG" >> > > - maybe some are too broad, or too vague. It depends on how likely it >> > > is that the scope will go beyond just auto-scaling. Of course you >> could >> > > also just stick with the original idea of "Auto-scaling" :-) >> > >> > I'm inclined to argue that limiting the scope of this SIG is actually a >> > feature, not a bug. Better to have a tightly focused SIG that has very >> > specific, achievable goals than to try to boil the ocean by solving all >> > of the auto* problems in OpenStack. We all know how "one SIG to rule >> > them all" ends. ;-) >> > >> > >> And to clarify, there will definitely some cross SIG co-work between >> > >> this new SIG and Self-Healing SIG (there're some common requirements >> > >> even across self-healing and autoscaling features.). We also need to >> > >> make sure we do not provide any duplicated work against self-healing >> > >> SIG. As a start, let's only focus on autoscaling scenario, and make >> > >> sure we're doing it right before we move to multiple cases. >> > > >> > > Sounds good! >> > >> If no objection, I will create the new SIG before next weekend and >> > >> plan a short schedule in Denver summit and PTG. >> > > >> > > Thanks for driving this! >> > >> >> > >-- >Kind regards, > >Melvin Hillsman >mrhillsman at gmail.com >mobile: (832) 264-2646 From openstack at nemebean.com Fri Jan 11 21:39:04 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 11 Jan 2019 15:39:04 -0600 Subject: Review-Priority for Project Repos In-Reply-To: <16ba68b1772befaf5d689ecfb8a7b60ad055bdeb.camel@redhat.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> <20190110194227.GB14554@sm-workstation> <16ba68b1772befaf5d689ecfb8a7b60ad055bdeb.camel@redhat.com> Message-ID: <5db92af7-e533-2da2-9b32-49f195472837@nemebean.com> On 1/10/19 5:03 PM, Sean Mooney wrote: > On Thu, 2019-01-10 at 13:42 -0600, Sean McGinnis wrote: >>> >>> I don't know if this was the reasoning behind Cinder's system, but I know >>> some people object to procedural -2 because it's a big hammer to essentially >>> say "not right now". It overloads the meaning of the vote in a potentially >>> confusing way that requires explanation every time it's used. At least I >>> hope procedural -2's always include a comment. >>> >> >> This was exactly the reasoning. -2 is overloaded, but its primary meaning >> was/is "we do not want this code change". It just happens that it was also a >> convenient way to say that with "right now" at the end. >> >> The Review-Priority -1 is a clear way to say whether something is held because >> it can't be merged right now due to procedural or process reasons, versus >> something that we just don't want at all. > for what its worth my understanding of why a procdural -2 is more correct is that this change > cannot be merged because it has not met the procedual requirement to be considerd for this release. > haveing received several over the years i have never seen it to carry any malaise > or weight then the zuul pep8 job complianing about the line lenght of my code. > with either a procedural -2 or a verify -1 from zuul my code is equally un mergeable. > > the prime example being a patch that requires a spec that has not been approved. > while most cores will not approve chage when other cores have left a -1 mistakes happen > and the -2 does emphasise the point that even if the code is perfect under the porject > processes this change should not be acitvly reporposed until the issue raised by the -2 > has been addressed. In the case of a procedual -2 that typically means the spec is merge > or the master branch opens for the next cycle. > > i agree that procedural -2's can seam harsh at first glance but i have also never seen one > left without a comment explaining why it was left. the issue with a procedural -1 is i can > jsut resubmit the patch several times and it can get lost in the comments. I don't think that's a problem with this new field. It sounds like priority -1 carries over from PS to PS. > > we recently intoduced a new review priority lable > if we really wanted to disabiguate form normal -2s then we coudl have an explcitly lable for it > but i personally would prefer to keep procedural -2s. To be clear, I have both used and received procedural -2's as well and they don't particularly bother me, but I can see where if you were someone who was new to the community or just a part-time contributor not as familiar with our processes it might be an unpleasant experience to see that -2 show up. As I said, I don't know that I would advocate for this on the basis of replacing procedural -2 alone, but if we're adding the category anyway I mildly prefer using it for procedural blockers in the future. From smooney at redhat.com Fri Jan 11 22:09:44 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 11 Jan 2019 22:09:44 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <5db92af7-e533-2da2-9b32-49f195472837@nemebean.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> <20190110194227.GB14554@sm-workstation> <16ba68b1772befaf5d689ecfb8a7b60ad055bdeb.camel@redhat.com> <5db92af7-e533-2da2-9b32-49f195472837@nemebean.com> Message-ID: On Fri, 2019-01-11 at 15:39 -0600, Ben Nemec wrote: > > On 1/10/19 5:03 PM, Sean Mooney wrote: > > On Thu, 2019-01-10 at 13:42 -0600, Sean McGinnis wrote: > > > > > > > > I don't know if this was the reasoning behind Cinder's system, but I know > > > > some people object to procedural -2 because it's a big hammer to essentially > > > > say "not right now". It overloads the meaning of the vote in a potentially > > > > confusing way that requires explanation every time it's used. At least I > > > > hope procedural -2's always include a comment. > > > > > > > > > > This was exactly the reasoning. -2 is overloaded, but its primary meaning > > > was/is "we do not want this code change". It just happens that it was also a > > > convenient way to say that with "right now" at the end. > > > > > > The Review-Priority -1 is a clear way to say whether something is held because > > > it can't be merged right now due to procedural or process reasons, versus > > > something that we just don't want at all. > > > > for what its worth my understanding of why a procdural -2 is more correct is that this change > > cannot be merged because it has not met the procedual requirement to be considerd for this release. > > haveing received several over the years i have never seen it to carry any malaise > > or weight then the zuul pep8 job complianing about the line lenght of my code. > > with either a procedural -2 or a verify -1 from zuul my code is equally un mergeable. > > > > the prime example being a patch that requires a spec that has not been approved. > > while most cores will not approve chage when other cores have left a -1 mistakes happen > > and the -2 does emphasise the point that even if the code is perfect under the porject > > processes this change should not be acitvly reporposed until the issue raised by the -2 > > has been addressed. In the case of a procedual -2 that typically means the spec is merge > > or the master branch opens for the next cycle. > > > > i agree that procedural -2's can seam harsh at first glance but i have also never seen one > > left without a comment explaining why it was left. the issue with a procedural -1 is i can > > jsut resubmit the patch several times and it can get lost in the comments. > > I don't think that's a problem with this new field. It sounds like > priority -1 carries over from PS to PS. > > > > > we recently intoduced a new review priority lable > > if we really wanted to disabiguate form normal -2s then we coudl have an explcitly lable for it > > but i personally would prefer to keep procedural -2s. > > To be clear, I have both used and received procedural -2's as well and > they don't particularly bother me, but I can see where if you were > someone who was new to the community or just a part-time contributor not > as familiar with our processes it might be an unpleasant experience to > see that -2 show up. As I said, I don't know that I would advocate for > this on the basis of replacing procedural -2 alone, but if we're adding > the category anyway I mildly prefer using it for procedural blockers in > the future. i think i partally missunderstood the proposal. i had parsed it as replacing procedual code review -2 with a code review -1 rahter then review priority -1. if all projects adopt review priorty going forward then that might makes sense for those that dont i think code review -2 still makes sense. From rico.lin.guanyu at gmail.com Sat Jan 12 00:36:32 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Sat, 12 Jan 2019 08:36:32 +0800 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: <20190111185156.fmpaplichmwpvk5u@pacific.linksys.moosehall> References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> <20190111185156.fmpaplichmwpvk5u@pacific.linksys.moosehall> Message-ID: Adam Spiers 於 2019年1月12日 週六,上午2:59寫道: > Fine by me - sounds like we have a consensus for autoscaling then? I think “Autoscaling SIG” gets the majority vote. Let’s give it few more days for people in different time zones. > > Melvin Hillsman wrote: > >+1 SIGs should have limited scope - shared interest in a particular area - > >even if that area is something broad like security the mission and work > >should be specific which could lead to working groups, additional SIGs, > >projects, etc so I want to be careful how I word it but yes limited scope > >is the ideal way to start a SIG imo. > > > >On Fri, Jan 11, 2019 at 11:14 AM Duc Truong > wrote: > > > >> +1 on limiting the scope to autoscaling at first. I prefer the name > >> autoscaling since the mission is to improve automatic scaling. If the > >> mission is changed later, we can change the name of the SIG to reflect > >> that. > >> > >> On Fri, Jan 11, 2019 at 8:24 AM Ben Nemec > wrote: > >> > > >> > > >> > > >> > On 1/11/19 10:14 AM, Adam Spiers wrote: > >> > > Rico Lin wrote: > >> > >> Dear all > >> > >> > >> > >> To continue the discussion of whether we should have new SIG for > >> > >> autoscaling. > >> > >> I think we already got enough time for this ML [1], and it's time > to > >> > >> jump to the next step. As we got a lot of positive feedbacks from > ML > >> > >> [1], I think it's definitely considered an action to create a new > SIG, > >> > >> do some init works, and finally Here are some things that we can > start > >> > >> right now, to come out with the name of SIG, the definition and > >> mission. > >> > >> Here's my draft plan: To create a SIG name `Automatic SIG`, with > given > >> > >> initial mission to improve automatic scaling with (but not limited > to) > >> > >> OpenStack. As we discussed in forum [2], to have scenario tests and > >> > >> documents will be considered as actions for the initial mission. I > >> > >> gonna assume we will start from scenarios which already provide > some > >> > >> basic tests and documents which we can adapt very soon and use > them to > >> > >> build a SIG environment. And the long-term mission of this SIG is > to > >> > >> make sure we provide good documentation and test coverage for most > >> > >> automatic functionality. > >> > >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make > sure we > >> > >> can provide more value if there are more needs in the future. Just > >> > >> like the example which Adam raised `self-optimizing` from people > who > >> > >> are using watcher [3]. Let me know if you got any concerns about > this > >> > >> name. > >> > > > >> > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound > >> > > quite right to me, because it's not clear what is being automated. > For > >> > > example from the outside people might think it was a SIG about CI, > or > >> > > about automated testing, or both - or even some kind of automatic > >> > > creation of new SIGs ;-) > >> > > Here are some alternative suggestions: > >> > > - Optimization SIG > >> > > - Self-optimization SIG > >> > > - Auto-optimization SIG > >> > > - Adaptive Cloud SIG > >> > > - Self-adaption SIG > >> > > - Auto-adaption SIG > >> > > - Auto-configuration SIG > >> > > > >> > > although I'm not sure these are a huge improvement on "Autoscaling > SIG" > >> > > - maybe some are too broad, or too vague. It depends on how likely > it > >> > > is that the scope will go beyond just auto-scaling. Of course you > >> could > >> > > also just stick with the original idea of "Auto-scaling" :-) > >> > > >> > I'm inclined to argue that limiting the scope of this SIG is actually > a > >> > feature, not a bug. Better to have a tightly focused SIG that has very > >> > specific, achievable goals than to try to boil the ocean by solving > all > >> > of the auto* problems in OpenStack. We all know how "one SIG to rule > >> > them all" ends. ;-) > >> > > >> > >> And to clarify, there will definitely some cross SIG co-work > between > >> > >> this new SIG and Self-Healing SIG (there're some common > requirements > >> > >> even across self-healing and autoscaling features.). We also need > to > >> > >> make sure we do not provide any duplicated work against > self-healing > >> > >> SIG. As a start, let's only focus on autoscaling scenario, and make > >> > >> sure we're doing it right before we move to multiple cases. > >> > > > >> > > Sounds good! > >> > >> If no objection, I will create the new SIG before next weekend and > >> > >> plan a short schedule in Denver summit and PTG. > >> > > > >> > > Thanks for driving this! > >> > > >> > >> > > > >-- > >Kind regards, > > > >Melvin Hillsman > >mrhillsman at gmail.com > >mobile: (832) 264-2646 > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekcs.openstack at gmail.com Sat Jan 12 02:52:57 2019 From: ekcs.openstack at gmail.com (Eric K) Date: Fri, 11 Jan 2019 18:52:57 -0800 Subject: [congress][infra] override-checkout problem Message-ID: Hi Ghanshyam, On 1/11/19, 4:57 AM, "Ghanshyam Mann" wrote: >Hi Eric, > >This seems the same issue happening on congress-tempest-plugin gate where >'congress-devstack-py35-api-mysql-queens' is failing [1]. >python-congressclient was >not able to install and openstack client trow error for congress command. > >The issue is stable branch jobs on congress-tempest-plugin does checkout >the master version for all repo >instead of what mentioned in override-checkout var. > >If you see congress's rocky patch, congress is checkout out with rocky >version[2] but >congress-tempest-plugin patch's rocky job checkout the master version of >congress instead of rocky version [3]. >That is why your test expectedly fail on congress patch but pass on >congress-tempest-plugin. > >Root cause is that override-checkout var does not work on the legacy job >(it is only zuulv3 job var, if I am not wrong), >you need to use BRANCH_OVERRIDE for legacy jobs. Myself, amotoki and >akhil was trying lot other workarounds >to debug the root cause but at the end we just notice that congress jobs >are legacy jobs and using override-checkout :). Gosh thanks so much for the investigation. Yes it's a legacy-dsvm job. So sorry for the run around! I'm thinking of taking the opportunity to migrate to devstack-tempest job. I've taken a first stab here: https://review.openstack.org/#/c/630414/ > >I have submitted the testing patch with BRANCH_OVERRIDE for >congress-tempest-plugin queens job[4]. >Which seems working fine, I can make those patches more formal for merge. And thanks so much for putting together those patches using BRANCH_OVERRIDE! Merging sounds good unless we can quickly migrate To non-legacy jobs. Realistically it'll probably end up take a while to get everything migrated and working. > > >Another thing I was discussing with Akhil that new tests of builins >feature need another feature flag >(different than congressz3.enabled) as that feature of z3 is in stein >onwards only. Yup. I was going to do that but wanted to first figure out why it wasn't failing on tempest plugin. I've now submitted a patch to do that. > > >[1] https://review.openstack.org/#/c/618951/ >[2] >http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87 >474d7/logs/pip2-freeze.txt.gz >[3] >http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-ro >cky/23c0214/logs/pip2-freeze.txt.gz >[4] >https://review.openstack.org/#/q/topic:fix-stable-branch-testing+(status:o >pen+OR+status:merged) > >-gmann > > ---- On Fri, 11 Jan 2019 10:40:39 +0900 Eric K > wrote ---- > > The congress-tempest-plugin zuul jobs against stable branches appear > > to be working incorrectly. Tests that should fail on stable/rocky (and > > indeed fails when triggered by congress patch [1]) are passing when > > triggered by congress-tempest-plugin patch [2]. > > > > I'd assume it's some kind of zuul misconfiguration in > > congress-tempest-plugin [3], but I've so far failed to figure out > > what's wrong. Particularly strange is that the job-output appears to > > show it checking out the right thing [4]. > > > > Any thoughts or suggestions? Thanks so much! > > > > [1] > > https://review.openstack.org/#/c/629070/ > > >http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87 >474d7/logs/testr_results.html.gz > > The two failing z3 tests should indeed fail because the feature was > > not available in rocky. The tests were introduced because for some > > reason they pass in the job triggered by a patch in > > congress-tempest-plugin. > > > > [2] > > https://review.openstack.org/#/c/618951/ > > >http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-ro >cky/23c0214/logs/testr_results.html.gz > > > > [3] >https://github.com/openstack/congress-tempest-plugin/blob/master/.zuul.yam >l#L4 > > > > [4] >http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-ro >cky/23c0214/job-output.txt.gz#_2019-01-09_05_18_08_183562 > > shows congress is checked out to the correct commit at the top of the > > stable/rocky branch. > > > > > > > From raniaadouni at gmail.com Sun Jan 13 10:16:23 2019 From: raniaadouni at gmail.com (Rania Adouni) Date: Sun, 13 Jan 2019 11:16:23 +0100 Subject: [openstack-ZUN] Message-ID: hi everyone , I was trying to deploy wordpress -zun by using heat , this is the template I used "https://pastebin.com/0PGtWSVw" . now the stack create successfully the mysql image running but the wordpress image alwayes stopped and when I try to started and access to the container " openstack appcontainer exec --interactive rho-1-container apache2-foreground " i get this output : ********************************* connected to container "rho-1-container" type ~. to disconnect AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.16.0.3. Set the 'ServerName' directive globally to suppress this message AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.16.0.3. Set the 'ServerName' directive globally to suppress this message [Sun Jan 13 10:07:24.463058 2019] [mpm_prefork:notice] [pid 77] AH00163: Apache/2.4.25 (Debian) PHP/7.2.14 configured -- resuming normal operations [Sun Jan 13 10:07:24.463196 2019] [core:notice] [pid 77] AH00094: Command line: 'apache2 -D FOREGROUND' ***************************************************** and then the status of wordpress image back stopped !!!! the logs of wordpress image can be found here : https://pastebin.com/CitXk6zN thanks for any help !! -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Sun Jan 13 18:11:29 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Sun, 13 Jan 2019 12:11:29 -0600 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: References: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> Message-ID: Hi Derek, Yes, these rules would need to be added inside the router namespace when it is created and it seems to me it is a workable solution. I will raise this work in the next L3 sub-team meeting, so we keep an eye on the patches / progress you make Regards Miguel On Mon, Jan 7, 2019 at 11:54 AM Derek Higgins wrote: > On Mon, 7 Jan 2019 at 17:08, Clark Boylan wrote: > > > > On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: > > > Thanks for bringing this up Derek! > > > Comments below. > > > > > > On Mon, Jan 7, 2019 at 8:30 AM Derek Higgins > wrote: > > > > > > > > Hi All, > > > > > > > > Shortly before the holidays CI jobs moved from xenial to bionic, for > > > > Ironic this meant a bunch failures[1], all have now been dealt with, > > > > with the exception of the UEFI job. It turns out that during this job > > > > our (virtual) baremetal nodes use tftp to download a ipxe image. In > > > > order to track these tftp connections we have been making use of the > > > > fact that nf_conntrack_helper has been enabled by default. In newer > > > > kernel versions[2] this is no longer the case and I'm now trying to > > > > figure out the best way to deal with the new behaviour. I've put > > > > together some possible solutions along with some details on why they > > > > are not ideal and would appreciate some opinions > > > > > > The git commit message suggests that users should explicitly put in > rules such > > > that the traffic is matched. I feel like the kernel change ends up > > > being a behavior > > > change in this case. > > > > > > I think the reasonable path forward is to have a configuration > > > parameter that the > > > l3 agent can use to determine to set the netfilter connection tracker > helper. > > > > > > Doing so, allows us to raise this behavior change to operators > minimizing the > > > need of them having to troubleshoot it in production, and gives them a > choice > > > in the direction that they wish to take. > > > > https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to > cover this. Basically you should explicitly enable specific helpers when > you need them rather than relying on the auto helper rules. > > Thanks, I forgot to point out the option of adding these rules, If I > understand it correctly they would need to be added inside the router > namespace when neutron creates it, somebody from neutron might be able > to indicate if this is a workable solution. > > > > > Maybe even avoid the configuration option entirely if ironic and neutron > can set the required helper for tftp when tftp is used? > > > > > > > > [trim] > > > > > > > [more trimming] > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongbin034 at gmail.com Sun Jan 13 21:32:43 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Sun, 13 Jan 2019 16:32:43 -0500 Subject: [openstack-ZUN] In-Reply-To: References: Message-ID: Hi Rania, It seems I can reproduce the error by using your template (with modification of the private/public network name). The problem is resolved after I switched to the "mysql:5.7" image: http://paste.openstack.org/compare/742277/742276/ . It might relate to this issue: https://github.com/docker-library/wordpress/issues/313 . If it still doesn't work after switching the image, give another try by opening the mysql port in the security groups. For example: https://github.com/hongbin/heat-templates/commit/848d4cce49e85e0fff4b06c35c71de43532389f2 . Let me know if it still doesn't work. Best regards, Hongbin On Sun, Jan 13, 2019 at 11:23 AM Rania Adouni wrote: > hi everyone , > > I was trying to deploy wordpress -zun by using heat , this is the template > I used "https://pastebin.com/0PGtWSVw" . > now the stack create successfully the mysql image running but the > wordpress image alwayes stopped and when I try to started and access to the > container " openstack appcontainer exec --interactive rho-1-container > apache2-foreground " > i get this output : > ********************************* > connected to container "rho-1-container" > type ~. to disconnect > AH00558: apache2: Could not reliably determine the server's fully > qualified domain name, using 172.16.0.3. Set the 'ServerName' directive > globally to suppress this message > AH00558: apache2: Could not reliably determine the server's fully > qualified domain name, using 172.16.0.3. Set the 'ServerName' directive > globally to suppress this message > [Sun Jan 13 10:07:24.463058 2019] [mpm_prefork:notice] [pid 77] AH00163: > Apache/2.4.25 (Debian) PHP/7.2.14 configured -- resuming normal operations > [Sun Jan 13 10:07:24.463196 2019] [core:notice] [pid 77] AH00094: Command > line: 'apache2 -D FOREGROUND' > ***************************************************** > and then the status of wordpress image back stopped !!!! > the logs of wordpress image can be found here : > https://pastebin.com/CitXk6zN > > thanks for any help !! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Sun Jan 13 22:45:00 2019 From: emilien at redhat.com (Emilien Macchi) Date: Sun, 13 Jan 2019 17:45:00 -0500 Subject: [tripleo] TripleO Stein milestone 2 released ! Message-ID: We just released Stein Milestone 2 for TripleO, thanks all for your work: https://launchpad.net/tripleo/+milestone/stein-2 If your blueprint is done, please mark it as "Implemented" Or move it to stein-3. Bugs in progress will be moved to stein-3 automatically. By the end of the week, I'll move them myself otherwise but please do it if you can. I'll provide interesting stats at the end of Stein, where we compare numbers of fixed bugs and implemented blueprints over the cycles. Thanks, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Mon Jan 14 02:05:48 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 14 Jan 2019 11:05:48 +0900 Subject: [Searchlight] Nominating Thuy Dang for Searchlight core In-Reply-To: References: Message-ID: Hi, Welcome to the core team, Thuy Dang :) Bests, On Thu, Jan 10, 2019 at 11:07 AM lương hữu tuấn wrote: > +1 from me :) > > On Thursday, January 10, 2019, Trinh Nguyen wrote: > >> Hello team, >> >> I would like to nominate Thuy Dang for >> Searchlight core. He has been leading the effort to clarify our vision and >> working on some blueprints to make Searchlight a multi-cloud application. I >> believe Thuy will be a great resource for our team. >> >> Bests, >> >> >> -- >> *Trinh Nguyen* >> *www.edlab.xyz * >> >> -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Mon Jan 14 02:53:06 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 14 Jan 2019 11:53:06 +0900 Subject: [Searchlight] Team meeting cancelled today Message-ID: Hi team, I will help to coordinate an upstream training webinar at 1400 today [1] so will have to cancel the team meeting. If you guys want to discuss something, please let me know, I will be on the IRC channel. [1] https://www.meetup.com/VietOpenStack/events/257860457/ Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From lajos.katona at ericsson.com Mon Jan 14 08:59:53 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Mon, 14 Jan 2019 08:59:53 +0000 Subject: [L2-Gateway] l2gw-connection status In-Reply-To: References: Message-ID: Hi, Sorry, missing subject.... On 2019. 01. 11. 15:19, Lajos Katona wrote: > Hi, > > I have a question regarding networking-l2gw, specifically l2gw-connection. > We have an issue where the hw switch configured by networking-l2gw is > slow, so when the l2gw-connection is created the API returns > successfully, but the dataplane configuration is not yet ready. > Do you think that adding state field to the connection is feasible somehow? > By checking the vtep schema > (http://www.openvswitch.org/support/dist-docs/vtep.5.html) no such > information is available on vtep level. > > Thanks in advance for the help. > > Regarads > Lajos From sbauza at redhat.com Mon Jan 14 11:19:46 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Mon, 14 Jan 2019 12:19:46 +0100 Subject: [nova] Retiring gantt, python-ganttclient projects In-Reply-To: <1fec3e43b5247493614fe3f3b175133408f960e2.camel@redhat.com> References: <1fec3e43b5247493614fe3f3b175133408f960e2.camel@redhat.com> Message-ID: On Fri, Jan 11, 2019 at 5:44 PM Stephen Finucane wrote: > Hey, > > These projects are mega old, don't appear to have been official > projects, and should have been retired a long time ago. This is serves > as a heads up on the off-chance someone has managed to do something > with them. > > All good with me. It even could be confusing for people want to know about placement and scheduler. Do you need me for retiring the repos ? -Sylvain Stephen > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Mon Jan 14 11:52:58 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 14 Jan 2019 11:52:58 +0000 Subject: [nova] Retiring gantt, python-ganttclient projects In-Reply-To: References: <1fec3e43b5247493614fe3f3b175133408f960e2.camel@redhat.com> Message-ID: <69c83bb74b7341414156cfe48b7e64368d11b9bd.camel@redhat.com> On Mon, 2019-01-14 at 12:19 +0100, Sylvain Bauza wrote: > On Fri, Jan 11, 2019 at 5:44 PM Stephen Finucane > wrote: > > Hey, > > > > > > > > These projects are mega old, don't appear to have been official > > > > projects, and should have been retired a long time ago. This is > > serves > > > > as a heads up on the off-chance someone has managed to do something > > > > with them. > > > > > > All good with me. It even could be confusing for people want to know > about placement and scheduler. > Do you need me for retiring the repos ? > -Sylvain Indeed. Reviews are here: * https://review.openstack.org/630154 * https://review.openstack.org/630138 Looks like it has to be you or John Garbutt to push them and close them out, as you're the only still active cores I can see. Thanks, Stephen -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Mon Jan 14 12:34:19 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 14 Jan 2019 13:34:19 +0100 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: <20190111152318.ztuwirfgypehdfp6@localhost> Message-ID: <20190114123419.mqblajjrvzduo4f6@localhost> On 11/01, Brandon Caulder wrote: > Hi, > > The steps were... > - purge > - shutdown cinder-scheduler, cinder-api > - upgrade software > - restart cinder-volume Hi, You should not restart cinder volume services before doing the DB sync, otherwise the Cinder service is likely to fail. > - sync (upgrade fails and stops at v114) > - sync again (db upgrades to v117) > - restart cinder-volume > - stacktrace observed in volume.log > At this point this could be a DB issue: https://bugs.mysql.com/bug.php?id=67926 https://jira.mariadb.org/browse/MDEV-10558 Cheers, Gorka. > Thanks > > On Fri, Jan 11, 2019 at 7:23 AM Gorka Eguileor wrote: > > > On 10/01, Brandon Caulder wrote: > > > Hi Iain, > > > > > > There are 424 rows in volumes which drops down to 185 after running > > > cinder-manage db purge 1. Restarting the volume service after package > > > upgrade and running sync again does not remediate the problem, although > > > running db sync a second time does bump the version up to 117, the > > > following appears in the volume.log... > > > > > > http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ > > > > > > > Hi, > > > > If I understand correctly the steps were: > > > > - Run DB sync --> Fail > > - Run DB purge > > - Restart volume services > > - See the log error > > - Run DB sync --> version proceeds to 117 > > > > If that is the case, could you restart the services again now that the > > migration has been moved to version 117? > > > > If the cinder-volume service is able to restart please run the online > > data migrations with the service running. > > > > Cheers, > > Gorka. > > > > > > > Thanks > > > > > > On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell < > > iain.macdonnell at oracle.com> > > > wrote: > > > > > > > > > > > Different issue, I believe (DB sync vs. online migrations) - it just > > > > happens that both pertain to shared targets. > > > > > > > > Brandon, might you have a very large number of rows in your volumes > > > > table? Have you been purging soft-deleted rows? > > > > > > > > ~iain > > > > > > > > > > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > > > > Brandon, > > > > > > > > > > I am thinking you are hitting this bug: > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > > > > > > > > > > I think you can work around it by retrying the migration with the > > volume > > > > > service running. You may, however, want to check with Iain > > MacDonnell > > > > > as he has been looking at this for a while. > > > > > > > > > > Thanks! > > > > > Jay > > > > > > > > > > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > > > > >> Hi, > > > > >> > > > > >> I am receiving the following error when performing an offline > > upgrade > > > > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > > > > >> openstack-cinder-1:12.0.3-1.el7. > > > > >> > > > > >> # cinder-manage db version > > > > >> 105 > > > > >> > > > > >> # cinder-manage --debug db sync > > > > >> Error during database migration: (pymysql.err.OperationalError) > > (2013, > > > > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE > > volumes > > > > >> SET shared_targets=%(shared_targets)s'] [parameters: > > > > >> {'shared_targets': 1}] > > > > >> > > > > >> # cinder-manage db version > > > > >> 114 > > > > >> > > > > >> The db version does not upgrade to queens version 117. Any help > > would > > > > >> be appreciated. > > > > >> > > > > >> Thank you > > > > > > > > > > > > > > > From amotoki at gmail.com Mon Jan 14 12:36:42 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Mon, 14 Jan 2019 21:36:42 +0900 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> Message-ID: The similar failure happens in neutron-fwaas. This blocks several patches in neutron-fwaas including policy-in-code support. https://bugs.launchpad.net/neutron/+bug/1811506 Most failures are fixed by applying Ben's neutron fix https://review.openstack.org/#/c/629335/ [1], but we still have one failure in neutron_fwaas.tests.functional.privileged.test_utils.InNamespaceTest.test_in_namespace [2]. This failure is caused by oslo.privsep 1.31.0 too. This does not happen with 1.30.1. Any help would be appreciated. [1] neutron-fwaas change https://review.openstack.org/#/c/630451/ [2] http://logs.openstack.org/51/630451/2/check/legacy-neutron-fwaas-dsvm-functional/05b9131/logs/testr_results.html.gz -- Akihiro Motoki (irc: amotoki) 2019年1月9日(水) 9:32 Ben Nemec : > I think I've got it. At least in my local tests, the handle pointer > being passed from C -> Python -> C was getting truncated at the Python > step because we didn't properly define the type. If the address assigned > was larger than would fit in a standard int then we passed what amounted > to a bogus pointer back to the C code, which caused the segfault. > > I have no idea why privsep threading would have exposed this, other than > maybe running in threads affected the address space somehow? > > In any case, https://review.openstack.org/629335 has got these > functional tests working for me locally in oslo.privsep 1.31.0. It would > be great if somebody could try them out and verify that I didn't just > find a solution that somehow only works on my system. :-) > > -Ben > > On 1/8/19 4:30 PM, Ben Nemec wrote: > > > > > > On 1/8/19 2:22 PM, Slawomir Kaplonski wrote: > >> Hi Ben, > >> > >> I was also looking at it today. I’m totally not an C and Oslo.privsep > >> expert but I think that there is some new process spawned here. > >> I put pdb before line > >> > https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L191 > >> where this issue happen. Then, with "ps aux” I saw: > >> > >> vagrant at fullstack-ubuntu ~ $ ps aux | grep privsep > >> root 18368 0.1 0.5 185752 33544 pts/1 Sl+ 13:24 0:00 > >> /opt/stack/neutron/.tox/dsvm-functional/bin/python > >> /opt/stack/neutron/.tox/dsvm-functional/bin/privsep-helper > >> --config-file neutron/tests/etc/neutron.conf --privsep_context > >> neutron.privileged.default --privsep_sock_path > >> /tmp/tmpG5iqb9/tmp1dMGq0/privsep.sock > >> vagrant 18555 0.0 0.0 14512 1092 pts/2 S+ 13:25 0:00 grep > >> --color=auto privsep > >> > >> But then when I continue run test, and it segfaulted, in journal log I > >> have: > >> > >> Jan 08 13:25:29 fullstack-ubuntu kernel: privsep-helper[18369] > >> segfault at 140043e8 ip 00007f8e1800ef32 sp 00007f8e18a63320 error 4 > >> in libnetfilter_conntrack.so.3.5.0[7f8e18009000+1a000] > >> > >> Please check pics of those processes. First one (when test was > >> „paused” with pdb) has 18368 and later segfault has 18369. > > > > privsep-helper does fork, so I _think_ that's normal. > > > > > https://github.com/openstack/oslo.privsep/blob/ecb1870c29b760f09fb933fc8ebb3eac29ffd03e/oslo_privsep/daemon.py#L539 > > > > > >> > >> I don’t know if You saw my today’s comment in launchpad. I was trying > >> to change method used to start PrivsepDaemon from Method.ROOTWRAP to > >> Method.FORK (in > >> > https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/priv_context.py#L218) > > >> and run test as root, then tests were passed. > > > > Yeah, I saw that, but I don't understand it. :-/ > > > > The daemon should end up running with the same capabilities in either > > case. By the time it starts making the C calls the environment should be > > identical, regardless of which method was used to start the process. > > > >> > >> — > >> Slawek Kaplonski > >> Senior software engineer > >> Red Hat > >> > >>> Wiadomość napisana przez Ben Nemec w dniu > >>> 08.01.2019, o godz. 20:04: > >>> > >>> Further update: I dusted off my gdb skills and attached it to the > >>> privsep process to try to get more details about exactly what is > >>> crashing. It looks like the segfault happens on this line: > >>> > >>> > https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 > >>> > >>> > >>> which is > >>> > >>> h->cb = cb; > >>> > >>> h being the conntrack handle and cb being the callback function. > >>> > >>> This makes me think the problem isn't the callback itself (even if we > >>> assigned a bogus pointer, which we didn't, it shouldn't cause a > >>> segfault unless you try to dereference it) but in the handle we pass > >>> in. Trying to look at h->cb results in: > >>> > >>> (gdb) print h->cb > >>> Cannot access memory at address 0x800f228 > >>> > >>> Interestingly, h itself is fine: > >>> > >>> (gdb) print h > >>> $3 = (struct nfct_handle *) 0x800f1e0 > >>> > >>> It doesn't _look_ to me like the handle should be crossing any thread > >>> boundaries or anything, so I'm not sure why it would be a problem. It > >>> gets created in the same privileged function that ultimately > >>> registers the callback: > >>> > https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 > >>> > >>> > >>> So still not sure what's going on, but I thought I'd share what I've > >>> found before I stop to eat something. > >>> > >>> -Ben > >>> > >>> On 1/7/19 12:11 PM, Ben Nemec wrote: > >>>> Renamed the thread to be more descriptive. > >>>> Just to update the list on this, it looks like the problem is a > >>>> segfault when the netlink_lib module makes a C call. Digging into > >>>> that code a bit, it appears there is a callback being used[1]. I've > >>>> seen some comments that when you use a callback with a Python > >>>> thread, the thread needs to be registered somehow, but this is all > >>>> uncharted territory for me. Suggestions gratefully accepted. :-) > >>>> 1: > >>>> > https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 > >>>> On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: > >>>>> Hi, > >>>>> > >>>>> I just found that functional tests in Neutron are failing since > >>>>> today or maybe yesterday. See [1] > >>>>> I was able to reproduce it locally and it looks that it happens > >>>>> with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. > >>>>> > >>>>> [1] https://bugs.launchpad.net/neutron/+bug/1810518 > >>>>> > >>>>> — > >>>>> Slawek Kaplonski > >>>>> Senior software engineer > >>>>> Red Hat > >>>>> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vdrok at mirantis.com Mon Jan 14 12:47:24 2019 From: vdrok at mirantis.com (Vladyslav Drok) Date: Mon, 14 Jan 2019 14:47:24 +0200 Subject: [ironic] stepping down from core Message-ID: Hello folks, As you might have noticed my contribution to ironic has dropped to almost 0 during the last three months. My current job responsibilities have changed, and I can't dedicate the time needed to be a core reviewer at this time. I will still be around and at some point might get back to contributing to ironic, but for now I'd like to request to be removed from the core reviewers group. Thank you, Vlad -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Mon Jan 14 13:01:36 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 14 Jan 2019 05:01:36 -0800 Subject: [ironic] stepping down from core In-Reply-To: References: Message-ID: Greetings Vlad, This was saddening to read and totally understandable. Thank you for your excellent work, and we will see you around. :) -Julia On Mon, Jan 14, 2019 at 4:52 AM Vladyslav Drok wrote: > Hello folks, > > As you might have noticed my contribution to ironic has dropped to almost > 0 during the last three months. My current job responsibilities have > changed, and I can't dedicate the time needed to be a core reviewer at this > time. I will still be around and at some point might get back to > contributing to ironic, but for now I'd like to request to be removed from > the core reviewers group. > > Thank you, > Vlad > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Mon Jan 14 13:12:49 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Mon, 14 Jan 2019 14:12:49 +0100 Subject: [ironic] stepping down from core In-Reply-To: References: Message-ID: <808ed1c5-958a-79b7-a183-166813c1eb40@redhat.com> Hi Vlad, Sigh, our small Slavic lobby becomes smaller with every cycle :) Good luck with your new challenges! Dmitry On 1/14/19 1:47 PM, Vladyslav Drok wrote: > Hello folks, > > As you might have noticed my contribution to ironic has dropped to almost 0 > during the last three months. My current job responsibilities have changed, and > I can't dedicate the time needed to be a core reviewer at this time. I will > still be around and at some point might get back to contributing to ironic, but > for now I'd like to request to be removed from the core reviewers group. > > Thank you, > Vlad From aschultz at redhat.com Mon Jan 14 14:51:07 2019 From: aschultz at redhat.com (Alex Schultz) Date: Mon, 14 Jan 2019 07:51:07 -0700 Subject: [tripleo] Re: [infra] NetworkManager on infra Fedora 29 and CentOS nodes In-Reply-To: References: <20190109061109.GA24618@fedora19.localdomain> <20190109232624.GB24618@fedora19.localdomain> Message-ID: On Wed, Jan 9, 2019 at 4:34 PM Alex Schultz wrote: > > On Wed, Jan 9, 2019 at 4:26 PM Ian Wienand wrote: > > > > On Wed, Jan 09, 2019 at 09:02:57AM -0700, Alex Schultz wrote: > > > Don't suppose we could try this with tripleo jobs prior to cutting > > > them all over could we? We don't use NetworkManager and infact > > > os-net-config doesn't currently support NetworkManager. I don't think > > > it'll cause problems, but I'd like to have some test prior to cutting > > > them all over. > > > > It is possible to stage this in by creating a new NetworkManager > > enabled node-type. I've proposed that in [1] but it's only useful if > > you want to then follow-up with setting up testing jobs to use the new > > node-type. We can then revert and apply the change to regular nodes. > > > > By just switching directly in [2], we can quite quickly revert if > > there should be an issue. We can immediately delete the new image, > > revert the config change and then worry about fixing it. > > > > Staging it is the conservative approach and more work all round but > > obviously safer; hoping for the best with the escape hatch is probably > > my preferred option given the low risk. I've WIP'd both reviews so > > just let us know in there your thoughts. > > > > For us to test I think we just need > https://review.openstack.org/#/c/629685/ once the node pool change > goes in. Then the jobs on that change will be the NetworkManager > version. I would really prefer testing this way than possibly having > to revert after breaking a bunch of in flight patches. I'll defer to > others if they think it's OK to just land it and revert as needed. > Looks like it should be ok based on the test results from https://review.openstack.org/#/c/629685/. Thanks, -Alex > Thanks, > -Alex > > > Thanks, > > > > -i > > > > [1] https://review.openstack.org/629680 > > [2] https://review.openstack.org/619960 From bodenvmw at gmail.com Mon Jan 14 14:54:37 2019 From: bodenvmw at gmail.com (Boden Russell) Date: Mon, 14 Jan 2019 07:54:37 -0700 Subject: [dev][neutron] Bug summary week of Jan 7 Message-ID: <4ab78e5c-4250-c038-4d24-9d7d70864954@gmail.com> Below is a summary of the neutron bugs that came in last week (Jan 7th). Any bugs with a preceding "(*)" are still under investigation. Looks like njohnston is this week's bug deputy. Gate Failures - https://bugs.launchpad.net/neutron/+bug/1811515 - https://bugs.launchpad.net/neutron/+bug/1811506 - https://bugs.launchpad.net/neutron/+bug/1811126 L3 / DHCP - https://bugs.launchpad.net/neutron/+bug/1811639 - https://bugs.launchpad.net/neutron/+bug/1811213 OVS-FW - https://bugs.launchpad.net/neutron/+bug/1811405 API - (*) https://bugs.launchpad.net/neutron/+bug/1811390 RFE / Enhancements - https://bugs.launchpad.net/neutron/+bug/1811352 - https://bugs.launchpad.net/neutron/+bug/1811166 - https://bugs.launchpad.net/neutron/+bug/1810905 Docs - https://bugs.launchpad.net/neutron/+bug/1811238 Tempest - https://bugs.launchpad.net/neutron/+bug/1810963 Xen - https://bugs.launchpad.net/neutron/+bug/1810764 From alfredo.deluca at gmail.com Mon Jan 14 16:21:41 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Mon, 14 Jan 2019 17:21:41 +0100 Subject: Training Message-ID: Hi all. I am looking for openstack training both basic and advanced here in Italy or Europe. Possibly instructor led on site. Any suggestions? -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike_mp at zzzcomputing.com Mon Jan 14 16:23:01 2019 From: mike_mp at zzzcomputing.com (Mike Bayer) Date: Mon, 14 Jan 2019 11:23:01 -0500 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: <20190114123419.mqblajjrvzduo4f6@localhost> References: <20190111152318.ztuwirfgypehdfp6@localhost> <20190114123419.mqblajjrvzduo4f6@localhost> Message-ID: On Mon, Jan 14, 2019 at 7:38 AM Gorka Eguileor wrote: > > On 11/01, Brandon Caulder wrote: > > Hi, > > > > The steps were... > > - purge > > - shutdown cinder-scheduler, cinder-api > > - upgrade software > > - restart cinder-volume > > Hi, > > You should not restart cinder volume services before doing the DB sync, > otherwise the Cinder service is likely to fail. > > > - sync (upgrade fails and stops at v114) > > - sync again (db upgrades to v117) > > - restart cinder-volume > > - stacktrace observed in volume.log > > > > At this point this could be a DB issue: > > https://bugs.mysql.com/bug.php?id=67926 > https://jira.mariadb.org/browse/MDEV-10558 that's a scary issue, can the reporter please list what MySQL / MariaDB version is running and if this is Galera/HA or single node? > > Cheers, > Gorka. > > > Thanks > > > > On Fri, Jan 11, 2019 at 7:23 AM Gorka Eguileor wrote: > > > > > On 10/01, Brandon Caulder wrote: > > > > Hi Iain, > > > > > > > > There are 424 rows in volumes which drops down to 185 after running > > > > cinder-manage db purge 1. Restarting the volume service after package > > > > upgrade and running sync again does not remediate the problem, although > > > > running db sync a second time does bump the version up to 117, the > > > > following appears in the volume.log... > > > > > > > > http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ > > > > > > > > > > Hi, > > > > > > If I understand correctly the steps were: > > > > > > - Run DB sync --> Fail > > > - Run DB purge > > > - Restart volume services > > > - See the log error > > > - Run DB sync --> version proceeds to 117 > > > > > > If that is the case, could you restart the services again now that the > > > migration has been moved to version 117? > > > > > > If the cinder-volume service is able to restart please run the online > > > data migrations with the service running. > > > > > > Cheers, > > > Gorka. > > > > > > > > > > Thanks > > > > > > > > On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell < > > > iain.macdonnell at oracle.com> > > > > wrote: > > > > > > > > > > > > > > Different issue, I believe (DB sync vs. online migrations) - it just > > > > > happens that both pertain to shared targets. > > > > > > > > > > Brandon, might you have a very large number of rows in your volumes > > > > > table? Have you been purging soft-deleted rows? > > > > > > > > > > ~iain > > > > > > > > > > > > > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > > > > > Brandon, > > > > > > > > > > > > I am thinking you are hitting this bug: > > > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > > > > > > > > > > > > > I think you can work around it by retrying the migration with the > > > volume > > > > > > service running. You may, however, want to check with Iain > > > MacDonnell > > > > > > as he has been looking at this for a while. > > > > > > > > > > > > Thanks! > > > > > > Jay > > > > > > > > > > > > > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > > > > > >> Hi, > > > > > >> > > > > > >> I am receiving the following error when performing an offline > > > upgrade > > > > > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > > > > > >> openstack-cinder-1:12.0.3-1.el7. > > > > > >> > > > > > >> # cinder-manage db version > > > > > >> 105 > > > > > >> > > > > > >> # cinder-manage --debug db sync > > > > > >> Error during database migration: (pymysql.err.OperationalError) > > > (2013, > > > > > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE > > > volumes > > > > > >> SET shared_targets=%(shared_targets)s'] [parameters: > > > > > >> {'shared_targets': 1}] > > > > > >> > > > > > >> # cinder-manage db version > > > > > >> 114 > > > > > >> > > > > > >> The db version does not upgrade to queens version 117. Any help > > > would > > > > > >> be appreciated. > > > > > >> > > > > > >> Thank you > > > > > > > > > > > > > > > > > > > > From alfredo.deluca at gmail.com Mon Jan 14 16:36:28 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Mon, 14 Jan 2019 17:36:28 +0100 Subject: Training Message-ID: Hi all. I am looking for openstack training both basic and advanced here in Italy or Europe. Possibly instructor led on site. Any suggestions? -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kbcaulder at gmail.com Mon Jan 14 16:43:56 2019 From: kbcaulder at gmail.com (Brandon Caulder) Date: Mon, 14 Jan 2019 08:43:56 -0800 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: <20190111152318.ztuwirfgypehdfp6@localhost> <20190114123419.mqblajjrvzduo4f6@localhost> Message-ID: Hi, We are running a 5 node galera cluster with haproxy in front. Haproxy is installed on the same node as cinder and its configuration sends all reads and writes to the first node. From what I can tell the galera and mariadb rpms did not come from the RDO repository. /etc/cinder/cinder.conf [database] connection = mysql+pymysql://cinder:xxxxxxxx at 127.0.0.1/cinder /etc/haproxy/haproxy.conf listen galera 127.0.0.1:3306 maxconn 10000 mode tcp option tcpka option tcplog option mysql-check user haproxy server db1 10.252.173.54:3306 check maxconn 10000 server db2 10.252.173.55:3306 check backup maxconn 10000 server db3 10.252.173.56:3306 check backup maxconn 10000 server db4 10.252.173.57:3306 check backup maxconn 10000 server db5 10.252.173.58:3306 check backup maxconn 10000 Name : haproxy Version : 1.5.18 Release : 7.el7 Architecture: x86_64 Install Date: Wed 09 Jan 2019 07:09:01 PM GMT Group : System Environment/Daemons Size : 2689838 License : GPLv2+ Signature : RSA/SHA256, Wed 25 Apr 2018 11:04:31 AM GMT, Key ID 24c6a8a7f4a80eb5 Source RPM : haproxy-1.5.18-7.el7.src.rpm Build Date : Wed 11 Apr 2018 04:28:42 AM GMT Build Host : x86-01.bsys.centos.org Relocations : (not relocatable) Packager : CentOS BuildSystem Vendor : CentOS URL : http://www.haproxy.org/ Summary : TCP/HTTP proxy and load balancer for high availability environments Name : galera Version : 25.3.20 Release : 1.rhel7.el7.centos Architecture: x86_64 Install Date: Wed 09 Jan 2019 07:07:52 PM GMT Group : System Environment/Libraries Size : 36383325 License : GPL-2.0 Signature : DSA/SHA1, Tue 02 May 2017 04:20:52 PM GMT, Key ID cbcb082a1bb943db Source RPM : galera-25.3.20-1.rhel7.el7.centos.src.rpm Build Date : Thu 27 Apr 2017 12:58:55 PM GMT Build Host : centos70-x86-64 Relocations : (not relocatable) Packager : Codership Oy Vendor : Codership Oy URL : http://www.codership.com/ Summary : Galera: a synchronous multi-master wsrep provider (replication engine) Name : MariaDB-server Version : 10.3.2 Release : 1.el7.centos Architecture: x86_64 Install Date: Wed 09 Jan 2019 07:08:11 PM GMT Group : Applications/Databases Size : 511538370 License : GPLv2 Signature : DSA/SHA1, Sat 07 Oct 2017 05:51:08 PM GMT, Key ID cbcb082a1bb943db Source RPM : MariaDB-server-10.3.2-1.el7.centos.src.rpm Build Date : Fri 06 Oct 2017 01:51:16 PM GMT Build Host : centos70-x86-64 Relocations : (not relocatable) Vendor : MariaDB Foundation URL : http://mariadb.org Summary : MariaDB: a very fast and robust SQL database server Thanks On Mon, Jan 14, 2019 at 8:23 AM Mike Bayer wrote: > On Mon, Jan 14, 2019 at 7:38 AM Gorka Eguileor > wrote: > > > > On 11/01, Brandon Caulder wrote: > > > Hi, > > > > > > The steps were... > > > - purge > > > - shutdown cinder-scheduler, cinder-api > > > - upgrade software > > > - restart cinder-volume > > > > Hi, > > > > You should not restart cinder volume services before doing the DB sync, > > otherwise the Cinder service is likely to fail. > > > > > - sync (upgrade fails and stops at v114) > > > - sync again (db upgrades to v117) > > > - restart cinder-volume > > > - stacktrace observed in volume.log > > > > > > > At this point this could be a DB issue: > > > > https://bugs.mysql.com/bug.php?id=67926 > > https://jira.mariadb.org/browse/MDEV-10558 > > that's a scary issue, can the reporter please list what MySQL / > MariaDB version is running and if this is Galera/HA or single node? > > > > > > Cheers, > > Gorka. > > > > > Thanks > > > > > > On Fri, Jan 11, 2019 at 7:23 AM Gorka Eguileor > wrote: > > > > > > > On 10/01, Brandon Caulder wrote: > > > > > Hi Iain, > > > > > > > > > > There are 424 rows in volumes which drops down to 185 after running > > > > > cinder-manage db purge 1. Restarting the volume service after > package > > > > > upgrade and running sync again does not remediate the problem, > although > > > > > running db sync a second time does bump the version up to 117, the > > > > > following appears in the volume.log... > > > > > > > > > > http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ > > > > > > > > > > > > > Hi, > > > > > > > > If I understand correctly the steps were: > > > > > > > > - Run DB sync --> Fail > > > > - Run DB purge > > > > - Restart volume services > > > > - See the log error > > > > - Run DB sync --> version proceeds to 117 > > > > > > > > If that is the case, could you restart the services again now that > the > > > > migration has been moved to version 117? > > > > > > > > If the cinder-volume service is able to restart please run the online > > > > data migrations with the service running. > > > > > > > > Cheers, > > > > Gorka. > > > > > > > > > > > > > Thanks > > > > > > > > > > On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell < > > > > iain.macdonnell at oracle.com> > > > > > wrote: > > > > > > > > > > > > > > > > > Different issue, I believe (DB sync vs. online migrations) - it > just > > > > > > happens that both pertain to shared targets. > > > > > > > > > > > > Brandon, might you have a very large number of rows in your > volumes > > > > > > table? Have you been purging soft-deleted rows? > > > > > > > > > > > > ~iain > > > > > > > > > > > > > > > > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > > > > > > Brandon, > > > > > > > > > > > > > > I am thinking you are hitting this bug: > > > > > > > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > > > > > > > > > > > > > > > > I think you can work around it by retrying the migration with > the > > > > volume > > > > > > > service running. You may, however, want to check with Iain > > > > MacDonnell > > > > > > > as he has been looking at this for a while. > > > > > > > > > > > > > > Thanks! > > > > > > > Jay > > > > > > > > > > > > > > > > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > > > > > > >> Hi, > > > > > > >> > > > > > > >> I am receiving the following error when performing an offline > > > > upgrade > > > > > > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > > > > > > >> openstack-cinder-1:12.0.3-1.el7. > > > > > > >> > > > > > > >> # cinder-manage db version > > > > > > >> 105 > > > > > > >> > > > > > > >> # cinder-manage --debug db sync > > > > > > >> Error during database migration: > (pymysql.err.OperationalError) > > > > (2013, > > > > > > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE > > > > volumes > > > > > > >> SET shared_targets=%(shared_targets)s'] [parameters: > > > > > > >> {'shared_targets': 1}] > > > > > > >> > > > > > > >> # cinder-manage db version > > > > > > >> 114 > > > > > > >> > > > > > > >> The db version does not upgrade to queens version 117. Any > help > > > > would > > > > > > >> be appreciated. > > > > > > >> > > > > > > >> Thank you > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Mon Jan 14 16:45:03 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Mon, 14 Jan 2019 11:45:03 -0500 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: References: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> Message-ID: <72ed6b5b-a28b-8fa2-dfce-fcf31ccc40a6@gmail.com> On 1/7/19 12:42 PM, Julia Kreger wrote: > On Mon, Jan 7, 2019 at 9:11 AM Clark Boylan wrote: >> >> On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: > [trim] >>> >>> Doing so, allows us to raise this behavior change to operators minimizing the >>> need of them having to troubleshoot it in production, and gives them a choice >>> in the direction that they wish to take. >> >> https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to cover this. Basically you should explicitly enable specific helpers when you need them rather than relying on the auto helper rules. >> >> Maybe even avoid the configuration option entirely if ironic and neutron can set the required helper for tftp when tftp is used? >> > Great link Clark, thanks! > > It could be viable to ask operators to explicitly set their security > groups for tftp to be passed. > > I guess we actually have multiple cases where there are issues and the > only non-impacted case is when the ironic conductor host is directly > attached to the flat network the machine is booting from. In the case > of a flat network, it doesn't seem viable for us to change rules > ad-hoc since we would need to be able to signal that the helper is > needed, but it does seem viable to say "make sure connectivity works x > way". Where as with multitenant networking, we use dedicated networks, > so conceivably it is just a static security group setting that an > operator can keep in place. Explicit static rules like that seem less > secure to me without conntrack helpers. :( > > Does anyone in Neutron land have any thoughts? I am from Neutron land, sorry for the slow reply. First, I'm trying to get in contact with someone that knows more about nf_conntrack_helper than me, I'll follow-up here or in the patch. In neutron, as in most projects, the goal is to have things configured so admins don't need to set any extra options, so we've typically done things like set sysctl values to make sure we don't get tripped-up by such issues. Mostly these settings have been in the L3 code, so are done in namespaces and have limited "impact" on the system hypervisor on the compute node. Since this is security group related it is different, since that isn't done in a namespace - we add a rule for related/established connections in the "root" namespace, for example in the iptables_hybrid case. For that reason it's not obvious to me that setting this sysctl is bad - it's not in the VM itself, and the packets aren't going to the hypervisor, so is there any impact we need to worry about besides just having it loaded? The other option would be to add more rules when SG rules are added that are associated with a protocol that has a helper. IMO that's not a great solution as there is no way for the user to control what filters (like IP addresses) are allowed, for example a SIP helper IP address. Hopefully I'm understanding things correctly. Thanks, -Brian From balazs.gibizer at ericsson.com Mon Jan 14 17:16:23 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Mon, 14 Jan 2019 17:16:23 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1547052955.1128.1@smtp.office365.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> <1547029853.1128.0@smtp.office365.com> <1547052955.1128.1@smtp.office365.com> Message-ID: <1547486170.17957.0@smtp.office365.com> On Wed, Jan 9, 2019 at 5:56 PM, Balázs Gibizer wrote: > > > On Wed, Jan 9, 2019 at 11:30 AM, Balázs Gibizer > wrote: >> >> >> On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer >> wrote: >>> >>> >>>> But, let's chat more about it via a hangout the week after next >>>> (week >>>> of January 14 when Matt is back), as suggested in >>>> #openstack-nova >>>> today. We'll be able to have a high-bandwidth discussion then >>>> and >>>> agree on a decision on how to move forward with this. >>> >>> Thank you all for the discussion. I agree to have a real-time >>> discussion about the way forward. >>> >>> Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a >>> hangouts[2]? >> > > It seems that Tuesday 15th of Jan, 17:00 UTC [2] would be better for > the team. So I'm moving the call there. Sorry to change it again. I hope this is the final time. Friday 18th of Jan, 17:00 UTC [2]. The dicussion etherpad is updated with a bit more info [3]. Cheers, gibi [1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI [2] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190118T170000 [3] https://etherpad.openstack.org/p/bandwidth-way-forward > > Cheers, > gibi > > [1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI > [2] > https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190115T170000 > > From cdent+os at anticdent.org Mon Jan 14 17:56:09 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 14 Jan 2019 17:56:09 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: On Thu, 10 Jan 2019, Chris Dent wrote: > Now we need to make sure the document reflects not just how things > are but also how they should be. We (the TC) would like feedback > from the community on the following general questions (upon which > you should feel free to expand as necessary). > > * Does the document accurately reflect what you see the TC doing? > * What's in the list that shouldn't be? > * What's not in the list that should be? > * Should something that is listed be done more or less? Since nobody else is feeling inspired to respond, I'll respond to myself but whereas the original message was written with my TC hat, this one is not. Perhaps the TC is not doing what it should be doing, nor is it constituted to enable that. Since there is a significant and friction creating division of power and leadership between the TC and PTLs, what would it be like if we required half or more of the TC be elected from PTLs? Then the "providing the technical leadership" aspect of the TC mission [3] would be vested with the people who also have some responsibility for executing on that leadership. That would be like something we had before, but now there are many more PTLs. > [3] https://governance.openstack.org/tc/reference/charter.html#mission -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From cdent+os at anticdent.org Mon Jan 14 18:01:47 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 14 Jan 2019 18:01:47 +0000 (GMT) Subject: [nova] [placement] [packaging] placement extraction check in meeting Message-ID: As discussed in the recent pupdate [1] there will be a meeting this Wednesday at 1700 UTC to discuss the current state of the placement extraction and get some idea on the critical items that need to be addressed to feel comfy. If you're interested in this topic, meet near that time in the #openstack-placement IRC channel and someone will produce links for a hangout, etherpad, whatever is required. Thanks. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001666.html -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From fungi at yuggoth.org Mon Jan 14 18:13:15 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 14 Jan 2019 18:13:15 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: <20190114181315.6uocyvuq6nqdoohw@yuggoth.org> On 2019-01-14 17:56:09 +0000 (+0000), Chris Dent wrote: [...] > Since there is a significant and friction creating division of power > and leadership between the TC and PTLs, what would it be like if we > required half or more of the TC be elected from PTLs? Then the > "providing the technical leadership" aspect of the TC mission [3] > would be vested with the people who also have some responsibility for > executing on that leadership. [...] As someone who was both a PTL and TC member for a while, I think it's a lot to juggle (especially since I expect none of us has just our leadership duties on our respective piles of responsibilities). While doing both, I felt like I wasn't able to carve out enough time to give either group of constituents the attention and representation it was due. I know we have a few PTLs on the TC now, and I respect their monumental effort but personally don't know how they manage to stay sane. Kudos! -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jimmy at openstack.org Mon Jan 14 18:15:17 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Mon, 14 Jan 2019 12:15:17 -0600 Subject: Training In-Reply-To: References: Message-ID: <5C3CD1B5.7080803@openstack.org> Hi Alfredo, A good place to start would be here: https://www.openstack.org/marketplace/training/ If you have additional questions, please don't hesitate. Cheers, Jimmy > Alfredo De Luca > January 14, 2019 at 10:21 AM > Hi all. > I am looking for openstack training both basic and advanced here in > Italy or Europe. Possibly instructor led on site. > > Any suggestions? > > > -- > /*Alfredo*/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvanwinkle at salesforce.com Mon Jan 14 19:12:09 2019 From: mvanwinkle at salesforce.com (Matt Van Winkle) Date: Mon, 14 Jan 2019 13:12:09 -0600 Subject: UC Election - Looking for Election Officials Message-ID: Hey Stackers, We are getting ready for the Winter UC election and we need to have at least two Election Officials. I was wondering if you would like to help us on that process. You can find all the details of the election at *https://governance.openstack.org/uc/reference/uc-election-feb2019.html *. I do want to point out to those who are new that Election Officials are unable to run in the election itself but can of course vote. The election dates will be: January 21 - February 03, 05:59 UTC: Open candidacy for UC positions February 04 - February 10, 11:59 UTC: UC elections (voting) Please, reach out to any of the current UC members or simple reply to this email if you can help us in this community process. Thanks, OpenStack User Committee Amy, Leong, Matt, Melvin, and Joseph -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Mon Jan 14 20:16:51 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Mon, 14 Jan 2019 14:16:51 -0600 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: <20190114201650.GA6655@sm-workstation> > > Since there is a significant and friction creating division of power > and leadership between the TC and PTLs, what would it be like if we > required half or more of the TC be elected from PTLs? Then the > "providing the technical leadership" aspect of the TC mission [3] > would be vested with the people who also have some responsibility for > executing on that leadership. > > That would be like something we had before, but now there are many > more PTLs. > I could see having something where to be an eligible candidate, one would either need to be a current or a past PTL. There's definitely value in bringing that experience and perspective to the TC. That could complicate the election process quite a bit. But it's an idea interesting enough that I think we should discuss it further. Sean From fungi at yuggoth.org Mon Jan 14 20:29:33 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 14 Jan 2019 20:29:33 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190114201650.GA6655@sm-workstation> References: <20190114201650.GA6655@sm-workstation> Message-ID: <20190114202933.4tcpf6swaq3irrt2@yuggoth.org> On 2019-01-14 14:16:51 -0600 (-0600), Sean McGinnis wrote: [...] > I could see having something where to be an eligible candidate, > one would either need to be a current or a past PTL. There's > definitely value in bringing that experience and perspective to > the TC. [...] In principle, our charter doesn't even require candidates for TC seats be recognized contributors to any project, only that they be OSF individual members in good standing. In practice, the most common way people get the visibility required to be elected to the TC is to have already held another leadership role in the community, frequently by serving as a PTL. While I haven't actually done the math, I would wager more than half the people ever elected to the TC have also been PTLs at some point, so I personally have doubts that making it a policy would actually change anything. A quick skim of the current sitting TC members, I believe 12 out of 13 are either presently or were formerly also PTLs. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jungleboyj at gmail.com Mon Jan 14 21:32:13 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Mon, 14 Jan 2019 15:32:13 -0600 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190114201650.GA6655@sm-workstation> References: <20190114201650.GA6655@sm-workstation> Message-ID: <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> On 1/14/2019 2:16 PM, Sean McGinnis wrote: >> Since there is a significant and friction creating division of power >> and leadership between the TC and PTLs, what would it be like if we >> required half or more of the TC be elected from PTLs? Then the >> "providing the technical leadership" aspect of the TC mission [3] >> would be vested with the people who also have some responsibility for >> executing on that leadership. >> >> That would be like something we had before, but now there are many >> more PTLs. >> > I could see having something where to be an eligible candidate, one would > either need to be a current or a past PTL. There's definitely value in bringing > that experience and perspective to the TC. > > That could complicate the election process quite a bit. But it's an idea > interesting enough that I think we should discuss it further. > > Sean I think have a requirement of having been a PTL could be reasonable.  Given the fact that I feel I have a hard enough time doing all I want to do as PTL, I wouldn't want to add TC at the same time.  Would want to feel like I was doing it at a time where I could give it the appropriate attention. Jay From Kevin.Fox at pnnl.gov Mon Jan 14 21:53:00 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Mon, 14 Jan 2019 21:53:00 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> References: <20190114201650.GA6655@sm-workstation>, <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> Message-ID: <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> Been chewing on this thread for a while.... I think I should advocate the other direction. I think most folks will come from PTL as its an easier path to get votes. So I don't see that as underrepresented. Getting some diversity of ideas from outside of those from PTL's is probably a good idea for the overall health of OpenStack. What about Users that have never been PTL's? Not developers? Thanks, Kevin ________________________________________ From: Jay Bryant [jungleboyj at gmail.com] Sent: Monday, January 14, 2019 1:32 PM To: openstack-discuss at lists.openstack.org Subject: Re: [tc] [all] Please help verify the role of the TC On 1/14/2019 2:16 PM, Sean McGinnis wrote: >> Since there is a significant and friction creating division of power >> and leadership between the TC and PTLs, what would it be like if we >> required half or more of the TC be elected from PTLs? Then the >> "providing the technical leadership" aspect of the TC mission [3] >> would be vested with the people who also have some responsibility for >> executing on that leadership. >> >> That would be like something we had before, but now there are many >> more PTLs. >> > I could see having something where to be an eligible candidate, one would > either need to be a current or a past PTL. There's definitely value in bringing > that experience and perspective to the TC. > > That could complicate the election process quite a bit. But it's an idea > interesting enough that I think we should discuss it further. > > Sean I think have a requirement of having been a PTL could be reasonable. Given the fact that I feel I have a hard enough time doing all I want to do as PTL, I wouldn't want to add TC at the same time. Would want to feel like I was doing it at a time where I could give it the appropriate attention. Jay From fungi at yuggoth.org Mon Jan 14 22:00:57 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 14 Jan 2019 22:00:57 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> Message-ID: <20190114220057.c5vvhnuditpdbpjy@yuggoth.org> On 2019-01-14 21:53:00 +0000 (+0000), Fox, Kevin M wrote: [...] > Getting some diversity of ideas from outside of those from PTL's > is probably a good idea for the overall health of OpenStack. What > about Users that have never been PTL's? Not developers? [...] It's an interesting suggestion. I'm curious how you'd see user representatives on the OpenStack TC as differing from the OpenStack UC: https://governance.openstack.org/uc/ Those are the individuals who users in our community have chosen to represent their interests. Do you feel they're chosen poorly? Or simply lack influence over/a voice in topics for which the TC members are asked to provide policy and guidance? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From kennelson11 at gmail.com Mon Jan 14 22:12:39 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 14 Jan 2019 14:12:39 -0800 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190114202933.4tcpf6swaq3irrt2@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <20190114202933.4tcpf6swaq3irrt2@yuggoth.org> Message-ID: Hello :) On Mon, Jan 14, 2019 at 12:30 PM Jeremy Stanley wrote: > On 2019-01-14 14:16:51 -0600 (-0600), Sean McGinnis wrote: > [...] > > I could see having something where to be an eligible candidate, > > one would either need to be a current or a past PTL. There's > > definitely value in bringing that experience and perspective to > > the TC. > [...] > > In principle, our charter doesn't even require candidates for TC > seats be recognized contributors to any project, only that they be > OSF individual members in good standing. In practice, the most > common way people get the visibility required to be elected to the > TC is to have already held another leadership role in the community, > frequently by serving as a PTL. There are many other leadership positions that we would exclude if we started restricting TC candidates to only current/past PTLs. SIG Chairs (formerly WG leads) for example. Some sort of leadership role prior to running is important, I agree. But we would never have had people like Colleen Murphy or Chris Dent who were and are, respectively, excellent members of the TC. > While I haven't actually done the > math, I would wager more than half the people ever elected to the TC > have also been PTLs at some point, so I personally have doubts that > making it a policy would actually change anything. > > A quick skim of the current sitting TC members, I believe 12 out of > 13 are either presently or were formerly also PTLs. > -- > Jeremy Stanley > -Kendall (diablo_rojo) -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Mon Jan 14 22:33:52 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 14 Jan 2019 22:33:52 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <20190114202933.4tcpf6swaq3irrt2@yuggoth.org> Message-ID: <20190114223352.nesaip34taee3tsg@yuggoth.org> On 2019-01-14 14:12:39 -0800 (-0800), Kendall Nelson wrote: > Hello :) > > On Mon, Jan 14, 2019 at 12:30 PM Jeremy Stanley wrote: > > > On 2019-01-14 14:16:51 -0600 (-0600), Sean McGinnis wrote: > > [...] > > > I could see having something where to be an eligible > > > candidate, one would either need to be a current or a past > > > PTL. There's definitely value in bringing that experience and > > > perspective to the TC. > > [...] > > > > In principle, our charter doesn't even require candidates for TC > > seats be recognized contributors to any project, only that they > > be OSF individual members in good standing. In practice, the > > most common way people get the visibility required to be elected > > to the TC is to have already held another leadership role in the > > community, frequently by serving as a PTL. > > There are many other leadership positions that we would exclude if > we started restricting TC candidates to only current/past PTLs. > SIG Chairs (formerly WG leads) for example. > > Some sort of leadership role prior to running is important, I > agree. But we would never have had people like Colleen Murphy or > Chris Dent who were and are, respectively, excellent members of > the TC. Absolutely. I happen to think that the representatives we've had on the TC who weren't previously PTLs have provided valuable insight. > > While I haven't actually done the math, I would wager more than > > half the people ever elected to the TC have also been PTLs at > > some point, so I personally have doubts that making it a policy > > would actually change anything. [...] And here I was more responding to Chris's original question, "what would it be like if we required half or more of the TC be elected from PTLs?" What I meant to say is that I believe half or more of the TC are already (and have always been) elected from current and prior PTLs, so mandating that wouldn't change anything for the better. Certainly if Sean's follow-up suggestion of requiring *all* TC candidates to have PTL experience were entertained, it would exclude the sorts of valuable contributions we've had from notable non-PTL members on the TC and I think that would be a significant loss. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jungleboyj at gmail.com Mon Jan 14 22:44:07 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Mon, 14 Jan 2019 16:44:07 -0600 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190114223352.nesaip34taee3tsg@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <20190114202933.4tcpf6swaq3irrt2@yuggoth.org> <20190114223352.nesaip34taee3tsg@yuggoth.org> Message-ID: <610355ec-f136-145c-f050-b172f710479f@gmail.com> On 1/14/2019 4:33 PM, Jeremy Stanley wrote: > On 2019-01-14 14:12:39 -0800 (-0800), Kendall Nelson wrote: >> Hello :) >> >> On Mon, Jan 14, 2019 at 12:30 PM Jeremy Stanley wrote: >> >>> On 2019-01-14 14:16:51 -0600 (-0600), Sean McGinnis wrote: >>> [...] >>>> I could see having something where to be an eligible >>>> candidate, one would either need to be a current or a past >>>> PTL. There's definitely value in bringing that experience and >>>> perspective to the TC. >>> [...] >>> >>> In principle, our charter doesn't even require candidates for TC >>> seats be recognized contributors to any project, only that they >>> be OSF individual members in good standing. In practice, the >>> most common way people get the visibility required to be elected >>> to the TC is to have already held another leadership role in the >>> community, frequently by serving as a PTL. >> There are many other leadership positions that we would exclude if >> we started restricting TC candidates to only current/past PTLs. >> SIG Chairs (formerly WG leads) for example. >> >> Some sort of leadership role prior to running is important, I >> agree. But we would never have had people like Colleen Murphy or >> Chris Dent who were and are, respectively, excellent members of >> the TC. > Absolutely. I happen to think that the representatives we've had on > the TC who weren't previously PTLs have provided valuable insight. > >>> While I haven't actually done the math, I would wager more than >>> half the people ever elected to the TC have also been PTLs at >>> some point, so I personally have doubts that making it a policy >>> would actually change anything. > [...] > > And here I was more responding to Chris's original question, "what > would it be like if we required half or more of the TC be elected > from PTLs?" What I meant to say is that I believe half or more of > the TC are already (and have always been) elected from current and > prior PTLs, so mandating that wouldn't change anything for the > better. > > Certainly if Sean's follow-up suggestion of requiring *all* TC > candidates to have PTL experience were entertained, it would exclude > the sorts of valuable contributions we've had from notable non-PTL > members on the TC and I think that would be a significant loss. Agreed, in my response to Sean's note hadn't meant that all TC members would need to have been PTLs but that some proportion having technical leadership experience in OpenStack would be good.  Sorry for the confusion. From openstack at nemebean.com Mon Jan 14 23:10:22 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 14 Jan 2019 17:10:22 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> Message-ID: <8eb6964f-506f-848b-a838-935bb972c9f5@nemebean.com> I tried to set up a test environment for this, but I'm having some issues. My local environment is defaulting to python 3, while the gate job appears to have been running under python 2. I'm not sure why it's doing that since the tox env definition doesn't specify python 3 (maybe something to do with https://review.openstack.org/#/c/622415/ ?), but either way I keep running into import issues. I'll take another look tomorrow, but in the meantime I'm afraid I haven't made any meaningful progress. :-( On 1/14/19 6:36 AM, Akihiro Motoki wrote: > The similar failure happens in neutron-fwaas. This blocks several > patches in neutron-fwaas including policy-in-code support. > https://bugs.launchpad.net/neutron/+bug/1811506 > > Most failures are fixed by applying Ben's neutron fix > https://review.openstack.org/#/c/629335/ [1], > but we still have one failure > in neutron_fwaas.tests.functional.privileged.test_utils.InNamespaceTest.test_in_namespace > [2]. > This failure is caused by oslo.privsep 1.31.0 too. This does not happen > with 1.30.1. > Any help would be appreciated. > > [1] neutron-fwaas change https://review.openstack.org/#/c/630451/ > [2] > http://logs.openstack.org/51/630451/2/check/legacy-neutron-fwaas-dsvm-functional/05b9131/logs/testr_results.html.gz > > -- > Akihiro Motoki (irc: amotoki) > > > 2019年1月9日(水) 9:32 Ben Nemec >: > > I think I've got it. At least in my local tests, the handle pointer > being passed from C -> Python -> C was getting truncated at the Python > step because we didn't properly define the type. If the address > assigned > was larger than would fit in a standard int then we passed what > amounted > to a bogus pointer back to the C code, which caused the segfault. > > I have no idea why privsep threading would have exposed this, other > than > maybe running in threads affected the address space somehow? > > In any case, https://review.openstack.org/629335 has got these > functional tests working for me locally in oslo.privsep 1.31.0. It > would > be great if somebody could try them out and verify that I didn't just > find a solution that somehow only works on my system. :-) > > -Ben > > On 1/8/19 4:30 PM, Ben Nemec wrote: > > > > > > On 1/8/19 2:22 PM, Slawomir Kaplonski wrote: > >> Hi Ben, > >> > >> I was also looking at it today. I’m totally not an C and > Oslo.privsep > >> expert but I think that there is some new process spawned here. > >> I put pdb before line > >> > https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L191 > > >> where this issue happen. Then, with "ps aux” I saw: > >> > >> vagrant at fullstack-ubuntu ~ $ ps aux | grep privsep > >> root     18368  0.1  0.5 185752 33544 pts/1    Sl+  13:24   0:00 > >> /opt/stack/neutron/.tox/dsvm-functional/bin/python > >> /opt/stack/neutron/.tox/dsvm-functional/bin/privsep-helper > >> --config-file neutron/tests/etc/neutron.conf --privsep_context > >> neutron.privileged.default --privsep_sock_path > >> /tmp/tmpG5iqb9/tmp1dMGq0/privsep.sock > >> vagrant  18555  0.0  0.0  14512  1092 pts/2    S+   13:25   0:00 > grep > >> --color=auto privsep > >> > >> But then when I continue run test, and it segfaulted, in journal > log I > >> have: > >> > >> Jan 08 13:25:29 fullstack-ubuntu kernel: privsep-helper[18369] > >> segfault at 140043e8 ip 00007f8e1800ef32 sp 00007f8e18a63320 > error 4 > >> in libnetfilter_conntrack.so.3.5.0[7f8e18009000+1a000] > >> > >> Please check pics of those processes. First one (when test was > >> „paused” with pdb) has 18368 and later segfault has 18369. > > > > privsep-helper does fork, so I _think_ that's normal. > > > > > https://github.com/openstack/oslo.privsep/blob/ecb1870c29b760f09fb933fc8ebb3eac29ffd03e/oslo_privsep/daemon.py#L539 > > > > > > >> > >> I don’t know if You saw my today’s comment in launchpad. I was > trying > >> to change method used to start PrivsepDaemon from > Method.ROOTWRAP to > >> Method.FORK (in > >> > https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/priv_context.py#L218) > > >> and run test as root, then tests were passed. > > > > Yeah, I saw that, but I don't understand it. :-/ > > > > The daemon should end up running with the same capabilities in > either > > case. By the time it starts making the C calls the environment > should be > > identical, regardless of which method was used to start the process. > > > >> > >> — > >> Slawek Kaplonski > >> Senior software engineer > >> Red Hat > >> > >>> Wiadomość napisana przez Ben Nemec > w dniu > >>> 08.01.2019, o godz. 20:04: > >>> > >>> Further update: I dusted off my gdb skills and attached it to the > >>> privsep process to try to get more details about exactly what is > >>> crashing. It looks like the segfault happens on this line: > >>> > >>> > https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 > > >>> > >>> > >>> which is > >>> > >>> h->cb = cb; > >>> > >>> h being the conntrack handle and cb being the callback function. > >>> > >>> This makes me think the problem isn't the callback itself (even > if we > >>> assigned a bogus pointer, which we didn't, it shouldn't cause a > >>> segfault unless you try to dereference it) but in the handle we > pass > >>> in. Trying to look at h->cb results in: > >>> > >>> (gdb) print h->cb > >>> Cannot access memory at address 0x800f228 > >>> > >>> Interestingly, h itself is fine: > >>> > >>> (gdb) print h > >>> $3 = (struct nfct_handle *) 0x800f1e0 > >>> > >>> It doesn't _look_ to me like the handle should be crossing any > thread > >>> boundaries or anything, so I'm not sure why it would be a > problem. It > >>> gets created in the same privileged function that ultimately > >>> registers the callback: > >>> > https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 > > >>> > >>> > >>> So still not sure what's going on, but I thought I'd share what > I've > >>> found before I stop to eat something. > >>> > >>> -Ben > >>> > >>> On 1/7/19 12:11 PM, Ben Nemec wrote: > >>>> Renamed the thread to be more descriptive. > >>>> Just to update the list on this, it looks like the problem is a > >>>> segfault when the netlink_lib module makes a C call. Digging into > >>>> that code a bit, it appears there is a callback being used[1]. > I've > >>>> seen some comments that when you use a callback with a Python > >>>> thread, the thread needs to be registered somehow, but this is > all > >>>> uncharted territory for me. Suggestions gratefully accepted. :-) > >>>> 1: > >>>> > https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 > > >>>> On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: > >>>>> Hi, > >>>>> > >>>>> I just found that functional tests in Neutron are failing since > >>>>> today or maybe yesterday. See [1] > >>>>> I was able to reproduce it locally and it looks that it happens > >>>>> with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are > fine. > >>>>> > >>>>> [1] https://bugs.launchpad.net/neutron/+bug/1810518 > >>>>> > >>>>> — > >>>>> Slawek Kaplonski > >>>>> Senior software engineer > >>>>> Red Hat > >>>>> > >> > > > From doug at doughellmann.com Mon Jan 14 23:18:34 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Mon, 14 Jan 2019 18:18:34 -0500 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> Message-ID: "Fox, Kevin M" writes: > Been chewing on this thread for a while.... I think I should advocate > the other direction. > > I think most folks will come from PTL as its an easier path to get > votes. So I don't see that as underrepresented. > > Getting some diversity of ideas from outside of those from PTL's is > probably a good idea for the overall health of OpenStack. What about > Users that have never been PTL's? Not developers? > > Thanks, > Kevin It is already possible for any Foundation member to stand for election to the TC. They do not need to be a "developer" (or even "contributor") according to the election rules. I don't think it's especially likely that someone who isn't a contributor is going to do well in the election for more practical reasons, though. -- Doug From mriedemos at gmail.com Tue Jan 15 00:04:06 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Mon, 14 Jan 2019 18:04:06 -0600 Subject: [nova][glance] Granting image member access for snapshots (bug 1675791) Message-ID: <2847e346-2f60-2d33-4686-1fd654992d8e@gmail.com> I have a fix proposed for a pretty old bug (1675791 [1]). This originally came up because of a scenario where an admin shelves a server and then the owner of the shelved server cannot unshelve it since they do not have access to the shelve snapshot image. The same is true for normal snapshot and backup operations though, see this proposed spec for Stein [2]. It also came up during the cross-cell resize spec review [3] since that solution depends on snapshot to get the root disk from one cell to another. In a nutshell, when creating a snapshot now, the compute API will check if the project creating the snapshot is the same as the project owner of the server. If not, the image is created with visibility=shared and the project owner of the instance is granted member access to the image, which allows them to GET the image directly via the ID, but not list it by default (the tenant user has to accept the pending membership for that). I have tested this out in devstack today and everything seems to work well. I am posting this to (a) raise awareness of the bug and proposed fix since it is sort of a behavior change in the createImage/createBackup/shelve APIs and (b) to make sure the glance team is aware and acknowledges this is an OK thing to do, i.e. are there any kind of unforeseen side effects of automatically granting image membership like this (I would think not since the owner of the instance has access to the root disk of the server anyway - it is their data). Also note that some really crusty legacy code in most of the in-tree virt drivers had to be removed (some virt drivers would change the image visibility back to private during the actual data upload to glance) which could mean out of tree drivers have the same issue. [1] https://bugs.launchpad.net/nova/+bug/1675791 [2] https://review.openstack.org/#/c/616843/ [3] https://review.openstack.org/#/c/616037/3/specs/stein/approved/cross-cell-resize.rst at 233 -- Thanks, Matt From ildiko.vancsa at gmail.com Tue Jan 15 02:12:35 2019 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Mon, 14 Jan 2019 19:12:35 -0700 Subject: [edge] Zero Touch Provisioning In-Reply-To: <04d66a74-5560-45c5-d61f-fd38277b467d@gmail.com> References: <3a1ca5f4-0c3a-ae32-3e0b-b346cf653f65@gmail.com> <59f29ce7-aa3e-7f68-4e53-2623d35cf778@gmail.com> <04d66a74-5560-45c5-d61f-fd38277b467d@gmail.com> Message-ID: <9F3678F9-C049-47B6-81C5-98B1BBF2B1E1@gmail.com> Hi, Apologies, I’ve just got to this thread. I don’t remember discussing this topic in details with the Edge group yet. ZTP came up as a desire, but we didn’t get to the point of going into details on what it means until now. I think it would be great to capture the discussion on this thread as well as further aspects for consideration on the Edge Computing Group wiki and use it as a central place for StarlingX and relevant OpenStack projects such as Cyborg to benefit from as input for their work. I added it to the agenda for the Edge group weekly call[1] for tomorrow to bring it up as a discussion point and can keep it there if we have people around interested in discussing it further. Thanks, Ildikó [1] https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings > On 2018. Dec 21., at 6:25, Jay Pipes wrote: > > On 12/21/2018 01:40 AM, Zhipeng Huang wrote: >> ICYMI: https://blog.ipspace.net/2018/12/zero-touch-provisioning-with-patrick_20.html > > From that article: > > "ZTP can be used internally connecting to an internal provisioning server, and it can be used externally connecting to an external provisioning server. Some commercial products use ZTP in connection with a vendor-controlled cloud-based provisioning server." > > Funny, that's almost exactly what I said in my original response on this thread :) > >> Shall we collect items of OCP/OpenStack components that could composite a ZTP stack somewhere ? > > Sure, just make sure we're being explicit about the things we're discussing. If you want to focus exclusively on *network device* provisioning, that's fine, but that should be explicitly stated. > > Network device provisioning is an important but ultimately tiny part of infrastructure provisioning. Unfortunately, most of the telco and network OEM community (understandably) only refer to network device provisioning when they talk about "ZTP". > > Also, the less this can be pidgeon-holed into the amorphous "edge" category, the better, IMHO ;) > > Best, > -jay > >> On Thu, Dec 20, 2018 at 10:51 PM Curtis > wrote: >> On Thu, Dec 20, 2018 at 9:35 AM Jay Pipes > > wrote: >> On 12/20/2018 09:33 AM, Curtis wrote: >> > No, it doesn't do inventory management. It's like a small >> base OS for >> > network switches, they come with it, boot up into it, and you >> can use it >> > to install other OSes. At least that's how it's used with the >> whitebox >> > switches I have. With ONIE it'd install the OS, then the >> initial OS >> > could register with some kind of inventory system. >> What about for non-network devices -- i.e. general compute >> hardware? >> After all, edge is more than just the network switches :) >> I'm not sure; that's a good question. I was just using it as an >> example of a piece of ZTP related technology that has been adopted >> by vendors who build physical devices and include them by default, >> though of course, only a subset of network switches. :) >> If ONIE was a good example, and it could not be used for general >> compute hardware, then perhaps something like it could be built >> using lessons learned by ONIE. I dunno. :) >> Thanks, >> Curits >> Best, >> -jay >> -- Blog: serverascode.com >> -- >> Zhipeng (Howard) Huang >> Principle Engineer >> IT Standard & Patent/IT Product Line >> Huawei Technologies Co,. Ltd >> Email: huangzhipeng at huawei.com >> Office: Huawei Industrial Base, Longgang, Shenzhen > From zhipengh512 at gmail.com Tue Jan 15 02:24:21 2019 From: zhipengh512 at gmail.com (Zhipeng Huang) Date: Tue, 15 Jan 2019 10:24:21 +0800 Subject: [edge] Zero Touch Provisioning In-Reply-To: <9F3678F9-C049-47B6-81C5-98B1BBF2B1E1@gmail.com> References: <3a1ca5f4-0c3a-ae32-3e0b-b346cf653f65@gmail.com> <59f29ce7-aa3e-7f68-4e53-2623d35cf778@gmail.com> <04d66a74-5560-45c5-d61f-fd38277b467d@gmail.com> <9F3678F9-C049-47B6-81C5-98B1BBF2B1E1@gmail.com> Message-ID: Thanks Ildiko ! :) On Tue, Jan 15, 2019 at 10:12 AM Ildiko Vancsa wrote: > Hi, > > Apologies, I’ve just got to this thread. > > I don’t remember discussing this topic in details with the Edge group yet. > ZTP came up as a desire, but we didn’t get to the point of going into > details on what it means until now. > > I think it would be great to capture the discussion on this thread as well > as further aspects for consideration on the Edge Computing Group wiki and > use it as a central place for StarlingX and relevant OpenStack projects > such as Cyborg to benefit from as input for their work. > > I added it to the agenda for the Edge group weekly call[1] for tomorrow to > bring it up as a discussion point and can keep it there if we have people > around interested in discussing it further. > > Thanks, > Ildikó > > [1] https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings > > > > On 2018. Dec 21., at 6:25, Jay Pipes wrote: > > > > On 12/21/2018 01:40 AM, Zhipeng Huang wrote: > >> ICYMI: > https://blog.ipspace.net/2018/12/zero-touch-provisioning-with-patrick_20.html > > > > From that article: > > > > "ZTP can be used internally connecting to an internal provisioning > server, and it can be used externally connecting to an external > provisioning server. Some commercial products use ZTP in connection with a > vendor-controlled cloud-based provisioning server." > > > > Funny, that's almost exactly what I said in my original response on this > thread :) > > > >> Shall we collect items of OCP/OpenStack components that could composite > a ZTP stack somewhere ? > > > > Sure, just make sure we're being explicit about the things we're > discussing. If you want to focus exclusively on *network device* > provisioning, that's fine, but that should be explicitly stated. > > > > Network device provisioning is an important but ultimately tiny part of > infrastructure provisioning. Unfortunately, most of the telco and network > OEM community (understandably) only refer to network device provisioning > when they talk about "ZTP". > > > > Also, the less this can be pidgeon-holed into the amorphous "edge" > category, the better, IMHO ;) > > > > Best, > > -jay > > > >> On Thu, Dec 20, 2018 at 10:51 PM Curtis > wrote: > >> On Thu, Dec 20, 2018 at 9:35 AM Jay Pipes >> > wrote: > >> On 12/20/2018 09:33 AM, Curtis wrote: > >> > No, it doesn't do inventory management. It's like a small > >> base OS for > >> > network switches, they come with it, boot up into it, and you > >> can use it > >> > to install other OSes. At least that's how it's used with the > >> whitebox > >> > switches I have. With ONIE it'd install the OS, then the > >> initial OS > >> > could register with some kind of inventory system. > >> What about for non-network devices -- i.e. general compute > >> hardware? > >> After all, edge is more than just the network switches :) > >> I'm not sure; that's a good question. I was just using it as an > >> example of a piece of ZTP related technology that has been adopted > >> by vendors who build physical devices and include them by default, > >> though of course, only a subset of network switches. :) > >> If ONIE was a good example, and it could not be used for general > >> compute hardware, then perhaps something like it could be built > >> using lessons learned by ONIE. I dunno. :) > >> Thanks, > >> Curits > >> Best, > >> -jay > >> -- Blog: serverascode.com > >> -- > >> Zhipeng (Howard) Huang > >> Principle Engineer > >> IT Standard & Patent/IT Product Line > >> Huawei Technologies Co,. Ltd > >> Email: huangzhipeng at huawei.com > >> Office: Huawei Industrial Base, Longgang, Shenzhen > > > > -- Zhipeng (Howard) Huang Principle Engineer IT Standard & Patent/IT Product Line Huawei Technologies Co,. Ltd Email: huangzhipeng at huawei.com Office: Huawei Industrial Base, Longgang, Shenzhen -------------- next part -------------- An HTML attachment was scrubbed... URL: From liliueecg at gmail.com Tue Jan 15 02:35:49 2019 From: liliueecg at gmail.com (Li Liu) Date: Mon, 14 Jan 2019 21:35:49 -0500 Subject: [Cyborg][IRC] Message-ID: The IRC meeting will be held Wednesday at 0300 UTC, which is 10:00 pm est(Tuesday) / 7:00 pm pst(Tuesday) /11 am Beijing time (Wednesday) This week's Agenda: 1. Track status working items here https://storyboard.openstack.org/#!/story/2004248 2. Considering merge Sundar's patches into the pilot branch. -- Thank you Regards Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From chkumar246 at gmail.com Tue Jan 15 06:50:06 2019 From: chkumar246 at gmail.com (Chandan kumar) Date: Tue, 15 Jan 2019 12:20:06 +0530 Subject: [tripleo][openstack-ansible] collaboration on os_tempest role update VI - Jan 15, 2019 Message-ID: Hello, Here is the sixth update (Jan 08 to Jan 15, 2019) on collaboration on os_tempest[1] role between TripleO and OpenStack-Ansible projects. Things got merged: os_tempest * Better tempest blacklist management - https://review.openstack.org/621605 * Move role data in share/ansible/roles - https://review.openstack.org/629127 * Use the inventory to enable/disable services by default - https://review.openstack.org/628979 * Remove tempest_image_dir_owner var - https://review.openstack.org/629419 * Automatically select the correct tempest plugins - https://review.openstack.org/629499 Tripleo * os_tempest tripleo integration blueprint created: https://blueprints.launchpad.net/tripleo/+spec/os-tempest-tripleo Summary: os_tempest role got following features: * support of handling tempest skiplist having reasons why a test added in skip list. * Handling enable/disabling of services and tempest plugins using inventory * tempest_image_dir_owner var got removed In Progress: os_tempest: * Run smoke and dashboard tests by default - https://review.openstack.org/587617 * Fix tempest workspace path - https://review.openstack.org/628182 * [WIP] let's find proper heat tests - https://review.openstack.org/630695 * Add support for aarch64 images - https://review.openstack.org/620032 * Configuration drives don't appear to work on aarch64+kvm - https://review.openstack.org/626592 python-tempestconf * Create functional-tests role - https://review.openstack.org/626539 * Enable manila plugin in devstack - https://review.openstack.org/625191 * Set refstack-client-*-tempestconf voting again - https://review.openstack.org/628919 Updates on running os_tempest in CI * Devstack - https://review.openstack.org/627482 * TripleO - https://review.openstack.org/#/q/topic:tripleoostempest+(status:open+OR+status:merged) Summary: Finally we are able to execute the os_tempest role but currently due to missing vars from playbook, we will be soon reaching to run tempest tests also. Tripleo os_tempest spec review: https://review.openstack.org/630654 Upcoming week, we will be working on enabling heat tempest tests in OSA jobs and trying to run tempest tests from os_tempest in Tripleo/DevStack CI. Thanks to arxcruz (for skiplist test management) and odyssey4me (enabling openstack services using inventory ). Here is the 5th update [2] Have queries, Feel free to ping us on #tripleo or #openstack-ansible channel. Links: [1.] http://git.openstack.org/cgit/openstack/openstack-ansible-os_tempest [2.] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001471.html Thanks, Chandan Kumar From ignaziocassano at gmail.com Tue Jan 15 07:16:23 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 15 Jan 2019 08:16:23 +0100 Subject: [heat][octavia][barbican] TLS-terminated HTTPS load balancer Message-ID: Hello everyone, I found a lot of exampe for creating TLS-terminated HTTPS load balancer by command line using octavia and barbican. I did not find any example to do it with heat. Anyone could post an example, please ? Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Jan 15 07:31:16 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 15 Jan 2019 16:31:16 +0900 Subject: [congress][infra] override-checkout problem In-Reply-To: References: Message-ID: <168506bed0d.e293a53266619.6872264478276926942@ghanshyammann.com> ---- On Sat, 12 Jan 2019 11:52:57 +0900 Eric K wrote ---- > Hi Ghanshyam, > > On 1/11/19, 4:57 AM, "Ghanshyam Mann" wrote: > > >Hi Eric, > > > >This seems the same issue happening on congress-tempest-plugin gate where > >'congress-devstack-py35-api-mysql-queens' is failing [1]. > >python-congressclient was > >not able to install and openstack client trow error for congress command. > > > >The issue is stable branch jobs on congress-tempest-plugin does checkout > >the master version for all repo > >instead of what mentioned in override-checkout var. > > > >If you see congress's rocky patch, congress is checkout out with rocky > >version[2] but > >congress-tempest-plugin patch's rocky job checkout the master version of > >congress instead of rocky version [3]. > >That is why your test expectedly fail on congress patch but pass on > >congress-tempest-plugin. > > > >Root cause is that override-checkout var does not work on the legacy job > >(it is only zuulv3 job var, if I am not wrong), > >you need to use BRANCH_OVERRIDE for legacy jobs. Myself, amotoki and > >akhil was trying lot other workarounds > >to debug the root cause but at the end we just notice that congress jobs > >are legacy jobs and using override-checkout :). > Gosh thanks so much for the investigation. Yes it's a legacy-dsvm job. So > sorry for the run around! > I'm thinking of taking the opportunity to migrate to devstack-tempest job. > I've taken a first stab here: https://review.openstack.org/#/c/630414/ > > > > >I have submitted the testing patch with BRANCH_OVERRIDE for > >congress-tempest-plugin queens job[4]. > >Which seems working fine, I can make those patches more formal for merge. > And thanks so much for putting together those patches using > BRANCH_OVERRIDE! Merging sounds good unless we can quickly migrate Sounds like good plan. I have updated patch to unblock the gate as of now for queens py35 job - https://review.openstack.org/#/c/630173/4 And for other issues fix (other stable branch job) we can wait for zuulv3 migration work. -gmann > > To non-legacy jobs. Realistically it'll probably end up take a while to > get everything migrated and working. > > > > > >Another thing I was discussing with Akhil that new tests of builins > >feature need another feature flag > >(different than congressz3.enabled) as that feature of z3 is in stein > >onwards only. > Yup. I was going to do that but wanted to first figure out why it wasn't > failing on tempest plugin. > I've now submitted a patch to do that. > > > > > >[1] https://review.openstack.org/#/c/618951/ > >[2] > >http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87 > >474d7/logs/pip2-freeze.txt.gz > >[3] > >http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-ro > >cky/23c0214/logs/pip2-freeze.txt.gz > >[4] > >https://review.openstack.org/#/q/topic:fix-stable-branch-testing+(status:o > >pen+OR+status:merged) > > > >-gmann > > > > ---- On Fri, 11 Jan 2019 10:40:39 +0900 Eric K > > wrote ---- > > > The congress-tempest-plugin zuul jobs against stable branches appear > > > to be working incorrectly. Tests that should fail on stable/rocky (and > > > indeed fails when triggered by congress patch [1]) are passing when > > > triggered by congress-tempest-plugin patch [2]. > > > > > > I'd assume it's some kind of zuul misconfiguration in > > > congress-tempest-plugin [3], but I've so far failed to figure out > > > what's wrong. Particularly strange is that the job-output appears to > > > show it checking out the right thing [4]. > > > > > > Any thoughts or suggestions? Thanks so much! > > > > > > [1] > > > https://review.openstack.org/#/c/629070/ > > > > >http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87 > >474d7/logs/testr_results.html.gz > > > The two failing z3 tests should indeed fail because the feature was > > > not available in rocky. The tests were introduced because for some > > > reason they pass in the job triggered by a patch in > > > congress-tempest-plugin. > > > > > > [2] > > > https://review.openstack.org/#/c/618951/ > > > > >http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-ro > >cky/23c0214/logs/testr_results.html.gz > > > > > > [3] > >https://github.com/openstack/congress-tempest-plugin/blob/master/.zuul.yam > >l#L4 > > > > > > [4] > >http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-ro > >cky/23c0214/job-output.txt.gz#_2019-01-09_05_18_08_183562 > > > shows congress is checked out to the correct commit at the top of the > > > stable/rocky branch. > > > > > > > > > > > > > > > > > > From iwienand at redhat.com Tue Jan 15 07:48:46 2019 From: iwienand at redhat.com (Ian Wienand) Date: Tue, 15 Jan 2019 18:48:46 +1100 Subject: [tripleo] Re: [infra] NetworkManager on infra Fedora 29 and CentOS nodes In-Reply-To: References: <20190109061109.GA24618@fedora19.localdomain> <20190109232624.GB24618@fedora19.localdomain> Message-ID: <20190115074846.GA18699@fedora19.localdomain> On Mon, Jan 14, 2019 at 07:51:07AM -0700, Alex Schultz wrote: > Looks like it should be ok based on the test results from > https://review.openstack.org/#/c/629685/. Thanks; just to confirm this is now the default so CentOS and Fedora images now have NetworkManager in the base image. You can report any problems here or in #openstack-infra. Cheers, -i From gmann at ghanshyammann.com Tue Jan 15 09:26:49 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 15 Jan 2019 18:26:49 +0900 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190114223352.nesaip34taee3tsg@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <20190114202933.4tcpf6swaq3irrt2@yuggoth.org> <20190114223352.nesaip34taee3tsg@yuggoth.org> Message-ID: <16850d5b740.125f4d1a670822.5106331694503084438@ghanshyammann.com> ---- On Tue, 15 Jan 2019 07:33:52 +0900 Jeremy Stanley wrote ---- > On 2019-01-14 14:12:39 -0800 (-0800), Kendall Nelson wrote: > > Hello :) > > > > On Mon, Jan 14, 2019 at 12:30 PM Jeremy Stanley wrote: > > > > > On 2019-01-14 14:16:51 -0600 (-0600), Sean McGinnis wrote: > > > [...] > > > > I could see having something where to be an eligible > > > > candidate, one would either need to be a current or a past > > > > PTL. There's definitely value in bringing that experience and > > > > perspective to the TC. > > > [...] > > > > > > In principle, our charter doesn't even require candidates for TC > > > seats be recognized contributors to any project, only that they > > > be OSF individual members in good standing. In practice, the > > > most common way people get the visibility required to be elected > > > to the TC is to have already held another leadership role in the > > > community, frequently by serving as a PTL. > > > > There are many other leadership positions that we would exclude if > > we started restricting TC candidates to only current/past PTLs. > > SIG Chairs (formerly WG leads) for example. > > > > Some sort of leadership role prior to running is important, I > > agree. But we would never have had people like Colleen Murphy or > > Chris Dent who were and are, respectively, excellent members of > > the TC. > > Absolutely. I happen to think that the representatives we've had on > the TC who weren't previously PTLs have provided valuable insight. > > > > While I haven't actually done the math, I would wager more than > > > half the people ever elected to the TC have also been PTLs at > > > some point, so I personally have doubts that making it a policy > > > would actually change anything. > [...] > > And here I was more responding to Chris's original question, "what > would it be like if we required half or more of the TC be elected > from PTLs?" What I meant to say is that I believe half or more of > the TC are already (and have always been) elected from current and > prior PTLs, so mandating that wouldn't change anything for the > better. > > Certainly if Sean's follow-up suggestion of requiring *all* TC > candidates to have PTL experience were entertained, it would exclude > the sorts of valuable contributions we've had from notable non-PTL > members on the TC and I think that would be a significant loss. There is no doubt that having Leadership experience for TC candidates are much valuable but IMO PTL is not the only way to judge the leadership experience, it can be corporate experience, other past/current open source leadership or even from core member role also. There are many core members who are an excellent leader and leading certain area of the big project since long time. Nova, Neutron are good example where the project is divided into sub-teams and sub-leaders. As PTL, you need to do sort of management activity and not all core want to serve as PTL but they are a good leader and best fit for TC. -gmann > -- > Jeremy Stanley > From thierry at openstack.org Tue Jan 15 09:49:41 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 15 Jan 2019 10:49:41 +0100 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: Chris Dent wrote: > [...] > Since there is a significant and friction creating division of power > and leadership between the TC and PTLs, I'm not sure I follow you there... the division of power is between keeping an eye on the big picture and caring for OpenStack as a whole (TC) vs. rubber-hits-the-road, being responsible for a specific set of deliverables (PTL). The same individuals can care for both concerns, but those are different tasks... I think the division is clear. The only friction I've observed recently is when it comes to driving cross-project work -- an area that TC members and affected PTLs care about. We need more people driving that type of work, and as we've said in other threads, TC members (as well as other respected members of our community) are in a good position to help drive that work. > [...] what would it be like if we > required half or more of the TC be elected from PTLs? Then the > "providing the technical leadership" aspect of the TC mission [3] > would be vested with the people who also have some responsibility for > executing on that leadership. > > That would be like something we had before, but now there are many > more PTLs. I don't think PTLs have any difficulty getting elected when they run, so I'm not sure a provision that the TC must have reserved seats for PTLs would have a significant impact, beyond complicating the election process. In terms of increasing TC efficiency... As others said, being a PTL for a large project is already a lot of work and that leaves little to do "TC work". And if the goal is to get more TC members to drive cross-project work, the main reason TC members don't drive (more) cross-project work is generally that they don't have enough bandwidth to do so. Mandating more PTLs to be TC members is unlikely to result in a TC membership with more available cross-project work bandwidth... I agree that it is important to have representation of classic developer teams on the TC, but I feel like today's TC membership is a good mix between horizontal teams and vertical teams, including deployment concerns and adjacent communities perspective. We should definitely continue to encourage TC candidacies from vertical/classic project teams, but I don't think that should be reduced to PTLs, and I don't think that should be reserved seats. -- Thierry Carrez (ttx) From cdent+os at anticdent.org Tue Jan 15 11:01:09 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Tue, 15 Jan 2019 11:01:09 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> References: <20190114201650.GA6655@sm-workstation>, <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> Message-ID: On Mon, 14 Jan 2019, Fox, Kevin M wrote: > Been chewing on this thread for a while.... I think I should advocate the other direction. I'm not sure where to rejoin this thread, so picking here as it provides a reasonable entry point. First: thanks to everyone who has joined in, I honestly do feel that as annoying as these discussions can be, they often reveal something useful. Second, things went a bit sideways from the point I was trying to reach. I wasn't trying to say that PTLs are the obvious and experienced choice for TC leadership, nor that they were best placed to represent the community. I hope that my own behavior over the past few years has made it clear that I very definitely do not feel that way. However, as most respondents on this thread have pointed out, both TC members and PTLs are described as being over-tasked. What I'm trying to tease out or ask is: Are they over-tasked because they are working on too many things (or at least trying to sort through the too many things); a situation that results from _no unified technical leadership for the community_. My initial assertion was that the TC is insufficiently involved in defining and performing technical leadership. Then I implied that the TC cannot do anything like actionable and unified technical leadership because they have little to no real executive power and what power they do have (for example, trying to make openstack-wide goals) is in conflict (because of the limits of time and space) with the goals that PTLs (and others) are trying to enact. Thus: What if the TC and PTLs were the same thing? Would it become more obvious that there's too much in play to make progress in a unified direction (on the thing called OpenStack), leading us to choose less to do, and choose more consistency and actionable leadership? And would it enable some power to execute on that leadership. Those are questions, not assertions. > Getting some diversity of ideas from outside of those from PTL's > is probably a good idea for the overall health of OpenStack. What > about Users that have never been PTL's? Not developers? So, to summarize: While I agree we need a diversity of ideas, I don't think we lack for ideas, nor have we ever. What we lack is a small enough set of ideas to act on them with significant enough progress to make a real difference. How can we make the list small and (to bring this back to the TC role) empower the TC to execute on that list? And, to be complete, should we? And, to be extra really complete, I'm not sure if we should or not, which is why I'm asking. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From rob at cleansafecloud.com Tue Jan 15 11:02:49 2019 From: rob at cleansafecloud.com (Robert Donovan) Date: Tue, 15 Jan 2019 11:02:49 +0000 Subject: [nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation In-Reply-To: References: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> <0b37748c-bbc4-e5cf-a434-6adcd0248b64@gmail.com> <1c2024d88b8c900edb2f063b4203da3d5cc76c11.camel@redhat.com> Message-ID: <38BFDC6F-B4C1-4CF7-83C4-E4FF02362491@cleansafecloud.com> Thanks, these are all very interesting points, particularly the notes on the nature of floating guest cores as this does indeed go some way towards mitigation. I can certainly see that you could quite easily, without intent, make the situation worse if you stopped supporting floating instances and there was some issue with the pinning algorithm that was used. We want to go a step further in terms of mitigation, ideally without turning off SMT and taking that ~20% performance hit, but accept that our methods are specific to our company’s size, services and infrastructure. We’ll endeavour to share any experiences we have that may be useful to the wider community if we do proceed in any sort of implementation around this. Thanks, Rob > On 10 Jan 2019, at 19:02, Sean Mooney wrote: > > On Thu, 2019-01-10 at 17:56 +0000, Stephen Finucane wrote: >> On Thu, 2019-01-10 at 11:05 -0500, Jay Pipes wrote: >>> On 01/10/2019 10:49 AM, Robert Donovan wrote: >>>> Hello Nova folks, >>>> >>>> I spoke to some of you very briefly about this in Berlin (thanks >>>> again for your time), and we were resigned to turning off SMT to >>>> fully protect against future CPU cache side-channel attacks as I >>>> know many others have done. However, we have stubbornly done a bit >>>> of last-resort research and testing into using vCPU pinning on a >>>> per-tenant basis as an alternative and I’d like to lay it out in >>>> more detail for you to make sure there are no legs in the idea >>>> before abandoning it completely. >>>> >>>> The idea is to use libvirt’s vcpupin ability to ensure that two >>>> different tenants never share the same physical CPU core, so they >>>> cannot theoretically steal each other’s data via an L1 or L2 cache >>>> side-channel. The pinning would be optimised to make use of as many >>>> logical cores as possible for any given tenant. We would also >>>> isolate other key system processes to a separate range of physical >>>> cores. After discussions in Berlin, we ran some tests with live >>>> migration, as this is key to our maintenance activities and would >>>> be a show-stopped if it didn’t work. We found that removing any >>>> pinning restrictions immediately prior to migration resulted in >>>> them being completely reset on the target host, which could then be >>>> optimised accordingly post-migration. Unfortunately, there would be >>>> a small window of time where we couldn’t prevent tenants from >>>> sharing a physical core on the target host after a migration, but >>>> we think this is an acceptable risk given the nature of these >>>> attacks. >>>> >>>> Obviously, this approach may not be appropriate in many >>>> circumstances, such as if you have many tenants who just run single >>>> VMs with one vCPU, or if over-allocation is in use. We have also >>>> only looked at KVM and libvirt. I would love to know what people >>>> think of this approach however. Are there any other clear issues >>>> that you can think of which we may not have considered? If it seems >>>> like a reasonable idea, is it something that could fit into Nova >>>> and, if so, where in the architecture is the best place for it to >>>> sit? I know you can currently specify per-instance CPU pinning via >>>> flavor parameters, so a similar approach could be taken for this >>>> strategy. Alternatively, we can look at implementing it as an >>>> external plugin of some kind for use by those with a similar setup. >>> >>> IMHO, if you're going to go through all the hassle of pinning guest vCPU >>> threads to distinct logical host processors, you might as well just use >>> dedicated CPU resources for everything. As you mention above, you can't >>> have overcommit anyway if you're concerned about this problem. Once you >>> have a 1.0 cpu_allocation_ratio, you're essentially limiting your CPU >>> resources to a dedicated host CPU -> guest CPU situation so you might as >>> well just use CPU pinning and deal with all the headaches that brings >>> with it. >> >> Indeed. My initial answer to this was "use CPU thread policies" >> (specifically, the 'require' policy) to ensure each instance owns its >> entire core, thinking you were using dedicated/pinned CPUs. > the isolate policy should address this. > the require policy would for a even number of cores and a singel numa node. > the require policy does not adress this is you have multiple numa nodes > > e.g. a 14 cores spread aross 2 numa nodes with require will have one free > ht sibling on each numa node when pinned unless we hava a check for that i missed. >> For shared >> CPUs, I'm not sure how we could ever do something like you've proposed >> in a manner that would result in less than the ~20% or so performance >> degradation I usually see quoted when turning off SMT. Far too much >> second guessing of the expected performance requirements of the guest >> would be necessary. > for shared cpus the assumtion is that as the guest cores are floating that > your victim and payload vm woudl not remain running on the same core/hypertread > for a protracted period of time. if both are activly using cpu cycles then the > kernel schuler will schduler them to different threads/cores to allow them to > exectue without contention. Note that im not saying there is not a risk but > tenat aware shcduleing for shared cpus effefctivly mean we woudl have to stop supporting > floating instance entirely and only allow oversubsripton to happen between vms from > the same tenant which is a unlikely to ever happen in a cloud enviorment as > teant vms typically are not coloated on a single host and second is not desirable in all > environments. >> Stephen -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Tue Jan 15 12:49:44 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Tue, 15 Jan 2019 07:49:44 -0500 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <8eb6964f-506f-848b-a838-935bb972c9f5@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> <8eb6964f-506f-848b-a838-935bb972c9f5@nemebean.com> Message-ID: Ben Nemec writes: > I tried to set up a test environment for this, but I'm having some > issues. My local environment is defaulting to python 3, while the gate > job appears to have been running under python 2. I'm not sure why it's > doing that since the tox env definition doesn't specify python 3 (maybe > something to do with https://review.openstack.org/#/c/622415/ ?), but > either way I keep running into import issues. > > I'll take another look tomorrow, but in the meantime I'm afraid I > haven't made any meaningful progress. :-( If no version is specified in the tox.ini then tox defaults to the version of python used to install it. -- Doug From openstack at fried.cc Tue Jan 15 13:40:01 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 15 Jan 2019 07:40:01 -0600 Subject: [nova][glance] Granting image member access for snapshots (bug 1675791) In-Reply-To: <2847e346-2f60-2d33-4686-1fd654992d8e@gmail.com> References: <2847e346-2f60-2d33-4686-1fd654992d8e@gmail.com> Message-ID: <18811cdc-5bba-29e9-90d0-a1f7e1bfc06d@fried.cc> On 1/14/2019 6:04 PM, Matt Riedemann wrote: > I have a fix proposed The proposed fix is here: https://review.openstack.org/#/c/630769/ > for a pretty old bug (1675791 [1]). This > originally came up because of a scenario where an admin shelves a server > and then the owner of the shelved server cannot unshelve it since they > do not have access to the shelve snapshot image. > > The same is true for normal snapshot and backup operations though, see > this proposed spec for Stein [2]. > > It also came up during the cross-cell resize spec review [3] since that > solution depends on snapshot to get the root disk from one cell to another. > > In a nutshell, when creating a snapshot now, the compute API will check > if the project creating the snapshot is the same as the project owner of > the server. If not, the image is created with visibility=shared and the > project owner of the instance is granted member access to the image, > which allows them to GET the image directly via the ID, but not list it > by default (the tenant user has to accept the pending membership for > that). I have tested this out in devstack today and everything seems to > work well. > > I am posting this to (a) raise awareness of the bug and proposed fix > since it is sort of a behavior change in the > createImage/createBackup/shelve APIs and (b) to make sure the glance > team is aware and acknowledges this is an OK thing to do, i.e. are there > any kind of unforeseen side effects of automatically granting image > membership like this (I would think not since the owner of the instance > has access to the root disk of the server anyway - it is their data). > > Also note that some really crusty legacy code in most of the in-tree > virt drivers had to be removed (some virt drivers would change the image > visibility back to private during the actual data upload to glance) > which could mean out of tree drivers have the same issue. > > [1] https://bugs.launchpad.net/nova/+bug/1675791 > [2] https://review.openstack.org/#/c/616843/ > [3] > https://review.openstack.org/#/c/616037/3/specs/stein/approved/cross-cell-resize.rst at 233 > > From rob at cleansafecloud.com Tue Jan 15 13:57:33 2019 From: rob at cleansafecloud.com (Robert Donovan) Date: Tue, 15 Jan 2019 13:57:33 +0000 Subject: [horizon][keystone][dev] Cross-domain administrators and context-switching Message-ID: <5BA26B4C-E187-4F6C-BC72-9CCE05C96B76@cleansafecloud.com> Hello, We run a cloud service with multiple domains (one per tenant) and offer services on top which can, amongst other things, involve administrators creating instances, snapshots etc. on behalf to those tenants. My understanding is that, in order to achieve this with Horizon, we currently have to create a separate admin user in each domain with a role that allows those abilities. The administrator then needs to log into that domain using the new user to perform the required actions. Firstly, is that assumption correct? Or is it possible use the same user credentials across domain boundaries? Secondly, have there ever been discussions around the “Set Domain Context” function having a wider effect to scope the whole dashboard to that particular domain, including the project panels? Are there potential issues with this as a proposal? Many thanks, Rob From mihalis68 at gmail.com Tue Jan 15 14:22:22 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 15 Jan 2019 09:22:22 -0500 Subject: [ops] last weeks meetups team minutes Message-ID: openstack ops meetups team meeting minutes from last week: Meeting ended Tue Jan 8 15:53:43 2019 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) 10:53 AM Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-01-08-15.01.html 10:53 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-01-08-15.01.txt 10:53 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-01-08-15.01.log.html This week's meeting will be in approximately 40 minutes as I write. Our main priority now is to get the planning for the Berlin meetup underway Chris -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From lbragstad at gmail.com Tue Jan 15 14:44:55 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Tue, 15 Jan 2019 08:44:55 -0600 Subject: [horizon][keystone][dev] Cross-domain administrators and context-switching In-Reply-To: <5BA26B4C-E187-4F6C-BC72-9CCE05C96B76@cleansafecloud.com> References: <5BA26B4C-E187-4F6C-BC72-9CCE05C96B76@cleansafecloud.com> Message-ID: On Tue, Jan 15, 2019 at 7:59 AM Robert Donovan wrote: > Hello, > > We run a cloud service with multiple domains (one per tenant) and offer > services on top which can, amongst other things, involve administrators > creating instances, snapshots etc. on behalf to those tenants. My > understanding is that, in order to achieve this with Horizon, we currently > have to create a separate admin user in each domain with a role that allows > those abilities. The administrator then needs to log into that domain using > the new user to perform the required actions. > > Firstly, is that assumption correct? Or is it possible use the same user > credentials across domain boundaries? > I'm not sure why separate users would be needed in this case, but I could be missing something from the horizon side. Does this not work today with Horizon? Or are you using the CLIs to perform these actions? > > Secondly, have there ever been discussions around the “Set Domain Context” > function having a wider effect to scope the whole dashboard to that > particular domain, including the project panels? Are there potential issues > with this as a proposal? > Reading this as someone who works on keystone, this sounds like getting a new token in keystone scoped to a different domain you have authorization on via a role assignment. Based on a quick search though, there appears to be a few gaps remaining in horizon for domain support [0][1]. [0] https://bugs.launchpad.net/horizon/+bug/1600195 [1] https://bugs.launchpad.net/horizon/+bug/1706879 > Many thanks, > Rob > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Tue Jan 15 15:11:23 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 15 Jan 2019 09:11:23 -0600 Subject: [releases] Additions to releases-core Message-ID: <20190115151123.GA1297@sm-workstation> I'm very happy to announce I have added Kendall Nelson and Jean-Philippe Evrard to the releases-core group. These two have been doing release request reviews and asking questions to learn about the release process, and we feel they are ready for more. Initially we will just be looking at +2's and saving approvals for one of the existing core members until everyone has more confidence. I wouldn't expect this +2-only period to last long though. We will also need to work out reallocation of review days. Right now, the current cores have taken different days for them to be the point person for processing reviews. With the expanded team, I think we should take a look at those days again and get the work divided up amongst us. Thanks Kendall and JP for getting involved and helping with this important piece of the process, and welcome to the releases-core team! Sean From fungi at yuggoth.org Tue Jan 15 15:30:41 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 15 Jan 2019 15:30:41 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> Message-ID: <20190115153041.mnrwbp6uekaucygq@yuggoth.org> On 2019-01-15 11:01:09 +0000 (+0000), Chris Dent wrote: [...] > Then I implied that the TC cannot do anything like actionable and > unified technical leadership because they have little to no real > executive power and what power they do have (for example, trying to > make openstack-wide goals) is in conflict (because of the limits of > time and space) with the goals that PTLs (and others) are trying to > enact. [...] Maybe I'm reading between the lines too much, but are you thinking that PTLs have any more executive power than TC members? At least my experience as a former PTL and discussions I've had with other PTLs suggest that the position is more to do with surfacing information about what the team is working on and helping coordinate efforts, not deciding what the team will work on. PTLs (and TC members, and anyone in the community for that matter) can direct where they spend their own time, and can also suggest to others where time might be better spent, but other than the ability to prevent work from being accepted (for example, by removing core reviewers who review non-priority changes) there's not really much "executive power" wielded by a PTL to decide on a project's direction, only influence (usually influence gained by seeking consensus and not attempting to assert team direction by fiat decree). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From Kuirong.Chen at infortrend.com Tue Jan 15 03:40:21 2019 From: Kuirong.Chen at infortrend.com (=?big5?B?S3Vpcm9uZy5DaGVuKLOvq7a/xCk=?=) Date: Tue, 15 Jan 2019 03:40:21 +0000 Subject: [Infra][Cinder]Testing Infortrend CI setup Message-ID: Hi, I would like to ask some question. First one is about CI system setting. My CI system setup is follow on https://docs.openstack.org/infra/system-config/third_party.html. And the trigger way is follow jenkins gerrit triger like below: [cid:image001.png at 01D4ACC3.C2A2B3F0] (Set the url as the guide say) And this is my test CI result on https://review.openstack.org/#/c/630871/. [cid:image002.png at 01D4ACC4.77C23210] Under the History, my test project is not hyperlink text like others CI test. Why so different? The Second Question is: Does InfortrendCI account have voting permissions on cinder repo? On https://review.openstack.org/#/c/523659/ I find Infortrend CI account will show under History but workflow like below: [cid:image003.png at 01D4ACC7.17F81270] That's mean I need get voting permissions? thank you. KuiRong Software Design Dept.II Ext. 7077 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 49033 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 98326 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 107805 bytes Desc: image003.png URL: From qi.ni at intel.com Tue Jan 15 09:10:57 2019 From: qi.ni at intel.com (Ni, Qi) Date: Tue, 15 Jan 2019 09:10:57 +0000 Subject: [neutron] Patch fails to build after rechecking. Message-ID: <32C216DF431DC842B4FB087109584B420473B0BD@shsmsx102.ccr.corp.intel.com> Hello guys, I've been working on a patch of neutron https://review.openstack.org/#/c/626109/11 and it keeps failing when building. I'm sure my code is bug free and I found Liu Yulong's patch has a similar condition like mine. https://review.openstack.org/#/c/627285/. He rechecks many times and fails every time. His code is nearly the same as previous except a line of comment. I'm not sure if there's any bug on this checking system and please take a look at our issue, thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gergely.csatari at nokia.com Tue Jan 15 09:53:00 2019 From: gergely.csatari at nokia.com (Csatari, Gergely (Nokia - HU/Budapest)) Date: Tue, 15 Jan 2019 09:53:00 +0000 Subject: [Edge-computing] Use cases mapping to MVP architectures - FEEDBACK NEEDED In-Reply-To: <741FC607-65D8-4345-AC6A-42FE7ECAD2BA@openstack.org> References: <133ad3bc-bc57-b627-7b43-94f8ac846746@redhat.com> <741FC607-65D8-4345-AC6A-42FE7ECAD2BA@openstack.org> Message-ID: Hi, Thanks for the comment. Do you see any other use of the capability to create images on an edge site except creating snapshots? Thanks, Gerg0 From: Ildiko Vancsa Sent: Tuesday, January 15, 2019 2:59 AM To: Csatari, Gergely (Nokia - HU/Budapest) Subject: Fwd: [Edge-computing] Use cases mapping to MVP architectures - FEEDBACK NEEDED Begin forwarded message: From: Bogdan Dobrelya > Subject: Re: [Edge-computing] Use cases mapping to MVP architectures - FEEDBACK NEEDED Date: 2019. January 10. 8:34:45 GMT-7 To: edge-computing at lists.openstack.org, "openstack-discuss at lists.openstack.org" > On 10.01.2019 14:43, Ildiko Vancsa wrote: Hi, We are reaching out to you about the use cases for edge cloud infrastructure that the Edge Computing Group is working on to collect. They are recorded in our wiki [1] and they describe high level scenarios when an edge cloud infrastructure would be needed. Hello. Verifying the mappings created for the "Elementary operations on a site" [18] feature against the distributed glance specification [19], I can see a vital feature is missing for "Advanced operations on a site", like creating an image locally, when the parent control plane is not available. And consequences coming off that, like availability of create snapshots for Nova as well. All that boils down to a) better identifying the underlying requirement/limitations for CRUD operations available for middle edge sites in the Distributed Control Plane case. And b) the requirement of data replication and conflicts resolving tooling, which comes out, if we assume we want all CRUDs being always available for middle edge sites disregard of the parent edge's control plane state. So that is the missing and important thing to have socialised and noted for the mappings. [18] https://wiki.openstack.org/wiki/MappingOfUseCasesFeaturesRequirementsAndUserStories#Elementary_operations_on_one_site [19] https://review.openstack.org/619638 During the second Denver PTG discussions we drafted two MVP architectures what we could build from the current functionality of OpenStack with some slight modifications [2]. These are based on the work of James and his team from Oath. We differentiate between a distributed [3] and a centralized [4] control plane architecture scenarios. In one of the Berlin Forum sessions we were asked to map the MVP architecture scenarios to the use cases so I made an initial mapping and now we are looking for feedback. This mapping only means, that the listed use case can be implemented using the MVP architecture scenarios. It should be noted, that none of the MVP architecture scenarios provide solution for edge cloud infrastructure upgrade or centralized management. Please comment on the wiki or in a reply to this mail in case you have questions or disagree with the initial mapping we put together. Please let us know if you have any questions. Here is the use cases and the mapped architecture scenarios: Mobile service provider 5G/4G virtual RAN deployment and Edge Cloud B2B2X [5] Both distributed [3] and centralized [4] Universal customer premise equipment (uCPE) for Enterprise Network Services[6] Both distributed [3] and centralized [4] Unmanned Aircraft Systems (Drones) [7] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Cloud Storage Gateway - Storage at the Edge [8] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Open Caching - stream/store data at the edge [9] Both distributed [3] and centralized [4] Smart City as Software-Defined closed-loop system [10] The use case is not complete enough to figure out Augmented Reality -- Sony Gaming Network [11] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Analytics/control at the edge [12] The use case is not complete enough to figure out Manage retail chains - chick-fil-a [13] The use case is not complete enough to figure out At this moment chick-fil-a uses a different Kubernetes cluster in every edge location and they manage them using Git [14] Smart Home [15] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Data Collection - Smart cooler/cold chain tracking [16] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event VPN Gateway Service Delivery [17] The use case is not complete enough to figure out [1]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases [2]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures [3]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Distributed_Control_Plane_Scenario [4]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Centralized_Control_Plane_Scenario [5]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Mobile_service_provider_5G.2F4G_virtual_RAN_deployment_and_Edge_Cloud_B2B2X. [6]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Universal_customer_premise_equipment_.28uCPE.29_for_Enterprise_Network_Services [7]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Unmanned_Aircraft_Systems_.28Drones.29 [8]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Cloud_Storage_Gateway_-_Storage_at_the_Edge [9]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Open_Caching_-_stream.2Fstore_data_at_the_edge [10]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_City_as_Software-Defined_closed-loop_system [11]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Augmented_Reality_--_Sony_Gaming_Network [12]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Analytics.2Fcontrol_at_the_edge [13]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Manage_retail_chains_-_chick-fil-a [14]: https://schd.ws/hosted_files/kccna18/34/GitOps.pdf [15]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_Home [16]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Data_Collection_-_Smart_cooler.2Fcold_chain_tracking [17]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#VPN_Gateway_Service_Delivery Thanks and Best Regards, Gergely and Ildikó _______________________________________________ Edge-computing mailing list Edge-computing at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing -- Best regards, Bogdan Dobrelya, Irc #bogdando _______________________________________________ Edge-computing mailing list Edge-computing at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Tue Jan 15 16:39:29 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Tue, 15 Jan 2019 17:39:29 +0100 Subject: Training In-Reply-To: <5C3CD1B5.7080803@openstack.org> References: <5C3CD1B5.7080803@openstack.org> Message-ID: Thank you all. Appreciated On Mon, Jan 14, 2019 at 7:15 PM Jimmy McArthur wrote: > Hi Alfredo, > > A good place to start would be here: > https://www.openstack.org/marketplace/training/ > > If you have additional questions, please don't hesitate. > > Cheers, > Jimmy > > > Alfredo De Luca > January 14, 2019 at 10:21 AM > Hi all. > I am looking for openstack training both basic and advanced here in Italy > or Europe. Possibly instructor led on site. > > Any suggestions? > > > -- > *Alfredo* > > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Fox at pnnl.gov Tue Jan 15 16:52:46 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Tue, 15 Jan 2019 16:52:46 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation>, <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov>, Message-ID: <1A3C52DFCD06494D8528644858247BF01C280D4F@EX10MBOX03.pnnl.gov> Ah. Thanks for the clarification. You raise some interesting questions. Lets explore a bit. Like you, I have no idea what the right solution is. just thinking out loud. "What if the TC and PTLs were the same thing?" One risk of making them the same is what I was talking about before. But what about one way association rather then both ways? something like "All PTL's have a seat on the TC and are required to attend meetings if possible"? That would allow the PTL's to have a voice in the TC, to know what's going on at the greater level and more easily feed back such info to the projects? It also would not block non ptl's from having a voice too if elected. It might be easier to make decisions that effect all the projects? Would something like that have the effect you were thinking? You mention that there might not be time for PTL's to do both things. Is there a scope of what a PTL does somewhere we could look at? Maybe some of the scope could be moved to a different role to enable the TC stuff? A co-PTL or something? Thanks, Kevin ________________________________________ From: Chris Dent [cdent+os at anticdent.org] Sent: Tuesday, January 15, 2019 3:01 AM To: openstack-discuss at lists.openstack.org Subject: RE: [tc] [all] Please help verify the role of the TC On Mon, 14 Jan 2019, Fox, Kevin M wrote: > Been chewing on this thread for a while.... I think I should advocate the other direction. I'm not sure where to rejoin this thread, so picking here as it provides a reasonable entry point. First: thanks to everyone who has joined in, I honestly do feel that as annoying as these discussions can be, they often reveal something useful. Second, things went a bit sideways from the point I was trying to reach. I wasn't trying to say that PTLs are the obvious and experienced choice for TC leadership, nor that they were best placed to represent the community. I hope that my own behavior over the past few years has made it clear that I very definitely do not feel that way. However, as most respondents on this thread have pointed out, both TC members and PTLs are described as being over-tasked. What I'm trying to tease out or ask is: Are they over-tasked because they are working on too many things (or at least trying to sort through the too many things); a situation that results from _no unified technical leadership for the community_. My initial assertion was that the TC is insufficiently involved in defining and performing technical leadership. Then I implied that the TC cannot do anything like actionable and unified technical leadership because they have little to no real executive power and what power they do have (for example, trying to make openstack-wide goals) is in conflict (because of the limits of time and space) with the goals that PTLs (and others) are trying to enact. Thus: What if the TC and PTLs were the same thing? Would it become more obvious that there's too much in play to make progress in a unified direction (on the thing called OpenStack), leading us to choose less to do, and choose more consistency and actionable leadership? And would it enable some power to execute on that leadership. Those are questions, not assertions. > Getting some diversity of ideas from outside of those from PTL's > is probably a good idea for the overall health of OpenStack. What > about Users that have never been PTL's? Not developers? So, to summarize: While I agree we need a diversity of ideas, I don't think we lack for ideas, nor have we ever. What we lack is a small enough set of ideas to act on them with significant enough progress to make a real difference. How can we make the list small and (to bring this back to the TC role) empower the TC to execute on that list? And, to be complete, should we? And, to be extra really complete, I'm not sure if we should or not, which is why I'm asking. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From aspiers at suse.com Tue Jan 15 17:01:02 2019 From: aspiers at suse.com (Adam Spiers) Date: Tue, 15 Jan 2019 18:01:02 +0100 Subject: [self-healing-sig][api-sig][train] best practices for haproxy health checking In-Reply-To: References: Message-ID: <20190115170102.tweycrlylfn5rie7@arabian.linksys.moosehall> Ben Nemec wrote: >On 1/11/19 11:11 AM, Dirk Müller wrote: >>Does anyone have a good pointer for good healthchecks to be used by >>the frontend api haproxy loadbalancer? Great question, thanks ;-) This is exactly the kind of discussion I believe is worth encouraging within the self-healing SIG context. >>in one case that I am looking at right now, the entry haproxy >>loadbalancer was not able >>to detect a particular backend being not responding to api requests, >>so it flipped up and down repeatedly, causing intermittend spurious >>503 errors. >> >>The backend was able to respond to connections and to basic HTTP GET >>requests (e.g. / or even /v3 as path), but when it got a "real" query >>it hung. the reason for that was, as it turned out, >>the configured caching backend memcached on that machine being locked >>up (due to some other bug). >> >>I wonder if there is a better way to check if a backend is "working" >>and what the best practices around this are. A potential thought I had >>was to do the backend check via some other healthcheck specific port >>that runs a custom daemon that does more sophisticated checks like >>checking for system wide errors (like memcache, database, rabbitmq) >>being unavailable on that node, and hence not accepting any api >>traffic until that is being resolved. > >A very similar thing has been proposed: >https://review.openstack.org/#/c/531456/ This is definitely relevant, although it's a slightly different approach to the same problem, where the backend API service itself would perform checks internally, rather than relying on something external to it evaluating its health. IMHO the former makes slightly more sense, because the API service knows exactly what its dependencies are and can easily check the health of things like a database connection. Having said that, of course there is also benefit to black-box monitoring. >It also came up as a possible community goal for Train: http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000558.html Right. Here's the story: https://storyboard.openstack.org/#!/story/2001439 IIRC, the latest consensus reached in Denver included the following points: - We should initially do the simplest thing which could possibly work. - Each API should only perform shallow health checks on its dependencies (e.g. nova-api shouldn't perform extensive functional checks on other nova services), but deeper health checks on its internals are fine (e.g. that it can reach the database / message queue / memcached). Then we can use Vitrage for root cause analysis. I would like to suggest one immediate concrete action we should take on this particular haproxy scenario, which is to submit a corresponding use case to the self-healing SIG doc repo. This should help share any existing best practices (or gaps thereof) across the whole community, as a starting point which anyone is welcome to jump on board. I'm happy to do this, or since I happen to be in the same office as Dirk for the rest of this week, maybe we can even co-author it together :-) >But to my knowledge no one has stepped forward to drive the work. It >seems to be something people generally agree we need, but nobody has >time to do. :-( I'm actually very enthusiastic about the idea of taking this on myself, but cannot promise anything until I've had the relevant conversations with my employer this week ... From smooney at redhat.com Tue Jan 15 17:04:02 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 15 Jan 2019 17:04:02 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C280D4F@EX10MBOX03.pnnl.gov> References: <20190114201650.GA6655@sm-workstation> , <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> , <1A3C52DFCD06494D8528644858247BF01C280D4F@EX10MBOX03.pnnl.gov> Message-ID: On Tue, 2019-01-15 at 16:52 +0000, Fox, Kevin M wrote: > Ah. Thanks for the clarification. You raise some interesting questions. > > Lets explore a bit. Like you, I have no idea what the right solution is. just thinking out loud. > > "What if the TC and PTLs were the same thing?" One risk of making them the same is what > I was talking about before. But what about one way association rather then both ways? > something like "All PTL's have a seat on the TC and are required to attend meetings if > possible"? That would allow the PTL's to have a voice in the TC, to know what's going on at > the greater level and more easily feed back such info to the projects? It also would not > block non ptl's from having a voice too if elected. It might be easier to make decisions that effect > all the projects? > > Would something like that have the effect you were thinking? > > You mention that there might not be time for PTL's to do both things. Is there a scope of > what a PTL does somewhere we could look at? Maybe some of the scope could be moved to > a different role to enable the TC stuff? A co-PTL or something? minor comment on this point, PTL are already playing typically 3 roles that of the PTL and that of a core reviewer and often that of an indivigual contributor. i do like the idea of PTLs haveing a voice on the TC but it may be asking a lot for them to take on 4th role in paralle to the 3 they already have. so rather then mandate that a PTL must attend if they can they could be given an optional seat with the ablity to deleaget that to another core memember. simliar to project/release liasons each project could have a TC liason that can default to the PTL or anoter core team memeber that they nominate to take there place? > > Thanks, > Kevin > ________________________________________ > From: Chris Dent [cdent+os at anticdent.org] > Sent: Tuesday, January 15, 2019 3:01 AM > To: openstack-discuss at lists.openstack.org > Subject: RE: [tc] [all] Please help verify the role of the TC > > On Mon, 14 Jan 2019, Fox, Kevin M wrote: > > > Been chewing on this thread for a while.... I think I should advocate the other direction. > > I'm not sure where to rejoin this thread, so picking here as it > provides a reasonable entry point. First: thanks to everyone who has > joined in, I honestly do feel that as annoying as these discussions > can be, they often reveal something useful. > > Second, things went a bit sideways from the point I was trying to > reach. I wasn't trying to say that PTLs are the obvious and > experienced choice for TC leadership, nor that they were best placed > to represent the community. I hope that my own behavior over the > past few years has made it clear that I very definitely do not feel > that way. > > However, as most respondents on this thread have pointed out, both > TC members and PTLs are described as being over-tasked. What I'm > trying to tease out or ask is: Are they over-tasked because they are > working on too many things (or at least trying to sort through the > too many things); a situation that results from _no unified > technical leadership for the community_. > > My initial assertion was that the TC is insufficiently involved in > defining and performing technical leadership. > > Then I implied that the TC cannot do anything like actionable and > unified technical leadership because they have little to no real > executive power and what power they do have (for example, trying to > make openstack-wide goals) is in conflict (because of the limits of > time and space) with the goals that PTLs (and others) are trying to > enact. > > Thus: What if the TC and PTLs were the same thing? Would it become > more obvious that there's too much in play to make progress in a > unified direction (on the thing called OpenStack), leading us to > choose less to do, and choose more consistency and actionable > leadership? And would it enable some power to execute on that > leadership. > > Those are questions, not assertions. > > > Getting some diversity of ideas from outside of those from PTL's > > is probably a good idea for the overall health of OpenStack. What > > about Users that have never been PTL's? Not developers? > > So, to summarize: While I agree we need a diversity of ideas, I > don't think we lack for ideas, nor have we ever. What we lack > is a small enough set of ideas to act on them with significant > enough progress to make a real difference. How can we make the list > small and (to bring this back to the TC role) empower the TC to > execute on that list? > > And, to be complete, should we? > > And, to be extra really complete, I'm not sure if we should or not, > which is why I'm asking. > > -- > Chris Dent ٩◔̯◔۶ https://anticdent.org/ > freenode: cdent tw: @anticdent From openstack at nemebean.com Tue Jan 15 17:16:48 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Jan 2019 11:16:48 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> <8eb6964f-506f-848b-a838-935bb972c9f5@nemebean.com> Message-ID: On 1/15/19 6:49 AM, Doug Hellmann wrote: > Ben Nemec writes: > >> I tried to set up a test environment for this, but I'm having some >> issues. My local environment is defaulting to python 3, while the gate >> job appears to have been running under python 2. I'm not sure why it's >> doing that since the tox env definition doesn't specify python 3 (maybe >> something to do with https://review.openstack.org/#/c/622415/ ?), but >> either way I keep running into import issues. >> >> I'll take another look tomorrow, but in the meantime I'm afraid I >> haven't made any meaningful progress. :-( > > If no version is specified in the tox.ini then tox defaults to the > version of python used to install it. > Ah, good to know. I think I installed tox as just "tox" instead of "python-tox", which means I got the py3 version. Unfortunately I'm still having trouble running the failing test (and not for the expected reason ;-). The daemon is failing to start with: ImportError: No module named tests.functional.utils I'm not seeing any log output from the daemon either for some reason so it's hard to debug. There must be some difference between this and the neutron test environment because in neutron I was getting daemon log output in /opt/stack/logs. I'll keep poking at it but I'm open to suggestions. From Kevin.Fox at pnnl.gov Tue Jan 15 17:17:49 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Tue, 15 Jan 2019 17:17:49 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: , Message-ID: <1A3C52DFCD06494D8528644858247BF01C280D98@EX10MBOX03.pnnl.gov> Friction between PTL's/projects and cross project work is not a new thing. It has been a major issue for OpenStack almost since inception. There were some hurtles I had when trying to make any progress on cross project work that ultimately caused me to give up. (I'll make some generalizations here... not saying everyone has these issues) * PTL's (and the whole project really) tend only to look at their corner of the OpenStack and not so much how their project fits into the greater picture. * PTL's have been good about discussing cross project issues when directly raised, but push back that their project should be part of the solution. The other project is always the better place to solve it, or a new project should solve it. That needs to stop being the go to answer. * PTL's don't go out and actively look where their project does not play so well in the grater OpenStack. They focus on solving their own problems. This leads to architecture that work for a project but fails for the overall OpenStack. * The way code gets into a project mostly involves getting enough capital (reviews) on a given project to get attention when needed. But cross project work doesn't get you enough capital in enough specific projects to ever really progress. There needs to be a cross project review score or something that says, this dev is working on cross project goals and submitting work of that type. give attention to their pr's because their work is important too! PTL's could prioritize this kind of work. So, I kind of like the idea that PTL's need to focus more on the whole, not just their own project, so that they can help lead in the right directions. Right now each project is going in its own directions without much thought to the whole. I'm not trying to blame the folks I've interacted with in the PTL role. They have by and large been very helpful. But the Role of PTL, as well as the TC, have not enabled significant progress on the cross project issues due to these gaps. Maybe part of the solution is tweaking the responsibilities of the PTL's to focus more on cross project issues. Thanks, Kevin ________________________________________ From: Thierry Carrez [thierry at openstack.org] Sent: Tuesday, January 15, 2019 1:49 AM To: openstack-discuss at lists.openstack.org Subject: Re: [tc] [all] Please help verify the role of the TC Chris Dent wrote: > [...] > Since there is a significant and friction creating division of power > and leadership between the TC and PTLs, I'm not sure I follow you there... the division of power is between keeping an eye on the big picture and caring for OpenStack as a whole (TC) vs. rubber-hits-the-road, being responsible for a specific set of deliverables (PTL). The same individuals can care for both concerns, but those are different tasks... I think the division is clear. The only friction I've observed recently is when it comes to driving cross-project work -- an area that TC members and affected PTLs care about. We need more people driving that type of work, and as we've said in other threads, TC members (as well as other respected members of our community) are in a good position to help drive that work. > [...] what would it be like if we > required half or more of the TC be elected from PTLs? Then the > "providing the technical leadership" aspect of the TC mission [3] > would be vested with the people who also have some responsibility for > executing on that leadership. > > That would be like something we had before, but now there are many > more PTLs. I don't think PTLs have any difficulty getting elected when they run, so I'm not sure a provision that the TC must have reserved seats for PTLs would have a significant impact, beyond complicating the election process. In terms of increasing TC efficiency... As others said, being a PTL for a large project is already a lot of work and that leaves little to do "TC work". And if the goal is to get more TC members to drive cross-project work, the main reason TC members don't drive (more) cross-project work is generally that they don't have enough bandwidth to do so. Mandating more PTLs to be TC members is unlikely to result in a TC membership with more available cross-project work bandwidth... I agree that it is important to have representation of classic developer teams on the TC, but I feel like today's TC membership is a good mix between horizontal teams and vertical teams, including deployment concerns and adjacent communities perspective. We should definitely continue to encourage TC candidacies from vertical/classic project teams, but I don't think that should be reduced to PTLs, and I don't think that should be reserved seats. -- Thierry Carrez (ttx) From Kevin.Fox at pnnl.gov Tue Jan 15 17:32:49 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Tue, 15 Jan 2019 17:32:49 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> , <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> , <1A3C52DFCD06494D8528644858247BF01C280D4F@EX10MBOX03.pnnl.gov>, Message-ID: <1A3C52DFCD06494D8528644858247BF01C280E0B@EX10MBOX03.pnnl.gov> Yeah. I think what I was getting at was that project leadership would be able to gather info about how they fit in the greater OpenStack via the TC, get a voice in how cross project issues could be solved, and feed back that information and prioritize cross project work on their project. If its optional, then that might not happen properly. Delegating that authority might be a good solution as you suggest. But then delegation of prioritizing of reviews and other such things might need to go along with it at the same time to be effective? The idea is for there to be a flow of leadership from the TC to the project for some of the cross project work, but if that flow cant happen, then nothing really changed. Not so sure a delegate can have enough power to really be effective that way? Thoughts? Thanks, Kevin ________________________________________ From: Sean Mooney [smooney at redhat.com] Sent: Tuesday, January 15, 2019 9:04 AM To: Fox, Kevin M; Chris Dent; openstack-discuss at lists.openstack.org Subject: Re: [tc] [all] Please help verify the role of the TC On Tue, 2019-01-15 at 16:52 +0000, Fox, Kevin M wrote: > Ah. Thanks for the clarification. You raise some interesting questions. > > Lets explore a bit. Like you, I have no idea what the right solution is. just thinking out loud. > > "What if the TC and PTLs were the same thing?" One risk of making them the same is what > I was talking about before. But what about one way association rather then both ways? > something like "All PTL's have a seat on the TC and are required to attend meetings if > possible"? That would allow the PTL's to have a voice in the TC, to know what's going on at > the greater level and more easily feed back such info to the projects? It also would not > block non ptl's from having a voice too if elected. It might be easier to make decisions that effect > all the projects? > > Would something like that have the effect you were thinking? > > You mention that there might not be time for PTL's to do both things. Is there a scope of > what a PTL does somewhere we could look at? Maybe some of the scope could be moved to > a different role to enable the TC stuff? A co-PTL or something? minor comment on this point, PTL are already playing typically 3 roles that of the PTL and that of a core reviewer and often that of an indivigual contributor. i do like the idea of PTLs haveing a voice on the TC but it may be asking a lot for them to take on 4th role in paralle to the 3 they already have. so rather then mandate that a PTL must attend if they can they could be given an optional seat with the ablity to deleaget that to another core memember. simliar to project/release liasons each project could have a TC liason that can default to the PTL or anoter core team memeber that they nominate to take there place? > > Thanks, > Kevin > ________________________________________ > From: Chris Dent [cdent+os at anticdent.org] > Sent: Tuesday, January 15, 2019 3:01 AM > To: openstack-discuss at lists.openstack.org > Subject: RE: [tc] [all] Please help verify the role of the TC > > On Mon, 14 Jan 2019, Fox, Kevin M wrote: > > > Been chewing on this thread for a while.... I think I should advocate the other direction. > > I'm not sure where to rejoin this thread, so picking here as it > provides a reasonable entry point. First: thanks to everyone who has > joined in, I honestly do feel that as annoying as these discussions > can be, they often reveal something useful. > > Second, things went a bit sideways from the point I was trying to > reach. I wasn't trying to say that PTLs are the obvious and > experienced choice for TC leadership, nor that they were best placed > to represent the community. I hope that my own behavior over the > past few years has made it clear that I very definitely do not feel > that way. > > However, as most respondents on this thread have pointed out, both > TC members and PTLs are described as being over-tasked. What I'm > trying to tease out or ask is: Are they over-tasked because they are > working on too many things (or at least trying to sort through the > too many things); a situation that results from _no unified > technical leadership for the community_. > > My initial assertion was that the TC is insufficiently involved in > defining and performing technical leadership. > > Then I implied that the TC cannot do anything like actionable and > unified technical leadership because they have little to no real > executive power and what power they do have (for example, trying to > make openstack-wide goals) is in conflict (because of the limits of > time and space) with the goals that PTLs (and others) are trying to > enact. > > Thus: What if the TC and PTLs were the same thing? Would it become > more obvious that there's too much in play to make progress in a > unified direction (on the thing called OpenStack), leading us to > choose less to do, and choose more consistency and actionable > leadership? And would it enable some power to execute on that > leadership. > > Those are questions, not assertions. > > > Getting some diversity of ideas from outside of those from PTL's > > is probably a good idea for the overall health of OpenStack. What > > about Users that have never been PTL's? Not developers? > > So, to summarize: While I agree we need a diversity of ideas, I > don't think we lack for ideas, nor have we ever. What we lack > is a small enough set of ideas to act on them with significant > enough progress to make a real difference. How can we make the list > small and (to bring this back to the TC role) empower the TC to > execute on that list? > > And, to be complete, should we? > > And, to be extra really complete, I'm not sure if we should or not, > which is why I'm asking. > > -- > Chris Dent ٩◔̯◔۶ https://anticdent.org/ > freenode: cdent tw: @anticdent From openstack at nemebean.com Tue Jan 15 17:51:19 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Jan 2019 11:51:19 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> <8eb6964f-506f-848b-a838-935bb972c9f5@nemebean.com> Message-ID: <292b70c6-677e-4f6b-7b65-7062c2875d9f@nemebean.com> On 1/15/19 11:16 AM, Ben Nemec wrote: > > > On 1/15/19 6:49 AM, Doug Hellmann wrote: >> Ben Nemec writes: >> >>> I tried to set up a test environment for this, but I'm having some >>> issues. My local environment is defaulting to python 3, while the gate >>> job appears to have been running under python 2. I'm not sure why it's >>> doing that since the tox env definition doesn't specify python 3 (maybe >>> something to do with https://review.openstack.org/#/c/622415/ ?), but >>> either way I keep running into import issues. >>> >>> I'll take another look tomorrow, but in the meantime I'm afraid I >>> haven't made any meaningful progress. :-( >> >> If no version is specified in the tox.ini then tox defaults to the >> version of python used to install it. >> > > Ah, good to know. I think I installed tox as just "tox" instead of > "python-tox", which means I got the py3 version. > > Unfortunately I'm still having trouble running the failing test (and not > for the expected reason ;-). The daemon is failing to start with: > > ImportError: No module named tests.functional.utils > > I'm not seeing any log output from the daemon either for some reason so > it's hard to debug. There must be some difference between this and the > neutron test environment because in neutron I was getting daemon log > output in /opt/stack/logs. Figured this part out. tox.ini wasn't inheriting some values in the same way as neutron. Fix proposed in https://review.openstack.org/#/c/631035/ Now hopefully I can make progress on the rest of it. > > I'll keep poking at it but I'm open to suggestions. > From cristian.calin at orange.com Tue Jan 15 19:44:57 2019 From: cristian.calin at orange.com (cristian.calin at orange.com) Date: Tue, 15 Jan 2019 19:44:57 +0000 Subject: [neutron][qos][dpdk] is QoS supported with DPDK Message-ID: <16302_1547581498_5C3E383A_16302_220_4_a25f772f8d5c4aab8b7e680bcf7e79d4@orange.com> Hi all, Is QoS supported when DPDK is leveraged on the openvswitch? >From what I can tell the two are incompatible but I'm hoping someone has some information otherwise. Thanks, Cristian Calin _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Jan 15 20:00:58 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 15 Jan 2019 12:00:58 -0800 Subject: [ironic] Mid-cycle call times In-Reply-To: References: Message-ID: Greetings everyone, It seems the most popular times are January 21st and 22nd between 2 PM and 6 PM UTC. Please add any topics for discussion to the etherpad[1] as soon as possible. I will propose a schedule and agenda in the next day or two. -Julia [1]: https://etherpad.openstack.org/p/ironic-stein-midcycle On Tue, Jan 8, 2019 at 9:10 AM Julia Kreger wrote: > > Greetings everyone! > > It seems we have coalesced around January 21st and 22nd. I have posted > a poll[1] with time windows in two hour blocks so we can reach a > consensus on when we should meet. > > Please vote for your available time windows so we can find the best > overlap for everyone. Additionally, if there are any topics or items > that you feel would be a good use of the time, please feel free to add > them to the planning etherpad[2]. [trim] From stig.openstack at telfer.org Tue Jan 15 20:58:52 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Tue, 15 Jan 2019 20:58:52 +0000 Subject: [scientific-sig] IRC meeting Wednesday 1100 UTC: 2FA with FreeIPA, OpenInfra Days London, ISC 2019 Message-ID: <95C77676-5EA4-4D33-A2C0-B0D2A95A9504@telfer.org> Hi All - We have a Scientific SIG IRC meeting coming up on Wednesday 16th at 1100 UTC in channel #openstack-meeting. Everyone is welcome. Full agenda and details are here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_16th_2019 We’ve got quite a bit to cover this week. Our headline item is a presentation on experiences configuring 2FA for OpenStack using FreeIPA. We also have some exciting events coming up, and discussion on a possible Scientific OpenStack BoF at the International Supercomputer Conference in Frankfurt in June. Cheers, Stig -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Jan 15 21:05:06 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 15 Jan 2019 21:05:06 +0000 Subject: [neutron][qos][dpdk] is QoS supported with DPDK In-Reply-To: <16302_1547581498_5C3E383A_16302_220_4_a25f772f8d5c4aab8b7e680bcf7e79d4@orange.com> References: <16302_1547581498_5C3E383A_16302_220_4_a25f772f8d5c4aab8b7e680bcf7e79d4@orange.com> Message-ID: <2ed69f493b71e610054088c57d8694eac1026443.camel@redhat.com> On Tue, 2019-01-15 at 19:44 +0000, cristian.calin at orange.com wrote: > Hi all, > > Is QoS supported when DPDK is leveraged on the openvswitch? max and max burst rates are supported i belive in both ingress and egress using HTB in ovs-dpdk there is no ovs-dpdk support for miniumium bandwidth qos at the dataplane level. this is also true of kernel ovs. hardware offloaded ovs in thery could enforce minium bandwith gurarentees if the nic supports it. the minium bandwidth support in neutron based on placement reservations however should work with ovs-dpdk too so once that is there while the datapane wont enforce the minium the scudler will prevent over subsriptions. if you enforce a max bandwith policy equal to the min bandwith gurentee with a large burst rate for all instnace on a host you can fake a min bandwith gurrentee but its not quite the same thing as using a network backend the truely supports it. so yes max bandwith qos and dscp policies work with ovs-dpkd and there may be support for min bandwith in stien/train regards sean > From what I can tell the two are incompatible but I’m hoping someone has some information otherwise. > > > Thanks, > Cristian Calin > ______________________________________________________________________________________________________________________ > ___ > > Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le > signaler > a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles > d'alteration, > Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. > > This message and its attachments may contain confidential or privileged information that may be protected by law; > they should not be distributed, used or copied without authorisation. > If you have received this email in error, please notify the sender and delete this message and its attachments. > As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. > Thank you. From ed at leafe.com Tue Jan 15 22:08:18 2019 From: ed at leafe.com (Ed Leafe) Date: Tue, 15 Jan 2019 16:08:18 -0600 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> Message-ID: <479CAA53-24C1-4C7B-8810-7D870E7F082B@leafe.com> On Jan 15, 2019, at 5:01 AM, Chris Dent wrote: > > Then I implied that the TC cannot do anything like actionable and > unified technical leadership because they have little to no real > executive power and what power they do have (for example, trying to > make openstack-wide goals) is in conflict (because of the limits of > time and space) with the goals that PTLs (and others) are trying to > enact. > > Thus: What if the TC and PTLs were the same thing? Would it become > more obvious that there's too much in play to make progress in a > unified direction (on the thing called OpenStack), leading us to > choose less to do, and choose more consistency and actionable > leadership? And would it enable some power to execute on that > leadership. Early on in the development of OpenStack there was a lot of discussion about having a group of elected people act as a sort of BDFL (or, more properly, BDFT, where s/Life/Term). Remember, this was when there was Nova and Swift, and not much else. That approach was abandoned in favor of a more bottom-up approach, and as the number of projects and teams in OpenStack grew, so did the opinions on technical direction. As a result, OpenStack has attracted the sort of people who tend to bristle at the notion of a leader or group of leaders determining technical policy and direction. I always have been and still continue to favor a unified direction for OpenStack as a whole, but I recognize that that ship sailed a long, long time ago, and that trying to add some of that back now would not sit well with most of the community. -- Ed Leafe From openstack at nemebean.com Tue Jan 15 22:56:20 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 15 Jan 2019 16:56:20 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <292b70c6-677e-4f6b-7b65-7062c2875d9f@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> <8eb6964f-506f-848b-a838-935bb972c9f5@nemebean.com> <292b70c6-677e-4f6b-7b65-7062c2875d9f@nemebean.com> Message-ID: TLDR: We now need to look at the thread namespace instead of the process namespace. Many, many details below. On 1/15/19 11:51 AM, Ben Nemec wrote: > > > On 1/15/19 11:16 AM, Ben Nemec wrote: >> >> >> On 1/15/19 6:49 AM, Doug Hellmann wrote: >>> Ben Nemec writes: >>> >>>> I tried to set up a test environment for this, but I'm having some >>>> issues. My local environment is defaulting to python 3, while the gate >>>> job appears to have been running under python 2. I'm not sure why it's >>>> doing that since the tox env definition doesn't specify python 3 (maybe >>>> something to do with https://review.openstack.org/#/c/622415/ ?), but >>>> either way I keep running into import issues. >>>> >>>> I'll take another look tomorrow, but in the meantime I'm afraid I >>>> haven't made any meaningful progress. :-( >>> >>> If no version is specified in the tox.ini then tox defaults to the >>> version of python used to install it. >>> >> >> Ah, good to know. I think I installed tox as just "tox" instead of >> "python-tox", which means I got the py3 version. >> >> Unfortunately I'm still having trouble running the failing test (and >> not for the expected reason ;-). The daemon is failing to start with: >> >> ImportError: No module named tests.functional.utils No idea why, but updating the fwaas capabilities to match core neutron by adding c.CAP_DAC_OVERRIDE and c.CAP_DAC_READ_SEARCH made this go away. Those are related to file permission checks, but the permissions on my source tree are, well, permissive, so I'm not sure why that would be a problem. >> >> I'm not seeing any log output from the daemon either for some reason >> so it's hard to debug. There must be some difference between this and >> the neutron test environment because in neutron I was getting daemon >> log output in /opt/stack/logs. > > Figured this part out. tox.ini wasn't inheriting some values in the same > way as neutron. Fix proposed in https://review.openstack.org/#/c/631035/ Actually, I discovered that these logs were happening, they were just in /tmp. So that change is probably not necessary, especially since it's breaking ci. > > Now hopefully I can make progress on the rest of it. And sure enough, I did. :-) In short, we need to look at the thread-specific network namespace in this test instead of the process-specific one. When we change the namespace it only affects the thread, unless the call is made from the process's main thread. Here's a simple(?) example: #!/usr/bin/env python import ctypes import os import threading from pyroute2 import netns # The python threading identifier is useless here, # we need to make a syscall libc = ctypes.CDLL('libc.so.6') def do_the_thing(ns): tid = libc.syscall(186) # This id varies by platform :-/ # Check the starting netns print('process %s' % os.readlink('/proc/self/ns/net')) print('thread %s' % os.readlink('/proc/self/task/%s/ns/net' % tid)) # Change the netns print('changing to %s' % ns) netns.setns(ns) # Check again. It should be different print('process %s' % os.readlink('/proc/self/ns/net')) print('thread %s\n' % os.readlink('/proc/self/task/%s/ns/net' % tid)) # Run in main thread do_the_thing('foo') # Run in new thread t = threading.Thread(target=do_the_thing, args=('bar',)) t.start() t.join() # Run in main thread again to show difference do_the_thing('bar') # Clean up after ourselves netns.remove('foo') netns.remove('bar') And here's the output: process net:[4026531992] thread net:[4026531992] changing to foo process net:[4026532196] <- Running in the main thread changes both thread net:[4026532196] process net:[4026532196] thread net:[4026532196] changing to bar process net:[4026532196] <- Child thread only changes the thread thread net:[4026532254] process net:[4026532196] thread net:[4026532196] changing to bar process net:[4026532254] <- Main thread gets them back in sync thread net:[4026532254] So, to get this test passing I think we need to change [1] so it looks for the thread id and uses a replacement for [2] that allows the thread id to be injected as above. And it's the end of my day so I'm going to leave it there. :-) 1: https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/tests/functional/utils.py#L23 2: https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/utils.py#L25 -Ben From mriedemos at gmail.com Wed Jan 16 00:25:45 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 15 Jan 2019 18:25:45 -0600 Subject: [placement] update 19-01 In-Reply-To: References: Message-ID: <93d9f88a-4689-708f-9102-02e0b1ea8418@gmail.com> On 1/11/2019 9:44 AM, Chris Dent wrote: > I've been saying for a few weeks that "progress continues on > gpu-reshaping for libvirt and xen" but it looks like the work at: > > * > > > > is actually stalled. Anyone have some insight on the status of that > work? This came up in yesterdays placement/scheduler meeting as well. Sylvain was pinged about the libvirt change being stalled waiting for updates from him (he might have lost this change since it's owned by me in gerrit, but I'm not working on it - I'd suggest starring it to keep tabs on it in gerrit). As for the xenapi changes, the original author is no longer working on this and the replacement hasn't picked it up, so also stalled. Without a major push on the reshaper stuff (feature freeze is March 7) those are likely going to be deferred including an (arguably optional for now) hook to do reshapes during fast-forward upgrades (which would unfortunately have to be run on every compute host as the mdev information is on the host, not the DB). For the extraction work, I think the main thing we're looking for (and Sylvain signed up for in Berlin [1]) is to have a functional test which creates a server in a flat provider tree, reshapes the tree to move inventory and allocations to a child provider, and then schedules another server to the same resource class and make sure that all works properly. We have existing functional tests that do the reshape stuff with a fake virt driver, just not tests that hook in the scheduler aspects I don't think. [1] https://etherpad.openstack.org/p/BER-placement-extract -- Thanks, Matt From tony at bakeyournoodle.com Wed Jan 16 02:14:46 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Wed, 16 Jan 2019 13:14:46 +1100 Subject: [meta-sig][docs] new section for SIG documentation on docs.o.o In-Reply-To: <20190110235010.3ozo6hgxbgrvoqxx@pacific.linksys.moosehall> References: <20190110220235.2rggnmxwxqyn6lnz@pacific.linksys.moosehall> <20190110225442.GI28232@thor.bakeyournoodle.com> <20190110235010.3ozo6hgxbgrvoqxx@pacific.linksys.moosehall> Message-ID: <20190116021440.GB8374@thor.bakeyournoodle.com> On Thu, Jan 10, 2019 at 11:50:11PM +0000, Adam Spiers wrote: > Tony Breeds wrote: > > > We really only have > > https://docs.openstack.org/project-team-guide/stable-branches.html to > > link to but you;d feel less lonely ;P > > Indeed we would ;-) Done. https://review.openstack.org/#/c/631094/ Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From rico.lin.guanyu at gmail.com Wed Jan 16 03:48:27 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 16 Jan 2019 11:48:27 +0800 Subject: [heat]meeting cancelled this week Message-ID: Dear team, due to some personal stuff, I will not join meeting today. Let’s resume meeting next week. -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 16 09:36:33 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 16 Jan 2019 10:36:33 +0100 Subject: [pctavia][heat] queens loadbalancer issue In-Reply-To: References: Message-ID: Sorry for my email. I did not see my networrk address on the vip network are sold out. Ignazio Il giorno mer 16 gen 2019 alle ore 10:30 Ignazio Cassano < ignaziocassano at gmail.com> ha scritto: > Hello All, > > I am facing an issue with octavia loadbalancer on queens. > Attached here there is the heat stack I am using for frating a > loadbalancer between 2 virtual machines. > I launched the stack 4 times. > First, second and third work fine. > The forth gives a lot of errors in octavia worker log attached here. > It seems to loop . > > Please help me. > > Ignazio > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Wed Jan 16 10:54:56 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 16 Jan 2019 11:54:56 +0100 Subject: [neutron] Patch fails to build after rechecking. In-Reply-To: <32C216DF431DC842B4FB087109584B420473B0BD@shsmsx102.ccr.corp.intel.com> References: <32C216DF431DC842B4FB087109584B420473B0BD@shsmsx102.ccr.corp.intel.com> Message-ID: <75C22B7A-05AE-4E53-ACE8-286DA05D85C7@redhat.com> Hi, It is most probably caused by issue [1]. It is another issue exposed, I think by new oslo.privsep version with concurrency. We are currently working to find root cause and fix for this issue. [1] https://bugs.launchpad.net/neutron/+bug/1811515 > Wiadomość napisana przez Ni, Qi w dniu 15.01.2019, o godz. 10:10: > > Hello guys, > > I’ve been working on a patch of neutron https://review.openstack.org/#/c/626109/11 and it keeps failing when building. I’m sure my code is bug free and I found Liu Yulong’s patch has a similar condition like mine. https://review.openstack.org/#/c/627285/. He rechecks many times and fails every time. His code is nearly the same as previous except a line of comment. > > I’m not sure if there’s any bug on this checking system and please take a look at our issue, thank you. — Slawek Kaplonski Senior software engineer Red Hat From thierry at openstack.org Wed Jan 16 11:21:08 2019 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 16 Jan 2019 12:21:08 +0100 Subject: [Release-job-failures] Release of openstack/puppet-* failed In-Reply-To: References: Message-ID: <66df7a16-01e2-7c04-64dd-175e649be3f5@openstack.org> Puppet-OpenStack Stein M2 release job (triggered after https://review.openstack.org/#/c/629753/ merged) failed for all modules with errors similar to: zuul at openstack.org wrote: > Build failed. > > - release-openstack-puppet http://logs.openstack.org/b1/b1434f41935eb5c4cb50fc5aea2b6406a6d0f63d/release/release-openstack-puppet/0c37729/ : POST_FAILURE in 2m 58s > - announce-release announce-release : SKIPPED This is due to the recent introduction of a puppetforge upload job: https://review.openstack.org/#/c/627573/5/zuul.d/jobs.yaml That job tries to install a few gems and that fails because the ruby header files are not available (missing ruby-dev / ruby-devel package). Tarball upload jobs are then skipped, as well as announcement or releases.o.o documentation. Solution is probably to add that missing package to the "build" bindep.txt profile (for every affected puppet-* repository) that is loaded in pre-run of that publication job. Currently that profile only contains "puppet" and that stops short of installing the ruby development header package (at least on Ubuntu hosts). Once that's done, I recommend pushing a new Stein release for Puppet-OpenStack (similar to https://review.openstack.org/#/c/629753/, bumping .Z for all packages) so that we can test that it properly works before end-of-release crush. -- Thierry Carrez (ttx) From tobias.urdin at binero.se Wed Jan 16 11:36:38 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Wed, 16 Jan 2019 12:36:38 +0100 Subject: [Release-job-failures] Release of openstack/puppet-* failed In-Reply-To: <66df7a16-01e2-7c04-64dd-175e649be3f5@openstack.org> References: <66df7a16-01e2-7c04-64dd-175e649be3f5@openstack.org> Message-ID: Hello Thierry, I have pushed a fix for the upload-puppetforge role [1] and tested it in a disposable Ubuntu Xenial machine. Best regards Tobias On 01/16/2019 12:23 PM, Thierry Carrez wrote: > Puppet-OpenStack Stein M2 release job (triggered after > https://review.openstack.org/#/c/629753/ merged) failed for all modules > with errors similar to: > > zuul at openstack.org wrote: >> Build failed. >> >> - release-openstack-puppet http://logs.openstack.org/b1/b1434f41935eb5c4cb50fc5aea2b6406a6d0f63d/release/release-openstack-puppet/0c37729/ : POST_FAILURE in 2m 58s >> - announce-release announce-release : SKIPPED > This is due to the recent introduction of a puppetforge upload job: > https://review.openstack.org/#/c/627573/5/zuul.d/jobs.yaml > > That job tries to install a few gems and that fails because the ruby > header files are not available (missing ruby-dev / ruby-devel package). > Tarball upload jobs are then skipped, as well as announcement or > releases.o.o documentation. > > Solution is probably to add that missing package to the "build" > bindep.txt profile (for every affected puppet-* repository) that is > loaded in pre-run of that publication job. Currently that profile only > contains "puppet" and that stops short of installing the ruby > development header package (at least on Ubuntu hosts). > > Once that's done, I recommend pushing a new Stein release for > Puppet-OpenStack (similar to https://review.openstack.org/#/c/629753/, > bumping .Z for all packages) so that we can test that it properly works > before end-of-release crush. > From tobias.urdin at binero.se Wed Jan 16 11:37:31 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Wed, 16 Jan 2019 12:37:31 +0100 Subject: [Release-job-failures] Release of openstack/puppet-* failed In-Reply-To: <66df7a16-01e2-7c04-64dd-175e649be3f5@openstack.org> References: <66df7a16-01e2-7c04-64dd-175e649be3f5@openstack.org> Message-ID: <78204504-e8a9-a3ba-ce29-95cb66192319@binero.se> Ops, missed the link [1]. [1] https://review.openstack.org/#/c/631194/ On 01/16/2019 12:23 PM, Thierry Carrez wrote: > Puppet-OpenStack Stein M2 release job (triggered after > https://review.openstack.org/#/c/629753/ merged) failed for all modules > with errors similar to: > > zuul at openstack.org wrote: >> Build failed. >> >> - release-openstack-puppet http://logs.openstack.org/b1/b1434f41935eb5c4cb50fc5aea2b6406a6d0f63d/release/release-openstack-puppet/0c37729/ : POST_FAILURE in 2m 58s >> - announce-release announce-release : SKIPPED > This is due to the recent introduction of a puppetforge upload job: > https://review.openstack.org/#/c/627573/5/zuul.d/jobs.yaml > > That job tries to install a few gems and that fails because the ruby > header files are not available (missing ruby-dev / ruby-devel package). > Tarball upload jobs are then skipped, as well as announcement or > releases.o.o documentation. > > Solution is probably to add that missing package to the "build" > bindep.txt profile (for every affected puppet-* repository) that is > loaded in pre-run of that publication job. Currently that profile only > contains "puppet" and that stops short of installing the ruby > development header package (at least on Ubuntu hosts). > > Once that's done, I recommend pushing a new Stein release for > Puppet-OpenStack (similar to https://review.openstack.org/#/c/629753/, > bumping .Z for all packages) so that we can test that it properly works > before end-of-release crush. > From cdent+os at anticdent.org Wed Jan 16 12:33:40 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Wed, 16 Jan 2019 12:33:40 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C280E0B@EX10MBOX03.pnnl.gov> References: <20190114201650.GA6655@sm-workstation> , <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> , <1A3C52DFCD06494D8528644858247BF01C280D4F@EX10MBOX03.pnnl.gov>, <1A3C52DFCD06494D8528644858247BF01C280E0B@EX10MBOX03.pnnl.gov> Message-ID: On Tue, 15 Jan 2019, Fox, Kevin M wrote: > If its optional, then that might not happen properly. Delegating that authority might be a good solution as you suggest. But then delegation of prioritizing of reviews and other such things might need to go along with it at the same time to be effective? The idea is for there to be a flow of leadership from the TC to the project for some of the cross project work, but if that flow cant happen, then nothing really changed. Not so sure a delegate can have enough power to really be effective that way? Thoughts? Delegation also simply spreads things around over a wider base, rather than addressing one of the core issues: there are too many things going on at once for there to be anything could be labeled a "unified direction". If we want (do we?) a unified direction, there have to be fewer directions. Positioning things as a "a delegate will do it" is a way of choosing to not limit the number of things being done. There are two competing voices around these issues, both have merit, but they remain in competition. One says: the role of the TC is to enable healthy contribution in all its many and diverse forms in the context of a community called OpenStack. Another says: the role of the TC is to help shape a product [1] called OpenStack. The first sounds rather positive: We're enabling lots of stuff, look at all the stuff! The second is often perceived as rather negative: That stuff could be so much better! None of these perceptions are complete or fully accurate, there are positives and negatives in all views, but what seems to come out of these conversations is that there are some very small and vocal minorities that care about these issues and have strong opinions, and everyone else doesn't care. Do they (you, silent readers!) not care, are they not interested, or is it a matter of "we've been over this before and nothing ever changes?" [1] I bristle at the term "product" because it sounds like we're trying to sell something. I'm thinking more in terms of "a thing that is being produced and has a purpose". -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From smooney at redhat.com Wed Jan 16 13:13:07 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 16 Jan 2019 13:13:07 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> , <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> , <1A3C52DFCD06494D8528644858247BF01C280D4F@EX10MBOX03.pnnl.gov> , <1A3C52DFCD06494D8528644858247BF01C280E0B@EX10MBOX03.pnnl.gov> Message-ID: On Wed, 2019-01-16 at 12:33 +0000, Chris Dent wrote: > On Tue, 15 Jan 2019, Fox, Kevin M wrote: > > > If its optional, then that might not happen properly. Delegating that authority might be a good solution as you > > suggest. But then delegation of prioritizing of reviews and other such things might need to go along with it at the > > same time to be effective? The idea is for there to be a flow of leadership from the TC to the project for some of > > the cross project work, but if that flow cant happen, then nothing really changed. Not so sure a delegate can have > > enough power to really be effective that way? Thoughts? > > Delegation also simply spreads things around over a wider base, > rather than addressing one of the core issues: there are too many > things going on at once for there to be anything could be labeled > a "unified direction". > > If we want (do we?) a unified direction, there have to be fewer > directions. Positioning things as a "a delegate will do it" is a way > of choosing to not limit the number of things being done. well the delegate suggestion was more. if a PTL feels they are too busy with there other duties but in the comunity and in there company they could ask another core member to attend the TC meeting on there behalf so the number of voice would not change vs the PTL attending. i had assumed it woudl be the role of the TC liason be that the PTL or a delage to bring back a update to the project team e.g. nova in the nova team meeting. collectivly the core team would then discuss the direction form the tc and work with there contiubtors to determin if that direction is viable and either implement it or provide feedback to the tc as to why it is not viable for the project. i think a lot of the valosity of openstack has come form the fact that it has tradtionally been grass roots driven. it is true that it has also slowed progress on some cross project issue and in some cases internal project topic where different compaines or contiutors have differnet prioties but i also dont how the TC would ever truly mitgate company priorits conflicting with TC goals. > > There are two competing voices around these issues, both have merit, > but they remain in competition. > > One says: the role of the TC is to enable healthy contribution in > all its many and diverse forms in the context of a community called > OpenStack. > > Another says: the role of the TC is to help shape a product [1] > called OpenStack. > > The first sounds rather positive: We're enabling lots of stuff, look at > all the stuff! > > The second is often perceived as rather negative: That stuff could > be so much better! > > None of these perceptions are complete or fully accurate, there are > positives and negatives in all views, but what seems to come out of > these conversations is that there are some very small and vocal > minorities that care about these issues and have strong opinions, > and everyone else doesn't care. Do they (you, silent readers!) not > care, are they not interested, or is it a matter of "we've been over > this before and nothing ever changes?" personally i have always seen the role of the tc in providing guidence on policies that govern openstack, its process and long term direction but not in providing direction to indiviual projects directly. e.g. engaging with all project to develop and implement a timely transtion to python3 first and python 3 only is tc role in my view, both in terms of makeing sure a timeline is cerated and a process to get there is formulated but i dont nessicarly think its the tc role to guide a specific project say swift or nova in how they should migrate to python3 specifically. e.g. if swift had a lib depency that only it used and it needed to change it to move to python 3 then the swift core team should be free to make that decision without tc guidence. as a counter point if the libary was used by several project say eventlet and supose eventlets was not going to support python 3 ever then i think the TC would have cross project role in determining a path forward so we can have a similar technology stack across our projects. so if i was to sumerise my personal view point i think the TCs role is largely in the form of facilitating cross project goals (enabling openstack to "power" the edge) and governace/policy like the PTI, but i think indiviual project should still have the freedom to priories there work and define how they achive that goal and if it is relevent to them. i dont think devstack for example need a dedicated effort to make it ready for the edge, nova and neutron might need just a tad more effort if they wanted to achive that goal. > > [1] I bristle at the term "product" because it sounds like we're > trying to sell something. I'm thinking more in terms of "a thing > that is being produced and has a purpose". i have a similar reaction to the term product. From e0ne at e0ne.info Wed Jan 16 13:15:28 2019 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Wed, 16 Jan 2019 15:15:28 +0200 Subject: [horizon] Unused xstatic-* projects retirement Message-ID: Hi team, There are some xstatic packages which I didn't start to use in Horizon or plugins. We didn't do any release of them. During the last meeting [1] we agreed to mark them as retired. I'll start retired procedure [2] today. If you're going to use them, please let me know. The list of the projects to be retired: - xstatic-angular-ui-router - xstatic-bootstrap-datepicker - xstatic-hogan - xstatic-jquery-migrate - xstatic-jquery.quicksearch - xstatic-jquery.tablesorter - xstatic-rickshaw - xstatic-spin - xstatic-vis [1] http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-110 Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.rydberg at citynetwork.eu Wed Jan 16 13:17:23 2019 From: tobias.rydberg at citynetwork.eu (Tobias Rydberg) Date: Wed, 16 Jan 2019 14:17:23 +0100 Subject: [publiccloud-wg][sigs] Reminder weekly meeting Public Cloud WG Message-ID: <62e669c1-8e34-58a7-fc93-1661c5684d12@citynetwork.eu> Hi everyone, Time for a new meeting for Public Cloud WG - tomorrow 1400 UTC in #openstack-publiccloud! Agenda found at https://etherpad.openstack.org/p/publiccloud-wg Cheers, Tobias -- Tobias Rydberg Senior Developer Twitter & IRC: tobberydberg www.citynetwork.eu | www.citycloud.com INNOVATION THROUGH OPEN IT INFRASTRUCTURE ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED From mriedemos at gmail.com Wed Jan 16 14:34:17 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 16 Jan 2019 08:34:17 -0600 Subject: [nova][glance] Granting image member access for snapshots (bug 1675791) In-Reply-To: <2847e346-2f60-2d33-4686-1fd654992d8e@gmail.com> References: <2847e346-2f60-2d33-4686-1fd654992d8e@gmail.com> Message-ID: <9f0ee7b0-6a8b-39dc-fb59-d84f3259fe06@gmail.com> On 1/14/2019 6:04 PM, Matt Riedemann wrote: > I have a fix proposed for a pretty old bug (1675791 [1]). This > originally came up because of a scenario where an admin shelves a server > and then the owner of the shelved server cannot unshelve it since they > do not have access to the shelve snapshot image. > > The same is true for normal snapshot and backup operations though, see > this proposed spec for Stein [2]. > > It also came up during the cross-cell resize spec review [3] since that > solution depends on snapshot to get the root disk from one cell to another. > > In a nutshell, when creating a snapshot now, the compute API will check > if the project creating the snapshot is the same as the project owner of > the server. If not, the image is created with visibility=shared and the > project owner of the instance is granted member access to the image, > which allows them to GET the image directly via the ID, but not list it > by default (the tenant user has to accept the pending membership for > that). I have tested this out in devstack today and everything seems to > work well. > > I am posting this to (a) raise awareness of the bug and proposed fix > since it is sort of a behavior change in the > createImage/createBackup/shelve APIs and (b) to make sure the glance > team is aware and acknowledges this is an OK thing to do, i.e. are there > any kind of unforeseen side effects of automatically granting image > membership like this (I would think not since the owner of the instance > has access to the root disk of the server anyway - it is their data). > > Also note that some really crusty legacy code in most of the in-tree > virt drivers had to be removed (some virt drivers would change the image > visibility back to private during the actual data upload to glance) > which could mean out of tree drivers have the same issue. > > [1] https://bugs.launchpad.net/nova/+bug/1675791 > [2] https://review.openstack.org/#/c/616843/ > [3] > https://review.openstack.org/#/c/616037/3/specs/stein/approved/cross-cell-resize.rst at 233 One more note about this - I have not yet tested doing a snapshot of a volume-backed server where the admin creates the snapshot and the tenant user tries to create a new server from that snapshot. In that case the tenant user should have access to the snapshot image, but they might not have access to the volume snapshot so that could still fail. For that to work, we'd likely need to force an ownership transfer of the volume to the tenant user (owner of the server) I guess. -- Thanks, Matt From rosmaita.fossdev at gmail.com Wed Jan 16 14:34:31 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 16 Jan 2019 09:34:31 -0500 Subject: [nova][glance] Granting image member access for snapshots (bug 1675791) In-Reply-To: <18811cdc-5bba-29e9-90d0-a1f7e1bfc06d@fried.cc> References: <2847e346-2f60-2d33-4686-1fd654992d8e@gmail.com> <18811cdc-5bba-29e9-90d0-a1f7e1bfc06d@fried.cc> Message-ID: On 1/15/19 8:40 AM, Eric Fried wrote: > > On 1/14/2019 6:04 PM, Matt Riedemann wrote: >> I have a fix proposed > > The proposed fix is here: https://review.openstack.org/#/c/630769/ > >> for a pretty old bug (1675791 [1]). This >> originally came up because of a scenario where an admin shelves a server >> and then the owner of the shelved server cannot unshelve it since they >> do not have access to the shelve snapshot image. >> >> The same is true for normal snapshot and backup operations though, see >> this proposed spec for Stein [2]. >> >> It also came up during the cross-cell resize spec review [3] since that >> solution depends on snapshot to get the root disk from one cell to another. >> >> In a nutshell, when creating a snapshot now, the compute API will check >> if the project creating the snapshot is the same as the project owner of >> the server. If not, the image is created with visibility=shared and the >> project owner of the instance is granted member access to the image, >> which allows them to GET the image directly via the ID, but not list it >> by default (the tenant user has to accept the pending membership for >> that). I have tested this out in devstack today and everything seems to >> work well. >> >> I am posting this to (a) raise awareness of the bug and proposed fix >> since it is sort of a behavior change in the >> createImage/createBackup/shelve APIs and (b) to make sure the glance >> team is aware and acknowledges this is an OK thing to do, i.e. are there >> any kind of unforeseen side effects of automatically granting image >> membership like this (I would think not since the owner of the instance >> has access to the root disk of the server anyway - it is their data). This is an OK thing to do. The Glance image sharing philosophy is that the image owner can share with anyone and a shared image is immediately usable by any of its members. It's not accessible to any non-owner non-members, and as you point out, it won't "spam" an image member by showing up in the member's default image-list unless the member (not the sharer) decides otherwise. The only thing I can think of that might impact your propoal is the 'image_member_quota' configuration option in glance-api.conf, which limits how many projects a single image may be shared with. (It's not really a quota, it's a hard limit across all users.) The default is 128. There's no limit to how many images a single project can be a member of, so you don't need to worry about that. >> Also note that some really crusty legacy code in most of the in-tree >> virt drivers had to be removed (some virt drivers would change the image >> visibility back to private during the actual data upload to glance) >> which could mean out of tree drivers have the same issue. Yes, this is an important point to note. Since Ocata, when the 'shared' and 'community' visibility values were added, an image with visibility of 'private' is not accessible to any members of the image. (The visibility transition does not affect the member-list. The member-list persists even when the visibility has changed to a value that renders it inert.) >> [1] https://bugs.launchpad.net/nova/+bug/1675791 >> [2] https://review.openstack.org/#/c/616843/ >> [3] >> https://review.openstack.org/#/c/616037/3/specs/stein/approved/cross-cell-resize.rst at 233 >> >> > From artem.goncharov at gmail.com Wed Jan 16 14:45:16 2019 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Wed, 16 Jan 2019 15:45:16 +0100 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: Yes, it is working. Thanks Jim On Wed, Jan 16, 2019 at 3:39 PM Jim Phillips wrote: > @Artem and @Surya, the issue that you've reported should be working > correctly now. Can you please double check and let me know if it's not? > > On Thu, Jan 10, 2019 at 2:10 PM Boris Renski wrote: > >> Hey guys! thanks for the heads up on this. Let us check and fix ASAP. >> >> On Thu, Jan 10, 2019 at 12:45 AM Artem Goncharov < >> artem.goncharov at gmail.com> wrote: >> >>> Hi, >>> >>> I can repeat the issue - stackalytics stopped showing my affiliation >>> correctly (user: gtema, entry in default_data.json is present) >>> >>> Regards, >>> Artem >>> >>> On Thu, Jan 10, 2019 at 5:48 AM Surya Singh < >>> singh.surya64mnnit at gmail.com> wrote: >>> >>>> Hi Boris >>>> >>>> Great to see new facelift of Stackalytics. Its really good. >>>> >>>> I have a query regarding contributors name is not listed as per company >>>> affiliation. >>>> Before facelift to stackalytics it was showing correct whether i have >>>> entry in >>>> https://github.com/openstack/stackalytics/blob/master/etc/default_data.json >>>> or not. >>>> Though now i have pushed the patch for same >>>> https://review.openstack.org/629150, but another thing is one of my >>>> colleague Vishal Manchanda name is also showing as independent contributor >>>> rather than NEC contributor. While his name entry already in >>>> etc/default_data.json. >>>> >>>> Would be great if you check the same. >>>> >>>> --- >>>> Thanks >>>> Surya >>>> >>>> >>>> On Tue, Jan 8, 2019 at 11:57 PM Boris Renski >>>> wrote: >>>> >>>>> Folks, >>>>> >>>>> Happy New Year! We wanted to start the year by giving a facelift to >>>>> stackalytics.com (based on stackalytics openstack project). Brief >>>>> summary of updates: >>>>> >>>>> - >>>>> >>>>> We have new look and feel at stackalytics.com >>>>> - >>>>> >>>>> We did away with DriverLog >>>>> and Member Directory >>>>> , which were not very >>>>> actively used or maintained. Those are still available via direct links, >>>>> but not in the menu on the top >>>>> - >>>>> >>>>> BIGGEST CHANGE: You can now track some of the CNCF and >>>>> Unaffiliated project commits via a separate subsection accessible via top >>>>> menu. Before this was all bunched up in Project Type -> Complimentary >>>>> >>>>> Happy to hear comments or feedback. >>>>> >>>>> -Boris >>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From elfosardo at gmail.com Wed Jan 16 14:57:57 2019 From: elfosardo at gmail.com (elfosardo) Date: Wed, 16 Jan 2019 15:57:57 +0100 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: Hello, this doesn't look working correctly for co-authored patches. For example, looking at my user report for https://review.openstack.org/#/c/622898/ I still see the affiliation to my old company. Thanks, Riccardo On Wed, 16 Jan 2019 at 15:48, Artem Goncharov wrote: > > Yes, it is working. > > Thanks Jim > > On Wed, Jan 16, 2019 at 3:39 PM Jim Phillips wrote: >> >> @Artem and @Surya, the issue that you've reported should be working correctly now. Can you please double check and let me know if it's not? >> >> On Thu, Jan 10, 2019 at 2:10 PM Boris Renski wrote: >>> >>> Hey guys! thanks for the heads up on this. Let us check and fix ASAP. >>> >>> On Thu, Jan 10, 2019 at 12:45 AM Artem Goncharov wrote: >>>> >>>> Hi, >>>> >>>> I can repeat the issue - stackalytics stopped showing my affiliation correctly (user: gtema, entry in default_data.json is present) >>>> >>>> Regards, >>>> Artem >>>> >>>> On Thu, Jan 10, 2019 at 5:48 AM Surya Singh wrote: >>>>> >>>>> Hi Boris >>>>> >>>>> Great to see new facelift of Stackalytics. Its really good. >>>>> >>>>> I have a query regarding contributors name is not listed as per company affiliation. >>>>> Before facelift to stackalytics it was showing correct whether i have entry in https://github.com/openstack/stackalytics/blob/master/etc/default_data.json or not. >>>>> Though now i have pushed the patch for same https://review.openstack.org/629150, but another thing is one of my colleague Vishal Manchanda name is also showing as independent contributor rather than NEC contributor. While his name entry already in etc/default_data.json. >>>>> >>>>> Would be great if you check the same. >>>>> >>>>> --- >>>>> Thanks >>>>> Surya >>>>> >>>>> >>>>> On Tue, Jan 8, 2019 at 11:57 PM Boris Renski wrote: >>>>>> >>>>>> Folks, >>>>>> >>>>>> >>>>>> Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: >>>>>> >>>>>> We have new look and feel at stackalytics.com >>>>>> >>>>>> We did away with DriverLog and Member Directory, which were not very actively used or maintained. Those are still available via direct links, but not in the menu on the top >>>>>> >>>>>> BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible via top menu. Before this was all bunched up in Project Type -> Complimentary >>>>>> >>>>>> Happy to hear comments or feedback. >>>>>> >>>>>> -Boris >>>>>> >>>>>> From mriedemos at gmail.com Wed Jan 16 14:59:30 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 16 Jan 2019 08:59:30 -0600 Subject: [nova][glance] Granting image member access for snapshots (bug 1675791) In-Reply-To: References: <2847e346-2f60-2d33-4686-1fd654992d8e@gmail.com> <18811cdc-5bba-29e9-90d0-a1f7e1bfc06d@fried.cc> Message-ID: <1b8b95f6-2052-17d4-95e2-58a7e3f3731a@gmail.com> On 1/16/2019 8:34 AM, Brian Rosmaita wrote: > The only thing I can think of that might impact your propoal is the > 'image_member_quota' configuration option in glance-api.conf, which > limits how many projects a single image may be shared with. (It's not > really a quota, it's a hard limit across all users.) The default is 128. This shouldn't be a problem in this case. The image is still essentially private (protected?) since it's only shared between the admin tenant and the user tenant. -- Thanks, Matt From edmondsw at us.ibm.com Wed Jan 16 15:22:02 2019 From: edmondsw at us.ibm.com (William M Edmonds) Date: Wed, 16 Jan 2019 10:22:02 -0500 Subject: [nova][glance] Granting image member access for snapshots (bug 1675791) In-Reply-To: <9f0ee7b0-6a8b-39dc-fb59-d84f3259fe06@gmail.com> References: <2847e346-2f60-2d33-4686-1fd654992d8e@gmail.com> <9f0ee7b0-6a8b-39dc-fb59-d84f3259fe06@gmail.com> Message-ID: Matt Riedemann wrote on 01/16/2019 09:34:17 AM: > > One more note about this - I have not yet tested doing a snapshot of a > volume-backed server where the admin creates the snapshot and the tenant > user tries to create a new server from that snapshot. In that case the > tenant user should have access to the snapshot image, but they might not > have access to the volume snapshot so that could still fail. For that to > work, we'd likely need to force an ownership transfer of the volume to > the tenant user (owner of the server) I guess. Or depend on a new cinder capability for sharing volumes across projects, similar to how images can be shared across projects. This would be useful for other things as well. I do agree that in this case, forcing ownership transfer would probably be better than sharing. In fact, I wish we could do that for images here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From singh.surya64mnnit at gmail.com Wed Jan 16 15:31:14 2019 From: singh.surya64mnnit at gmail.com (Surya Singh) Date: Wed, 16 Jan 2019 21:01:14 +0530 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: @Jim thanks for looking into this, but FYI still showing as independent* https://www.stackalytics.com/?user_id=confisurya. May be merging this https://review.openstack.org/#/c/629150/ will fix the issue. On Wed, Jan 16, 2019 at 8:09 PM Jim Phillips wrote: > @Artem and @Surya, the issue that you've reported should be working > correctly now. Can you please double check and let me know if it's not? > > On Thu, Jan 10, 2019 at 2:10 PM Boris Renski wrote: > >> Hey guys! thanks for the heads up on this. Let us check and fix ASAP. >> >> On Thu, Jan 10, 2019 at 12:45 AM Artem Goncharov < >> artem.goncharov at gmail.com> wrote: >> >>> Hi, >>> >>> I can repeat the issue - stackalytics stopped showing my affiliation >>> correctly (user: gtema, entry in default_data.json is present) >>> >>> Regards, >>> Artem >>> >>> On Thu, Jan 10, 2019 at 5:48 AM Surya Singh < >>> singh.surya64mnnit at gmail.com> wrote: >>> >>>> Hi Boris >>>> >>>> Great to see new facelift of Stackalytics. Its really good. >>>> >>>> I have a query regarding contributors name is not listed as per company >>>> affiliation. >>>> Before facelift to stackalytics it was showing correct whether i have >>>> entry in >>>> https://github.com/openstack/stackalytics/blob/master/etc/default_data.json >>>> or not. >>>> Though now i have pushed the patch for same >>>> https://review.openstack.org/629150, but another thing is one of my >>>> colleague Vishal Manchanda name is also showing as independent contributor >>>> rather than NEC contributor. While his name entry already in >>>> etc/default_data.json. >>>> >>>> Would be great if you check the same. >>>> >>>> --- >>>> Thanks >>>> Surya >>>> >>>> >>>> On Tue, Jan 8, 2019 at 11:57 PM Boris Renski >>>> wrote: >>>> >>>>> Folks, >>>>> >>>>> Happy New Year! We wanted to start the year by giving a facelift to >>>>> stackalytics.com (based on stackalytics openstack project). Brief >>>>> summary of updates: >>>>> >>>>> - >>>>> >>>>> We have new look and feel at stackalytics.com >>>>> - >>>>> >>>>> We did away with DriverLog >>>>> and Member Directory >>>>> , which were not very >>>>> actively used or maintained. Those are still available via direct links, >>>>> but not in the menu on the top >>>>> - >>>>> >>>>> BIGGEST CHANGE: You can now track some of the CNCF and >>>>> Unaffiliated project commits via a separate subsection accessible via top >>>>> menu. Before this was all bunched up in Project Type -> Complimentary >>>>> >>>>> Happy to hear comments or feedback. >>>>> >>>>> -Boris >>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.se Wed Jan 16 15:53:43 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Wed, 16 Jan 2019 16:53:43 +0100 Subject: [dev][puppet] Register PTG team attendance Message-ID: Dear Puppeteers, The foundation would like all projects to report if they will be attending the PTG and I'm in charge of reporting that for the Puppet OpenStack project. I'm now reaching out to hear if you are attending and would like the Puppet OpenStack project to have an official attendance and therefore possible requirements of rooms etc? I personally is not sure that I will attend the PTG even though it's directly after the Summit which I'm hoping to be able to attend this year. Earlier years we haven't had any real reasons to have dedicated rooms or events from what I know but please let me know if you have any plans. I don't think not placing our team as attending prevents anybody from actually being at the PTG, I hope not atleast (maybe somebody can shine in on that part). I need to fill in the survey by January 20th, 2019 at 7:00 UTC so please let me know urgently, sorry for the short timespan, it's entirely my fault on that part. Please get back to me if you will be attending and need to register the Puppet OpenStack team. Best regards Tobias From pabelanger at redhat.com Wed Jan 16 16:28:14 2019 From: pabelanger at redhat.com (Paul Belanger) Date: Wed, 16 Jan 2019 11:28:14 -0500 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: <20190108220522.iczyv2yz5rfg4qci@yuggoth.org> References: <20190108220522.iczyv2yz5rfg4qci@yuggoth.org> Message-ID: <20190116162814.GA4591@localhost.localdomain> On Tue, Jan 08, 2019 at 10:05:23PM +0000, Jeremy Stanley wrote: > On 2019-01-08 10:21:32 -0800 (-0800), Boris Renski wrote: > > Happy New Year! We wanted to start the year by giving a facelift to > > stackalytics.com (based on stackalytics openstack project). > [...] > > Happy to hear comments or feedback. > > Looks slick! When you say "based on" I guess you mean "forked from?" > I don't see those modifications in the repository at > https://git.openstack.org/cgit/openstack/stackalytics nor proposed > to it through https://review.openstack.org/ so presumably the source > code now lives elsewhere. Is Stackalytics still open source, or has > it become proprietary? I second this, it would be good to restart the workt to move this under openstack-infra (opendev) but we need help from the current maintainers of stackalytics. - Paul From lyarwood at redhat.com Wed Jan 16 16:53:43 2019 From: lyarwood at redhat.com (Lee Yarwood) Date: Wed, 16 Jan 2019 16:53:43 +0000 Subject: [nova][placement][tripleo] Extracted Placement enablement status update #1 Message-ID: <20190116165343.basfiqgyhr7ddqew@lyarwood.usersys.redhat.com> Hello, This is a brief status progress report on my work to enable the deployment of the extracted Placement service within TripleO ahead of the extraction check-in call later today. tl;dr TripleO isn't ready to switch to an extracted Placement service or for code to be removed from openstack/nova. # RDO (packaging) https://review.rdoproject.org/#/q/topic:placement-add This was started at the PTG in Denver and mostly completed some time ago now. I do however have a single change outstanding that includes the puppet-placement package as a dependency of puppet-tripleo but aside from that this is complete: puppet-tripleo: Require puppet-placement https://review.rdoproject.org/r/18330 FWIW in the context of the extraction check-in call, both RDO and TripleO are not ready to deal with the original placement code being deleted from the openstack/nova project at this point. This will continue to be the case until the following work is completed to enable deployments using the extracted project in TripleO. # Kolla (containers) https://review.openstack.org/#/q/topic:split-placement+OR+topic:upgrade-placement The initial Kolla change to produce a placement-api container landed some time ago after some small delays due to the Stein UCA not being used for the Ubuntu binary jobs: Split placement-api from nova https://review.openstack.org/#/c/613589/ FWIW an additional change to copy across the DB migration script has also been posted and is pending a package promotion in RDO: Copy placement database migration script https://review.openstack.org/#/c/626382/ The above change isn't required by TripleO, I just wanted to highlight that kolla-ansible are looking to use the script during upgrades. Unfortunately due various issues with TripleO promotions the TripleO container for the placement-api was only uploaded yesterday after I manually built it: https://hub.docker.com/r/tripleomaster/centos-binary-placement-api # Puppet https://review.openstack.org/#/q/topic:tripleo-placement-extraction The initial and massive puppet-placement cookiecutter change has landed: Initial cookiecutter run and import from puppet-nova https://review.openstack.org/#/c/604182/ This is currently only tested on CentOS7 by the following POI or puppet-openstack-integration change that has itself been stuck behind a raft of Ubuntu py3 and Stein UCA issues: placement: Extract the service from Nova https://review.openstack.org/#/c/615568/ I'm attempting to unblock this by working my way through the following changes to support py3 and the Stein UCA within the Ubuntu jobs: https://review.openstack.org/#/q/status:open+topic:ubuntu-py3 https://review.openstack.org/#/q/status:open+topic:inherit_pyvers Help with these changes from any openstack-puppet Ubuntu users would obviously be appreciated here! Finally I'm currently working on the following puppet-tripleo change that makes this all usable within TripleO itself: WIP placement: Initial extraction of the Placement service from Nova https://review.openstack.org/#/c/624335/ # TripleO https://review.openstack.org/#/q/topic:tripleo-placement-extraction On the TripleO front I almost have undercloud deployment completing in a reproducer environment with various local hacks to workaround changes not being merged yet. The main tripleo-heat-template change can be found below: WIP placement: Extract the service from Nova https://review.openstack.org/#/c/630644/ # TODO Obviously there's still a lot to do here but to begin with I'll be focusing on the following tasks: - Landing the puppet-openstack-integration and puppet-tripleo changes. - Moving the tripleo-heat-templates change out of WIP. - Posting draft upgrade_tasks in a new change using the db migration script. Hopefully this is of some use to folks in the Placement and TripleO teams! Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: not available URL: From mriedemos at gmail.com Wed Jan 16 18:48:39 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 16 Jan 2019 12:48:39 -0600 Subject: [nova][placement][tripleo] Extracted Placement enablement status update #1 In-Reply-To: <20190116165343.basfiqgyhr7ddqew@lyarwood.usersys.redhat.com> References: <20190116165343.basfiqgyhr7ddqew@lyarwood.usersys.redhat.com> Message-ID: On 1/16/2019 10:53 AM, Lee Yarwood wrote: > Hopefully this is of some use to folks in the Placement and TripleO teams! Lee, thanks for the detailed update and your work on this, much appreciated. -- Thanks, Matt From openstack at nemebean.com Wed Jan 16 18:50:11 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 16 Jan 2019 12:50:11 -0600 Subject: [tripleo] OVB is now on Gerrit Message-ID: <2bff76a6-bea0-aed1-9431-d83c4f6b7ebf@nemebean.com> Just a heads up that the import of OVB to Gerrit is complete. If you're using OVB you should switch your clones from the Github repo to https://git.openstack.org/cgit/openstack/openstack-virtual-baremetal I've marked the Github repo deprecated so please don't submit issues or PRs there anymore. For the moment the approval process won't meaningfully change. I'll still be reviewing and single-approving patches, and the core reviewer list for OVB is the same as it was on Github, with the addition of Juan (as TripleO PTL) and Harald (as someone who has recently done a ton of good work in OVB). See the full list here: https://review.openstack.org/#/admin/groups/1993,members Once we have E2E testing set up on the repo (that isn't me manually testing changes against my local cloud) I expect we'll open that list up more and can discuss moving to a more typical 2 +2 model. Thanks. -Ben From cboylan at sapwetik.org Wed Jan 16 18:52:55 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 16 Jan 2019 10:52:55 -0800 Subject: [horizon] Unused xstatic-* projects retirement In-Reply-To: References: Message-ID: <1547664775.3690931.1636415712.6FC8681B@webmail.messagingengine.com> On Wed, Jan 16, 2019, at 5:15 AM, Ivan Kolodyazhny wrote: > Hi team, > > There are some xstatic packages which I didn't start to use in Horizon or > plugins. We didn't do any release of them. > > During the last meeting [1] we agreed to mark them as retired. I'll start > retired procedure [2] today. If you're going to use them, please let me > know. There is actually some neat tooling that pypi links to that shows you who depends on certain packages. As an example you can see the list for xstatic-hogan at https://libraries.io/pypi/XStatic-Hogan/usage. A lot of this is OpenStack related but some of it appears not to be? You may want to do a quick check of your users since the xstatic packages are generic packaging of js libraries and are not openstack specific. > > The list of the projects to be retired: > - xstatic-angular-ui-router > - xstatic-bootstrap-datepicker > - xstatic-hogan > - xstatic-jquery-migrate > - xstatic-jquery.quicksearch > - xstatic-jquery.tablesorter > - xstatic-rickshaw > - xstatic-spin > - xstatic-vis > > > [1] > http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-110 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ From mriedemos at gmail.com Wed Jan 16 18:53:48 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 16 Jan 2019 12:53:48 -0600 Subject: [nova][glance] Granting image member access for snapshots (bug 1675791) In-Reply-To: References: <2847e346-2f60-2d33-4686-1fd654992d8e@gmail.com> <9f0ee7b0-6a8b-39dc-fb59-d84f3259fe06@gmail.com> Message-ID: <8dca0118-ac84-82c0-6f0c-caa8b4386f82@gmail.com> On 1/16/2019 9:22 AM, William M Edmonds wrote: > I do agree that in this case, forcing ownership transfer would probably > be better than sharing. In fact, I wish we could do that for images here. I'm pretty sure we can simply force the glance snapshot image to be owned privately by the tenant user that owns the instance by simply specifying: image['owner'] = instance.project_id The reason I didn't just do that was because it's a more drastic change in behavior than what we have today with the image being owned by the tenant that created the image. I also thought about making that configurable, but that is (1) kind of gross since it's config-driven API behavior which also makes it (2) not really interoperable, although that behavior could probably be discoverable by end users. If you go back to the proposed spec from Brin Zhang [1] the proposal there is to change the compute API to allow passing in the owner project_id so nova doesn't have to fumble with this. I could see that being reasonable for snapshots and backups, but I think the original bug about shelve is really just that - a bug, and easily fixed with what I've proposed (and is not a problem for volume-backed servers because shelve doesn't create a snapshot image for those). [1] https://review.openstack.org/#/c/616843/ -- Thanks, Matt From Kevin.Fox at pnnl.gov Wed Jan 16 19:08:29 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Wed, 16 Jan 2019 19:08:29 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> , <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> , <1A3C52DFCD06494D8528644858247BF01C280D4F@EX10MBOX03.pnnl.gov>, <1A3C52DFCD06494D8528644858247BF01C280E0B@EX10MBOX03.pnnl.gov>, Message-ID: <1A3C52DFCD06494D8528644858247BF01C2847CB@EX10MBOX03.pnnl.gov> With my operator hat on, my experience has taught me to consider user feedback very carefully. It is rare when users actually bother to report bugs. The majority of users tend to not report issues and then grumble about something being broken add just try and work around it. If it goes on long enough, they tend to find entirely different solutions, without ever telling you why you have an issue. Thus leading to your projects decline or even demise. So those users that actually report issues are great. They provide the opportunity to fix something before it is too late. So, its not that the silent majority doesnt care. They do. But tend not to be vocal. Those that do speakup don't tend to have many voices in and of themselves, but represent a great many other voices that don't tend to speak up. So few voices speaking doesn't mean your doing well and there are few real problems. Just that the feedback isn't happening with many voices. Just my $0.02. There may be other experiences out there. Thanks, Kevin ________________________________________ From: Chris Dent [cdent+os at anticdent.org] Sent: Wednesday, January 16, 2019 4:33 AM To: OpenStack-discuss at lists.openstack.org Subject: RE: [tc] [all] Please help verify the role of the TC On Tue, 15 Jan 2019, Fox, Kevin M wrote: > If its optional, then that might not happen properly. Delegating that authority might be a good solution as you suggest. But then delegation of prioritizing of reviews and other such things might need to go along with it at the same time to be effective? The idea is for there to be a flow of leadership from the TC to the project for some of the cross project work, but if that flow cant happen, then nothing really changed. Not so sure a delegate can have enough power to really be effective that way? Thoughts? Delegation also simply spreads things around over a wider base, rather than addressing one of the core issues: there are too many things going on at once for there to be anything could be labeled a "unified direction". If we want (do we?) a unified direction, there have to be fewer directions. Positioning things as a "a delegate will do it" is a way of choosing to not limit the number of things being done. There are two competing voices around these issues, both have merit, but they remain in competition. One says: the role of the TC is to enable healthy contribution in all its many and diverse forms in the context of a community called OpenStack. Another says: the role of the TC is to help shape a product [1] called OpenStack. The first sounds rather positive: We're enabling lots of stuff, look at all the stuff! The second is often perceived as rather negative: That stuff could be so much better! None of these perceptions are complete or fully accurate, there are positives and negatives in all views, but what seems to come out of these conversations is that there are some very small and vocal minorities that care about these issues and have strong opinions, and everyone else doesn't care. Do they (you, silent readers!) not care, are they not interested, or is it a matter of "we've been over this before and nothing ever changes?" [1] I bristle at the term "product" because it sounds like we're trying to sell something. I'm thinking more in terms of "a thing that is being produced and has a purpose". -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From mriedemos at gmail.com Wed Jan 16 19:29:12 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 16 Jan 2019 13:29:12 -0600 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: References: Message-ID: On 1/14/2019 12:01 PM, Chris Dent wrote: > As discussed in the recent pupdate [1] there will be a meeting this > Wednesday at 1700 UTC to discuss the current state of the placement > extraction and get some idea on the critical items that need to be > addressed to feel comfy. > > If you're interested in this topic, meet near that time in the > #openstack-placement IRC channel and someone will produce links > for a hangout, etherpad, whatever is required. > > Thanks. > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001666.html Here is my attempt at summarizing the call we had. Notes are in the etherpad [1]. Deployment tools: * Lee is working on TripleO support for extracted placement and estimates 3 more weeks for just deploy (base install) support to be done, and at least 3 more weeks for upgrade support after that. Read Lee's status update for details [2]. * If nova were to go ahead and drop placement code and require extracted placement before TripleO is ready, they would have to pin nova to a git SHA before that which would delay their Stein release. * Having the extraction span release boundaries would ease the upgrade pain for TripleO. Nested providers / reshaper / VGPU: * VGPU reshaper work for nested resource providers in libvirt and xenapi drivers has stalled and there is still hesitation to move forward with extracting placement before that reshape flow, including in an upgrade, is tested to know that nova does not need any last minute data migrations which require direct access to the placement database. In other words, we have not yet confirmed that the placement reshaper API will be fully sufficient until a real driver is using it. * Matt (me!) has agreed to rebase and address the comments on the libvirt patch [3] to try and push that forward. * We still need someone to write a functional test which creates a server with a flat resource structure, reshapes that to nested, and then creates another server against the same provider tree. Data migration: * The only placement-specific online data migration in nova is "create_incomplete_consumers" and we agreed to copy that into placement and add a placement-status upgrade check for it. The data migration code will build on top of Tetsuro's work [4]. Matt is signed up to work on both of those commands. Miscellaneous: * Placement release notes will start at the current release and reference the nova release notes for anything older (Ocata->Rocky). * Chris is already working on some other things like docs and small governance changes (os-traits), but those are all on hold until the placement code in nova is dropped which is dependent on the deployment tooling support and reshaper changes above. * We agreed to checkpoint again in three weeks so Wednesday February 6 at let's say the same time, 1700 UTC. [1] https://etherpad.openstack.org/p/placement-extract-stein-5 [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001783.html [3] https://review.openstack.org/#/c/599208/ [4] https://review.openstack.org/#/c/624942/ -- Thanks, Matt From francois.magimel at alumni.enseeiht.fr Wed Jan 16 21:36:27 2019 From: francois.magimel at alumni.enseeiht.fr (=?UTF-8?Q?Fran=c3=a7ois_Magimel?=) Date: Wed, 16 Jan 2019 22:36:27 +0100 Subject: [horizon] Unused xstatic-* projects retirement In-Reply-To: References: Message-ID: <985ba1a5-d5de-b5d8-91f0-2e92fca1b00e@alumni.enseeiht.fr> Hi ! Le 16/01/2019 à 14:15, Ivan Kolodyazhny a écrit : > Hi team, > > There are some xstatic packages which I didn't start to use in Horizon or plugins. We didn't do any release of them. > > During the last meeting [1] we agreed to mark them as retired. I'll start retired procedure [2] today. If you're going to use them, please let me know. > > The list of the projects to be retired: > - xstatic-angular-ui-router > - xstatic-bootstrap-datepicker > - xstatic-hogan > - xstatic-jquery-migrate > - xstatic-jquery.quicksearch > - xstatic-jquery.tablesorter > - xstatic-rickshaw We have some work in progress [2] in the CloudKitty dashboard plugin to use xstatic-rickshaw instead of including the JS script our the repo. [2] https://storyboard.openstack.org/#!/story/2003578 François > - xstatic-spin > - xstatic-vis > > > [1] http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-110 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From emilien at redhat.com Wed Jan 16 23:41:05 2019 From: emilien at redhat.com (Emilien Macchi) Date: Thu, 17 Jan 2019 00:41:05 +0100 Subject: [nova][placement][tripleo] Extracted Placement enablement status update #1 In-Reply-To: <20190116165343.basfiqgyhr7ddqew@lyarwood.usersys.redhat.com> References: <20190116165343.basfiqgyhr7ddqew@lyarwood.usersys.redhat.com> Message-ID: Hi Lee, I reviewed all I could in this list. It's impressive work! Thanks a lot. I'll keep an eye on it and try to help. Thanks, On Wed, Jan 16, 2019 at 5:58 PM Lee Yarwood wrote: > Hello, > > This is a brief status progress report on my work to enable the deployment > of > the extracted Placement service within TripleO ahead of the extraction > check-in > call later today. > > tl;dr TripleO isn't ready to switch to an extracted Placement service or > for > code to be removed from openstack/nova. > > # RDO (packaging) > > https://review.rdoproject.org/#/q/topic:placement-add > > This was started at the PTG in Denver and mostly completed some time ago > now. I > do however have a single change outstanding that includes the > puppet-placement > package as a dependency of puppet-tripleo but aside from that this is > complete: > > puppet-tripleo: Require puppet-placement > https://review.rdoproject.org/r/18330 > > FWIW in the context of the extraction check-in call, both RDO and TripleO > are > not ready to deal with the original placement code being deleted from the > openstack/nova project at this point. This will continue to be the case > until > the following work is completed to enable deployments using the > extracted project in TripleO. > > # Kolla (containers) > > > https://review.openstack.org/#/q/topic:split-placement+OR+topic:upgrade-placement > > The initial Kolla change to produce a placement-api container landed some > time > ago after some small delays due to the Stein UCA not being used for the > Ubuntu > binary jobs: > > Split placement-api from nova > https://review.openstack.org/#/c/613589/ > > FWIW an additional change to copy across the DB migration script has also > been > posted and is pending a package promotion in RDO: > > Copy placement database migration script > https://review.openstack.org/#/c/626382/ > > The above change isn't required by TripleO, I just wanted to highlight that > kolla-ansible are looking to use the script during upgrades. > > Unfortunately due various issues with TripleO promotions the TripleO > container > for the placement-api was only uploaded yesterday after I manually built > it: > > https://hub.docker.com/r/tripleomaster/centos-binary-placement-api > > # Puppet > > https://review.openstack.org/#/q/topic:tripleo-placement-extraction > > The initial and massive puppet-placement cookiecutter change has landed: > > Initial cookiecutter run and import from puppet-nova > https://review.openstack.org/#/c/604182/ > > This is currently only tested on CentOS7 by the following POI or > puppet-openstack-integration change that has itself been stuck behind a > raft of > Ubuntu py3 and Stein UCA issues: > > placement: Extract the service from Nova > https://review.openstack.org/#/c/615568/ > > I'm attempting to unblock this by working my way through the following > changes to support py3 and the Stein UCA within the Ubuntu jobs: > > https://review.openstack.org/#/q/status:open+topic:ubuntu-py3 > > https://review.openstack.org/#/q/status:open+topic:inherit_pyvers > > Help with these changes from any openstack-puppet Ubuntu users would > obviously > be appreciated here! > > Finally I'm currently working on the following puppet-tripleo change that > makes > this all usable within TripleO itself: > > WIP placement: Initial extraction of the Placement service from Nova > https://review.openstack.org/#/c/624335/ > > # TripleO > > https://review.openstack.org/#/q/topic:tripleo-placement-extraction > > On the TripleO front I almost have undercloud deployment completing in a > reproducer environment with various local hacks to workaround changes not > being merged yet. The main tripleo-heat-template change can be found below: > > WIP placement: Extract the service from Nova > https://review.openstack.org/#/c/630644/ > > # TODO > > Obviously there's still a lot to do here but to begin with I'll be > focusing on > the following tasks: > > - Landing the puppet-openstack-integration and puppet-tripleo changes. > - Moving the tripleo-heat-templates change out of WIP. > - Posting draft upgrade_tasks in a new change using the db migration > script. > > Hopefully this is of some use to folks in the Placement and TripleO teams! > > Cheers, > -- > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 > 2D76 > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From tanmingxiao at outlook.com Thu Jan 17 01:37:52 2019 From: tanmingxiao at outlook.com (=?utf-8?B?6LCtIOaYjuWutQ==?=) Date: Thu, 17 Jan 2019 01:37:52 +0000 Subject: [bifrost][ironic-inspector] how could i use bifrost to inspector the server Message-ID: Hi all: I use the bifrost to deploy ironic standalone,but I don’t know how to use it for ironic-inspector,Thanks for your help -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Thu Jan 17 01:40:13 2019 From: zbitter at redhat.com (Zane Bitter) Date: Thu, 17 Jan 2019 14:40:13 +1300 Subject: [dev][tc][ptl] Evaluating projects in relation to OpenStack cloud vision In-Reply-To: <8da07091-1fec-174b-af81-6ccc008bab2f@gmail.com> References: <8da07091-1fec-174b-af81-6ccc008bab2f@gmail.com> Message-ID: <376cf6a6-ae31-d6e2-41d2-0fa36061df9c@redhat.com> On 8/01/19 10:43 AM, Jay Bryant wrote: > Julia and Chris, > > Thanks for putting this together.  Wanted to share some thoughts in-line > below: > > On 1/4/2019 9:53 AM, Julia Kreger wrote: >> As some of you may or may not have heard, recently the Technical >> Committee approved a technical vision document [1]. >> >> The goal of the technical vision document is to try to provide a >> reference point for cloud infrastructure software in an ideal >> universe. It is naturally recognized that not all items will apply to >> all projects. > > The document is a really good high level view of what each OpenStack > project should hopefully conform to.  I think it would be good to get > this into the Upstream Institute education in some way as I think it is > something that new contributors should understand and keep in mind.  It > certainly would have helped me as a newbie to think about this. I love this idea. Looking at https://docs.openstack.org/upstream-training/ it seems that the syllabus for Upstream Institute is (or will eventually be) effectively the Contributor Guide, so a good first step would be to link to the vision from the Contributor Guide: https://review.openstack.org/631366 Any idea what else would be involved in making this happen? thanks, Zane. > > >> We envision the results of the evaluation to be added to each >> project's primary contributor documentation tree >> (/doc/source/contributor/vision-reflection.rst) as a list of bullet >> points detailing areas where a project feels they need adjustment to >> better align with the technical vision, and if the project already has >> visibility into a path forward, that as well. > > > > Good idea to have teams go through this.  I will work on doing the above > for Cinder. > > Jay > > From lujinluo at gmail.com Thu Jan 17 06:38:47 2019 From: lujinluo at gmail.com (Lujin Luo) Date: Wed, 16 Jan 2019 22:38:47 -0800 Subject: [neutron] [upgrade] No meeting on Jan. 17th Message-ID: Hi team, I will not be able to chair the meeting tomorrow. And since from the review list, we do not have many updates. Let's resume next week. Sorry for any inconvenience caused. Best regards, Lujin From zbitter at redhat.com Thu Jan 17 07:41:49 2019 From: zbitter at redhat.com (Zane Bitter) Date: Thu, 17 Jan 2019 20:41:49 +1300 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> Message-ID: <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> On 16/01/19 12:01 AM, Chris Dent wrote: > On Mon, 14 Jan 2019, Fox, Kevin M wrote: > >> Been chewing on this thread for a while.... I think I should advocate >> the other direction. > > I'm not sure where to rejoin this thread, so picking here as it > provides a reasonable entry point. First: thanks to everyone who has > joined in, I honestly do feel that as annoying as these discussions > can be, they often reveal something useful. > > Second, things went a bit sideways from the point I was trying to > reach. I wasn't trying to say that PTLs are the obvious and > experienced choice for TC leadership, nor that they were best placed > to represent the community. I hope that my own behavior over the > past few years has made it clear that I very definitely do not feel > that way. > > However, as most respondents on this thread have pointed out, both > TC members and PTLs are described as being over-tasked. What I'm > trying to tease out or ask is: Are they over-tasked because they are > working on too many things (or at least trying to sort through the > too many things); a situation that results from _no unified > technical leadership for the community_. > > My initial assertion was that the TC is insufficiently involved in > defining and performing technical leadership. > > Then I implied that the TC cannot do anything like actionable and > unified technical leadership because they have little to no real > executive power and what power they do have (for example, trying to > make openstack-wide goals) is in conflict (because of the limits of > time and space) with the goals that PTLs (and others) are trying to > enact. Thanks for clarifying this, it's a really interesting question to consider. > Thus: What if the TC and PTLs were the same thing? Would it become > more obvious that there's too much in play to make progress in a > unified direction (on the thing called OpenStack), leading us to > choose less to do, and choose more consistency and actionable > leadership? And would it enable some power to execute on that > leadership. I'm not sure we need to speculate, because as you know the TC and PTLs literally were the same thing prior to 2014-ish. My recollection is that there were pluses and minuses, but on the whole I don't think it had the effect you're suggesting it might. On the plus side there was in a sense more diversity of opinion, because every project had an ex officio representative on the TC. Direct election tends to favour the most visible members of the community and, because the most visible folks often have similar roles, for a while that led to a big chunk of the TC all looking at OpenStack from only a couple of different directions. That diversity was limited to existing projects though. That led to the TC effectively becoming a bottleneck for folks that were working on things it didn't need to stand in the way of, as already-overworked folks whose attention was by definition consumed with managing the details of their individual silos lacked the time to do deep investigation into the edges of the big picture. The project structure reform came about in large part to resolve this, which removed the bottleneck but didn't make it any easier for PTLs to focus on the big picture. I don't recall a time when the TC used the opportunity of having the PTLs as its members to manage cross-project goals, though I'd be interested in hearing examples if somebody has a different recollection. It doesn't seem that any of the various permutations of the PTLs-as-TC-members proposal in this thread are workable, for reasons that others have already covered plus a few more: all cause a perverse incentive to create new official projects; many rely on coercing people who are already capable of winning seats on their own merit to work as TC members when they have chosen not to for whatever reason. > Those are questions, not assertions. > >> Getting some diversity of ideas from outside of those from PTL's >> is probably a good idea for the overall health of OpenStack. What >> about Users that have never been PTL's? Not developers? > > So, to summarize: While I agree we need a diversity of ideas, I > don't think we lack for ideas, nor have we ever. What we lack > is a small enough set of ideas to act on them with significant > enough progress to make a real difference. How can we make the list > small I think I have more to say about this another day, but here is a crazy thought: what if the list is too big because the ideas are too small? What if we can't agree because the stakes are so low? > and (to bring this back to the TC role) empower the TC to > execute on that list? > > And, to be complete, should we? > > And, to be extra really complete, I'm not sure if we should or not, > which is why I'm asking. > From sbauza at redhat.com Thu Jan 17 10:39:13 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Thu, 17 Jan 2019 11:39:13 +0100 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: References: Message-ID: On Wed, Jan 16, 2019 at 8:33 PM Matt Riedemann wrote: > On 1/14/2019 12:01 PM, Chris Dent wrote: > > As discussed in the recent pupdate [1] there will be a meeting this > > Wednesday at 1700 UTC to discuss the current state of the placement > > extraction and get some idea on the critical items that need to be > > addressed to feel comfy. > > > > If you're interested in this topic, meet near that time in the > > #openstack-placement IRC channel and someone will produce links > > for a hangout, etherpad, whatever is required. > > > > Thanks. > > > > [1] > > > http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001666.html > > Here is my attempt at summarizing the call we had. Notes are in the > etherpad [1]. > > Thanks for the notes. I apologize but I had other things to do at that time which led me unable to attend this call. Deployment tools: > > * Lee is working on TripleO support for extracted placement and > estimates 3 more weeks for just deploy (base install) support to be > done, and at least 3 more weeks for upgrade support after that. Read > Lee's status update for details [2]. > * If nova were to go ahead and drop placement code and require extracted > placement before TripleO is ready, they would have to pin nova to a git > SHA before that which would delay their Stein release. > * Having the extraction span release boundaries would ease the upgrade > pain for TripleO. > > That sounds a reasonable trade-off IMHO. Nested providers / reshaper / VGPU: > > * VGPU reshaper work for nested resource providers in libvirt and xenapi > drivers has stalled and there is still hesitation to move forward with > extracting placement before that reshape flow, including in an upgrade, > is tested to know that nova does not need any last minute data > migrations which require direct access to the placement database. In > other words, we have not yet confirmed that the placement reshaper API > will be fully sufficient until a real driver is using it. > * Matt (me!) has agreed to rebase and address the comments on the > libvirt patch [3] to try and push that forward. > Thanks Matt for it. The rebase should be quite easy-forward since it's just a matter of exploding methods into smaller ones, but devil can be in details and some UTs could require some extra work. * We still need someone to write a functional test which creates a > server with a flat resource structure, reshapes that to nested, and then > creates another server against the same provider tree. > > I won't be on perpetual PTO like I was in December/early January and I certainly hope to finish all my internal/customer duties hopefully next week (or the customer wouldn't be happy). So, if you still trust me about havint time for upstream, this is then the priority for me. Data migration: > > * The only placement-specific online data migration in nova is > "create_incomplete_consumers" and we agreed to copy that into placement > and add a placement-status upgrade check for it. The data migration code > will build on top of Tetsuro's work [4]. Matt is signed up to work on > both of those commands. > > Miscellaneous: > > * Placement release notes will start at the current release and > reference the nova release notes for anything older (Ocata->Rocky). > * Chris is already working on some other things like docs and small > governance changes (os-traits), but those are all on hold until the > placement code in nova is dropped which is dependent on the deployment > tooling support and reshaper changes above. > * We agreed to checkpoint again in three weeks so Wednesday February 6 > at let's say the same time, 1700 UTC. > > Worth adding it in agendas, then. -Sylvain [1] https://etherpad.openstack.org/p/placement-extract-stein-5 > [2] > > http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001783.html > [3] https://review.openstack.org/#/c/599208/ > [4] https://review.openstack.org/#/c/624942/ > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at ericsson.com Thu Jan 17 11:16:08 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Thu, 17 Jan 2019 11:16:08 +0000 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: References: Message-ID: <1547723762.31652.7@smtp.office365.com> On Wed, Jan 16, 2019 at 8:29 PM, Matt Riedemann wrote: > > Nested providers / reshaper / VGPU: > > * VGPU reshaper work for nested resource providers in libvirt and > xenapi drivers has stalled and there is still hesitation to move > forward with extracting placement before that reshape flow, including > in an upgrade, is tested to know that nova does not need any last > minute data migrations which require direct access to the placement > database. In other words, we have not yet confirmed that the > placement reshaper API will be fully sufficient until a real driver > is using it. > * Matt (me!) has agreed to rebase and address the comments on the > libvirt patch [3] to try and push that forward. > * We still need someone to write a functional test which creates a > server with a flat resource structure, reshapes that to nested, and > then creates another server against the same provider tree. > There is a functional test [1] that uses a fake virt driver and simulates rehape. My first attempt was to add an extra instance creation after the end of the reshape. But this test reshapes the provider tree to a way that the resulting tree uses sharing disk provider and doesn't have inventory on the compute node RP any more (cpu and mem moved under NUMA). Unfortunately nova does not yet support scheduling against such tree. Shall I try to add a new functional test with the fake virt driver or try to add a functional test with the libvirt driver top of the VGPU reshaper patch? Cheers, gibi [1] nova.tests.functional.test_servers.ProviderTreeTests#test_reshape From cdent+os at anticdent.org Thu Jan 17 12:00:22 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 17 Jan 2019 12:00:22 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> Message-ID: On Thu, 17 Jan 2019, Zane Bitter wrote: >> Thus: What if the TC and PTLs were the same thing? Would it become >> more obvious that there's too much in play to make progress in a >> unified direction (on the thing called OpenStack), leading us to >> choose less to do, and choose more consistency and actionable >> leadership? And would it enable some power to execute on that >> leadership. > > I'm not sure we need to speculate, because as you know the TC and PTLs > literally were the same thing prior to 2014-ish. My recollection is that > there were pluses and minuses, but on the whole I don't think it had the > effect you're suggesting it might. Part and parcel of what I'm suggesting is that less stuff would be considered in the domain of "what do we do?" such that the tyranny of the old/existing projects that you describe is a feature not a bug, as an in-built constraint. It's not a future I really like, but it is one strategy for enabling moving in one direction: cut some stuff. Stop letting so many flowers bloom. Letting those flowers bloom is in the camp of "contribution in all its many and diverse forms". > It doesn't seem that any of the various permutations of the > PTLs-as-TC-members proposal in this thread are workable, for reasons that > others have already covered plus a few more: all cause a perverse incentive > to create new official projects; many rely on coercing people who are already > capable of winning seats on their own merit to work as TC members when they > have chosen not to for whatever reason. Yes to all that. I don't think there's an easy solution here, but it's useful to explore the problem space, even if it is simply to see if people think there's a problem. > I think I have more to say about this another day, but here is a crazy > thought: what if the list is too big because the ideas are too small? What if > we can't agree because the stakes are so low? I don't think that's a crazy thought, and thank you for bringing it around to this, I hoped someone would get there eventually. This is in the camp of "stuff could be so much better". Which makes it pretty clear that the two voices I described are not actually in opposition. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From cdent+os at anticdent.org Thu Jan 17 12:07:27 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 17 Jan 2019 12:07:27 +0000 (GMT) Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: References: Message-ID: On Wed, 16 Jan 2019, Matt Riedemann wrote: > Here is my attempt at summarizing the call we had. Notes are in the etherpad > [1]. Thanks for writing this up, this aligns pretty well with what I recall. Some additional notes/comments within. > Deployment tools: > > * Lee is working on TripleO support for extracted placement and estimates 3 > more weeks for just deploy (base install) support to be done, and at least 3 > more weeks for upgrade support after that. Read Lee's status update for > details [2]. > * If nova were to go ahead and drop placement code and require extracted > placement before TripleO is ready, they would have to pin nova to a git SHA > before that which would delay their Stein release. > * Having the extraction span release boundaries would ease the upgrade pain > for TripleO. Can you (or Dan?) clarify if spanning the release boundaries is usefully specifically for tooling that chooses to upgrade everything at once and thus is forced to run Stein nova with Stein placement? And if someone were able/willing to run Rocky nova with Stein placement (briefly) the challenges are less of a concern? I'm not asking because I disagree with the assertion, I just want to be sure I understand (and by proxy our adoring readers do as well) what "ease" really means in this context as the above bullet doesn't really explain it. > * Placement release notes will start at the current release and reference the > nova release notes for anything older (Ocata->Rocky). This is ready to go with https://review.openstack.org/#/c/631308/ and https://review.openstack.org/#/c/618708/ . Both need one more +2. > [1] https://etherpad.openstack.org/p/placement-extract-stein-5 -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From doug at doughellmann.com Thu Jan 17 12:57:10 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 17 Jan 2019 07:57:10 -0500 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> Message-ID: Chris Dent writes: > On Thu, 17 Jan 2019, Zane Bitter wrote: > >>> Thus: What if the TC and PTLs were the same thing? Would it become >>> more obvious that there's too much in play to make progress in a >>> unified direction (on the thing called OpenStack), leading us to >>> choose less to do, and choose more consistency and actionable >>> leadership? And would it enable some power to execute on that >>> leadership. >> >> I'm not sure we need to speculate, because as you know the TC and PTLs >> literally were the same thing prior to 2014-ish. My recollection is that >> there were pluses and minuses, but on the whole I don't think it had the >> effect you're suggesting it might. > > Part and parcel of what I'm suggesting is that less stuff would be > considered in the domain of "what do we do?" such that the tyranny of > the old/existing projects that you describe is a feature not a bug, > as an in-built constraint. > > It's not a future I really like, but it is one strategy for enabling > moving in one direction: cut some stuff. Stop letting so many > flowers bloom. > > Letting those flowers bloom is in the camp of "contribution in > all its many and diverse forms". What would you prune? -- Doug From jaosorior at redhat.com Thu Jan 17 13:08:53 2019 From: jaosorior at redhat.com (Juan Antonio Osorio Robles) Date: Thu, 17 Jan 2019 15:08:53 +0200 Subject: [tripleo] LP bug bash Message-ID: Hello folks! It has come to our attention that our Launchpad bug list has been growing and some bugs have gone stale. We have decided to go through the list weekly and triage or update bugs as needed in order to address this. This would be done one hour before our weekly meeting (so Tuesday at 13:00 UTC). Volunteers or people interested are welcome! There will be more details on the medium (video or IRC) on that day in the #tripleo channel. Best regards! From james.slagle at gmail.com Thu Jan 17 13:39:25 2019 From: james.slagle at gmail.com (James Slagle) Date: Thu, 17 Jan 2019 08:39:25 -0500 Subject: [TripleO] Scaling TripleO for Edge Computing Message-ID: Within the TripleO Edge squad, we've been collecting ideas and looking at other current in progress work as it relates to scaling TripleO. The high level goal is brainstorm about how we can scale to manage deployments that are of a order of magnitude larger than we do today (thousands of nodes instead of hundreds). We've been collecting these ideas in this google doc: https://docs.google.com/document/d/12tPc4NC5fo8ytGuFZ4DSZXXyzes1x3U7oYz9neaPP_o/ As always, any feedback or other ideas are welcome. It should be open to view/comment/edit without any account required. Or, feel free to reply on the ML. Thanks! -- -- James Slagle -- From tobias.rydberg at citynetwork.eu Thu Jan 17 14:40:48 2019 From: tobias.rydberg at citynetwork.eu (Tobias Rydberg) Date: Thu, 17 Jan 2019 15:40:48 +0100 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: References: Message-ID: Hi, Thanks a lot for pushing this Adrian and that etherpad is a really good start! I'm happy to help out champion this if that is of any use and if it's chosen as one of the community goals! Cheers Tobias Rydberg Senior Developer Twitter & IRC: tobberydberg www.citynetwork.eu | www.citycloud.com INNOVATION THROUGH OPEN IT INFRASTRUCTURE ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED On 2019-01-11 07:18, Adrian Turjak wrote: > Hello OpenStackers! > > As discussed at the Berlin Summit, one of the proposed community goals > was project deletion and resource clean-up. > > Essentially the problem here is that for almost any company that is > running OpenStack we run into the issue of how to delete a project and > all the resources associated with that project. What we need is an > OpenStack wide solution that every project supports which allows > operators of OpenStack to delete everything related to a given project. > > Before we can choose this as a goal, we need to define what the actual > proposed solution is, and what each service is either implementing or > contributing to. > > I've started an Etherpad here: > https://etherpad.openstack.org/p/community-goal-project-deletion > > Please add to it if I've missed anything about the problem description, > or to flesh out the proposed solutions, but try to mostly keep any > discussion here on the mailing list, so that the Etherpad can hopefully > be more of a summary of where the discussions have led. > > This is mostly a starting point, and I expect there to be a lot of > opinions and probably some push back from doing anything too big. That > said, this is a major issue in OpenStack, and something we really do > need because OpenStack is too big and too complicated for this not to > exist in a smart cross-project manner. > > Let's solve this the best we can! > > Cheers, > > Adrian Turjak > > > > From mriedemos at gmail.com Thu Jan 17 15:05:57 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 17 Jan 2019 09:05:57 -0600 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: <1547723762.31652.7@smtp.office365.com> References: <1547723762.31652.7@smtp.office365.com> Message-ID: On 1/17/2019 5:16 AM, Balázs Gibizer wrote: > There is a functional test [1] that uses a fake virt driver and > simulates rehape. My first attempt was to add an extra instance > creation after the end of the reshape. But this test reshapes the > provider tree to a way that the resulting tree uses sharing disk > provider and doesn't have inventory on the compute node RP any more > (cpu and mem moved under NUMA). Unfortunately nova does not yet support > scheduling against such tree. That's probably the one I mentioned on the call then. It uses a fake virt driver but stubs out the update_provider_tree method (from what I remember) and wouldn't be an easy fit for doing what I think we need to do for a new functional test. > > Shall I try to add a new functional test with the fake virt driver or > try to add a functional test with the libvirt driver top of the VGPU > reshaper patch? I'm personally OK with a fake virt driver (it could even be special purpose like some of our fake virt drivers for testing things like live migration rollback and resize failure/reschedule). Writing anything on top of the libvirt driver is still going to require stubbing out large parts of the libvirt driver code, which essentially makes it a fake driver. I know we have some functional tests for the libvirt driver that stub other stuff (Stephen is familiar with these) so it might be possible, but if I were going to write a new test I'd just use a fake virt driver and have the test be more like our traditional functional tests where we use the API to create a server, then reshape to nested, and then schedule another server to the nested resource class and assert everything is OK, since I think what we're really trying to test here is the API and scheduler interaction more than the virt driver itself. -- Thanks, Matt From mriedemos at gmail.com Thu Jan 17 15:09:24 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 17 Jan 2019 09:09:24 -0600 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: References: Message-ID: <5c80b99e-e7b3-bc65-9556-c80608de0347@gmail.com> On 1/17/2019 6:07 AM, Chris Dent wrote: >> Deployment tools: >> >> * Lee is working on TripleO support for extracted placement and >> estimates 3 more weeks for just deploy (base install) support to be >> done, and at least 3 more weeks for upgrade support after that. Read >> Lee's status update for details [2]. >> * If nova were to go ahead and drop placement code and require >> extracted placement before TripleO is ready, they would have to pin >> nova to a git SHA before that which would delay their Stein release. >> * Having the extraction span release boundaries would ease the upgrade >> pain for TripleO. > > Can you (or Dan?) clarify if spanning the release boundaries is > usefully specifically for tooling that chooses to upgrade everything > at once and thus is forced to run Stein nova with Stein placement? > > And if someone were able/willing to run Rocky nova with Stein > placement (briefly) the challenges are less of a concern? > > I'm not asking because I disagree with the assertion, I just want to > be sure I understand (and by proxy our adoring readers do as well) > what "ease" really means in this context as the above bullet doesn't > really explain it. I didn't go into details on that point because honestly I also could use some written words explaining the differences for TripleO in doing the upgrade and migration in-step with the Stein upgrade versus upgrading to Stein and then upgrading to Train, and how the migration with that is any less painful. I know Dan talked about it on the call, but I can't say I followed it all well enough to be able to summarize the pros/cons (which is why I didn't in my summary email). This might already be something I know about, but the lights just aren't turning on right now. -- Thanks, Matt From openstack at nemebean.com Thu Jan 17 15:27:12 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 17 Jan 2019 09:27:12 -0600 Subject: [tripleo] OVB is now on Gerrit In-Reply-To: <2bff76a6-bea0-aed1-9431-d83c4f6b7ebf@nemebean.com> References: <2bff76a6-bea0-aed1-9431-d83c4f6b7ebf@nemebean.com> Message-ID: <385f8245-7893-4dbb-308c-d80d0f0afe2b@nemebean.com> One thing that occurred to me as I was adding CI jobs is that we need to decide what to do with the docs. Currently they live on readthedocs[1], and I believe we have an RTD publishing job so we could leave them there. That would involve the least churn, so I tend to prefer it. Alternatively we could start publishing the docs independently on docs.openstack.org. That would require some way to redirect people from the RTD pages to d.o.o, which is probably doable but more work. I'm not sure how much benefit there is to doing this either. Finally, we could migrate all of the OVB docs to tripleo-docs, but that would be a bunch more work (and tripleo-docs is already large enough ;-) so unless someone comes up with a very compelling reason I'd be opposed to it. 1: https://openstack-virtual-baremetal.readthedocs.io/en/latest/introduction.html On 1/16/19 12:50 PM, Ben Nemec wrote: > Just a heads up that the import of OVB to Gerrit is complete. If you're > using OVB you should switch your clones from the Github repo to > https://git.openstack.org/cgit/openstack/openstack-virtual-baremetal > I've marked the Github repo deprecated so please don't submit issues or > PRs there anymore. > > For the moment the approval process won't meaningfully change. I'll > still be reviewing and single-approving patches, and the core reviewer > list for OVB is the same as it was on Github, with the addition of Juan > (as TripleO PTL) and Harald (as someone who has recently done a ton of > good work in OVB). See the full list here: > https://review.openstack.org/#/admin/groups/1993,members > > Once we have E2E testing set up on the repo (that isn't me manually > testing changes against my local cloud) I expect we'll open that list up > more and can discuss moving to a more typical 2 +2 model. > > Thanks. > > -Ben > From openstack at nemebean.com Thu Jan 17 15:35:38 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 17 Jan 2019 09:35:38 -0600 Subject: [all] Etcd as DLM In-Reply-To: <3147d433-13b4-3582-c831-25c29a5799ca@nemebean.com> References: <3147d433-13b4-3582-c831-25c29a5799ca@nemebean.com> Message-ID: This thread got a bit sidetracked with potential use-cases for etcd3 (which seems to happen a lot with this topic...), but we still need to decide how we're going to actually communicate with etcd from OpenStack services. Does anyone have input on that? Thanks. -Ben On 12/3/18 4:48 PM, Ben Nemec wrote: > Hi, > > I wanted to revisit this topic because it has come up in some downstream > discussions around Cinder A/A HA and the last time we talked about it > upstream was a year and a half ago[1]. There have certainly been changes > since then so I think it's worth another look. For context, the > conclusion of that session was: > > "Let's use etcd 3.x in the devstack CI, projects that are eventlet based > an use the etcd v3 http experimental API and those that don't can use > the etcd v3 gRPC API. Dims will submit a patch to tooz for the new > driver with v3 http experimental API. Projects should feel free to use > the DLM based on tooz+etcd3 from now on. Others projects can figure out > other use cases for etcd3." > > The main question that has come up is whether this is still the best > practice or if we should revisit the preferred drivers for etcd. Gorka > has gotten the grpc-based driver working in a Cinder driver that needs > etcd[2], so there's a question as to whether we still need the HTTP > etcd-gateway or if everything should use grpc. I will admit I'm nervous > about trying to juggle eventlet and grpc, but if it works then my only > argument is general misgivings about doing anything clever that involves > eventlet. :-) > > It looks like the HTTP API for etcd has moved out of experimental > status[3] at this point, so that's no longer an issue. There was some > vague concern from a downstream packaging perspective that the grpc > library might use a funky build system, whereas the etcd3-gateway > library only depends on existing OpenStack requirements. > > On the other hand, I don't know how much of a hassle it is to deploy and > manage a grpc-gateway. I'm kind of hoping someone has already been down > this road and can advise about what they found. > > Thanks. > > -Ben > > 1: https://etherpad.openstack.org/p/BOS-etcd-base-service > 2: > https://github.com/embercsi/ember-csi/blob/5bd4dffe9107bc906d14a45cd819d9a659c19047/ember_csi/ember_csi.py#L1106-L1111 > > 3: https://github.com/grpc-ecosystem/grpc-gateway From abishop at redhat.com Thu Jan 17 16:21:45 2019 From: abishop at redhat.com (Alan Bishop) Date: Thu, 17 Jan 2019 11:21:45 -0500 Subject: [all] Etcd as DLM In-Reply-To: References: <3147d433-13b4-3582-c831-25c29a5799ca@nemebean.com> Message-ID: On Thu, Jan 17, 2019 at 10:37 AM Ben Nemec wrote: > This thread got a bit sidetracked with potential use-cases for etcd3 > (which seems to happen a lot with this topic...), but we still need to > decide how we're going to actually communicate with etcd from OpenStack > services. Does anyone have input on that? > I have been successful testing the cinder-volume service using etcd3-gateway [1] to access etcd3 via tooz.coordination. Work great, although I haven't stress tested the setup. [1] https://github.com/dims/etcd3-gateway Alan Thanks. > > -Ben > > On 12/3/18 4:48 PM, Ben Nemec wrote: > > Hi, > > > > I wanted to revisit this topic because it has come up in some downstream > > discussions around Cinder A/A HA and the last time we talked about it > > upstream was a year and a half ago[1]. There have certainly been changes > > since then so I think it's worth another look. For context, the > > conclusion of that session was: > > > > "Let's use etcd 3.x in the devstack CI, projects that are eventlet based > > an use the etcd v3 http experimental API and those that don't can use > > the etcd v3 gRPC API. Dims will submit a patch to tooz for the new > > driver with v3 http experimental API. Projects should feel free to use > > the DLM based on tooz+etcd3 from now on. Others projects can figure out > > other use cases for etcd3." > > > > The main question that has come up is whether this is still the best > > practice or if we should revisit the preferred drivers for etcd. Gorka > > has gotten the grpc-based driver working in a Cinder driver that needs > > etcd[2], so there's a question as to whether we still need the HTTP > > etcd-gateway or if everything should use grpc. I will admit I'm nervous > > about trying to juggle eventlet and grpc, but if it works then my only > > argument is general misgivings about doing anything clever that involves > > eventlet. :-) > > > > It looks like the HTTP API for etcd has moved out of experimental > > status[3] at this point, so that's no longer an issue. There was some > > vague concern from a downstream packaging perspective that the grpc > > library might use a funky build system, whereas the etcd3-gateway > > library only depends on existing OpenStack requirements. > > > > On the other hand, I don't know how much of a hassle it is to deploy and > > manage a grpc-gateway. I'm kind of hoping someone has already been down > > this road and can advise about what they found. > > > > Thanks. > > > > -Ben > > > > 1: https://etherpad.openstack.org/p/BOS-etcd-base-service > > 2: > > > https://github.com/embercsi/ember-csi/blob/5bd4dffe9107bc906d14a45cd819d9a659c19047/ember_csi/ember_csi.py#L1106-L1111 > > > > 3: https://github.com/grpc-ecosystem/grpc-gateway > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu Jan 17 16:32:01 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 17 Jan 2019 16:32:01 +0000 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: <5c80b99e-e7b3-bc65-9556-c80608de0347@gmail.com> References: <5c80b99e-e7b3-bc65-9556-c80608de0347@gmail.com> Message-ID: On Thu, 2019-01-17 at 09:09 -0600, Matt Riedemann wrote: > On 1/17/2019 6:07 AM, Chris Dent wrote: > > > Deployment tools: > > > > > > * Lee is working on TripleO support for extracted placement and > > > estimates 3 more weeks for just deploy (base install) support to be > > > done, and at least 3 more weeks for upgrade support after that. Read > > > Lee's status update for details [2]. > > > * If nova were to go ahead and drop placement code and require > > > extracted placement before TripleO is ready, they would have to pin > > > nova to a git SHA before that which would delay their Stein release. > > > * Having the extraction span release boundaries would ease the upgrade > > > pain for TripleO. > > > > Can you (or Dan?) clarify if spanning the release boundaries is > > usefully specifically for tooling that chooses to upgrade everything > > at once and thus is forced to run Stein nova with Stein placement? > > > > And if someone were able/willing to run Rocky nova with Stein > > placement (briefly) the challenges are less of a concern? > > > > I'm not asking because I disagree with the assertion, I just want to > > be sure I understand (and by proxy our adoring readers do as well) > > what "ease" really means in this context as the above bullet doesn't > > really explain it. > > I didn't go into details on that point because honestly I also could use > some written words explaining the differences for TripleO in doing the > upgrade and migration in-step with the Stein upgrade versus upgrading to > Stein and then upgrading to Train, and how the migration with that is > any less painful. I know Dan talked about it on the call, but I can't > say I followed it all well enough to be able to summarize the pros/cons > (which is why I didn't in my summary email). This might already be > something I know about, but the lights just aren't turning on right now. a general questionon this topic. is there any update on supprot for deploying and upgrading to extracted placnement with other deployment tools the main ones beyond triplo that come to mind are kolla-ansible, juju, openstack-ansible, openstack helm there are obviously others but before we remove the code in nova i assume we will want to ensure that other tools beyond devstack, grenade and tripleo can actuly deploy stien with extracted placemnt and idealy upgrade. was this covered in the placement extration meeting. From fungi at yuggoth.org Thu Jan 17 16:33:51 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 17 Jan 2019 16:33:51 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> Message-ID: <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> On 2019-01-17 20:41:49 +1300 (+1300), Zane Bitter wrote: [...] > I'm not sure we need to speculate, because as you know the TC and > PTLs literally were the same thing prior to 2014-ish. [...] Minor historical notes: the role now occupied by the TC was originally filled by a governance body known as the Project Oversight Committee which then later became the Project Policy Board (PPB). A description of our pre-foundation technical governance can still be found undisturbed and rotting in our wiki at the moment, should you be in the mood for a bit of light reading: https://wiki.openstack.org/wiki/Governance/OldModel The PPB was replaced by (but essentially renamed to) the Technical Committee in September 2012, as required in appendix 4 of the bylaws for the then-newly-formed OpenStack Foundation (note that the text there defining the initial TC election is slated for removal in the bylaws amendment currently up for a vote of the individual members): https://www.openstack.org/legal/technical-committee-member-policy/ The very first two TC elections did still include PTLs who had guaranteed TC seats: https://wiki.openstack.org/wiki/Governance/TCElectionsFall2012 https://wiki.openstack.org/wiki/TC_Elections_Spring_2013 But the subsequent election in late 2013 switched to the free-for-all model we've come to know today with the adoption of the new TC Charter: https://wiki.openstack.org/wiki/TC_Elections_Spring_2013 Now I'm wondering whether we should form an OpenStack Historical Preservation Society. ;) -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Thu Jan 17 16:38:43 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 17 Jan 2019 16:38:43 +0000 Subject: [tripleo] OVB is now on Gerrit In-Reply-To: <385f8245-7893-4dbb-308c-d80d0f0afe2b@nemebean.com> References: <2bff76a6-bea0-aed1-9431-d83c4f6b7ebf@nemebean.com> <385f8245-7893-4dbb-308c-d80d0f0afe2b@nemebean.com> Message-ID: <20190117163842.7es7zvjwjse4tuzt@yuggoth.org> On 2019-01-17 09:27:12 -0600 (-0600), Ben Nemec wrote: [...] > I believe we have an RTD publishing job so we could leave them > there. [...] Note that the job to trigger RTD builds through their webhook is currently broken by https://github.com/rtfd/readthedocs.org/issues/4986 and awaiting a new deployment on RTD to fix, as best as we can tell. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From sean.mcginnis at gmx.com Thu Jan 17 16:45:56 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 17 Jan 2019 10:45:56 -0600 Subject: [all] Etcd as DLM In-Reply-To: References: <3147d433-13b4-3582-c831-25c29a5799ca@nemebean.com> Message-ID: <20190117164556.GA22643@sm-workstation> On Thu, Jan 17, 2019 at 11:21:45AM -0500, Alan Bishop wrote: > On Thu, Jan 17, 2019 at 10:37 AM Ben Nemec wrote: > > > This thread got a bit sidetracked with potential use-cases for etcd3 > > (which seems to happen a lot with this topic...), but we still need to > > decide how we're going to actually communicate with etcd from OpenStack > > services. Does anyone have input on that? > > > > I have been successful testing the cinder-volume service using > etcd3-gateway [1] to access etcd3 via tooz.coordination. Work great, > although I haven't stress tested the setup. > > [1] https://github.com/dims/etcd3-gateway > > Alan > Devstack by default (and therefore most gate testing by default) have been using etcd3 via tooz for Cinder lock coordination for over a year now. Oh, almost two years apparently: https://review.openstack.org/#/c/466298/14/lib/cinder Other than some intermittent etcd availability issues that Matt Riedemann has noticed recently, it appears to be working fine. Fine enough that I wasn't even aware of it until Matt brought it up at least. Sean From colleen at gazlene.net Thu Jan 17 16:46:47 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Thu, 17 Jan 2019 17:46:47 +0100 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> Message-ID: <1547743607.271159.1637236992.7774859F@webmail.messagingengine.com> On Tue, Jan 15, 2019, at 12:01 PM, Chris Dent wrote: [snipped] > > Then I implied that the TC cannot do anything like actionable and > unified technical leadership because they have little to no real > executive power and what power they do have (for example, trying to > make openstack-wide goals) is in conflict (because of the limits of > time and space) with the goals that PTLs (and others) are trying to > enact. While I understand that the TC may feel frustrated that they do not always feel like they have sufficient insight and influence into the ongoings of individual projects, I actually believe that this is the better way to operate. If individual team leaders were also tasked with leading the entire community as well, there would be significant conflicts of interest. PTLs are responsible for doing what is in the best interest for their project, and the TC is responsible for doing what is in the best interest of the whole community, and in the places where those do not 100% line up there is discussion and compromise. It is hard and sometimes painful and it means progress is very slow, but it is healthy. If a single body was acting as dictator for the whole community, progress might speed up but we would be losing out on the diversity of opinion that makes this community great. But as others have pointed out, in reality the TC is already largely made up of people who do have influence and insight into a large part of the individual projects and I'm not sure it really helps. TC members are always trying to be very careful about taking one hat off as they put another on and it creates quite a cognitive burden. I'm fairly sure Jeremy, to take one example, could have a heated debate by himself from at least eight different perspectives on any community topic. Everyone on the TC has sufficient influence to enact whatever change they decide is needed. What's lacking is agreement on what to act on... > > Thus: What if the TC and PTLs were the same thing? Would it become > more obvious that there's too much in play to make progress in a > unified direction (on the thing called OpenStack), leading us to > choose less to do, and choose more consistency and actionable > leadership? And would it enable some power to execute on that > leadership. > > Those are questions, not assertions. > > > Getting some diversity of ideas from outside of those from PTL's > > is probably a good idea for the overall health of OpenStack. What > > about Users that have never been PTL's? Not developers? > > So, to summarize: While I agree we need a diversity of ideas, I > don't think we lack for ideas, nor have we ever. What we lack > is a small enough set of ideas to act on them with significant > enough progress to make a real difference. How can we make the list > small and (to bring this back to the TC role) empower the TC to > execute on that list? Indeed. There are very very many improvements that could be made. None of them are so critical that it's obvious what to start with. OpenStack has matured enough that the community and the software are working pretty well most of the time, there aren't really any emergencies. > > And, to be complete, should we? > > And, to be extra really complete, I'm not sure if we should or not, > which is why I'm asking. > Returning to your original request for feedback, my expectation of the TC is much more passive than you imply it should be. I'm happy for the TC to do the work of approving new projects, acting as bridges to the board and foundation, mediating conflicts which can't otherwise be resolved and providing guidance as needed when the community needs it or asks for it. Grassroots change can start with any community member, whether or not they are elected to the TC. I don't think it needs to be the TC's job to drive grandiose changes for the sake of progress. Colleen From sean.mcginnis at gmx.com Thu Jan 17 16:48:51 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 17 Jan 2019 10:48:51 -0600 Subject: [release] Release countdown for week R-11, January 21-25 Message-ID: <20190117164850.GB22643@sm-workstation> Welcome to the weekly countdown email. Development Focus ----------------- Teams should be focused on implementing planned work for the cycle. It is also a good time to review those plans and reprioritize anything if needed based on the what progress has been made and what looks realistic to complete in the next few weeks. General Information ------------------- Looking ahead to Stein-3, please be aware of the various freeze dates. This varies for deliverable type, starting with non-client libraries, then client libraries, then finally services. This is to ensure we have time for requirements updates and resolving any issues prior to RC. Just as a reminder, we have freeze dates ahead of the first RC in order to stabilize our requirements. Updating global requirements close to overall code freeze increases the risk of an unforeseen side effect being introduced too late in the cycle to properly fix. For this reason, we first freeze the non-client libraries that may be used by service and client libraries, followed a week later by the client libraries. This minimizes the ripple effects that have caused projects to scamble to fix last minute issues. Please keep these deadlines in mind as you work towards wrapping up feature work that may require library changes to complete. Upcoming Deadlines & Dates -------------------------- Non-client library freeze: February 28 Client library freeze: March 7 Stein-3 milestone: March 7 -- Sean McGinnis (smcginnis) From ukalifon at redhat.com Tue Jan 15 16:58:37 2019 From: ukalifon at redhat.com (Udi Kalifon) Date: Tue, 15 Jan 2019 17:58:37 +0100 Subject: [qa] dynamic credentials with the tempest swift client Message-ID: Hello. I am developing GUI tests (Selenium) for the openstack director. I am now trying to make use of tempest Swift client in one of the tests which needs to fetch a file from the tripleo plans. I added this code in our base class to access the client: class GUITestCase(test.BaseTestCase): credentials = ('admin', 'primary') @classmethod def setup_clients(cls): super(GUITestCase, cls).setup_clients() # Clients for Swift cls.account_client = cls.os_admin.account_client cls.container_client = cls.os_admin.container_client cls.object_client = cls.os_admin.object_client I then try to list the objects in the "overcloud" container, which is where the default plans are found: class TestSwift(GUITestCase): def test_swift(self): (resp, body) = self.container_client.list_container_objects( "overcloud") print resp print body It returns a "not found" error. I'm pretty sure that the reason for it not finding the container (which is definitely there) is that it creates a project and a user for itself, and uses those credentials for its interactions with the undercloud. I can see the POST calls in tempest.log that show that it's creating itself the dynamic credentials: INFO tempest.lib.common.rest_client [req-54a... ] Request (TestSwift:setUpClass): 201 POST http://192.168.24.3:35357/v3/projects 0.365s INFO tempest.lib.common.rest_client [req-d3f... ] Request (TestSwift:setUpClass): 201 POST http://192.168.24.3:35357/v3/users 0.493s INFO tempest.lib.common.rest_client [req-480... ] Request (TestSwift:setUpClass): 200 GET http://192.168.24.3:35357/v3/roles 0.297s INFO tempest.lib.common.rest_client [req-e0a... ] Request (TestSwift:setUpClass): 204 PUT http://192.168.24.3:35357/v3/projects/9e1e6b33dcbd4aebb546c56a7258d5e0/users/cc715537a0c048b7a8692a0c8304b94d/roles/a7e0a38246d24ad58c9ca06d2db98099 0.270s INFO tempest.lib.common.dynamic_creds [-] *Acquired dynamic creds:* credentials: Credentials: {'username': u'tempest-TestSwift-1494291785', 'project_name': u'tempest-TestSwift-1494291785', 'project_domain_id': u'default', 'user_domain_id': u'default', 'tenant_id': u'9e1e6b33dcbd4aebb546c56a7258d5e0', 'user_domain_name': u'Default', 'domain_name': u'Default', 'tenant_name': u'tempest-TestSwift-1494291785', 'user_id': u'cc715537a0c048b7a8692a0c8304b94d', 'project_id': u'9e1e6b33dcbd4aebb546c56a7258d5e0', 'domain_id': u'default', 'project_domain_name': u'Default'}, Network: None, Subnet: None, Router: None So I'm looking for a way to utilize the client without it automatically creating itself dynamic credentials; it has to use the already-existing admin credentials on the admin project in order to see the container with the plans. What's the right way to do that, please? Thanks a lot in advance! Regards, Udi Kalifon; Senior QE; RHOS-UI Automation -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmnfths at protonmail.com Wed Jan 16 08:04:41 2019 From: cmnfths at protonmail.com (cmnfths) Date: Wed, 16 Jan 2019 08:04:41 +0000 Subject: Neutron tagging Message-ID: Hi everyone! Recently I've encountered a neutron issue related to port tagging mechanism, and wonder if anyone else ever did. The initial source of the issue was, as it seems, identical numeration of external VLANs' tags and OVS (exactly br-int) port tags. The VLAN tagged 222 was added to bridge br-ex and everything worked, but at some point firewall reported unusual traffic. Thus I'd found that there was a loop and 2 qr-* ports with the same tag 222 attached to br-int. After that I deleted them and the loop was gone. So the question is what exactly these tags are? The same ethernet frame bits as 802.1Q tags? Where is a pool or something they're taken from? Is there any possibility to form such pool via configs or patches? Has anyone ever encountered such situation? Thanks, Andrew Th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 16 09:30:46 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 16 Jan 2019 10:30:46 +0100 Subject: [pctavia][heat] queens loadbalancer issue Message-ID: Hello All, I am facing an issue with octavia loadbalancer on queens. Attached here there is the heat stack I am using for frating a loadbalancer between 2 virtual machines. I launched the stack 4 times. First, second and third work fine. The forth gives a lot of errors in octavia worker log attached here. It seems to loop . Please help me. Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: loadbalancertls.yaml Type: application/x-yaml Size: 3938 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: worker.log Type: text/x-log Size: 36732 bytes Desc: not available URL: From masha.atakova at mail.com Wed Jan 16 11:00:38 2019 From: masha.atakova at mail.com (Masha Atakova) Date: Wed, 16 Jan 2019 12:00:38 +0100 Subject: [neutron] Message-ID: An HTML attachment was scrubbed... URL: From jphillips at mirantis.com Wed Jan 16 14:39:40 2019 From: jphillips at mirantis.com (Jim Phillips) Date: Wed, 16 Jan 2019 09:39:40 -0500 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: @Artem and @Surya, the issue that you've reported should be working correctly now. Can you please double check and let me know if it's not? On Thu, Jan 10, 2019 at 2:10 PM Boris Renski wrote: > Hey guys! thanks for the heads up on this. Let us check and fix ASAP. > > On Thu, Jan 10, 2019 at 12:45 AM Artem Goncharov < > artem.goncharov at gmail.com> wrote: > >> Hi, >> >> I can repeat the issue - stackalytics stopped showing my affiliation >> correctly (user: gtema, entry in default_data.json is present) >> >> Regards, >> Artem >> >> On Thu, Jan 10, 2019 at 5:48 AM Surya Singh >> wrote: >> >>> Hi Boris >>> >>> Great to see new facelift of Stackalytics. Its really good. >>> >>> I have a query regarding contributors name is not listed as per company >>> affiliation. >>> Before facelift to stackalytics it was showing correct whether i have >>> entry in >>> https://github.com/openstack/stackalytics/blob/master/etc/default_data.json >>> or not. >>> Though now i have pushed the patch for same >>> https://review.openstack.org/629150, but another thing is one of my >>> colleague Vishal Manchanda name is also showing as independent contributor >>> rather than NEC contributor. While his name entry already in >>> etc/default_data.json. >>> >>> Would be great if you check the same. >>> >>> --- >>> Thanks >>> Surya >>> >>> >>> On Tue, Jan 8, 2019 at 11:57 PM Boris Renski >>> wrote: >>> >>>> Folks, >>>> >>>> Happy New Year! We wanted to start the year by giving a facelift to >>>> stackalytics.com (based on stackalytics openstack project). Brief >>>> summary of updates: >>>> >>>> - >>>> >>>> We have new look and feel at stackalytics.com >>>> - >>>> >>>> We did away with DriverLog >>>> and Member Directory >>>> , which were not very >>>> actively used or maintained. Those are still available via direct links, >>>> but not in the menu on the top >>>> - >>>> >>>> BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated >>>> project commits via a separate subsection accessible via top menu. Before >>>> this was all bunched up in Project Type -> Complimentary >>>> >>>> Happy to hear comments or feedback. >>>> >>>> -Boris >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jphillips at mirantis.com Wed Jan 16 15:46:07 2019 From: jphillips at mirantis.com (Jim Phillips) Date: Wed, 16 Jan 2019 10:46:07 -0500 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: @Surya, that is the issue. I've added your info to the data file running in production. You should see the correct information on the site tomorrow. This https://review.openstack.org/#/c/629150/ will still need to be merged so we don't lose the changes in the future but I don't have the ability to approve it. On Wed, Jan 16, 2019 at 10:31 AM Surya Singh wrote: > @Jim thanks for looking into this, but FYI still showing as independent* > https://www.stackalytics.com/?user_id=confisurya. May be merging this > https://review.openstack.org/#/c/629150/ will fix the issue. > > On Wed, Jan 16, 2019 at 8:09 PM Jim Phillips > wrote: > >> @Artem and @Surya, the issue that you've reported should be working >> correctly now. Can you please double check and let me know if it's not? >> >> On Thu, Jan 10, 2019 at 2:10 PM Boris Renski >> wrote: >> >>> Hey guys! thanks for the heads up on this. Let us check and fix ASAP. >>> >>> On Thu, Jan 10, 2019 at 12:45 AM Artem Goncharov < >>> artem.goncharov at gmail.com> wrote: >>> >>>> Hi, >>>> >>>> I can repeat the issue - stackalytics stopped showing my affiliation >>>> correctly (user: gtema, entry in default_data.json is present) >>>> >>>> Regards, >>>> Artem >>>> >>>> On Thu, Jan 10, 2019 at 5:48 AM Surya Singh < >>>> singh.surya64mnnit at gmail.com> wrote: >>>> >>>>> Hi Boris >>>>> >>>>> Great to see new facelift of Stackalytics. Its really good. >>>>> >>>>> I have a query regarding contributors name is not listed as per >>>>> company affiliation. >>>>> Before facelift to stackalytics it was showing correct whether i have >>>>> entry in >>>>> https://github.com/openstack/stackalytics/blob/master/etc/default_data.json >>>>> or not. >>>>> Though now i have pushed the patch for same >>>>> https://review.openstack.org/629150, but another thing is one of my >>>>> colleague Vishal Manchanda name is also showing as independent contributor >>>>> rather than NEC contributor. While his name entry already in >>>>> etc/default_data.json. >>>>> >>>>> Would be great if you check the same. >>>>> >>>>> --- >>>>> Thanks >>>>> Surya >>>>> >>>>> >>>>> On Tue, Jan 8, 2019 at 11:57 PM Boris Renski >>>>> wrote: >>>>> >>>>>> Folks, >>>>>> >>>>>> Happy New Year! We wanted to start the year by giving a facelift to >>>>>> stackalytics.com (based on stackalytics openstack project). Brief >>>>>> summary of updates: >>>>>> >>>>>> - >>>>>> >>>>>> We have new look and feel at stackalytics.com >>>>>> - >>>>>> >>>>>> We did away with DriverLog >>>>>> and Member Directory >>>>>> , which were not very >>>>>> actively used or maintained. Those are still available via direct links, >>>>>> but not in the menu on the top >>>>>> - >>>>>> >>>>>> BIGGEST CHANGE: You can now track some of the CNCF and >>>>>> Unaffiliated project commits via a separate subsection accessible via top >>>>>> menu. Before this was all bunched up in Project Type -> Complimentary >>>>>> >>>>>> Happy to hear comments or feedback. >>>>>> >>>>>> -Boris >>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at ericsson.com Thu Jan 17 16:54:45 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Thu, 17 Jan 2019 16:54:45 +0000 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: References: <1547723762.31652.7@smtp.office365.com> Message-ID: <1547744080.31652.9@smtp.office365.com> On Thu, Jan 17, 2019 at 4:05 PM, Matt Riedemann wrote: > On 1/17/2019 5:16 AM, Balázs Gibizer wrote: >> There is a functional test [1] that uses a fake virt driver and >> simulates rehape. My first attempt was to add an extra instance >> creation after the end of the reshape. But this test reshapes the >> provider tree to a way that the resulting tree uses sharing disk >> provider and doesn't have inventory on the compute node RP any more >> (cpu and mem moved under NUMA). Unfortunately nova does not yet >> support >> scheduling against such tree. > > That's probably the one I mentioned on the call then. It uses a fake > virt driver but stubs out the update_provider_tree method (from what > I remember) and wouldn't be an easy fit for doing what I think we > need to do for a new functional test. Yes, it is the one. > >> >> Shall I try to add a new functional test with the fake virt driver or >> try to add a functional test with the libvirt driver top of the VGPU >> reshaper patch? > > I'm personally OK with a fake virt driver (it could even be special > purpose like some of our fake virt drivers for testing things like > live migration rollback and resize failure/reschedule). Writing > anything on top of the libvirt driver is still going to require > stubbing out large parts of the libvirt driver code, which > essentially makes it a fake driver. I know we have some functional > tests for the libvirt driver that stub other stuff (Stephen is > familiar with these) so it might be possible, but if I were going to > write a new test I'd just use a fake virt driver and have the test be > more like our traditional functional tests where we use the API to > create a server, then reshape to nested, and then schedule another > server to the nested resource class and assert everything is OK, > since I think what we're really trying to test here is the API and > scheduler interaction more than the virt driver itself. I managed to hack together a functional test[1] that execise the vgpu reshape code in the libvirt driver (thanks to fakelibvirt.py) with instances booted both before and after the reshape. Cheers, gibi [1] https://review.openstack.org/#/c/631559 > > -- > > Thanks, > > Matt From cdent+os at anticdent.org Thu Jan 17 17:00:29 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 17 Jan 2019 17:00:29 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> Message-ID: On Thu, 17 Jan 2019, Doug Hellmann wrote: [i wrote]: >> It's not a future I really like, but it is one strategy for enabling >> moving in one direction: cut some stuff. Stop letting so many >> flowers bloom. >> >> Letting those flowers bloom is in the camp of "contribution in >> all its many and diverse forms". > > What would you prune? I don't think it should be up to me to decide. That would be a thing "we" (in whatever form) would decide. But if pressed to make a list for the sake of conversation I would endeavor to limit things based on a couple strawperson criteria (below). As I said above I'm not clear that this is the right thing, to do, but it is a potential strategy. Being included in the examples below isn't a suggestion that the thing listed is no good, or should not exist. Rather that it _might_ be healthier with a boundary between it and OpenStack. A clear boundary could allow these flowers to bloom nearby, as well as others. Strategies to figure out what could be removed: * Stuff that could be done via existing non-openstack tools or more generic tools that address cloudy-stuff in general. Much of the stuff in telemetry or orchestration would fit here and some utilities: Telemetry, Cloudkitty, Freezer, Heat, Karbor, Magnum, Monasca, Zun to name just some. If the remaining services are produced in a way that provides suitable observability, then tools like Prometheus, the ELK stack, what have you can play a big part and ansible, terraform, related friends can too. * Deployment tooling (Tripleo, Kolla, Openstackansible, etc) and packaging. It all needs to exist, but it clouds the picture and direction of OpenStack, the thing you have once it is deployed or installed. In a very bad analogy: I make gabbi and I'm not responsible for it becoming an RPM, nor for maintaining pip which is used to install it, but RPMs and pip are very important. If we had a one-true-deployment tool, then having that in this new and smaller OpenStack could make sense, but if we want there to be many tools enabling them to be independent communities from each other could be refreshing. Do I think this is a good idea? I don't know. It's just thinking out loud with half-baked ideas for the sake of conversation with the ever-optimistic hope that it might inspire an actually good idea. Do I think something like this is likely to happen? Not really. Ship sailed. Could a hybrid that includes some of this happen? Potentially, especially if the idea of top-level projects in the foundation grows. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From fungi at yuggoth.org Thu Jan 17 17:11:18 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 17 Jan 2019 17:11:18 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <1547743607.271159.1637236992.7774859F@webmail.messagingengine.com> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1547743607.271159.1637236992.7774859F@webmail.messagingengine.com> Message-ID: <20190117171118.gdtbm7beqyxjqto5@yuggoth.org> On 2019-01-17 17:46:47 +0100 (+0100), Colleen Murphy wrote: [...] > I'm fairly sure Jeremy, to take one example, could have a heated > debate by himself from at least eight different perspectives on > any community topic. [...] So what you're saying is I spend a lot of time talking to myself? That would certainly explain why I'm always so hoarse. ;) [with my jester's cap on] -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From mriedemos at gmail.com Thu Jan 17 17:32:55 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 17 Jan 2019 11:32:55 -0600 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: References: <5c80b99e-e7b3-bc65-9556-c80608de0347@gmail.com> Message-ID: <624e7894-c0df-65ff-2659-38725f5b71d6@gmail.com> On 1/17/2019 10:32 AM, Sean Mooney wrote: > a general questionon this topic. > is there any update on supprot for deploying and upgrading to extracted > placnement with other deployment tools > > the main ones beyond triplo that come to mind are > kolla-ansible, juju, openstack-ansible, openstack helm > > there are obviously others but before we remove the code in nova > i assume we will want to ensure that other tools beyond devstack, grenade > and tripleo can actuly deploy stien with extracted placemnt and idealy upgrade. > > was this covered in the placement extration meeting. Chris has links for this in the etherpad and mentions it in the placement update emails. Off the top of my head, I want to say kolla can deploy and is working on upgrades from source tarballs (until debs are available). OSA has a change up for install which isn't merged yet. I don't know about juju or helm. -- Thanks, Matt From derekh at redhat.com Thu Jan 17 17:45:31 2019 From: derekh at redhat.com (Derek Higgins) Date: Thu, 17 Jan 2019 17:45:31 +0000 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: <72ed6b5b-a28b-8fa2-dfce-fcf31ccc40a6@gmail.com> References: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> <72ed6b5b-a28b-8fa2-dfce-fcf31ccc40a6@gmail.com> Message-ID: On Mon, 14 Jan 2019 at 16:47, Brian Haley wrote: > > On 1/7/19 12:42 PM, Julia Kreger wrote: > > On Mon, Jan 7, 2019 at 9:11 AM Clark Boylan wrote: > >> > >> On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: > > [trim] > >>> > >>> Doing so, allows us to raise this behavior change to operators minimizing the > >>> need of them having to troubleshoot it in production, and gives them a choice > >>> in the direction that they wish to take. > >> > >> https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to cover this. Basically you should explicitly enable specific helpers when you need them rather than relying on the auto helper rules. > >> > >> Maybe even avoid the configuration option entirely if ironic and neutron can set the required helper for tftp when tftp is used? > >> > > Great link Clark, thanks! > > > > It could be viable to ask operators to explicitly set their security > > groups for tftp to be passed. > > > > I guess we actually have multiple cases where there are issues and the > > only non-impacted case is when the ironic conductor host is directly > > attached to the flat network the machine is booting from. In the case > > of a flat network, it doesn't seem viable for us to change rules > > ad-hoc since we would need to be able to signal that the helper is > > needed, but it does seem viable to say "make sure connectivity works x > > way". Where as with multitenant networking, we use dedicated networks, > > so conceivably it is just a static security group setting that an > > operator can keep in place. Explicit static rules like that seem less > > secure to me without conntrack helpers. :( > > > > Does anyone in Neutron land have any thoughts? > > I am from Neutron land, sorry for the slow reply. > > First, I'm trying to get in contact with someone that knows more about > nf_conntrack_helper than me, I'll follow-up here or in the patch. Great, thanks > > In neutron, as in most projects, the goal is to have things configured > so admins don't need to set any extra options, so we've typically done > things like set sysctl values to make sure we don't get tripped-up by > such issues. Mostly these settings have been in the L3 code, so are > done in namespaces and have limited "impact" on the system hypervisor on > the compute node. > > Since this is security group related it is different, since that isn't > done in a namespace - we add a rule for related/established connections > in the "root" namespace, for example in the iptables_hybrid case. For > that reason it's not obvious to me that setting this sysctl is bad - > it's not in the VM itself, and the packets aren't going to the > hypervisor, so is there any impact we need to worry about besides just > having it loaded? As far as I've been able to figure out we'd need to have the kernel module loaded, one per protocol supported e.g. nf_conntrack_tftp, nf_conntrack_sip etc... and set the sysctl inside the namespace, my testing was on devstack where the node being deployed with ironic was on the same host, It may be that the sysctl is also needed in the root namespace in a more realistic scenario where ironic is controlling real baremetal nodes, I'll see if I can find out if this is the case. > > The other option would be to add more rules when SG rules are added that > are associated with a protocol that has a helper. IMO that's not a > great solution as there is no way for the user to control what filters > (like IP addresses) are allowed, for example a SIP helper IP address. Ya, it doesn't sound ideal, also this would require specific SG rules to enable outgoing traffic which isn't normally the case > > Hopefully I'm understanding things correctly. > > Thanks, > > -Brian > From zufar at onf-ambassador.org Thu Jan 17 18:09:40 2019 From: zufar at onf-ambassador.org (Zufar Dhiyaulhaq) Date: Fri, 18 Jan 2019 01:09:40 +0700 Subject: [ironic] boot.ipxe input/output error Message-ID: Hi, I try to install ironic but get error boot.ipxe input/output error when trying to cleaning the node (error image: https://ibb.co/mhN5GYc). This is my ironic configuration : *[DEFAULT]enabled_drivers=pxe_ipmitoolenabled_hardware_types = ipmilog_dir=/var/log/ironictransport_url=rabbit://guest:guest at 10.60.60.10:5672/ auth_strategy=keystonenotification_driver = messaging[conductor]send_sensor_data = trueautomated_clean=true[swift]region_name = RegionOneproject_domain_id = defaultuser_domain_id = defaultproject_name = servicespassword = IRONIC_PASSWORDusername = ironicauth_url = http://10.60.60.10:5000/v3 auth_type = password[pxe]tftp_root=/tftpboottftp_server=10.60.60.10ipxe_enabled=Truepxe_bootfile_name=undionly.kpxeuefi_pxe_bootfile_name=ipxe.efipxe_config_template=$pybasedir/drivers/modules/ipxe_config.templateuefi_pxe_config_template=$pybasedir/drivers/modules/ipxe_config.templateipxe_use_swift=True[deploy]http_root=/httpboothttp_url=http://10.60.60.10:8080 [service_catalog]insecure = Trueauth_uri=http://10.60.60.10:5000/v3 auth_type=passwordauth_url=http://10.60.60.10:35357 project_domain_id = defaultuser_domain_id = defaultproject_name = servicesusername = ironicpassword = IRONIC_PASSWORDregion_name = RegionOne[database]connection=mysql+pymysql://ironic:IRONIC_DBPASSWORD at 10.60.60.10/ironic?charset=utf8 [keystone_authtoken]auth_url=http://10.60.60.10:35357 www_authenticate_uri=http://10.60.60.10:5000 auth_type=passwordusername=ironicpassword=IRONIC_PASSWORDuser_domain_name=Defaultproject_name=servicesproject_domain_name=Default[neutron]www_authenticate_uri=http://10.60.60.10:5000 auth_type=passwordauth_url=http://10.60.60.10:35357 project_domain_name=Defaultproject_name=servicesuser_domain_name=Defaultusername=ironicpassword=IRONIC_PASSWORDcleaning_network = 461a6663-e015-4ecf-9076-d1b502c3db25provisioning_network = 461a6663-e015-4ecf-9076-d1b502c3db25[glance]region_name = RegionOneproject_domain_id = defaultuser_domain_id = defaultproject_name = servicespassword = IRONIC_PASSWORDusername = ironicauth_url = http://10.60.60.10:5000/v3 auth_type = passwordswift_endpoint_url = http://10.60.60.10:8080/v1/AUTH_%(tenant_id)sswift_account = AUTH_f3bb39ae2e0946e1bbf812bcde6e08a7swift_container = glanceswift_temp_url_key = secret* I try to open the URL which is http://10.60.60.10:8080/boot.ipxe but always get BAD URL error. Are the Baremetal node cannot download the boot.ipxe? are my swift is broken? Thank you. Best Regards, Zufar Dhiyaulhaq -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Thu Jan 17 18:59:49 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Thu, 17 Jan 2019 10:59:49 -0800 Subject: [releases] Additions to releases-core In-Reply-To: <20190115151123.GA1297@sm-workstation> References: <20190115151123.GA1297@sm-workstation> Message-ID: Thanks to Sean and the rest of release management for this opportunity! I am excited to be apart of the team and help out :) And congrats to you too JP :) We'll celebrate at FOSDEM ;) -Kendall (diablo_rojo) On Tue, Jan 15, 2019 at 7:12 AM Sean McGinnis wrote: > I'm very happy to announce I have added Kendall Nelson and Jean-Philippe > Evrard > to the releases-core group. These two have been doing release request > reviews > and asking questions to learn about the release process, and we feel they > are > ready for more. > > Initially we will just be looking at +2's and saving approvals for one of > the > existing core members until everyone has more confidence. I wouldn't expect > this +2-only period to last long though. > > We will also need to work out reallocation of review days. Right now, the > current cores have taken different days for them to be the point person for > processing reviews. With the expanded team, I think we should take a look > at > those days again and get the work divided up amongst us. > > Thanks Kendall and JP for getting involved and helping with this important > piece of the process, and welcome to the releases-core team! > > Sean > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Jan 17 20:12:42 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Thu, 17 Jan 2019 21:12:42 +0100 Subject: [neutron] CI issue related to pyroute2 and latest oslo.privsep Message-ID: <37C0CF62-A6FA-4837-8C31-4628FCFA339A@redhat.com> Hi, Recently we had one more issue related to oslo.privsep and pyroute2. This caused many failures in Neutron CI. See [1] for details. Now fix (more like a workaround) for this issue is merged [2]. So if You saw in Your patch failing tempest/scenario jobs and in failed tests there were issues with SSH to instance through floating IP, please now rebase Your patch. It should be better :) [1] https://bugs.launchpad.net/neutron/+bug/1811515 [2] https://review.openstack.org/#/c/631275/ — Slawek Kaplonski Senior software engineer Red Hat From kennelson11 at gmail.com Thu Jan 17 20:18:23 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Thu, 17 Jan 2019 12:18:23 -0800 Subject: [ALL] Train TC Election Season (Yes, TC not PTL) Message-ID: Hello All! TC and not PTL you might ask? Yes! Due to the timing of the summit and release and optimizing timing so that they don't overlap, the technical elections this cycle will trade places and we will have the TC election first and then the PTL election. If you're curious about how this decision was made, feel free to check out the conversation during TC office hours here[1]. Election details: https://governance.openstack.org/election/ Please read the stipulations and timelines for candidates and electorate contained in this governance documentation. There will be further announcements posted to the mailing list as action is required from the electorate or candidates. This email is for information purposes only. If you have any questions which you feel affect others please reply to this email thread. If you have any questions that you which to discuss in private please email any of the election officials[2] so that we may address your concerns. Thank you, -Kendall (diablo_rojo) [1] http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%23openstack-tc.2019-01-16.log.html#t2019-01-16T01:02:18 [2] https://governance.openstack.org/election/#election-officials -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Thu Jan 17 20:20:10 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 17 Jan 2019 14:20:10 -0600 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190117171118.gdtbm7beqyxjqto5@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1547743607.271159.1637236992.7774859F@webmail.messagingengine.com> <20190117171118.gdtbm7beqyxjqto5@yuggoth.org> Message-ID: On 1/17/19 11:11 AM, Jeremy Stanley wrote: > [with my jester's cap on] Hey, Foundation folks, I have an idea for swag at the next summit. ;-) And so this isn't a completely content-less email, I will provide my perspective on the actual topic of the thread too. Reading the document, it seems to me that it describes less a "Technical" Committee and more a "Governance" Committee. It's right there in first two headings, and I would argue that the collaboration and maybe scope sections fit better with that too. I will grant that the release goals don't really fit my theory as those are technical first and foremost, but they're also sort of the exception not the rule. When I vote for members of a body with "technical" in its name, I expect those people to be driving the technical direction of the project. Per the document, and based on my past observation of the TC, I would say that it has actively avoided driving the technical direction of OpenStack in favor of the bottom-up philosophy mentioned elsewhere in this thread. That has always created some cognitive dissonance for me. I feel like a lot of the discussion in this thread has been around whether the TC should be a primarily technical body or a primarily governance body. While I don't want to get lost bikeshedding over naming, I hadn't previously considered that maybe the TC isn't (and shouldn't be?) an actual technical group. I'm not sure I have a definite answer for that right now though, so consider this a WIP opinion and maybe a useful alternate way of looking at the question. -Ben From openstack at nemebean.com Thu Jan 17 20:37:03 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 17 Jan 2019 14:37:03 -0600 Subject: [neutron][oslo] CI issue related to pyroute2 and latest oslo.privsep In-Reply-To: <37C0CF62-A6FA-4837-8C31-4628FCFA339A@redhat.com> References: <37C0CF62-A6FA-4837-8C31-4628FCFA339A@redhat.com> Message-ID: I think it's worth noting that this has actually demonstrated a rather significant issue with threaded privsep, which is that forking from a Python thread is really not a safe thing to do.[1][2] Sure, we could just say "don't fork in privileged code", but in this case the fork wasn't even in our code, it was in a library we were using. There are a few options, none of which I'm crazy about at this point: * Provide a way for callers to specify that a call needs to run in-process rather than in the thread-pool. Two problems with this: 1) It requires the callers to know that forking is happening and 2) I'm not sure it actually fixes all of the potential problems. You might need to have a completely separate privsep daemon to avoid the potential bad fork/thread interactions. * Switch to multiprocessing so calls execute in their own process. I may be wrong, but I think this requires all of the parameters passed in to be pickleable, which I bet is not remotely the case right now. I'm open to suggestions that are better than playing whack-a-mole with these bugs using a threaded and un-threaded daemon. -Ben 1: https://rachelbythebay.com/w/2011/06/07/forked/ 2: https://rachelbythebay.com/w/2014/08/16/forkenv/ On 1/17/19 2:12 PM, Slawomir Kaplonski wrote: > Hi, > > Recently we had one more issue related to oslo.privsep and pyroute2. This caused many failures in Neutron CI. See [1] for details. Now fix (more like a workaround) for this issue is merged [2]. So if You saw in Your patch failing tempest/scenario jobs and in failed tests there were issues with SSH to instance through floating IP, please now rebase Your patch. It should be better :) > > [1] https://bugs.launchpad.net/neutron/+bug/1811515 > [2] https://review.openstack.org/#/c/631275/ > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > From ed at leafe.com Thu Jan 17 20:47:40 2019 From: ed at leafe.com (Ed Leafe) Date: Thu, 17 Jan 2019 14:47:40 -0600 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1547743607.271159.1637236992.7774859F@webmail.messagingengine.com> <20190117171118.gdtbm7beqyxjqto5@yuggoth.org> Message-ID: On Jan 17, 2019, at 2:20 PM, Ben Nemec wrote: > > Reading the document, it seems to me that it describes less a "Technical" Committee and more a "Governance" Committee. It's right there in first two headings, and I would argue that the collaboration and maybe scope sections fit better with that too. I will grant that the release goals don't really fit my theory as those are technical first and foremost, but they're also sort of the exception not the rule. > > When I vote for members of a body with "technical" in its name, I expect those people to be driving the technical direction of the project. Per the document, and based on my past observation of the TC, I would say that it has actively avoided driving the technical direction of OpenStack in favor of the bottom-up philosophy mentioned elsewhere in this thread. That has always created some cognitive dissonance for me. That observation has been made by many (myself included) in the past. The TC has actively avoided any sort of appearance of forcing or mandating any particular technical approach or solution. -- Ed Leafe From cboylan at sapwetik.org Thu Jan 17 21:03:01 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 17 Jan 2019 13:03:01 -0800 Subject: Infra team upgrading review.openstack.org Gerrit from 2.13.9 to 2.13.12 January 18 at about 1700UTC Message-ID: <1547758981.2237058.1637449632.37C152F2@webmail.messagingengine.com> We will be performing a minor Gerrit upgrade to version 2.13.12 tomorrow (January 18, 2019) at 1700UTC. We've tested this upgrade on our dev server, https://review-dev.openstack.org, and expect it to be a quick upgrade. Any outage shouldn't last more than 10 minutes. We will let our configuration management tooling manage the upgrade so we won't have an exact time, but will try to get it as close to 1700UTC as possible. Feel free to test out the new version on the dev server. We are happy to answer any questions you might have as well. Sorry for the short notice and the cross post, Clark From zbitter at redhat.com Thu Jan 17 21:25:14 2019 From: zbitter at redhat.com (Zane Bitter) Date: Fri, 18 Jan 2019 10:25:14 +1300 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> Message-ID: On 18/01/19 5:33 AM, Jeremy Stanley wrote: > On 2019-01-17 20:41:49 +1300 (+1300), Zane Bitter wrote: > [...] >> I'm not sure we need to speculate, because as you know the TC and >> PTLs literally were the same thing prior to 2014-ish. > [...] > > Minor historical notes: the role now occupied by the TC was > originally filled by a governance body known as the Project > Oversight Committee which then later became the Project Policy Board > (PPB). A description of our pre-foundation technical governance can > still be found undisturbed and rotting in our wiki at the moment, > should you be in the mood for a bit of light reading: > https://wiki.openstack.org/wiki/Governance/OldModel > > The PPB was replaced by (but essentially renamed to) the Technical > Committee in September 2012, as required in appendix 4 of the bylaws > for the then-newly-formed OpenStack Foundation (note that the text > there defining the initial TC election is slated for removal in the > bylaws amendment currently up for a vote of the individual members): > https://www.openstack.org/legal/technical-committee-member-policy/ > > The very first two TC elections did still include PTLs who had > guaranteed TC seats: > https://wiki.openstack.org/wiki/Governance/TCElectionsFall2012 > https://wiki.openstack.org/wiki/TC_Elections_Spring_2013 > > But the subsequent election in late 2013 switched to the > free-for-all model we've come to know today with the adoption of the > new TC Charter: > https://wiki.openstack.org/wiki/TC_Elections_Spring_2013 > > Now I'm wondering whether we should form an OpenStack Historical > Preservation Society. ;) Thanks for the history lesson! I started keeping up with TC business probably soon after that first election (Heat was applying for incubation, and was accepted in November 2012 - the same day as the US Presidential election IIRC), but I don't think I was aware (or at least I had completely forgotten) that that was the first election since the TC replaced the PPB. - ZB From amotoki at gmail.com Thu Jan 17 21:45:37 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Fri, 18 Jan 2019 06:45:37 +0900 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> <8eb6964f-506f-848b-a838-935bb972c9f5@nemebean.com> <292b70c6-677e-4f6b-7b65-7062c2875d9f@nemebean.com> Message-ID: Thanks Ben for digging into the detail. I made some more tests based on your test script. >From my test result, pyroute2 and "ip" command operations against netns seems to work fine even if network namespaces of the process and thread are different. The test script iis http://paste.openstack.org/show/742886/ and the result is http://paste.openstack.org/show/742887/. > So, to get this test passing I think we need to change [1] so it looks > for the thread id and uses a replacement for [2] that allows the thread > id to be injected as above. I confirmed network namespace operations work well, so it looks safe. Considering the situation, I proposed a change on the failing test to check a list of network devices inside a netns. https://review.openstack.org/#/c/631654/ Thanks, Akihiro Motoki (irc: amotoki) 2019年1月16日(水) 7:56 Ben Nemec : > TLDR: We now need to look at the thread namespace instead of the process > namespace. Many, many details below. > > On 1/15/19 11:51 AM, Ben Nemec wrote: > > > > > > On 1/15/19 11:16 AM, Ben Nemec wrote: > >> > >> > >> On 1/15/19 6:49 AM, Doug Hellmann wrote: > >>> Ben Nemec writes: > >>> > >>>> I tried to set up a test environment for this, but I'm having some > >>>> issues. My local environment is defaulting to python 3, while the gate > >>>> job appears to have been running under python 2. I'm not sure why it's > >>>> doing that since the tox env definition doesn't specify python 3 > (maybe > >>>> something to do with https://review.openstack.org/#/c/622415/ ?), but > >>>> either way I keep running into import issues. > >>>> > >>>> I'll take another look tomorrow, but in the meantime I'm afraid I > >>>> haven't made any meaningful progress. :-( > >>> > >>> If no version is specified in the tox.ini then tox defaults to the > >>> version of python used to install it. > >>> > >> > >> Ah, good to know. I think I installed tox as just "tox" instead of > >> "python-tox", which means I got the py3 version. > >> > >> Unfortunately I'm still having trouble running the failing test (and > >> not for the expected reason ;-). The daemon is failing to start with: > >> > >> ImportError: No module named tests.functional.utils > > No idea why, but updating the fwaas capabilities to match core neutron > by adding c.CAP_DAC_OVERRIDE and c.CAP_DAC_READ_SEARCH made this go > away. Those are related to file permission checks, but the permissions > on my source tree are, well, permissive, so I'm not sure why that would > be a problem. > > >> > >> I'm not seeing any log output from the daemon either for some reason > >> so it's hard to debug. There must be some difference between this and > >> the neutron test environment because in neutron I was getting daemon > >> log output in /opt/stack/logs. > > > > Figured this part out. tox.ini wasn't inheriting some values in the same > > way as neutron. Fix proposed in https://review.openstack.org/#/c/631035/ > > Actually, I discovered that these logs were happening, they were just in > /tmp. So that change is probably not necessary, especially since it's > breaking ci. > > > > > Now hopefully I can make progress on the rest of it. > > And sure enough, I did. :-) > > In short, we need to look at the thread-specific network namespace in > this test instead of the process-specific one. When we change the > namespace it only affects the thread, unless the call is made from the > process's main thread. Here's a simple(?) example: > > #!/usr/bin/env python > > import ctypes > import os > import threading > > from pyroute2 import netns > > # The python threading identifier is useless here, > # we need to make a syscall > libc = ctypes.CDLL('libc.so.6') > > def do_the_thing(ns): > tid = libc.syscall(186) # This id varies by platform :-/ > # Check the starting netns > print('process %s' % os.readlink('/proc/self/ns/net')) > print('thread %s' % os.readlink('/proc/self/task/%s/ns/net' % tid)) > # Change the netns > print('changing to %s' % ns) > netns.setns(ns) > # Check again. It should be different > print('process %s' % os.readlink('/proc/self/ns/net')) > print('thread %s\n' % os.readlink('/proc/self/task/%s/ns/net' % tid)) > > # Run in main thread > do_the_thing('foo') > # Run in new thread > t = threading.Thread(target=do_the_thing, args=('bar',)) > t.start() > t.join() > # Run in main thread again to show difference > do_the_thing('bar') > > # Clean up after ourselves > netns.remove('foo') > netns.remove('bar') > > And here's the output: > > process net:[4026531992] > thread net:[4026531992] > changing to foo > process net:[4026532196] <- Running in the main thread changes both > thread net:[4026532196] > > process net:[4026532196] > thread net:[4026532196] > changing to bar > process net:[4026532196] <- Child thread only changes the thread > thread net:[4026532254] > > process net:[4026532196] > thread net:[4026532196] > changing to bar > process net:[4026532254] <- Main thread gets them back in sync > thread net:[4026532254] > > So, to get this test passing I think we need to change [1] so it looks > for the thread id and uses a replacement for [2] that allows the thread > id to be injected as above. > > And it's the end of my day so I'm going to leave it there. :-) > > 1: > > https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/tests/functional/utils.py#L23 > 2: > > https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/utils.py#L25 > > -Ben > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Jan 17 21:57:53 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Thu, 17 Jan 2019 22:57:53 +0100 Subject: Neutron tagging In-Reply-To: References: Message-ID: <148390DB-3A1F-47C7-9BD2-C664D496E630@redhat.com> Hi, > Wiadomość napisana przez cmnfths w dniu 16.01.2019, o godz. 09:04: > > Hi everyone! > > Recently I've encountered a neutron issue related to port tagging mechanism, and wonder if anyone else ever did. The initial source of the issue was, as it seems, identical numeration of external VLANs' tags and OVS (exactly br-int) port tags. The VLAN tagged 222 was added to bridge br-ex and everything worked, but at some point firewall reported unusual traffic. Thus I'd found that there was a loop and 2 qr-* ports with the same tag 222 attached to br-int. After that I deleted them and the loop was gone. Were those qr- ports from same network? > > So the question is what exactly these tags are? The same ethernet frame bits as 802.1Q tags? Where is a pool or something they're taken from? Is there any possibility to form such pool via configs or patches? Has anyone ever encountered such situation? Tags on ports in br-int are local tags used to separate traffic from different Neutron networks on one host. Those tags can be different for ports from same network on different host. > > Thanks, > Andrew Th. > — Slawek Kaplonski Senior software engineer Red Hat From codeology.lab at gmail.com Thu Jan 17 22:57:35 2019 From: codeology.lab at gmail.com (Cody) Date: Thu, 17 Jan 2019 17:57:35 -0500 Subject: [neutron]How to ease congestion at neutron server nodes? Message-ID: Hi Stackers, What solution(s) other than DVR could I use to avoid north-south traffic congestion at the neutron server nodes? Basically, I wish to let VMs with floating IPs to route directly from their respective hypervisor hosts to the Internet. Thank you very much. Regards, Cody From fungi at yuggoth.org Thu Jan 17 23:10:25 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 17 Jan 2019 23:10:25 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> Message-ID: <20190117231025.a52knzodpsv3blhu@yuggoth.org> On 2019-01-17 16:33:51 +0000 (+0000), Jeremy Stanley wrote: [...] > But the subsequent election in late 2013 switched to the > free-for-all model we've come to know today with the adoption of the > new TC Charter: > https://wiki.openstack.org/wiki/TC_Elections_Spring_2013 [...] And for anyone who found themselves scratching their heads over this, yes I meant to link https://wiki.openstack.org/wiki/TC_Elections_Fall_2013 instead. (Note also how back then we seemed to ignore the fact that naming things based on seasons excluded half of the World.) -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Thu Jan 17 23:58:21 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 17 Jan 2019 23:58:21 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1547743607.271159.1637236992.7774859F@webmail.messagingengine.com> <20190117171118.gdtbm7beqyxjqto5@yuggoth.org> Message-ID: <20190117235821.75wfwz2pkzvmkviu@yuggoth.org> On 2019-01-17 14:20:10 -0600 (-0600), Ben Nemec wrote: [...] > Reading the document, it seems to me that it describes less a > "Technical" Committee and more a "Governance" Committee. [...] I mused similarly some time back (in an ML post I'm having trouble finding now) that I consider the choice of naming for the "technical committee" unfortunate, as I see our role being one of community management and arbitration. Section 4.13.b.i of the OSF bylaws describes the responsibilities and powers of the TC thusly: "The Technical Committee shall have the authority to manage the OpenStack Project, including the authority to determine the scope of the OpenStack Technical Committee Approved Release..." (the latter is specifically with regard to application of the OpenStack trademark for products) https://www.openstack.org/legal/bylaws-of-the-openstack-foundation/ So I guess a lot of it comes down to how we interpret "manage" in that context. If you don't see the TC as the appropriate body to provide governance for the OpenStack project, then who do you think should take that on instead? Section 4.1.b.i of the bylaws mentions that "management of the technical matters relating to the OpenStack Project [...] shall be managed by the Technical Committee" and also "management of the technical matters for the OpenStack Project is designed to be a technical meritocracy" but doesn't go into details as to what it means by "technical matters" (beyond deciding what qualifies for trademark use). It seems to me that by delegating subproject-specific technical decisions to team leaders elected from each subproject, and then handling decisions which span projects (the technical vision document, project teams guide, cycle goals selection, et cetera), we meet both the letter and the spirit of the duties outlined for the OpenStack Technical Committee in the OSF bylaws. But as noted, a lot of this hinges on how people take the somewhat fuzzy terms above in the context with which they're given. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From emilien at redhat.com Fri Jan 18 05:44:33 2019 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 18 Jan 2019 06:44:33 +0100 Subject: [tripleo] LP bug bash In-Reply-To: References: Message-ID: On Thu, Jan 17, 2019 at 2:13 PM Juan Antonio Osorio Robles < jaosorior at redhat.com> wrote: > It has come to our attention that our Launchpad bug list has been > growing and some bugs have gone stale. We have decided to go through the > list weekly and triage or update bugs as needed in order to address > this. This would be done one hour before our weekly meeting (so Tuesday > at 13:00 UTC). > Good initiative. I also believe we should be more aggressive in our scripts which automatically close old and stalled bugs. I ran the scripts a few days ago if you remember, but I used the defaults. We might want to close everything that was hasn't been "In progress" after one year, I guess. Also a bunch of "In progress" things are actually implemented, and bugs need to be closed. Thanks for starting this effort. Count me in! PS: let's do it with blueprints as well. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Fri Jan 18 06:56:16 2019 From: zbitter at redhat.com (Zane Bitter) Date: Fri, 18 Jan 2019 19:56:16 +1300 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190115153041.mnrwbp6uekaucygq@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <20190115153041.mnrwbp6uekaucygq@yuggoth.org> Message-ID: <02355e62-092c-1e84-9559-9c9ae5c0e528@redhat.com> On 16/01/19 4:30 AM, Jeremy Stanley wrote: > On 2019-01-15 11:01:09 +0000 (+0000), Chris Dent wrote: > [...] >> Then I implied that the TC cannot do anything like actionable and >> unified technical leadership because they have little to no real >> executive power and what power they do have (for example, trying to >> make openstack-wide goals) is in conflict (because of the limits of >> time and space) with the goals that PTLs (and others) are trying to >> enact. > [...] > > Maybe I'm reading between the lines too much, but are you thinking > that PTLs have any more executive power than TC members? At least my > experience as a former PTL and discussions I've had with other PTLs > suggest that the position is more to do with surfacing information > about what the team is working on and helping coordinate efforts, > not deciding what the team will work on. PTLs (and TC members, and > anyone in the community for that matter) can direct where they spend > their own time, and can also suggest to others where time might be > better spent, but other than the ability to prevent work from being > accepted (for example, by removing core reviewers who review > non-priority changes) there's not really much "executive power" > wielded by a PTL to decide on a project's direction, only influence > (usually influence gained by seeking consensus and not attempting to > assert team direction by fiat decree). This seems like a good lead in to the feedback I have on the current role-of-the-TC document (which I already touched on in the review: https://review.openstack.org/622400). This discussion (which we've had many times in many forms) always drives me bananas, and here's why: It is *NOT* about "executive power"! Think of it this way. If you drop a bunch of humans in a field in the middle of nowhere, the chances of them arriving together at some destination - any destination! - are approximately zero in the absence of some co-ordination. This is true even if you assume they all have the same destination in mind, which is already pretty unlikely. The minimum requirements for success would appear to be: 1) One or more people to stand up and say "I think we should go this way" and explain why; 2) A reason for each person to expect that everybody else is going to go in the same direction; and 3) When the going gets tough, a sense within each individual that they were part of making the decisions, even when they don't agree with them. In short: leadership. But not "executive power"! In fact, executive power is to be avoided, because exercise of executive power is inimical to #3. Where the action is at is #2. Generating #2 is the meatspace equivalent of the Byzantine Generals problem in computer science. It's a hard problem, but a problem to which we have instinctively known the solution for a long time (long before it was solved in computer science, even though it's essentially the same solution): you somehow bootstrap a positive feedback loop in which confidence begets confidence. In OpenStack we choose to bootstrap it by using elections. People generally expect other people to follow the elected leaders because they believe that those other people voted for them, which at least in the aggregate is true. Other projects have chosen the BDFL model, which is inherently unsustainable as I believe the recent experiences of the Python community have shown. I think we made the right choice. But, having elected a group of folks to the TC - the only body that is elected by the community as a whole, and therefore the only folks that can set the direction of the project as a whole - what do we then say? "Well, we can't tell anybody what to do, so we have no choice but to just leave 'em in this field and hope for the best." Friends, I have feelings about this, but propriety precludes me from expressing them fully here. You're welcome to ask me about it some time. Suffice it to say that I believe this is a false dichotomy. Note that we don't need the TC to supply #1. But nobody else can supply #2. For these purposes you can think of "governance" - the stuff that the TC does consistently do - as being the guardian of the process that ensures #3. The false dichotomy does not appear to exist at the level of individual projects, and IMHO the result is kind of what you'd expect: a bunch of projects that are individually successful but that struggle to cohere together. Positive feedback loops are a funny thing: they're inherently unstable, so sometimes it doesn't take much to make them runaway in the wrong direction. Confidence begets confidence, but disappointment begets disappointment. (That's why e.g. I'm opposed to project-wide goals where we expect from the outset at least one project to fail to complete it in a single release cycle.) It is to this - the fact that the TC does not have a great track record of convincing the community to all move in one direction - that I would attribute the substantial group of people who are, as Chris says, simply tired of this sort of navel-gazing. I'm sure those folks would say that we should just stop trying. I think the actual solution is to start succeeding. It remains to be seen who is right. cheers, Zane. From jaosorior at redhat.com Fri Jan 18 07:52:20 2019 From: jaosorior at redhat.com (Juan Antonio Osorio Robles) Date: Fri, 18 Jan 2019 09:52:20 +0200 Subject: [tripleo] LP bug bash In-Reply-To: References: Message-ID: <31fa97bc-6200-6815-9017-c2fab53fdbce@redhat.com> On 1/18/19 7:44 AM, Emilien Macchi wrote: > On Thu, Jan 17, 2019 at 2:13 PM Juan Antonio Osorio Robles > > wrote: > > It has come to our attention that our Launchpad bug list has been > growing and some bugs have gone stale. We have decided to go > through the > list weekly and triage or update bugs as needed in order to address > this. This would be done one hour before our weekly meeting (so > Tuesday > at 13:00 UTC). > > > Good initiative. I also believe we should be more aggressive in our > scripts which automatically close old and stalled bugs. I ran the > scripts a few days ago if you remember, but I used the defaults. We > might want to close everything that was hasn't been "In progress" > after one year, I guess. Also a bunch of "In progress" things are > actually implemented, and bugs need to be closed. Mind bringing this up in the next weekly meeting? Sounds like a good proposal. > > Thanks for starting this effort. Count me in! > PS: let's do it with blueprints as well. I like this idea. > -- > Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From masayuki.igawa at gmail.com Fri Jan 18 10:02:44 2019 From: masayuki.igawa at gmail.com (Masayuki Igawa) Date: Fri, 18 Jan 2019 05:02:44 -0500 Subject: [qa] dynamic credentials with the tempest swift client In-Reply-To: References: Message-ID: <1aa6bce4-622e-4787-a73b-27de7ed9d224@www.fastmail.com> Hi, On Thu, Jan 17, 2019, at 17:58, Udi Kalifon wrote: : > So I'm looking for a way to utilize the client without it automatically > creating itself dynamic credentials; it has to use the already-existing > admin credentials on the admin project in order to see the container > with the plans. What's the right way to do that, please? Thanks a lot > in advance! Does this pre-provisioned credentials help you? https://docs.openstack.org/tempest/latest/configuration.html#pre-provisioned-credentials -- Masayuki Igawa Key fingerprint = C27C 2F00 3A2A 999A 903A 753D 290F 53ED C899 BF89 From lyarwood at redhat.com Fri Jan 18 10:28:42 2019 From: lyarwood at redhat.com (Lee Yarwood) Date: Fri, 18 Jan 2019 10:28:42 +0000 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: <624e7894-c0df-65ff-2659-38725f5b71d6@gmail.com> References: <5c80b99e-e7b3-bc65-9556-c80608de0347@gmail.com> <624e7894-c0df-65ff-2659-38725f5b71d6@gmail.com> Message-ID: <20190118102842.abtk2vg574uv5bmu@lyarwood.usersys.redhat.com> On 17-01-19 11:32:55, Matt Riedemann wrote: > On 1/17/2019 10:32 AM, Sean Mooney wrote: > > a general questionon this topic. > > is there any update on supprot for deploying and upgrading to extracted > > placnement with other deployment tools > > > > the main ones beyond triplo that come to mind are > > kolla-ansible, juju, openstack-ansible, openstack helm > > > > there are obviously others but before we remove the code in nova > > i assume we will want to ensure that other tools beyond devstack, grenade > > and tripleo can actuly deploy stien with extracted placemnt and idealy upgrade. > > > > was this covered in the placement extration meeting. > > Chris has links for this in the etherpad and mentions it in the placement > update emails. Off the top of my head, I want to say kolla can deploy and is > working on upgrades from source tarballs (until debs are available). OSA has > a change up for install which isn't merged yet. I don't know about juju or > helm. I think I called this out during the meeting but only the core kolla change introducing the new placement images has landed so far. The kolla-ansible change required to deploy the extracted placement service hasn't but looks almost ready to go: WIP: Split placement from nova https://review.openstack.org/#/c/613629/ Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: not available URL: From cdent+os at anticdent.org Fri Jan 18 10:38:02 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 18 Jan 2019 10:38:02 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <02355e62-092c-1e84-9559-9c9ae5c0e528@redhat.com> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <20190115153041.mnrwbp6uekaucygq@yuggoth.org> <02355e62-092c-1e84-9559-9c9ae5c0e528@redhat.com> Message-ID: On Fri, 18 Jan 2019, Zane Bitter wrote: > This seems like a good lead in to the feedback I have on the current > role-of-the-TC document (which I already touched on in the review: > https://review.openstack.org/622400). This discussion (which we've had many > times in many forms) always drives me bananas, and here's why: > > It is *NOT* about "executive power"! I basically agree with you that leadership is the key factor and my heart is with you on much of what you say throughout your message; however, as much as "executive power" makes me cringe, it felt necessary to introduce something else into the discussion to break the cycle. We keep talking about needing leadership but then seem to fail to do anything about it. Throwing "power" into the mix is largely in response to my observations and own personal experience that when a project or PTL is either: * acting in bad faith, contrary to the wider vision, or holding an effective veto over a positive change much of the rest of the community wants * feared that they might do any of those things in the prior point, even if they haven't demonstrated such the TC clams up, walks away, and tries to come at things from another angle which won't cause a disruption to the fragile peace. So, in a bit of reverse psychology: If the TC can't control the projects, maybe the projects should just be the TC? It's not a model I really agree with, but it is one that has managed to get some ideas and questions moving. In Jeremy's message he suggested that while the main action of the PTL is to coordinate and surface they do have one important power: the power to say "no" and then seems to suggest that's not a big deal. It's a huge deal. The TC has, by the current constitution, a similar power to say no, but it is a giant sledgehammer in the shape of making a project not official, and nobody wants to use that and: > But, having elected a group of folks to the TC - the only body that is > elected by the community as a whole, and therefore the only folks that can > set the direction of the project as a whole - what do we then say? > > "Well, we can't tell anybody what to do, so we have no choice but to just > leave 'em in this field and hope for the best." My goal with asking for the TC role document to be evaluated by the community was to survey around to what feelings people have about the extent people want the TC to tell people what they could do ("could", not "should") and find the boundaries and speed bumps. It feels like there are three groups being vocal, two that I mentioned before: * enable diffuse but vaguely related collaboration (the TC as it has acted for awhile) * lead with a much more strongly defined unified direction (something needs to change, methods differ) * PTL/Project driven direction is the right way (also the TC operating as usual, but with a different focal point) but I still have no clear idea what community members in general think. At least one person has suggested that if the TC is to continue as it has been for a few years, it might consider changing its name to get rid of "Technical". Is that the way to go? I hope not. I agree that we should reach and achieve higher, and let success be the fuel for still more. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From tobias.urdin at binero.se Fri Jan 18 11:37:22 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Fri, 18 Jan 2019 12:37:22 +0100 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: References: Message-ID: This is an amazing community goal! I think we've all had/are dealing with this pain on a daily basis and there is probably a lot of in-house solution to solving it, or using projects whether open source or not, like ospurge. I don't have super much time to dedicated but for us this is very important so I'd love to get more details on how I could contribute some time into this, not sure I could manage a champion role at this point. Best regards Tobias On 01/11/2019 07:22 AM, Adrian Turjak wrote: > Hello OpenStackers! > > As discussed at the Berlin Summit, one of the proposed community goals > was project deletion and resource clean-up. > > Essentially the problem here is that for almost any company that is > running OpenStack we run into the issue of how to delete a project and > all the resources associated with that project. What we need is an > OpenStack wide solution that every project supports which allows > operators of OpenStack to delete everything related to a given project. > > Before we can choose this as a goal, we need to define what the actual > proposed solution is, and what each service is either implementing or > contributing to. > > I've started an Etherpad here: > https://etherpad.openstack.org/p/community-goal-project-deletion > > Please add to it if I've missed anything about the problem description, > or to flesh out the proposed solutions, but try to mostly keep any > discussion here on the mailing list, so that the Etherpad can hopefully > be more of a summary of where the discussions have led. > > This is mostly a starting point, and I expect there to be a lot of > opinions and probably some push back from doing anything too big. That > said, this is a major issue in OpenStack, and something we really do > need because OpenStack is too big and too complicated for this not to > exist in a smart cross-project manner. > > Let's solve this the best we can! > > Cheers, > > Adrian Turjak > > > > > From juliaashleykreger at gmail.com Fri Jan 18 11:49:42 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 18 Jan 2019 03:49:42 -0800 Subject: [ironic] Mid-cycle call times In-Reply-To: References: Message-ID: Greetings everyone! I've created a fairly simple schedule: January 21st 2:00 - 4:00 UTC 2:00 PM UTC - Discuss current status 2:30 PM UTC - Revising current plans and reprioritizing as necessary 3:30 PM UTC - Making Ironic more container friendly. January 22nd 2:00 - 4:00 UTC 2:00 PM UTC - Boot Management for in-band inspection 2:30 PM UTC - SmartNIC configuration support 3:00 PM UTC - Disucss any other items that arose during the earlier discussions. 3:30 PM UTC - Bug and RFE Triaging If there are no objections, we can use my bluejeans[1] account to discuss. Please see our planning/discussion etherpad[2]. Thanks! I look forward to chatting with everyone soon. -Julia [1]: https://bluejeans.com/u/jkreger [2]: https://etherpad.openstack.org/p/ironic-stein-midcycle On Tue, Jan 15, 2019 at 12:00 PM Julia Kreger wrote: > > Greetings everyone, > > It seems the most popular times are January 21st and 22nd between 2 PM > and 6 PM UTC. > > Please add any topics for discussion to the etherpad[1] as soon as > possible. I will propose a schedule and agenda in the next day or two. > > -Julia > > [1]: https://etherpad.openstack.org/p/ironic-stein-midcycle > > On Tue, Jan 8, 2019 at 9:10 AM Julia Kreger wrote: > > > > Greetings everyone! > > > > It seems we have coalesced around January 21st and 22nd. I have posted > > a poll[1] with time windows in two hour blocks so we can reach a > > consensus on when we should meet. > > > > Please vote for your available time windows so we can find the best > > overlap for everyone. Additionally, if there are any topics or items > > that you feel would be a good use of the time, please feel free to add > > them to the planning etherpad[2]. > [trim] From dtantsur at redhat.com Fri Jan 18 12:17:57 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 18 Jan 2019 13:17:57 +0100 Subject: [ironic] Mid-cycle call times In-Reply-To: References: Message-ID: <4d9eb6d2-3033-1bce-43db-b6a6bee4ad9c@redhat.com> Thanks Julia! I guess we're canceling the weekly meeting because of the overlap? On 1/18/19 12:49 PM, Julia Kreger wrote: > Greetings everyone! > > I've created a fairly simple schedule: > > January 21st 2:00 - 4:00 UTC > > 2:00 PM UTC - Discuss current status > 2:30 PM UTC - Revising current plans and reprioritizing as necessary > 3:30 PM UTC - Making Ironic more container friendly. 3:30pm - until the last person gets too tired to argue :D > > January 22nd 2:00 - 4:00 UTC > > 2:00 PM UTC - Boot Management for in-band inspection > 2:30 PM UTC - SmartNIC configuration support > 3:00 PM UTC - Disucss any other items that arose during the earlier discussions. > 3:30 PM UTC - Bug and RFE Triaging > > If there are no objections, we can use my bluejeans[1] account to discuss. > Please see our planning/discussion etherpad[2]. > > Thanks! I look forward to chatting with everyone soon. > > -Julia > > [1]: https://bluejeans.com/u/jkreger > [2]: https://etherpad.openstack.org/p/ironic-stein-midcycle > > On Tue, Jan 15, 2019 at 12:00 PM Julia Kreger > wrote: >> >> Greetings everyone, >> >> It seems the most popular times are January 21st and 22nd between 2 PM >> and 6 PM UTC. >> >> Please add any topics for discussion to the etherpad[1] as soon as >> possible. I will propose a schedule and agenda in the next day or two. >> >> -Julia >> >> [1]: https://etherpad.openstack.org/p/ironic-stein-midcycle >> >> On Tue, Jan 8, 2019 at 9:10 AM Julia Kreger wrote: >>> >>> Greetings everyone! >>> >>> It seems we have coalesced around January 21st and 22nd. I have posted >>> a poll[1] with time windows in two hour blocks so we can reach a >>> consensus on when we should meet. >>> >>> Please vote for your available time windows so we can find the best >>> overlap for everyone. Additionally, if there are any topics or items >>> that you feel would be a good use of the time, please feel free to add >>> them to the planning etherpad[2]. >> [trim] > From mdulko at redhat.com Fri Jan 18 12:43:01 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Fri, 18 Jan 2019 13:43:01 +0100 Subject: [dev] [infra] [devstack] [qa] [kuryr] DevStack's etcd performance on gate VM's In-Reply-To: <73a791f2566f4f5618109b8570672a30edaa6008.camel@redhat.com> References: <1544720777.384097.1608336832.49746981@webmail.messagingengine.com> <73a791f2566f4f5618109b8570672a30edaa6008.camel@redhat.com> Message-ID: <80478d1c70ea93463639409f1794a0cdceb10d6c.camel@redhat.com> It's been a while, but I finally have some more info. The issue still persists. We've tracked that etcd problem to high fsync duration on GRA1 hosts by looking at etcd_disk_wal_fsync_duration_seconds* metrics. Seems like on other clouds we rarely get fsync duration higher than 1 second, but on GRA1 ~1.5% of all fsync's take longer than that. I'm still gathering more info and working with OVH folks to find the root cause. Meanwhile I've tested a simple idea to put etcd data directory on a RAM disk. This unsurprisingly seems to help, so I'll be preparing the patch [1] to be mergeable and can use some reviews once that's done. [1] https://review.openstack.org/#/c/626885/ On Thu, 2018-12-20 at 16:18 +0100, Michał Dulko wrote: > On Thu, 2018-12-13 at 09:06 -0800, Clark Boylan wrote: > > On Thu, Dec 13, 2018, at 4:39 AM, Michał Dulko wrote: > > > Hi, > > > > > > In Kuryr-Kubernetes we're using the DevStack-installed etcd as a > > > backend store for Kubernetes that we run on our gates. For some time we > > > can see its degraded performance manifesting like this [1] in the logs. > > > Later on this leads to various K8s errors [2], [3], up to missing > > > notifications from the API, which causes failures in Kuryr-Kubernetes > > > tempest tests. From what I've seen those etcd warnings normally mean > > > that disk latency is high. > > > > > > This seems to be mostly happening on OVH and RAX hosts. I've looked at > > > this with OVH folks and there isn't anything immediately alarming about > > > their hosts running gate VM's. > > > > That's interesting because we've been working with amorin at OVH over > > debugging similar IO problems and I think we both agree something is > > happening. We've disabled the BHS1 region as the vast majority of > > related failures were there, but kept GRA1 up and running which is > > where your example is from. My understanding is that a memory issue > > of some sort was found on the compute hypervisors (which could affect > > disk throughput if there isn't memory for caching available or if > > swap is using up available disk IO). We are currently waiting on > > amorin's go ahead to turn BHS1 back on after this is corrected. > > > > > Upgrading the etcd version doesn't seem to help, as well as patch [4] > > > which increases IO priority for etcd process. > > > > > > Any ideas of what I can try next? I think we're the only project that > > > puts so much pressure on the DevStack's etcd. Help would really be > > > welcomed, getting rid of this issue will greatly increase our gates > > > stability. > > > > It wouldn't surprise me if others aren't using etcd much. One thing > > that may help is to use the dstat data [5] from these failed jobs to > > rule out resource contention from within the job (cpu, io(ps), > > memory, etc). One thing we've found debugging these slower nodes is > > that it often exposes real bugs in our software by making them cost > > more. We should double check there isn't anything obvious like that > > happening here too. > > There are multiple moments we're experiencing this, but mostly in > interaction with K8s API, so of course software bug in there is > possible. We've seen it earlier, but seems like failures became more > often when we've updated K8s version. > > > I've been putting the csv file in https://lamada.eu/dstat-graph/ and > > that renders it for human consumption. But there are other tools out > > there for this too. > > I've rendered it as well, but besides finding a bug in dstat [6] that > made it show that kuryr-daemon is using 2 TB of RAM I didn't noticed > anything special, especially on io operations. In one of most recent > failures there's only a spike on processes and paging [7] around the > time we see fatal failure. > > Any ideas how I should proceed? From the number of rechecks we're doing > you can easily see that this is hitting us hard. > > > > Thanks, > > > Michał > > > > > > [1] > > > http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/screen-etcd.txt.gz#_Dec_12_17_19_33_618619 > > > [2] > > > http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/screen-kubernetes-api.txt.gz#_Dec_12_17_20_19_772688 > > > [3] > > > http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/screen-kubernetes-scheduler.txt.gz#_Dec_12_17_18_59_045347 > > > [4] https://review.openstack.org/#/c/624730/ > > > > > > > > > > [5] http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/dstat-csv_log.txt > > > > [6] https://github.com/dagwieers/dstat/pull/162 > [7] https://i.imgur.com/B8HQM8t.png From ignaziocassano at gmail.com Fri Jan 18 12:24:29 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 18 Jan 2019 13:24:29 +0100 Subject: [heystone][nova][cinder] netapp queens trust scoped token Message-ID: Hello Everyone, I am using a client for backupping openstack virtual machine. The crux of the problem is the client uses trust based authentication for scheduling backup jobs on behalf of user. When a trust scoped token is passed to cinder client to take a snapshot, I expect the client use the token to authenticate and perform the operation which cinder client does. However cinder volume service invokes novaclient as part of cinder nfs backend snapshot operation and novaclient tries to re-authenticate. Since keystone does not allow re-authentication using trust based tokens, cinder snapshot operation fails. So I get the following error: 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs [req-61d977c1-eef3-4309-ac02-aaa0eb880925 ab1bdb5dadc54312891f3a6410fef04d 6d1bffb04e3b4cdda30dc17aa96bfffc - default default] Call to Nova to create snapshot failed: Forbidden: You are not authorized to perform the requested action: Using trust-scoped token to create another token. Create a new trust-scoped token instead. (HTTP 403) (Request-ID: req-f55b682a-001b-4952-bfb1-abf4dd6bf459) > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs Traceback (most recent call last): > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/ remotefs.py", line 1452, in _create_snapshot_online > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs connection_info) > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs File "/usr/lib/python2.7/site-packages/cinder/compute/nova.py", line 188, in create_volume_snapshot > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs create_info=create_info) > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs File "/usr/lib/python2.7/site-packages/novaclient/v2/assisted_volume_snapshots.py", line 43, in create > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs return self._create('/os-assisted-volume-snapshots', body, 'snapshot') > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs File "/usr/lib/python2.7/site-packages/novaclient/base.py", line 361, in _create > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs resp, body = self.api.client.post(url, body=body) > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 310, in post > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs return self.request(url, 'POST', **kwargs) My cinder volume are non netapp fas8040 via nfs. Anyone can help me ? Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.page at canonical.com Fri Jan 18 12:48:21 2019 From: james.page at canonical.com (James Page) Date: Fri, 18 Jan 2019 12:48:21 +0000 Subject: [sig][upgrades] 2019 reboot/irc meeting on Monday Message-ID: Hi All During the PTG in Denver we agreed to move the Upgrades SIG IRC meeting to a 4 week cadence. I completed the re-scheduling shortly after the PTG, but we've failed to actually hold a meeting to date! The next scheduled meetings are 0900 and 1600 UTC this coming Monday; we'll go with the standing agenda that we had before: https://etherpad.openstack.org/p/upgrades-sig-meeting Have a great weekend and look forward to chatting with interested parties on Monday! Cheers James -------------- next part -------------- An HTML attachment was scrubbed... URL: From pkovar at redhat.com Fri Jan 18 13:42:33 2019 From: pkovar at redhat.com (Petr Kovar) Date: Fri, 18 Jan 2019 14:42:33 +0100 Subject: [docs] Nominating Alex Settle for openstack-doc-core Message-ID: <20190118144233.132eb0e427389da15e725141@redhat.com> Hi all, Alex Settle recently re-joined the Documentation Project after a few-month break. It's great to have her back and I want to formally nominate her for membership in the openstack-doc-core team, to follow the formal process for cores. Please let the ML know should you have any objections. Thanks, pk From thierry at openstack.org Fri Jan 18 13:52:18 2019 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 18 Jan 2019 14:52:18 +0100 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> Message-ID: <68df54a3-b3d9-2bd7-4377-34b30a82d93b@openstack.org> Jeremy Stanley wrote: > On 2019-01-17 20:41:49 +1300 (+1300), Zane Bitter wrote: > [...] >> I'm not sure we need to speculate, because as you know the TC and >> PTLs literally were the same thing prior to 2014-ish. > [...] > > Minor historical notes: the role now occupied by the TC was > originally filled by a governance body known as the Project > Oversight Committee which then later became the Project Policy Board > (PPB). A description of our pre-foundation technical governance can > still be found undisturbed and rotting in our wiki at the moment, > should you be in the mood for a bit of light reading: > https://wiki.openstack.org/wiki/Governance/OldModel > > The PPB was replaced by (but essentially renamed to) the Technical > Committee in September 2012, as required in appendix 4 of the bylaws > for the then-newly-formed OpenStack Foundation (note that the text > there defining the initial TC election is slated for removal in the > bylaws amendment currently up for a vote of the individual members): > https://www.openstack.org/legal/technical-committee-member-policy/ > > The very first two TC elections did still include PTLs who had > guaranteed TC seats: > https://wiki.openstack.org/wiki/Governance/TCElectionsFall2012 > https://wiki.openstack.org/wiki/TC_Elections_Spring_2013 > > But the subsequent election in late 2013 switched to the > free-for-all model we've come to know today with the adoption of the > new TC Charter: > https://wiki.openstack.org/wiki/TC_Elections_Spring_2013 > > Now I'm wondering whether we should form an OpenStack Historical > Preservation Society. ;) The history is actually documented outside the wiki: https://docs.openstack.org/project-team-guide/introduction.html#a-quick-history-of-openstack-governance -- Thierry Carrez (ttx) From thierry at openstack.org Fri Jan 18 14:05:52 2019 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 18 Jan 2019 15:05:52 +0100 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190117235821.75wfwz2pkzvmkviu@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1547743607.271159.1637236992.7774859F@webmail.messagingengine.com> <20190117171118.gdtbm7beqyxjqto5@yuggoth.org> <20190117235821.75wfwz2pkzvmkviu@yuggoth.org> Message-ID: <711fb981-27e7-4330-3d60-a12d57ac8214@openstack.org> Jeremy Stanley wrote: > On 2019-01-17 14:20:10 -0600 (-0600), Ben Nemec wrote: > [...] >> Reading the document, it seems to me that it describes less a >> "Technical" Committee and more a "Governance" Committee. > [...] > > I mused similarly some time back (in an ML post I'm having trouble > finding now) that I consider the choice of naming for the "technical > committee" unfortunate, as I see our role being one of community > management and arbitration. Section 4.13.b.i of the OSF bylaws > describes the responsibilities and powers of the TC thusly: > > "The Technical Committee shall have the authority to manage the > OpenStack Project, including the authority to determine the scope of > the OpenStack Technical Committee Approved Release..." (the latter > is specifically with regard to application of the OpenStack > trademark for products) This comes back to the original foundation of the... ahem... Foundation. We used to have a "Project Policy Board" that covered it all. When the Foundation was formed, we wanted to make sure the open source project would be governed by its contributors, and not by the Foundation board of Directors. So the PPB's rights and duties were split between the Board of Directors (to stay out of technical matters) and a "technical committee". A better naming would have been "open source project governance group" or "upstream matters decisions group" (everything upstream from the release of the software). "Technical" is a pretty simplistic way of describing it, if only because there are "technical" things on the downstream side, like what the User Committee covers, or the interoperability programs. -- Thierry Carrez (ttx) From thierry at openstack.org Fri Jan 18 14:09:51 2019 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 18 Jan 2019 15:09:51 +0100 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <68df54a3-b3d9-2bd7-4377-34b30a82d93b@openstack.org> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> <68df54a3-b3d9-2bd7-4377-34b30a82d93b@openstack.org> Message-ID: Thierry Carrez wrote: >> Minor historical notes: the role now occupied by the TC was >> originally filled by a governance body known as the Project >> Oversight Committee which then later became the Project Policy Board >> (PPB). [...] Actually no, we started with an Advisory Board, an Architecture Board, one Technical Committee for Nova and one Technical Committee for Swift :) The POC was introduced early 2011. -- Thierry Carrez (ttx) From fungi at yuggoth.org Fri Jan 18 14:12:41 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 18 Jan 2019 14:12:41 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <20190115153041.mnrwbp6uekaucygq@yuggoth.org> <02355e62-092c-1e84-9559-9c9ae5c0e528@redhat.com> Message-ID: <20190118141240.am2ae6igzeyrncmm@yuggoth.org> On 2019-01-18 10:38:02 +0000 (+0000), Chris Dent wrote: [...] > In Jeremy's message he suggested that while the main action of the > PTL is to coordinate and surface they do have one important power: > the power to say "no" and then seems to suggest that's not a big > deal. It's a huge deal. [...] I certainly didn't mean to imply that it's "not a big deal." The context was that the PTL's ability to refuse work from individuals doesn't magically make them work on priority tasks for that team (and in my experience, more often simply causes them to go away); so rejecting proffered non-priority effort doesn't translate to getting priorities accomplished, though it can serve to reduce distractions for the remaining team members who actually do want to focus on those priorities, sure. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Fri Jan 18 14:16:17 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 18 Jan 2019 14:16:17 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> <68df54a3-b3d9-2bd7-4377-34b30a82d93b@openstack.org> Message-ID: <20190118141617.gvdfcnzplfvs7eus@yuggoth.org> On 2019-01-18 15:09:51 +0100 (+0100), Thierry Carrez wrote: > Thierry Carrez wrote: > > > Minor historical notes: the role now occupied by the TC was > > > originally filled by a governance body known as the Project > > > Oversight Committee which then later became the Project Policy Board > > > (PPB). [...] > > Actually no, we started with an Advisory Board, an Architecture Board, one > Technical Committee for Nova and one Technical Committee for Swift :) The > POC was introduced early 2011. Thanks! I somehow missed that we had a similar overview in the Project Teams Guide which went back even farther than I could find in the old wiki reference articles (though it does still lack much of the detail retained in those older sources). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From smooney at redhat.com Fri Jan 18 14:20:47 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 18 Jan 2019 14:20:47 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> <68df54a3-b3d9-2bd7-4377-34b30a82d93b@openstack.org> Message-ID: <4a3ac0205da9d3fcc1572888519d0a8213df358e.camel@redhat.com> On Fri, 2019-01-18 at 15:09 +0100, Thierry Carrez wrote: > Thierry Carrez wrote: > > > Minor historical notes: the role now occupied by the TC was > > > originally filled by a governance body known as the Project > > > Oversight Committee which then later became the Project Policy Board > > > (PPB). [...] > > Actually no, we started with an Advisory Board, an Architecture Board, > one Technical Committee for Nova and one Technical Committee for Swift > :) The POC was introduced early 2011. > by the way the fact that we can have these conversation openly in openstack and proide input regardless of if we are members of the TC or not i think is one of the strengths of the openstack comunity and governace model/process. so i just wanted to say im glad we are having this conversation as a comunity. From fungi at yuggoth.org Fri Jan 18 14:39:06 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 18 Jan 2019 14:39:06 +0000 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: <20190118144233.132eb0e427389da15e725141@redhat.com> References: <20190118144233.132eb0e427389da15e725141@redhat.com> Message-ID: <20190118143906.2qqarb5xere4zorw@yuggoth.org> On 2019-01-18 14:42:33 +0100 (+0100), Petr Kovar wrote: > Alex Settle recently re-joined the Documentation Project after a > few-month break. It's great to have her back and I want to > formally nominate her for membership in the openstack-doc-core > team, to follow the formal process for cores. > > Please let the ML know should you have any objections. I'm in no way core on Docs, but I still wanted to take the opportunity to welcome Alex back. You've been sorely missed! -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Fri Jan 18 14:43:39 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 18 Jan 2019 14:43:39 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <4a3ac0205da9d3fcc1572888519d0a8213df358e.camel@redhat.com> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> <20190117163351.l67v7vtay6c5mn4a@yuggoth.org> <68df54a3-b3d9-2bd7-4377-34b30a82d93b@openstack.org> <4a3ac0205da9d3fcc1572888519d0a8213df358e.camel@redhat.com> Message-ID: <20190118144339.vtqzydecmayuez6x@yuggoth.org> On 2019-01-18 14:20:47 +0000 (+0000), Sean Mooney wrote: [...] > by the way the fact that we can have these conversation openly in > openstack and proide input regardless of if we are members of the > TC or not i think is one of the strengths of the openstack > comunity and governace model/process. so i just wanted to say im > glad we are having this conversation as a comunity. ...and not only that, but the ability to also stand for election. We're less than a month from time for declaring candidacy for available TC seats. Anyone who's interested in these matters should run if they can swing it. Be thinking about your platform statements and let's get a great bunch of candidates on the ballot! -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From simon.leinen at switch.ch Fri Jan 18 14:42:31 2019 From: simon.leinen at switch.ch (Simon Leinen) Date: Fri, 18 Jan 2019 15:42:31 +0100 Subject: [neutron]How to ease congestion at neutron server nodes? In-Reply-To: (Cody's message of "Thu, 17 Jan 2019 17:57:35 -0500") References: Message-ID: Cody writes: > What solution(s) other than DVR could I use to avoid north-south > traffic congestion at the neutron server nodes? Basically, I wish to > let VMs with floating IPs to route directly from their respective > hypervisor hosts to the Internet. Isn't that the DEFINITION of what DVR does? :-) (Not using DVR myself, so I may be wrong.) We push everything through those central nodes - we call them "network nodes". We try to alleviate the congestion by distributing routers across multiple nodes, and within each node we make sure that the forwarding plane (Open vSwitch in our case) is capable of using the "multi-queue" feature of the underlying network cards, so that packet forwarding is distributed across the multiple cores of those servers. That helped us a lot at the time. -- Simon. From a.settle at outlook.com Fri Jan 18 14:47:39 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Fri, 18 Jan 2019 14:47:39 +0000 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: <20190118143906.2qqarb5xere4zorw@yuggoth.org> References: <20190118144233.132eb0e427389da15e725141@redhat.com>, <20190118143906.2qqarb5xere4zorw@yuggoth.org> Message-ID: Awww thanks Jeremy! I've missed everyone so much! :D ________________________________ From: Jeremy Stanley Sent: 18 January 2019 14:39 To: openstack-discuss at lists.openstack.org Subject: Re: [docs] Nominating Alex Settle for openstack-doc-core On 2019-01-18 14:42:33 +0100 (+0100), Petr Kovar wrote: > Alex Settle recently re-joined the Documentation Project after a > few-month break. It's great to have her back and I want to > formally nominate her for membership in the openstack-doc-core > team, to follow the formal process for cores. > > Please let the ML know should you have any objections. I'm in no way core on Docs, but I still wanted to take the opportunity to welcome Alex back. You've been sorely missed! -- Jeremy Stanley -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrhillsman at gmail.com Fri Jan 18 14:59:01 2019 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Fri, 18 Jan 2019 08:59:01 -0600 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: References: <20190118144233.132eb0e427389da15e725141@redhat.com> <20190118143906.2qqarb5xere4zorw@yuggoth.org> Message-ID: +1000 On Fri, Jan 18, 2019 at 8:48 AM Alexandra Settle wrote: > Awww thanks Jeremy! I've missed everyone so much! :D > > ------------------------------ > *From:* Jeremy Stanley > *Sent:* 18 January 2019 14:39 > *To:* openstack-discuss at lists.openstack.org > *Subject:* Re: [docs] Nominating Alex Settle for openstack-doc-core > > On 2019-01-18 14:42:33 +0100 (+0100), Petr Kovar wrote: > > Alex Settle recently re-joined the Documentation Project after a > > few-month break. It's great to have her back and I want to > > formally nominate her for membership in the openstack-doc-core > > team, to follow the formal process for cores. > > > > Please let the ML know should you have any objections. > > I'm in no way core on Docs, but I still wanted to take the > opportunity to welcome Alex back. You've been sorely missed! > -- > Jeremy Stanley > -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 -------------- next part -------------- An HTML attachment was scrubbed... URL: From lbragstad at gmail.com Fri Jan 18 14:59:44 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Fri, 18 Jan 2019 08:59:44 -0600 Subject: [heystone][nova][cinder] netapp queens trust scoped token In-Reply-To: References: Message-ID: On Fri, Jan 18, 2019 at 6:49 AM Ignazio Cassano wrote: > Hello Everyone, > I am using a client for backupping openstack virtual machine. > The crux of the problem is the client uses trust based authentication for > scheduling backup jobs on behalf of user. When a trust scoped token is > passed to cinder client to take a snapshot, I expect the client use the > token to authenticate and perform the operation which cinder client does. > However cinder volume service invokes novaclient as part of cinder nfs > backend snapshot operation and novaclient tries to re-authenticate. Since > keystone does not allow re-authentication using trust based tokens, cinder > snapshot operation fails. > > Keystone allows authorization by allowing users to scope tokens to trusts, but once a trust-token is scoped, it can't be rescoped. Instead, keystone requires that you build another authentication request for a new trust-scoped token using the trust [0][1]. [0] https://developer.openstack.org/api-ref/identity/v3-ext/index.html?expanded=consuming-a-trust-detail#id121 [1] https://git.openstack.org/cgit/openstack/keystone/tree/keystone/auth/plugins/token.py#n84 > So I get the following error: > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs [req-61d977c1-eef3-4309-ac02-aaa0eb880925 > ab1bdb5dadc54312891f3a6410fef04d 6d1bffb04e3b4cdda30dc17aa96bfffc - default > default] Call to Nova to create snapshot failed: Forbidden: You are not > authorized to perform the requested action: Using trust-scoped token to > create another token. Create a new trust-scoped token instead. (HTTP 403) > (Request-ID: req-f55b682a-001b-4952-bfb1-abf4dd6bf459) > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs Traceback > (most recent call last): > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/ remotefs.py", > line 1452, in _create_snapshot_online > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > connection_info) > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > File "/usr/lib/python2.7/site-packages/cinder/compute/nova.py", line 188, > in create_volume_snapshot > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > create_info=create_info) > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > File > "/usr/lib/python2.7/site-packages/novaclient/v2/assisted_volume_snapshots.py", > line 43, in create > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > return self._create('/os-assisted-volume-snapshots', body, 'snapshot') > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > File "/usr/lib/python2.7/site-packages/novaclient/base.py", line 361, in > _create > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > resp, body = self.api.client.post(url, body=body) > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 310, > in post > > > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs > return self.request(url, 'POST', **kwargs) > > > My cinder volume are non netapp fas8040 via nfs. > > Anyone can help me ? > There is another feature in keystone that sounds like a better fit for what you're trying to do, called application credentials [2]. Application credentials were written as a way for users to grant authorization to services and scripts (e.g., the scipt asking cinder for a snapshot in your case.) Application credentials aren't tokens, but your scripts can use them to authenticate for a token [3]. The keystoneauth library already supports application credentials, so if you use that for building a session you should be able to use it in other clients that already support keystoneauth [4]. [2] https://docs.openstack.org/keystone/latest/user/application_credentials.html [3] https://docs.openstack.org/keystone/latest/user/application_credentials.html#using-application-credentials [4] https://docs.openstack.org/keystoneauth/latest/authentication-plugins.html#application-credentials > > Regards > > Ignazio > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Fri Jan 18 15:04:18 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 18 Jan 2019 15:04:18 +0000 Subject: [neutron]How to ease congestion at neutron server nodes? In-Reply-To: References: Message-ID: <779fc4bd5a99f9a43dde8c2df8f40608f09f89c1.camel@redhat.com> On Fri, 2019-01-18 at 15:42 +0100, Simon Leinen wrote: > Cody writes: > > What solution(s) other than DVR could I use to avoid north-south > > traffic congestion at the neutron server nodes? well the noth south traffic is processed by the nodes running the l3 agent so if you run the neutron server/api on the controller nodes and have dedicated networking ndoes for l3 and dhcp agents then that would achive what you desire without dvr in terms of not overloading the node that is running the neutron server. > > Basically, I wish to > > let VMs with floating IPs to route directly from their respective > > hypervisor hosts to the Internet. it would not however achive ^ > > Isn't that the DEFINITION of what DVR does? :-) yes that is the usecase that dvr with centralised snat tried to solve. dvr with distrbuted snat would obviosly loadblance the snat traffic away from the network nodes but im not sure we ever got that to work. > > (Not using DVR myself, so I may be wrong.) > > We push everything through those central nodes - we call them "network > nodes". We try to alleviate the congestion by distributing routers > across multiple nodes, and within each node we make sure that the > forwarding plane (Open vSwitch in our case) is capable of using the > "multi-queue" feature of the underlying network cards, so that packet > forwarding is distributed across the multiple cores of those servers. > That helped us a lot at the time. From smooney at redhat.com Fri Jan 18 15:07:27 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 18 Jan 2019 15:07:27 +0000 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: References: <20190118144233.132eb0e427389da15e725141@redhat.com> , <20190118143906.2qqarb5xere4zorw@yuggoth.org> Message-ID: On Fri, 2019-01-18 at 14:47 +0000, Alexandra Settle wrote: > Awww thanks Jeremy! I've missed everyone so much! :D > > From: Jeremy Stanley > Sent: 18 January 2019 14:39 > To: openstack-discuss at lists.openstack.org > Subject: Re: [docs] Nominating Alex Settle for openstack-doc-core > > On 2019-01-18 14:42:33 +0100 (+0100), Petr Kovar wrote: > > Alex Settle recently re-joined the Documentation Project after a > > few-month break. It's great to have her back and I want to > > formally nominate her for membership in the openstack-doc-core > > team, to follow the formal process for cores. > > > > Please let the ML know should you have any objections. > > I'm in no way core on Docs, but I still wanted to take the > opportunity to welcome Alex back. You've been sorely missed! same its nice to see you return alex. i also cant vote on you returning to a core position on the docs team but i think you did great work before and would continue to have a good impact if the nomination is carried. From jimmy at openstack.org Fri Jan 18 15:18:21 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Fri, 18 Jan 2019 09:18:21 -0600 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: References: <20190118144233.132eb0e427389da15e725141@redhat.com> , <20190118143906.2qqarb5xere4zorw@yuggoth.org> Message-ID: <5C41EE3D.9060701@openstack.org> Go Alex!!!! > Sean Mooney > January 18, 2019 at 9:07 AM > same its nice to see you return alex. i also cant vote on you returning > to a core position on the docs team but i think you did great > work before and would continue to have a good impact if the nomination > is carried. > > > Alexandra Settle > January 18, 2019 at 8:47 AM > > Awww thanks Jeremy! I've missed everyone so much! :D > > > ------------------------------------------------------------------------ > *From:* Jeremy Stanley > *Sent:* 18 January 2019 14:39 > *To:* openstack-discuss at lists.openstack.org > *Subject:* Re: [docs] Nominating Alex Settle for openstack-doc-core > On 2019-01-18 14:42:33 +0100 (+0100), Petr Kovar wrote: > > Alex Settle recently re-joined the Documentation Project after a > > few-month break. It's great to have her back and I want to > > formally nominate her for membership in the openstack-doc-core > > team, to follow the formal process for cores. > > > > Please let the ML know should you have any objections. > > I'm in no way core on Docs, but I still wanted to take the > opportunity to welcome Alex back. You've been sorely missed! > -- > Jeremy Stanley -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Fri Jan 18 15:18:40 2019 From: aj at suse.com (Andreas Jaeger) Date: Fri, 18 Jan 2019 16:18:40 +0100 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: <20190118144233.132eb0e427389da15e725141@redhat.com> References: <20190118144233.132eb0e427389da15e725141@redhat.com> Message-ID: On 18/01/2019 14.42, Petr Kovar wrote: > Hi all, > > Alex Settle recently re-joined the Documentation Project after a few-month > break. It's great to have her back and I want to formally nominate her for > membership in the openstack-doc-core team, to follow the formal process for > cores. Glad to see you back, Alex! > Please let the ML know should you have any objections. the opposite ;) Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From eumel at arcor.de Fri Jan 18 15:34:13 2019 From: eumel at arcor.de (Frank Kloeker) Date: Fri, 18 Jan 2019 16:34:13 +0100 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: <20190118144233.132eb0e427389da15e725141@redhat.com> References: <20190118144233.132eb0e427389da15e725141@redhat.com> Message-ID: formally +2 wb Am 2019-01-18 14:42, schrieb Petr Kovar: > Hi all, > > Alex Settle recently re-joined the Documentation Project after a > few-month > break. It's great to have her back and I want to formally nominate her > for > membership in the openstack-doc-core team, to follow the formal process > for > cores. > > Please let the ML know should you have any objections. > > Thanks, > pk From miguel at mlavalle.com Fri Jan 18 15:37:29 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Fri, 18 Jan 2019 09:37:29 -0600 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: References: <20190118144233.132eb0e427389da15e725141@redhat.com> Message-ID: Welcome back! On Fri, Jan 18, 2019 at 9:34 AM Frank Kloeker wrote: > formally +2 > > wb > > Am 2019-01-18 14:42, schrieb Petr Kovar: > > Hi all, > > > > Alex Settle recently re-joined the Documentation Project after a > > few-month > > break. It's great to have her back and I want to formally nominate her > > for > > membership in the openstack-doc-core team, to follow the formal process > > for > > cores. > > > > Please let the ML know should you have any objections. > > > > Thanks, > > pk > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amy at demarco.com Fri Jan 18 15:39:19 2019 From: amy at demarco.com (Amy Marrich) Date: Fri, 18 Jan 2019 09:39:19 -0600 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: <20190118144233.132eb0e427389da15e725141@redhat.com> References: <20190118144233.132eb0e427389da15e725141@redhat.com> Message-ID: Welcome back Alex we missed you! Amy (spotz) On Fri, Jan 18, 2019 at 7:48 AM Petr Kovar wrote: > Hi all, > > Alex Settle recently re-joined the Documentation Project after a few-month > break. It's great to have her back and I want to formally nominate her for > membership in the openstack-doc-core team, to follow the formal process for > cores. > > Please let the ML know should you have any objections. > > Thanks, > pk > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Fri Jan 18 15:44:57 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 18 Jan 2019 15:44:57 +0000 (GMT) Subject: [placement] update 19-02 Message-ID: HTML: https://anticdent.org/placement-update-19-02.html Hi! It's a placement update! The main excitement this week is we had a meeting to check in on the state of extraction and figure out the areas that need the most attention. More on that in the extraction section within. # Most Important Work to complete and review changes to deployment to support extracted placement is the main thing that matters. # What's Changed * Placement is now able to publish release notes. * Placement is running python 3.7 unit tests in the gate, but not functional (yet). * We had that meeting and [Matt made some notes](http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001789.html). # Bugs * Placement related [bugs not yet in progress](https://goo.gl/TgiPXb): 15. +1. * [In progress placement bugs](https://goo.gl/vzGGDQ) 16. Stable. # Specs Last week was spec freeze so I'll not list all the specs here, but for reference, there were 16 specs listed last week and all 16 of them are neither merged nor abandoned. # Main Themes The reshaper work was restarted after discussion at the meeting surfaced its stalled nature. The libvirt side of things is due some refactoring while the xenapi side is waiting for a new owner to come up to speed. Gibi has proposed a related functional test. All of that at: * Also making use of nested is this spectacular stack of code at bandwidth-resource-provider: * Eric's in the process of doing lots of cleanups to how often the ProviderTree in the resource tracker is checked against placement, and a variety of other "let's make this more right" changes in the same neighborhood: * Stack at: That stuff is very close to ready and will make lots of people happy when it merges. One of the main areas of concern is making sure it doesn't break things for Ironic. ## Extraction As noted above, there was a meeting which resulted in [Matt's Notes](http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001789.html), an updated [extraction etherpad](https://etherpad.openstack.org/p/placement-extract-stein-5), and an improved understanding of where things stand. The critical work to ensure a healthy extraction is with getting deployment tools working. Here are _some_ of the links to that work: * [TripleO](https://review.openstack.org/#/q/topic:tripleo-placement-extraction) * [OpenStack Ansible](https://review.openstack.org/#/q/project:openstack/openstack-ansible-os_placement) * [Kolla and Kolla Ansible](https://review.openstack.org/#/q/topic:split-placement) and [kolla upgrade](https://review.openstack.org/#/q/topic:upgrade-placement) We also worked out that getting the online database migrations happening on the placement side of the world would help: * [in placement](https://review.openstack.org/#/c/624942/) * [use in grenade](https://review.openstack.org/#/c/631614/) Documentation is mostly in-progress, but needs some review from packagers. A change to [openstack-manuals](https://review.openstack.org/#/c/628324/) depends on the [initial placement install docs](https://review.openstack.org/628220). There is a patch to [delete placement](https://review.openstack.org/#/c/618215/) from nova on which we've put an administrative -2 until it is safe to do the delete. # Other There are 13 [open changes](https://review.openstack.org/#/q/project:openstack/placement+status:open) in placement itself. Several of those are easy win cleanups. Of those placement changes, the [online-migration-related](https://review.openstack.org/#/q/topic:bug/1803925) ones are the most important. Outside of placement (I've decided to trim this list to just stuff that's seen a commit in the last two months): * Neutron minimum bandwidth implementation * WIP: add Placement aggregates tests (in tempest) * blazar: Consider the number of reservation inventory * Add placement client for basic GET operations (to tempest) # End Because I wanted to see what it might look like, I made a toy VM scheduler and placer, using etcd and placement. Then I wrote a [blog post](https://anticdent.org/etcd-placement-virt-install-compute.html). I wish there was more time for this kind of educational and exploratory playing. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From ukalifon at redhat.com Fri Jan 18 16:01:22 2019 From: ukalifon at redhat.com (Udi Kalifon) Date: Fri, 18 Jan 2019 17:01:22 +0100 Subject: [qa] dynamic credentials with the tempest swift client In-Reply-To: <1aa6bce4-622e-4787-a73b-27de7ed9d224@www.fastmail.com> References: <1aa6bce4-622e-4787-a73b-27de7ed9d224@www.fastmail.com> Message-ID: When I try this it just skips the tests, and doesn't say anywhere why. I added this to my tempest.conf: [auth] test_accounts_file = /home/ukalifon/src/tempest/cloud-01/accounts.yaml use_dynamic_credentials = False And my accounts.yaml looks like this: - username: 'admin' tenant_name: 'admin' password: 'cYsJrqtj7IvC581DxsLZkXlku' Regards, Udi Kalifon; Senior QE; RHOS-UI Automation On Fri, Jan 18, 2019 at 11:08 AM Masayuki Igawa wrote: > Hi, > > On Thu, Jan 17, 2019, at 17:58, Udi Kalifon wrote: > : > > So I'm looking for a way to utilize the client without it automatically > > creating itself dynamic credentials; it has to use the already-existing > > admin credentials on the admin project in order to see the container > > with the plans. What's the right way to do that, please? Thanks a lot > > in advance! > > Does this pre-provisioned credentials help you? > > https://docs.openstack.org/tempest/latest/configuration.html#pre-provisioned-credentials > > -- Masayuki Igawa > Key fingerprint = C27C 2F00 3A2A 999A 903A 753D 290F 53ED C899 BF89 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Fri Jan 18 16:11:01 2019 From: aj at suse.com (Andreas Jaeger) Date: Fri, 18 Jan 2019 17:11:01 +0100 Subject: Retiring openstack-infra/puppet-storyboard Message-ID: <53026ffc-076c-de1c-c38f-49e47a3c4435@suse.com> The repo openstack-infra/puppet-storyboard has been created with the intent to host storyboard by the OpenStack Infra team - and this did not happen. We will therefore retire the repository, I'm proposing patches now with topic retire-puppet-storyboard, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From aj at suse.com Fri Jan 18 16:17:10 2019 From: aj at suse.com (Andreas Jaeger) Date: Fri, 18 Jan 2019 17:17:10 +0100 Subject: Retiring openstack-infra/puppet-stackalytics (was Retiring openstack-infra/puppet-storyboard) In-Reply-To: <53026ffc-076c-de1c-c38f-49e47a3c4435@suse.com> References: <53026ffc-076c-de1c-c38f-49e47a3c4435@suse.com> Message-ID: <3459a819-50bf-8d39-6c2b-328b5eb4878d@suse.com> this is about stackalytics, not storyboard! Please do global search & replace ;( So, I will retire puppet-stackalytics, Andreas On 18/01/2019 17.11, Andreas Jaeger wrote: > The repo openstack-infra/puppet-storyboard has been created with the > intent to host storyboard by the OpenStack Infra team - and this did not > happen. We will therefore retire the repository, > > I'm proposing patches now with topic retire-puppet-storyboard, > > Andreas > -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From mriedemos at gmail.com Fri Jan 18 16:55:13 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 18 Jan 2019 10:55:13 -0600 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> Message-ID: <4c203b5b-1166-b037-2f91-18608668cdf5@gmail.com> On 1/4/2019 7:20 AM, Sean Mooney wrote: > so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure > the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in > neutron be egress qos minium bandwith support was added in newton with > https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf24270d57417#diff-4bbb0b6d12a0d060196c0e3f10e57cec You said "The ingress qos minium bandwith rules was only added in neutron" - did you mean a release rather than "neutron", as in a release newer than newton, presumably much newer? > so there are will be a lot of existing cases where ports will have minium bandwith policies before stein. Isn't this all admin-only by default in neutron since newton? So how do we know there will be "a lot" of existing cases? Do we know of any public openstack clouds that enable this for their users? If not, I'm guessing by "a lot" maybe you mean a lot of telco private cloud openstack deployments that just have a single MANO tenant? -- Thanks, Matt From openstack at nemebean.com Fri Jan 18 17:02:38 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 18 Jan 2019 11:02:38 -0600 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <20190117235821.75wfwz2pkzvmkviu@yuggoth.org> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1547743607.271159.1637236992.7774859F@webmail.messagingengine.com> <20190117171118.gdtbm7beqyxjqto5@yuggoth.org> <20190117235821.75wfwz2pkzvmkviu@yuggoth.org> Message-ID: On 1/17/19 5:58 PM, Jeremy Stanley wrote: > On 2019-01-17 14:20:10 -0600 (-0600), Ben Nemec wrote: > [...] >> Reading the document, it seems to me that it describes less a >> "Technical" Committee and more a "Governance" Committee. > [...] > > I mused similarly some time back (in an ML post I'm having trouble > finding now) that I consider the choice of naming for the "technical > committee" unfortunate, as I see our role being one of community > management and arbitration. Section 4.13.b.i of the OSF bylaws > describes the responsibilities and powers of the TC thusly: > > "The Technical Committee shall have the authority to manage the > OpenStack Project, including the authority to determine the scope of > the OpenStack Technical Committee Approved Release..." (the latter > is specifically with regard to application of the OpenStack > trademark for products) > > https://www.openstack.org/legal/bylaws-of-the-openstack-foundation/ > > So I guess a lot of it comes down to how we interpret "manage" in > that context. If you don't see the TC as the appropriate body to > provide governance for the OpenStack project, then who do you think > should take that on instead? I didn't mean to imply that I thought the TC _shouldn't_ be providing governance. I was just observing that the TC's activity seems to be slanted toward governance and away from what I would consider technical. Based on my philosophy that things are rarely black and white, I suspect the best place for the TC is going to be some happy medium between technical and governance activity. Which I realize is a totally unhelpful stance to take because it basically means my answer to "Should the TC do X" is always going to be "It depends." Maybe it would be useful to revisit some specific historical situations where people feel the TC should or should not have stepped in? A lot of the discussion I've seen so far has been in the abstract, but some more concrete examples might help define what people want/expect from the TC. > > Section 4.1.b.i of the bylaws mentions that "management of the > technical matters relating to the OpenStack Project [...] shall be > managed by the Technical Committee" and also "management of the > technical matters for the OpenStack Project is designed to be a > technical meritocracy" but doesn't go into details as to what it > means by "technical matters" (beyond deciding what qualifies for > trademark use). It seems to me that by delegating > subproject-specific technical decisions to team leaders elected from > each subproject, and then handling decisions which span projects > (the technical vision document, project teams guide, cycle goals > selection, et cetera), we meet both the letter and the spirit of the > duties outlined for the OpenStack Technical Committee in the OSF > bylaws. But as noted, a lot of this hinges on how people take the > somewhat fuzzy terms above in the context with which they're given. > From juliaashleykreger at gmail.com Fri Jan 18 17:34:35 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 18 Jan 2019 09:34:35 -0800 Subject: [ironic] Mid-cycle call times In-Reply-To: <4d9eb6d2-3033-1bce-43db-b6a6bee4ad9c@redhat.com> References: <4d9eb6d2-3033-1bce-43db-b6a6bee4ad9c@redhat.com> Message-ID: I think so yes. I guess so yes. I will follow-up with notes after the meeting and cover any announcements at the start of the call. And as for until the last person gets too tired to argue.... indeed. :) -Julia On Fri, Jan 18, 2019 at 4:23 AM Dmitry Tantsur wrote: > > Thanks Julia! > > I guess we're canceling the weekly meeting because of the overlap? > > On 1/18/19 12:49 PM, Julia Kreger wrote: > > Greetings everyone! > > > > I've created a fairly simple schedule: > > > > January 21st 2:00 - 4:00 UTC > > > > 2:00 PM UTC - Discuss current status > > 2:30 PM UTC - Revising current plans and reprioritizing as necessary > > 3:30 PM UTC - Making Ironic more container friendly. > > 3:30pm - until the last person gets too tired to argue :D > > > > > January 22nd 2:00 - 4:00 UTC > > > > 2:00 PM UTC - Boot Management for in-band inspection > > 2:30 PM UTC - SmartNIC configuration support > > 3:00 PM UTC - Disucss any other items that arose during the earlier discussions. > > 3:30 PM UTC - Bug and RFE Triaging > > > > If there are no objections, we can use my bluejeans[1] account to discuss. > > Please see our planning/discussion etherpad[2]. > > > > Thanks! I look forward to chatting with everyone soon. > > > > -Julia > > > > [1]: https://bluejeans.com/u/jkreger > > [2]: https://etherpad.openstack.org/p/ironic-stein-midcycle > > > > On Tue, Jan 15, 2019 at 12:00 PM Julia Kreger > > wrote: > >> > >> Greetings everyone, > >> > >> It seems the most popular times are January 21st and 22nd between 2 PM > >> and 6 PM UTC. > >> > >> Please add any topics for discussion to the etherpad[1] as soon as > >> possible. I will propose a schedule and agenda in the next day or two. > >> > >> -Julia > >> > >> [1]: https://etherpad.openstack.org/p/ironic-stein-midcycle > >> > >> On Tue, Jan 8, 2019 at 9:10 AM Julia Kreger wrote: > >>> > >>> Greetings everyone! > >>> > >>> It seems we have coalesced around January 21st and 22nd. I have posted > >>> a poll[1] with time windows in two hour blocks so we can reach a > >>> consensus on when we should meet. > >>> > >>> Please vote for your available time windows so we can find the best > >>> overlap for everyone. Additionally, if there are any topics or items > >>> that you feel would be a good use of the time, please feel free to add > >>> them to the planning etherpad[2]. > >> [trim] > > > > From balazs.gibizer at ericsson.com Fri Jan 18 18:07:16 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Fri, 18 Jan 2019 18:07:16 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1547486170.17957.0@smtp.office365.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> <1547029853.1128.0@smtp.office365.com> <1547052955.1128.1@smtp.office365.com> <1547486170.17957.0@smtp.office365.com> Message-ID: <1547834830.1231.1@smtp.office365.com> Hello, Thank your for the productive call. Here is a short summary about the agreements: * There will be a new microversion in nova for creating (and deleting) a server with ports having resource request. If such request is sent with any old microversion then nova will reject the request. Nova will reject not just the create request with old microvesion but all the requests that would lead to an inconsistent resource view. These are: create, attachInterface, resize, migrate live-migrate, evacuate, and unshelve of an offloaded server. * Delete server having resource aware ports will be supported with old microversion. Nova will delete the whole consumer in placement anyhow. * There will be a second new microversion (probably in Train) that will enable move operations for server having resource aware ports. This microversion split will allow us not to block the create / delete support for the feature in Stein. * The new microversions will act as a feature flag in the code. This will allow merging single use cases (e.g.: server create with one ovs backed resource aware port) and functionally verifying it before the whole generic create use case is ready and enabled. * A nova-manage command will be provided to heal the port allocations without moving the servers if there is enough resource inventory available for it on the current host. This tool will only work online as it will call neutron and placement APIs. * Server move operations with the second new microversion will automatically heal the server allocation. Cheers, gibi On Mon, Jan 14, 2019 at 6:16 PM, Balázs Gibizer wrote: > > > On Wed, Jan 9, 2019 at 5:56 PM, Balázs Gibizer > wrote: >> >> >> On Wed, Jan 9, 2019 at 11:30 AM, Balázs Gibizer >> wrote: >>> >>> >>> On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer >>> wrote: >>>> >>>> >>>>> But, let's chat more about it via a hangout the week after >>>>> next >>>>> (week >>>>> of January 14 when Matt is back), as suggested in >>>>> #openstack-nova >>>>> today. We'll be able to have a high-bandwidth discussion then >>>>> and >>>>> agree on a decision on how to move forward with this. >>>> >>>> Thank you all for the discussion. I agree to have a real-time >>>> discussion about the way forward. >>>> >>>> Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a >>>> hangouts[2]? >>> >> >> It seems that Tuesday 15th of Jan, 17:00 UTC [2] would be better for >> the team. So I'm moving the call there. > > Sorry to change it again. I hope this is the final time. Friday 18th > of > Jan, 17:00 UTC [2]. > The dicussion etherpad is updated with a bit more info [3]. > > Cheers, > gibi > > [1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI > [2] > https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190118T170000 > [3] https://etherpad.openstack.org/p/bandwidth-way-forward > > >> >> Cheers, >> gibi >> >> [1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI >> [2] >> >> https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190115T170000 >> >> > > From shubjero at gmail.com Fri Jan 18 18:16:46 2019 From: shubjero at gmail.com (shubjero) Date: Fri, 18 Jan 2019 13:16:46 -0500 Subject: [neutron]How to ease congestion at neutron server nodes? In-Reply-To: <779fc4bd5a99f9a43dde8c2df8f40608f09f89c1.camel@redhat.com> References: <779fc4bd5a99f9a43dde8c2df8f40608f09f89c1.camel@redhat.com> Message-ID: Hi Simon and Sean, What amount of throughput are you able to get through your network nodes? At what point do you bottleneck north-south traffic and where is the bottleneck? Can you elaborate more on the multi-queue configuration? We have 3 controllers where the neutron server/api/L3 agents run and the active L3 agent for a particular neutron router's will drive the controllers CPU interrupts through the roof (400k) at which point it begins to cause instability (failovers) amongst all L3 agents on that controller node well before we come close to saturating the 40Gbps (4x10Gbps lacp bond) of available bandwidth to it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Fri Jan 18 18:40:23 2019 From: dms at danplanet.com (Dan Smith) Date: Fri, 18 Jan 2019 10:40:23 -0800 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1547834830.1231.1@smtp.office365.com> (=?utf-8?Q?=22Bal?= =?utf-8?Q?=C3=A1zs?= Gibizer"'s message of "Fri, 18 Jan 2019 18:07:16 +0000") References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> <1547029853.1128.0@smtp.office365.com> <1547052955.1128.1@smtp.office365.com> <1547486170.17957.0@smtp.office365.com> <1547834830.1231.1@smtp.office365.com> Message-ID: > * There will be a second new microversion (probably in Train) that will > enable move operations for server having resource aware ports. This > microversion split will allow us not to block the create / delete > support for the feature in Stein. > > * The new microversions will act as a feature flag in the code. This > will allow merging single use cases (e.g.: server create with one ovs > backed resource aware port) and functionally verifying it before the > whole generic create use case is ready and enabled. > > * A nova-manage command will be provided to heal the port allocations > without moving the servers if there is enough resource inventory > available for it on the current host. This tool will only work online > as it will call neutron and placement APIs. > > * Server move operations with the second new microversion will > automatically heal the server allocation. I wasn't on this call, so apologies if I'm missing something important. Having a microversion that allows move operations for an instance configured with one of these ports seems really terrible to me. What exactly is the point of that? To distinguish between Stein and Train systems purely because Stein didn't have time to finish the feature? IMHO, we should really avoid abusing microversions for that sort of thing. I would tend to err on the side of "if it's not ready, then it's not ready" for Stein, but I'm sure the desire to get this in (even if partially) is too strong for that sort of restraint. Can we not return 403 in Stein, since moving instances is disable-able anyway, and just make it work in Train? Having a new microversion with a description of "nothing changed except we finished a feature so you can do this very obscure thing now" seems like we're just using them as an experimental feature flag, which was definitely not the intent. I know returning 403 for "you can't do this right now" isn't *as* discoverable, but you kinda have to handle 403 for operations that could be disabled anyway, so... --Dan From e0ne at e0ne.info Fri Jan 18 19:32:12 2019 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Fri, 18 Jan 2019 21:32:12 +0200 Subject: [horizon] Unused xstatic-* projects retirement In-Reply-To: <985ba1a5-d5de-b5d8-91f0-2e92fca1b00e@alumni.enseeiht.fr> References: <985ba1a5-d5de-b5d8-91f0-2e92fca1b00e@alumni.enseeiht.fr> Message-ID: Thank you Clark, Francios, My bad, I just checked via codesearch.openstack.org. I'll double check for pypi dependencies and update my patch. Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Wed, Jan 16, 2019 at 11:37 PM François Magimel < francois.magimel at alumni.enseeiht.fr> wrote: > Hi ! > > Le 16/01/2019 à 14:15, Ivan Kolodyazhny a écrit : > > Hi team, > > There are some xstatic packages which I didn't start to use in Horizon or > plugins. We didn't do any release of them. > > During the last meeting [1] we agreed to mark them as retired. I'll start > retired procedure [2] today. If you're going to use them, please let me > know. > > The list of the projects to be retired: > - xstatic-angular-ui-router > - xstatic-bootstrap-datepicker > - xstatic-hogan > - xstatic-jquery-migrate > - xstatic-jquery.quicksearch > - xstatic-jquery.tablesorter > - xstatic-rickshaw > > > We have some work in progress [2] in the CloudKitty dashboard plugin to > use xstatic-rickshaw instead of including the JS script our the repo. > > [2] https://storyboard.openstack.org/#!/story/2003578 > > François > > > - xstatic-spin > - xstatic-vis > > > [1] > http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-110 > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Fri Jan 18 20:56:14 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 18 Jan 2019 20:56:14 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <4c203b5b-1166-b037-2f91-18608668cdf5@gmail.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4c203b5b-1166-b037-2f91-18608668cdf5@gmail.com> Message-ID: <34bbba0a70c42c6ce5ab2bd1e3edb4e3a68a3e14.camel@redhat.com> On Fri, 2019-01-18 at 10:55 -0600, Matt Riedemann wrote: > On 1/4/2019 7:20 AM, Sean Mooney wrote: > > so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will > > configure > > the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added > > in > > neutron be egress qos minium bandwith support was added in newton with > > https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf24270d57417#diff-4bbb0b6d12a0d060196c0e3f10e57cec > > You said "The ingress qos minium bandwith rules was only added in > neutron" - did you mean a release rather than "neutron", as in a release > newer than newton, presumably much newer? yes i meant to say minium ingress qos was only added to neutron in rocky where as minium egress qos dates back to newton. > > > so there are will be a lot of existing cases where ports will have minium bandwith policies before stein. > > Isn't this all admin-only by default in neutron since newton? So how do > we know there will be "a lot" of existing cases? Do we know of any > public openstack clouds that enable this for their users? If not, I'm > guessing by "a lot" maybe you mean a lot of telco private cloud > openstack deployments that just have a single MANO tenant? yes telco/nfv deployment where a mano system is used to manage openstack was the primary usecase i was thinking about. looking at the api definition https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api/definitions/qos.py and api docs https://developer.openstack.org/api-ref/network/v2/index.html?expanded=create-minimum-bandwidth-rule-detail#qos-minimum-bandwidth-rules i dont see anything calling this api as admin only. i know qos in general was not intended to be admin only. looking at https://github.com/openstack/neutron/blob/master/neutron/conf/policies/qos.py it looks like you need admin right to create update and delete qos rules/policies but i think any user can aplly a qos policy that was created by an admin to a port or network https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api/definitions/qos.py#L91-L108 extends the port and network resouces with a qos policy id. https://github.com/openstack/neutron/blob/master/neutron/conf/policies/qos.py https://github.com/openstack/neutron/blob/master/neutron/conf/policies/port.py https://github.com/openstack/neutron/blob/master/neutron/conf/policies/network.py do not set an adming only policy on the qos policy id so i assuem the default of RULE_ANY ('rule:regular_user') or RULE_ADMIN_OR_OWNER ('rule:admin_or_owner') applies. the intenat of haveing policy creation be admin only but requesting a policy be available to all tenants was to allow opertor to chage for guaranteed bandwith or priorised trafic and enable tenants to opt in to that. if the admin did not define any qos policies and used teh defult api polices then yes there are likely few users of this out side of telco deployments. From smooney at redhat.com Fri Jan 18 21:12:05 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 18 Jan 2019 21:12:05 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> <1547029853.1128.0@smtp.office365.com> <1547052955.1128.1@smtp.office365.com> <1547486170.17957.0@smtp.office365.com> <1547834830.1231.1@smtp.office365.com> Message-ID: <20dc210355e242fd8d2001f89fe99750a6500dc3.camel@redhat.com> On Fri, 2019-01-18 at 10:40 -0800, Dan Smith wrote: > > * There will be a second new microversion (probably in Train) that will > > enable move operations for server having resource aware ports. This > > microversion split will allow us not to block the create / delete > > support for the feature in Stein. > > > > * The new microversions will act as a feature flag in the code. This > > will allow merging single use cases (e.g.: server create with one ovs > > backed resource aware port) and functionally verifying it before the > > whole generic create use case is ready and enabled. > > > > * A nova-manage command will be provided to heal the port allocations > > without moving the servers if there is enough resource inventory > > available for it on the current host. This tool will only work online > > as it will call neutron and placement APIs. > > > > * Server move operations with the second new microversion will > > automatically heal the server allocation. > > I wasn't on this call, so apologies if I'm missing something important. > > Having a microversion that allows move operations for an instance > configured with one of these ports seems really terrible to me. What > exactly is the point of that? To distinguish between Stein and Train > systems purely because Stein didn't have time to finish the feature? the intent of the second micro version was primarily discoverablity the option of have 1 vs 2 micro versions was discussed. no one really pushed strongly for only one on the call so i think we just converged on two mainly becaue of the discoverablity aspect. > > IMHO, we should really avoid abusing microversions for that sort of > thing. I would tend to err on the side of "if it's not ready, then it's > not ready" for Stein, but I'm sure the desire to get this in (even if > partially) is too strong for that sort of restraint. so dont merge it at all for stien and merge it all in train. its an option, but yes i think a lot of people woudl like to see at least some support in Stein. it would not be the only nova feature that does not support move operations but maybe it will cause operator more headache the it solves if we enable the feature with out move support. > > Can we not return 403 in Stein, since moving instances is disable-able > anyway, and just make it work in Train? Having a new microversion with a > description of "nothing changed except we finished a feature so you can > do this very obscure thing now" seems like we're just using them as an > experimental feature flag, which was definitely not the intent. I know > returning 403 for "you can't do this right now" isn't *as* discoverable, > but you kinda have to handle 403 for operations that could be disabled > anyway, so... ya i guess that is a fair point. microversions are not like neutrons api extention, you cant mix and match micorverions in the same way. in neutron the presence of the extension tell you the feature is availble and enabled. the nova microversion just tell you the code support it not that its configured so ya you would have to handel 403s if move operations were disabled even if we had two microverions. > > --Dan > From smooney at redhat.com Fri Jan 18 21:59:32 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 18 Jan 2019 21:59:32 +0000 Subject: [neutron]How to ease congestion at neutron server nodes? In-Reply-To: References: <779fc4bd5a99f9a43dde8c2df8f40608f09f89c1.camel@redhat.com> Message-ID: <3bf03baa7b7016bde3b84aa0fc6b0051b64cfc97.camel@redhat.com> On Fri, 2019-01-18 at 13:16 -0500, shubjero wrote: > Hi Simon and Sean, > > What amount of throughput are you able to get through your network nodes? that will very wildly dependign on your nics and the networking solution you deployed. ovs? with or without dpdk? with or without an sdn contoller? with or without hardware offload? vpp? calico? linux bridge? maybe some operators can share there experience. > At what point do you bottleneck north-south traffic and where is the bottleneck? ill assume you are using kernel ovs. with kenerl ovs you bottelneck will likely be ovs and possible the kernel routing stack. kernel ovs can only handel about 1.8mpps in L2 phy to phy switching maybe a bit more depending on you cpu frequency and kernel version. i have not been following this metric for ovs that closely but its somewhere in that neighbourhood. that is enough to switch about 1.2Gbps of 64k packets but can saturage a 10G link at ~ 512b packets. unless you are using juboframes or some nic offloads you cannot to my knowlage satuage a 40G link with kernel ovs. not when i say nic offload i am not refering to hardware offload ovs i am refering to GSO LRO and other offloads you enable via ethtool. the kernel ovs can switch in excess of 10Gbps of through put quite easy with standard mtu packets. 1500 MTU packet rate calculated as: (10*10^9) bits/sec / (1538 bytes * 8) = 812,744 pps for 40Gbps it would go up to 2.6mpps which is more then kernel ovs can forward on a standard server cpu frequecies int 2.4-3.6GHz range. if you are using 9K jumboframes that packet forwardign rate drops too 442,576 pps which again is well within kernel ovs ability to forward. the packet classifcation and header extration is much more costly then copying the packet payload so pps is more important then the size of the packet but they are related. so depending on your trafic profile kernel ovs may or may not be a bottleneck. the next bottelneck is the kernel routing speed. the linux kernel i pretty good at routing but in the neutron case it not only has to do routing but also nat. the ip table snat and dnat actions are likely to be the next bottelneck after ovs. > Can you elaborate more on the multi-queue configuration? i belive simon was refering to enableing multiple rx/tx ques on the nic attached to ovs such that the nics recive side scaling feature be used to hash packet into a set of hardware recive queue which can then be processed by ovs using multiple kernel threads across several cpus. > > We have 3 controllers where the neutron server/api/L3 agents run and the active L3 agent for a particular neutron > router's will drive the controllers CPU interrupts through the roof (400k) at which point it begins to cause one way to scale interrupt handling is irqblance assuming your nic supprot interupt steering but most if not all 10G nics do. > instability (failovers) amongst all L3 agents on that controller node well before we come close to saturating the > 40Gbps (4x10Gbps lacp bond) of available bandwidth to it. From assaf at redhat.com Fri Jan 18 22:22:42 2019 From: assaf at redhat.com (Assaf Muller) Date: Fri, 18 Jan 2019 17:22:42 -0500 Subject: [neutron]How to ease congestion at neutron server nodes? In-Reply-To: References: <779fc4bd5a99f9a43dde8c2df8f40608f09f89c1.camel@redhat.com> Message-ID: On Fri, Jan 18, 2019 at 1:21 PM shubjero wrote: > > Hi Simon and Sean, > > What amount of throughput are you able to get through your network nodes? At what point do you bottleneck north-south traffic and where is the bottleneck? Can you elaborate more on the multi-queue configuration? > > We have 3 controllers where the neutron server/api/L3 agents run and the active L3 agent for a particular neutron router's will drive the controllers CPU interrupts through the roof (400k) at which point it begins to cause instability (failovers) amongst all L3 agents on that controller node well before we come close to saturating the 40Gbps (4x10Gbps lacp bond) of available bandwidth to it. I'd look in to either: 1) Splitting out the L3/DHCP/metadata/OVS agents from the controller nodes to dedicated network nodes 2) Enabling DVR 3) Moving over to OVN with DVR From mriedemos at gmail.com Sat Jan 19 02:21:52 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 18 Jan 2019 20:21:52 -0600 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> <1547029853.1128.0@smtp.office365.com> <1547052955.1128.1@smtp.office365.com> <1547486170.17957.0@smtp.office365.com> <1547834830.1231.1@smtp.office365.com> Message-ID: <5a29f0db-6d29-cd16-83d7-7ffe4b630fb3@gmail.com> On 1/18/2019 12:40 PM, Dan Smith wrote: > Having a microversion that allows move operations for an instance > configured with one of these ports seems really terrible to me. I agree with that sentiment. > Can we not return 403 in Stein, since moving instances is disable-able > anyway, and just make it work in Train? Having a new microversion with a > description of "nothing changed except we finished a feature so you can > do this very obscure thing now" seems like we're just using them as an > experimental feature flag, which was definitely not the intent. I know > returning 403 for "you can't do this right now" isn't*as* discoverable, > but you kinda have to handle 403 for operations that could be disabled > anyway, so... We didn't discuss it too much on the call, but in thinking about it afterward, I think I would be OK with treating this like a bug fix in Train. We can fail move operations until we support this, and then once we support it, we just do, without a microversion. As noted, clients have to deal with this kind of stuff already, and I don't remember saying when we support live migration with NUMA (which now fails unless configured otherwise) that we would add a microversion for that - it either just works or it doesn't. So I'm OK with not adding a second microversion for move operation support later. -- Thanks, Matt From ignaziocassano at gmail.com Sat Jan 19 07:44:54 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Sat, 19 Jan 2019 08:44:54 +0100 Subject: [heystone][nova][cinder] netapp queens trust scoped token In-Reply-To: References: Message-ID: Hello Lance, application credential are also supported in queens? Thanks Ignazio Il giorno Ven 18 Gen 2019 16:04 Lance Bragstad ha scritto: > > > On Fri, Jan 18, 2019 at 6:49 AM Ignazio Cassano > wrote: > >> Hello Everyone, >> I am using a client for backupping openstack virtual machine. >> The crux of the problem is the client uses trust based authentication for >> scheduling backup jobs on behalf of user. When a trust scoped token is >> passed to cinder client to take a snapshot, I expect the client use the >> token to authenticate and perform the operation which cinder client does. >> However cinder volume service invokes novaclient as part of cinder nfs >> backend snapshot operation and novaclient tries to re-authenticate. Since >> keystone does not allow re-authentication using trust based tokens, cinder >> snapshot operation fails. >> >> > Keystone allows authorization by allowing users to scope tokens to trusts, > but once a trust-token is scoped, it can't be rescoped. Instead, keystone > requires that you build another authentication request for a new > trust-scoped token using the trust [0][1]. > > [0] > https://developer.openstack.org/api-ref/identity/v3-ext/index.html?expanded=consuming-a-trust-detail#id121 > [1] > https://git.openstack.org/cgit/openstack/keystone/tree/keystone/auth/plugins/token.py#n84 > > >> So I get the following error: >> >> 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs [req-61d977c1-eef3-4309-ac02-aaa0eb880925 >> ab1bdb5dadc54312891f3a6410fef04d 6d1bffb04e3b4cdda30dc17aa96bfffc - default >> default] Call to Nova to create snapshot failed: Forbidden: You are not >> authorized to perform the requested action: Using trust-scoped token to >> create another token. Create a new trust-scoped token instead. (HTTP 403) >> (Request-ID: req-f55b682a-001b-4952-bfb1-abf4dd6bf459) >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs Traceback >> (most recent call last): >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/ remotefs.py", >> line 1452, in _create_snapshot_online >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> connection_info) >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> File "/usr/lib/python2.7/site-packages/cinder/compute/nova.py", line 188, >> in create_volume_snapshot >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> create_info=create_info) >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> File >> "/usr/lib/python2.7/site-packages/novaclient/v2/assisted_volume_snapshots.py", >> line 43, in create >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> return self._create('/os-assisted-volume-snapshots', body, 'snapshot') >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> File "/usr/lib/python2.7/site-packages/novaclient/base.py", line 361, in >> _create >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> resp, body = self.api.client.post(url, body=body) >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 310, >> in post >> >> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >> return self.request(url, 'POST', **kwargs) >> >> >> My cinder volume are non netapp fas8040 via nfs. >> >> Anyone can help me ? >> > > There is another feature in keystone that sounds like a better fit for > what you're trying to do, called application credentials [2]. Application > credentials were written as a way for users to grant authorization to > services and scripts (e.g., the scipt asking cinder for a snapshot in your > case.) > > Application credentials aren't tokens, but your scripts can use them to > authenticate for a token [3]. The keystoneauth library already supports > application credentials, so if you use that for building a session you > should be able to use it in other clients that already support keystoneauth > [4]. > > [2] > https://docs.openstack.org/keystone/latest/user/application_credentials.html > [3] > https://docs.openstack.org/keystone/latest/user/application_credentials.html#using-application-credentials > [4] > https://docs.openstack.org/keystoneauth/latest/authentication-plugins.html#application-credentials > >> >> Regards >> >> Ignazio >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zufar at onf-ambassador.org Sat Jan 19 10:03:46 2019 From: zufar at onf-ambassador.org (Zufar Dhiyaulhaq) Date: Sat, 19 Jan 2019 17:03:46 +0700 Subject: [ironic] alidation of image href secreturl failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request Message-ID: Hi, I get some error when trying to create an instance for bare-metal node. Bellow is my troubleshooting. I don't know what is happening. Any suggestions? *Ironic Error Log*: 2019-01-19 15:36:41.232 15780 ERROR ironic.drivers.modules.deploy_utils [req-200dac66-0995-41c4-8c8c-dff053d27e36 499299da0c284a4ba9214ea0d83867cc 62088a869020430392a4fb1a0c5d2863 - default default] Agent deploy supports only HTTP(S) URLs as instance_info['image_source'] or swift temporary URL. Either the specified URL is not a valid HTTP(S) URL or is not reachable for node 6c20755a-e36b-495a-98e1-a40f58e5ac3c. Error: Validation of image href secreturl failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request.: ImageRefValidationFailed: Validation of image href secreturl failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request. 2019-01-19 15:36:41.233 15780 ERROR ironic.conductor.manager [req-200dac66-0995-41c4-8c8c-dff053d27e36 499299da0c284a4ba9214ea0d83867cc 62088a869020430392a4fb1a0c5d2863 - default default] Error while preparing to deploy to node 6c20755a-e36b-495a-98e1-a40f58e5ac3c: Validation of image href secreturl failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request.: ImageRefValidationFailed: Validation of image href secreturl failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request. *Nova Error Log:* 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall [-] Fixed interval looping call 'nova.virt.ironic.driver.IronicDriver._wait_for_active' failed: InstanceDeployFailure: Failed to provision instance 3$ 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall Traceback (most recent call last): 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall File "/usr/lib/python2.7/site-packages/oslo_service/loopingcall.py", line 137, in _run_loop 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall result = func(*self.args, **self.kw) 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 505, in _wait_for_active 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall raise exception.InstanceDeployFailure(msg) 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall InstanceDeployFailure: Failed to provision instance 38c276b1-b88a-4f4b-924b-8b52377f3145: Failed to prepare to deploy: Validation of image href secre$ 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall 2019-01-19 16:35:52.640 13355 ERROR nova.virt.ironic.driver [req-e11c3fcc-2066-49c6-b47b-0e3879840ad0 7ad46602ac42417a8c798c69cb3105e5 f3bb39ae2e0946e1bbf812bcde6e08a7 - default default] Error deploying instanc$ 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [req-e11c3fcc-2066-49c6-b47b-0e3879840ad0 7ad46602ac42417a8c798c69cb3105e5 f3bb39ae2e0946e1bbf812bcde6e08a7 - default default] [instance: 38c276b1-b88a-4$ 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] Traceback (most recent call last): 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2252, in _build_resources 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] yield resources 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2032, in _build_and_run_instance 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] block_device_info=block_device_info) 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1136, in spawn 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] 'node': node_uuid}) 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] self.force_reraise() 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] six.reraise(self.type_, self.value, self.tb) 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1128, in spawn 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] timer.start(interval=CONF.ironic.api_retry_interval).wait() 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] return hubs.get_hub().switch() 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] return self.greenlet.switch() 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/oslo_service/loopingcall.py", line 137, in _run_loop 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] result = func(*self.args, ** self.kw) 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 505, in _wait_for_active 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] raise exception.InstanceDeployFailure(msg) 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] InstanceDeployFailure: Failed to provision instance 38c276b1-b88a-4f4b-924b-8b52377f3145: Failed to prep$ 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: 38c276b1-b88a-4f4b-924b-8b52377f3145] *Ironic Configuration:* [DEFAULT] enabled_drivers=pxe_ipmitool enabled_hardware_types = ipmi log_dir=/var/log/ironic transport_url=rabbit://guest:guest at 10.60.60.10:5672/ auth_strategy=keystone notification_driver = messaging [conductor] send_sensor_data = true automated_clean=true [swift] region_name = RegionOne project_domain_id = default user_domain_id = default project_name = services password = IRONIC_PASSWORD username = ironic auth_url = http://10.60.60.10:5000/v3 auth_type = password [pxe] tftp_root=/tftpboot tftp_server=10.60.60.10 ipxe_enabled=True pxe_bootfile_name=undionly.kpxe uefi_pxe_bootfile_name=ipxe.efi pxe_config_template=$pybasedir/drivers/modules/ipxe_config.template uefi_pxe_config_template=$pybasedir/drivers/modules/ipxe_config.template pxe_append_params=coreos.autologin #ipxe_use_swift=True [agent] image_download_source = http [deploy] http_root=/httpboot http_url=http://10.60.60.10:8088 [service_catalog] insecure = True auth_uri=http://10.60.60.10:5000/v3 auth_type=password auth_url=http://10.60.60.10:35357 project_domain_id = default user_domain_id = default project_name = services username = ironic password = IRONIC_PASSWORD region_name = RegionOne [database] connection=mysql+pymysql:// ironic:IRONIC_DBPASSWORD at 10.60.60.10/ironic?charset=utf8 [keystone_authtoken] auth_url=http://10.60.60.10:35357 www_authenticate_uri=http://10.60.60.10:5000 auth_type=password username=ironic password=IRONIC_PASSWORD user_domain_name=Default project_name=services project_domain_name=Default [neutron] www_authenticate_uri=http://10.60.60.10:5000 auth_type=password auth_url=http://10.60.60.10:35357 project_domain_name=Default project_name=services user_domain_name=Default username=ironic password=IRONIC_PASSWORD cleaning_network = 461a6663-e015-4ecf-9076-d1b502c3db25 provisioning_network = 461a6663-e015-4ecf-9076-d1b502c3db25 [glance] region_name = RegionOne project_domain_id = default user_domain_id = default project_name = services password = IRONIC_PASSWORD username = ironic auth_url = http://10.60.60.10:5000/v3 auth_type = password temp_url_endpoint_type = swift swift_endpoint_url = http://10.60.60.10:8080/v1/AUTH_%(tenant_id)s swift_account = AUTH_f3bb39ae2e0946e1bbf812bcde6e08a7 swift_container = glance swift_temp_url_key = secret *Temp-URL enable:* [root at zu-controller0 ~(keystone_admin)]# openstack object store account show +------------+---------------------------------------+ | Field | Value | +------------+---------------------------------------+ | Account | AUTH_f3bb39ae2e0946e1bbf812bcde6e08a7 | | Bytes | 996 | | Containers | 1 | | Objects | 1 | | properties | Temp-Url-Key='secret' | +------------+---------------------------------------+ *Swift Endpoint:* [root at zu-controller0 ~(keystone_admin)]# openstack endpoint list | grep swift | 07e9d544a44241f5b317f651dce5f0a4 | RegionOne | swift | object-store | True | public | http://10.60.60.10:8080/v1/AUTH_%(tenant_id)s | | dadfd168384542b0933fe41df87d9dc8 | RegionOne | swift | object-store | True | internal | http://10.60.60.10:8080/v1/AUTH_%(tenant_id)s | | e53aca9d357542868516d367a0bf13a6 | RegionOne | swift | object-store | True | admin | http://10.60.60.10:8080/v1/AUTH_%(tenant_id)s | Best Regards, Zufar Dhiyaulhaq -------------- next part -------------- An HTML attachment was scrubbed... URL: From elvis at eprinz.us Sat Jan 19 19:39:25 2019 From: elvis at eprinz.us (Prinz Elvis N) Date: Sat, 19 Jan 2019 20:39:25 +0100 Subject: [CFP-Denver] Co-Speaker for Hands-on Session in Denver Summit (Prinz Elvis Noudjeu) Message-ID: <2a45ec6cbcb941e8d263949e91834889@eprinz.us> Hi All, Just want to ask if someone needs a Co-Speaker in his team for CFP Denver Summit. For Hands-on Session with Topics around: Kolla, zun, Monitoring, Cinder or HPC. Cheers Elvis from Germany freenode: PrinzElvis tw: @PrinzElvis From openstack at medberry.net Sat Jan 19 19:48:02 2019 From: openstack at medberry.net (David Medberry) Date: Sat, 19 Jan 2019 12:48:02 -0700 Subject: Pre-Denver Summit [OFFTOPIC] Message-ID: FYI, There is a Denver traditional sci-fi fest the weekend prior to the Open Infra Summit. It is called "StarFest" and has been running for decades. https://starfestdenver.com/ So, if interested in such things, just come a couple days early. I don't expect this to interest many, but feel it is worthwhile to advertise it a tiny bit. I did bracket tag it offtopic. From mriedemos at gmail.com Sat Jan 19 23:08:25 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Sat, 19 Jan 2019 17:08:25 -0600 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: References: Message-ID: <1f492516-3375-b3dc-f68a-9a8b66431e9b@gmail.com> A couple of quick status updates below. On 1/16/2019 1:29 PM, Matt Riedemann wrote: > Nested providers / reshaper / VGPU: > > * We still need someone to write a functional test which creates a > server with a flat resource structure, reshapes that to nested, and then > creates another server against the same provider tree. Gibi started this: https://review.openstack.org/#/c/631559/ > > Data migration: > > * The only placement-specific online data migration in nova is > "create_incomplete_consumers" and we agreed to copy that into placement > and add a placement-status upgrade check for it. The data migration code > will build on top of Tetsuro's work [4]. Matt is signed up to work on > both of those commands. The create_incomplete_consumers data migration was copied to placement and is merged: https://review.openstack.org/#/c/631604/ And the upgrade check patch is up for review: https://review.openstack.org/#/c/631671/ -- Thanks, Matt From gmann at ghanshyammann.com Sun Jan 20 12:00:07 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 20 Jan 2019 21:00:07 +0900 Subject: [qa] dynamic credentials with the tempest swift client In-Reply-To: References: <1aa6bce4-622e-4787-a73b-27de7ed9d224@www.fastmail.com> Message-ID: <1686b21dc57.10adf6b3c12767.7382989113710382621@ghanshyammann.com> Pre-provisioned account is the way to use the existing cred to run the Tempest. Can you check in tempest log about the reason of test skip? You can take ref of gate job for pre-provisioned accounts [1] [1] http://logs.openstack.org/50/628250/3/check/tempest-full-test-account-py3 -gmann ---- On Sat, 19 Jan 2019 01:01:22 +0900 Udi Kalifon wrote ---- > When I try this it just skips the tests, and doesn't say anywhere why. I added this to my tempest.conf: > [auth] > test_accounts_file = /home/ukalifon/src/tempest/cloud-01/accounts.yaml > use_dynamic_credentials = False > > And my accounts.yaml looks like this:- username: 'admin' > tenant_name: 'admin' > password: 'cYsJrqtj7IvC581DxsLZkXlku' > > Regards, > Udi Kalifon; Senior QE; RHOS-UI Automation > > > > On Fri, Jan 18, 2019 at 11:08 AM Masayuki Igawa wrote: > Hi, > > On Thu, Jan 17, 2019, at 17:58, Udi Kalifon wrote: > : > > So I'm looking for a way to utilize the client without it automatically > > creating itself dynamic credentials; it has to use the already-existing > > admin credentials on the admin project in order to see the container > > with the plans. What's the right way to do that, please? Thanks a lot > > in advance! > > Does this pre-provisioned credentials help you? > https://docs.openstack.org/tempest/latest/configuration.html#pre-provisioned-credentials > > -- Masayuki Igawa > Key fingerprint = C27C 2F00 3A2A 999A 903A 753D 290F 53ED C899 BF89 > > From qi.ni at intel.com Mon Jan 21 06:31:58 2019 From: qi.ni at intel.com (Ni, Qi) Date: Mon, 21 Jan 2019 06:31:58 +0000 Subject: Patch needs to be re-approved Message-ID: <32C216DF431DC842B4FB087109584B420473BC5B@shsmsx102.ccr.corp.intel.com> Hey guys, Please take a look at my patch https://review.openstack.org/#/c/626109/. Since there are two code review +2 and workflow +1, why hasn't it been merged yet? Best regards, Nicky. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Mon Jan 21 07:07:36 2019 From: zbitter at redhat.com (Zane Bitter) Date: Mon, 21 Jan 2019 20:07:36 +1300 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <20190115153041.mnrwbp6uekaucygq@yuggoth.org> <02355e62-092c-1e84-9559-9c9ae5c0e528@redhat.com> Message-ID: <349b1e3b-2de8-849c-1bd8-58cb61827639@redhat.com> On 18/01/19 11:38 PM, Chris Dent wrote: > On Fri, 18 Jan 2019, Zane Bitter wrote: > >> This seems like a good lead in to the feedback I have on the current >> role-of-the-TC document (which I already touched on in the review: >> https://review.openstack.org/622400). This discussion (which we've had >> many times in many forms) always drives me bananas, and here's why: >> >> It is *NOT* about "executive power"! > > I basically agree with you that leadership is the key factor and my > heart is with you on much of what you say throughout your message; > however, as much as "executive power" makes me cringe, it felt > necessary to introduce something else into the discussion to break > the cycle. We keep talking about needing leadership but then seem to > fail to do anything about it. My point was that this happens at least in part because we too often conflate leadership with "telling people what to do". > Throwing "power" into the mix is largely in response to my > observations and own personal experience that when a project or PTL > is either: > > * acting in bad faith, contrary to the wider vision, or holding an >   effective veto over a positive change much of the rest of the >   community wants > * feared that they might do any of those things in the prior point, >   even if they haven't demonstrated such > > the TC clams up, walks away, and tries to come at things from > another angle which won't cause a disruption to the fragile peace. We should assume that those kinds of situations come about due to people having different ideas about what OpenStack is supposed to be, rather than acting in bad faith or putting the wellbeing of their own project ahead of the whole community (which would be in contravention of our community principles: https://governance.openstack.org/tc/reference/principles.html#openstack-first-project-team-second-company-third). Under that assumption, I agree with you that it's important to force a conversation that leads to some resolution (after all, it's entirely possible that the project/PTL that is in conflict with the rest of the community is right!), rather than trying to paper over the issue. It's very difficult to tell somebody that they're acting "contrary to the wider vision" if you can't tell them what the wider vision is though. I'm hoping that having actually documented a vision for OpenStack clouds now, we have something to point to and ask "which part of this do you think should change?". It's... strange, if not exactly surprising, to me that facilitating those kinds of conversations (starting with making sure they happen) isn't something we have consensus on as being part of the TC's role. > So, in a bit of reverse psychology: If the TC can't control the > projects, maybe the projects should just be the TC? It's an interesting idea - and a great discussion - but ultimately if a PTL is not negotiating with the rest of the community now, what about putting them on the TC (presumably against their will, as many could run and quite likely win a seat already if they actually wanted) would prompt them to start? Noblesse oblige? I don't see a viable alternative to actually herding the cats, and getting the folks who are working in a different direction to articulate where they disagree and adjust course if necessary. (And if the TC does not do this, it will continue to remain un-done, because there is no other group that possesses the moral authority to try.) cheers, Zane. From massimo.sgaravatto at gmail.com Mon Jan 21 07:42:15 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Mon, 21 Jan 2019 08:42:15 +0100 Subject: [nova][ops] Problems disabling notifications on state changes Message-ID: I am disabling ceilometer on my OpenStack Ocata cloud. As fas as I understand, besides stopping the ceilometer services, I can also apply the following changes to nova.conf on the compute nodes: instance_usage_audit = True --> false notify_on_state_change = vm_and_task_state --> None I have a problem with the latter change: # grep ^notify_on_state_change /etc/nova/nova.conf notify_on_state_change=None but in the nova log: 2019-01-21 08:31:48.246 6349 ERROR nova ConfigFileValueError: Value for option notify_on_state_change is not valid: Valid values are [None, vm_state, vm_and_task_state], but found 'None' I have also tried setting notify_on_state_change= but it complains that it is not a valid value I can simply comment that line, but I am afraid there is a problem somewhere Thanks, Massimo -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Mon Jan 21 09:10:00 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Mon, 21 Jan 2019 10:10:00 +0100 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: References: Message-ID: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> On Thu, 2019-01-17 at 15:40 +0100, Tobias Rydberg wrote: > Hi, > > Thanks a lot for pushing this Adrian and that etherpad is a really > good > start! > > I'm happy to help out champion this if that is of any use and if > it's > chosen as one of the community goals! > > Cheers > > Tobias Rydberg > Senior Developer > Twitter & IRC: tobberydberg > > www.citynetwork.eu | www.citycloud.com > > INNOVATION THROUGH OPEN IT INFRASTRUCTURE > ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED > Hello, It's a pleasure to see people so thrilled about a community goal :) I see the etherpad [0] explains an opinion on how to do things, but I don't see many answers there, so I am not sure if there is a consensus yet. I think it would be great to have a larger community feedback, or at least a API SIG feedback, analysing this pattern. Regards, Jean-Philippe Evrard [0]: https://etherpad.openstack.org/p/community-goal-project-deletion From zbitter at redhat.com Mon Jan 21 09:15:41 2019 From: zbitter at redhat.com (Zane Bitter) Date: Mon, 21 Jan 2019 22:15:41 +1300 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> Message-ID: <118e96b2-e8cf-0711-1ea0-6d6f23f34eae@redhat.com> On 18/01/19 1:57 AM, Doug Hellmann wrote: > Chris Dent writes: > >> On Thu, 17 Jan 2019, Zane Bitter wrote: >> >>>> Thus: What if the TC and PTLs were the same thing? Would it become >>>> more obvious that there's too much in play to make progress in a >>>> unified direction (on the thing called OpenStack), leading us to >>>> choose less to do, and choose more consistency and actionable >>>> leadership? And would it enable some power to execute on that >>>> leadership. >>> >>> I'm not sure we need to speculate, because as you know the TC and PTLs >>> literally were the same thing prior to 2014-ish. My recollection is that >>> there were pluses and minuses, but on the whole I don't think it had the >>> effect you're suggesting it might. >> >> Part and parcel of what I'm suggesting is that less stuff would be >> considered in the domain of "what do we do?" such that the tyranny of >> the old/existing projects that you describe is a feature not a bug, >> as an in-built constraint. >> >> It's not a future I really like, but it is one strategy for enabling >> moving in one direction: cut some stuff. Stop letting so many >> flowers bloom. >> >> Letting those flowers bloom is in the camp of "contribution in, >> all its many and diverse forms". > > What would you prune? As a frequent and loud advocate for allowing all of those new projects in, I feel like this is a good moment to take stock and consider whether I might have been mistaken to do so, if only to reassure other folks that they can attempt to answer the question without me yelling at them ;) I do think Chris offers a valid line of enquiry, even though (like him) I don't really like the future that it leads to. I would identify two classes of project that we might consider for pruning in this scenario. * There are a number of projects that in a perfect world would arguably be just a feature rather than a separate service. The general pattern was usually that they had to do something on the compute node that was easier *socially* to get implemented in a separate project; often they also had to do something in the control plane that could potentially have been handled by a combination of other services, but again it was easier to throw that code into the project too rather than force multiple hard dependencies on cloud operators that wanted the feature. Pruning these projects could in theory lead to a more technically justifiable design for the features they support, and help build a critical mass of users for the more generic control plane services (I'm thinking of e.g. Mistral) that might have been used by multiple features, instead of being effectively reimplemented in various hard-coded configurations by multiple projects. * There are a number of projects that proceeded a long way down the path despite containing fundamental design flaws due to workarounds for missing features in services they depended on. In at least one case, multiple companies toiled away diligently for years taking over from one another as each, successively, ran out of runway while still waiting for features to build a sustainable design on top of. In the meantime, we added them to OpenStack and encouraged/demanded that they spend a good fraction of their time and effort on not breaking existing users from release to release. Pruning these projects might folks interested in them the opportunity to forego backwards-compatibility in favour of ensuring the features they need are present first, and then rapidly iterating toward a long-term sustainable design. The problem I still see with this it is that we made all of these decisions for good reasons, which were about getting feedback. We encouraged projects to guarantee backwards compatibility because that's needed to get users to use it for real and give feedback. We added projects that depended on missing features in part to provide feedback to other teams on what features were needed. We added projects that were really features because users needed those features, and there was no other way to hear their feedback. Clearly in some cases, that was not enough. But it's very hard to see how we can get the features users want done with even _less_ feedback. It could be that we don't actually want to get those features done, but interestingly (and slightly surprisingly) during the technical vision exercise nobody suggested we delete every design goal except for "Basic Physical Data Center Management". (If you *do* think we should do that, please propose it as a patch so we can discuss it.) It seems like we all actually kinda agree on where we want to get, but some of the critical paths to getting there may be blocked by other priorities. At this point I actually wouldn't be too unhappy to see a reset, where we said OK we are not going to worry about this other stuff until we've re-architected the building blocks to operate in such a way that they can support all of the additional services we want. Especially if we had a specific plan for prioritising those aspects. But how are we going to get feedback on what exactly it is we need to do without folks in the community building those additional services and features, and users using them? That's not a rhetorical question; if you have ideas I'd like to hear them. cheers, Zane. From balazs.gibizer at ericsson.com Mon Jan 21 09:49:22 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Mon, 21 Jan 2019 09:49:22 +0000 Subject: [nova][ops] Problems disabling notifications on state changes In-Reply-To: References: Message-ID: <1548064159.1231.4@smtp.office365.com> On Mon, Jan 21, 2019 at 8:42 AM, Massimo Sgaravatto wrote: > I am disabling ceilometer on my OpenStack Ocata cloud. > > As fas as I understand, besides stopping the ceilometer services, I > can also apply the following changes to nova.conf on the compute > nodes: > > instance_usage_audit = True --> false > notify_on_state_change = vm_and_task_state --> None > > > I have a problem with the latter change: > > # grep ^notify_on_state_change /etc/nova/nova.conf > notify_on_state_change=None > > but in the nova log: > > 2019-01-21 08:31:48.246 6349 ERROR nova ConfigFileValueError: Value > for option notify_on_state_change is not valid: Valid values are > [None, vm_state, vm_and_task_state], but found 'None' > > > I have also tried setting > > notify_on_state_change= > > > but it complains that it is not a valid value > > I can simply comment that line, but I am afraid there is a problem > somewhere If you want to turn off all the notification sending from Nova, then I suggest to add the following [1] to the nova.conf: [oslo_messaging_notifications] driver = noop I verified in a devstack that if you not specify the notify_on_state_change attribute in the config file then it properly defaults to None. I think there is no way to specify the None value othervise in the config file. Cheers, gibi [1] https://docs.openstack.org/oslo.messaging/latest/configuration/opts.html#oslo_messaging_notifications.driver > > Thanks, Massimo > From smooney at redhat.com Mon Jan 21 09:51:42 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 21 Jan 2019 09:51:42 +0000 Subject: Patch needs to be re-approved In-Reply-To: <32C216DF431DC842B4FB087109584B420473BC5B@shsmsx102.ccr.corp.intel.com> References: <32C216DF431DC842B4FB087109584B420473BC5B@shsmsx102.ccr.corp.intel.com> Message-ID: On Mon, 2019-01-21 at 06:31 +0000, Ni, Qi wrote: > Hey guys, > > Please take a look at my patch https://review.openstack.org/#/c/626109/. > > Since there are two code review +2 and workflow +1, why hasn’t it been merged yet? as far as i can tell it merged on january 18th at 18:19 so it merged 3 days ago. did you link the incorrect patch? it can take a few hours for patches to merged as the gate job need to scheduled and run before the patch can be submited by zuul > > Best regards, > Nicky. > From balazs.gibizer at ericsson.com Mon Jan 21 10:45:09 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Mon, 21 Jan 2019 10:45:09 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> <1547029853.1128.0@smtp.office365.com> <1547052955.1128.1@smtp.office365.com> <1547486170.17957.0@smtp.office365.com> <1547834830.1231.1@smtp.office365.com> <=?utf-8?Q?=22Bal?= =?utf-8?Q?=C3=A1zs?= Gibizer"'s message of "Fri, 18 Jan 2019 18:07:16 +0000"> Message-ID: <1548067505.1231.5@smtp.office365.com> On Fri, Jan 18, 2019 at 7:40 PM, Dan Smith wrote: >> * There will be a second new microversion (probably in Train) that >> will >> enable move operations for server having resource aware ports. This >> microversion split will allow us not to block the create / delete >> support for the feature in Stein. >> >> * The new microversions will act as a feature flag in the code. This >> will allow merging single use cases (e.g.: server create with one >> ovs >> backed resource aware port) and functionally verifying it before the >> whole generic create use case is ready and enabled. >> >> * A nova-manage command will be provided to heal the port >> allocations >> without moving the servers if there is enough resource inventory >> available for it on the current host. This tool will only work >> online >> as it will call neutron and placement APIs. >> >> * Server move operations with the second new microversion will >> automatically heal the server allocation. > > I wasn't on this call, so apologies if I'm missing something > important. > > Having a microversion that allows move operations for an instance > configured with one of these ports seems really terrible to me. What > exactly is the point of that? To distinguish between Stein and Train > systems purely because Stein didn't have time to finish the feature? I think in Stein we have time to finish the boot / delete use case of the feature but most probably do not have time to finish the move use cases. I belive that the boot / delete use case is already useful for end users. There are plenty of features in nova that are enabled before supporting all the cases, like move operations with NUMA. > > IMHO, we should really avoid abusing microversions for that sort of > thing. I would tend to err on the side of "if it's not ready, then > it's > not ready" for Stein, but I'm sure the desire to get this in (even if > partially) is too strong for that sort of restraint. Why it is an abuse of the microversion to use it to signal that a new use case is supported? I'm confused. I was asked to use microversions to signal that a feature is ready. So I'm not sure why in case of a feature (feature == one ore more use case(s)) it is OK to use a microversion but not OK when a use case (e.g. boot/delete) is completed. > > Can we not return 403 in Stein, since moving instances is disable-able > anyway, and just make it work in Train? Having a new microversion > with a > description of "nothing changed except we finished a feature so you > can > do this very obscure thing now" seems like we're just using them as an I think "nothing is changed" would not be true. Some operation (e.g. server move) that was rejected before (or even accepted but caused unintentional resource overallocation) now works properly. Isn't it the "you can do this very obscure thing now" documentation of a microversion that makes the new API behavior discoverable? > experimental feature flag, which was definitely not the intent. I know > returning 403 for "you can't do this right now" isn't *as* > discoverable, > but you kinda have to handle 403 for operations that could be disabled > anyway, so... The boot / delete use case would not be experimental, that would be final. 403 is a client error but in this case, in Stein, move operations would not be implemented yet. So for me that error is not a client error (e.g. there is no way a client can fix it) but a server error, like HTTP 501. Cheers, gibi > > --Dan From gmann at ghanshyammann.com Mon Jan 21 10:53:35 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 21 Jan 2019 19:53:35 +0900 Subject: [Interop-wg] [dev] [cinder] [qa] Strict Validation for Volume API using JSON Schema In-Reply-To: <16771ca2086.c57557fb28769.9145911937981210050@ghanshyammann.com> References: <16760426d56.ef4c345622903.2195899647060980382@ghanshyammann.com> <29d271ff-d5a7-2a28-53c1-3be7b868ad20@gmail.com> <8337FB0D-81D3-4E6C-9039-47BD749C3862@vmware.com> <16771ca2086.c57557fb28769.9145911937981210050@ghanshyammann.com> Message-ID: <168700b4bdd.f69941ed26006.3898869334284004658@ghanshyammann.com> ---- On Mon, 03 Dec 2018 10:58:51 +0900 Ghanshyam Mann wrote ---- > ---- On Sat, 01 Dec 2018 02:58:45 +0900 Mark Voelker wrote ---- > > > > > On Nov 29, 2018, at 9:28 PM, Matt Riedemann wrote: > > > > > > On 11/29/2018 10:17 AM, Ghanshyam Mann wrote: > > >> - To improve the volume API testing to avoid the backward compatible changes. Sometime we accidentally change the API in backward incompatible way and strict validation with JSON schema help to block those. > > > > > > +1 this is very useful to avoid unintentionally breaking the API. > > > > > >> We want to hear from cinder and interop team about any impact of this change to them. > > > > > > I'm mostly interested in what the interop WG would do about this given it's a potentially breaking change for interop without changes to the guidelines. Would there be some sort of grace period for clouds to conform to the changes in tempest? > > > > > > > That’s more or less what eventually happened when we began enforcing strict validation on Nova a few years ago after considerable debate. Clouds that were compliant with the interop guidelines before the strict validation patch landed and started failing once it went in could apply for a waiver while they worked on removing or upstreaming the nonstandard stuff. For those not familiar, here’s the patch that created a waiver program: > > > > https://review.openstack.org/#/c/333067/ > > > > Note that this expired with the 2017.01 Guideline: > > > > https://review.openstack.org/#/c/512447/ > > > > While not everyone was totally happy with the solution, it seemed to work out as a middle ground solution that helped get clouds on a better path in the end. I think we’ll discuss whether we’d need to do something like this again here. I’d love to hear: > > > > 1.) If anyone knows of clouds/products that would be fail interop testing because of this. Not looking to name and shame, just to get an idea of whether or not we have a concrete problem and how big it is. > > > > 2.) Opinions on how the waiver program went last time and whether the rest of the community feels like it’s something we should consider again. > > > > Personally I’m supportive of the general notion of improving API interoperability here…as usual it’s figuring out the mechanics of the transition that take a little figuring. =) > > Thanks Mark for response. I think point 1 is important, it is good to get the list of clouds or failure due to this this strict validation change. And accordingly, we can wait on Tempest side to merge those changes for this cycle (but personally I do not want to delay that if everything is fine), so that we can avoid the immediate failure of interop program. Any update/feedback from interop/cloud provider side on strict API validation ? We are holding the Tempest patches to merge and waiting to hear form interop group. -gmann > > -gmann > > > > > At Your Service, > > > > Mark T. Voelker > > > > > > > -- > > > > > > Thanks, > > > > > > Matt > > > > > > _______________________________________________ > > > Interop-wg mailing list > > > Interop-wg at lists.openstack.org > > > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openstack.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Finterop-wg&data=02%7C01%7Cmvoelker%40vmware.com%7C82a07fe28afe488c2eea08d6566b9734%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636791417437738014&sdata=lEx%2BbbTVzC%2FRC7ebmARDrFhfMsToM7Rwx8EKYtE7iFM%3D&reserved=0 > > > > > > > > From jean-philippe at evrard.me Mon Jan 21 10:58:41 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Mon, 21 Jan 2019 11:58:41 +0100 Subject: [all] Come and join us @ FOSDEM 2019, BXL, Belgium (Feb 2-3) Message-ID: Hello everyone, Like the last year(s), the OpenStack community has a booth during FOSDEM in Brussels! Everyone is welcome to come by, and a little help by taking a shift at our stand would be super appreciated! I've written the event's details in an Etherpad [0]. This year is a little different, as the booth will have more topics than usual: - We are now welcoming distributions to talk about their way to distribute OpenStack. You could see RDO information there, for example. - We will have marketing material for the OpenInfrastructure pilot projects too. I look forward to see you there! Regards, Jean-Philippe Evrard (evrardjp) [0]: https://etherpad.openstack.org/p/fosdem-2019 From skaplons at redhat.com Mon Jan 21 11:16:02 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Mon, 21 Jan 2019 12:16:02 +0100 Subject: [neutron][python3] Help needed Message-ID: Hi, Since some time we are trying to move our neutron-functional job to be run with python 3 - see patch [1]. Unfortunately we are still hitting same issue, some subunit parser error with message like "Parser Error: {{{Short read - got 4090 bytes, wanted 4376 bytes}}}”. See for example here: [2]. That cause that only around half of tests are executed and our job is failed. We found similar bug in Cinder [3] and we were trying to limit number of output produced by tests but this doesn’t help much. I don’t know how to move forward with this issue. Maybe someone more familiar with ostestr/subunit can take a look at it and help us to make this job finally working :) [1] https://review.openstack.org/#/c/577383/ [2] http://logs.openstack.org/83/577383/28/check/neutron-functional/649ba74/logs/testr_results.html.gz [3] https://bugs.launchpad.net/cinder/+bug/1728640 — Slawek Kaplonski Senior software engineer Red Hat From smooney at redhat.com Mon Jan 21 11:31:24 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 21 Jan 2019 11:31:24 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1548067505.1231.5@smtp.office365.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> <1547029853.1128.0@smtp.office365.com> <1547052955.1128.1@smtp.office365.com> <1547486170.17957.0@smtp.office365.com> <1547834830.1231.1@smtp.office365.com> <=?utf-8?Q?=22Bal?= =?utf-8?Q?=C3=A1zs?= Gibizer"'s message of "Fri, 18 Jan 2019 18:07:16 +0000"> <1548067505.1231.5@smtp.office365.com> Message-ID: <1667691443b4159149b88595ebb13d8a51b0fffb.camel@redhat.com> On Mon, 2019-01-21 at 10:45 +0000, Balázs Gibizer wrote: > > On Fri, Jan 18, 2019 at 7:40 PM, Dan Smith wrote: > > > * There will be a second new microversion (probably in Train) that > > > will > > > enable move operations for server having resource aware ports. This > > > microversion split will allow us not to block the create / delete > > > support for the feature in Stein. > > > > > > * The new microversions will act as a feature flag in the code. This > > > will allow merging single use cases (e.g.: server create with one > > > ovs > > > backed resource aware port) and functionally verifying it before the > > > whole generic create use case is ready and enabled. > > > > > > * A nova-manage command will be provided to heal the port > > > allocations > > > without moving the servers if there is enough resource inventory > > > available for it on the current host. This tool will only work > > > online > > > as it will call neutron and placement APIs. > > > > > > * Server move operations with the second new microversion will > > > automatically heal the server allocation. > > > > I wasn't on this call, so apologies if I'm missing something > > important. > > > > Having a microversion that allows move operations for an instance > > configured with one of these ports seems really terrible to me. What > > exactly is the point of that? To distinguish between Stein and Train > > systems purely because Stein didn't have time to finish the feature? > > I think in Stein we have time to finish the boot / delete use case of > the feature but most probably do not have time to finish the move use > cases. I belive that the boot / delete use case is already useful for > end users. There are plenty of features in nova that are enabled before > supporting all the cases, like move operations with NUMA. that is true however numa in partaclar was due to an oversight not by design. as is the case with macvtap sriov numa had intended to support livemigration from its introduction even if they are only now being completed. numa even without artoms work has always supported cold migrations the same is true of cpu pinning,hugepages,pci/sriov pass-though. > > > > > IMHO, we should really avoid abusing microversions for that sort of > > thing. I would tend to err on the side of "if it's not ready, then > > it's > > not ready" for Stein, but I'm sure the desire to get this in (even if > > partially) is too strong for that sort of restraint. > > Why it is an abuse of the microversion to use it to signal that a new > use case is supported? I'm confused. I was asked to use microversions > to signal that a feature is ready. So I'm not sure why in case of a > feature (feature == one ore more use case(s)) it is OK to use a > microversion but not OK when a use case (e.g. boot/delete) is completed. dan can speak for himself but i would assume because it does not signal that the use case is supported. it merely signals taht the codebase could support it, as move operations can be disable via config or may not be supported by the selected hypervior (ironic), the presence of the microversion alone is not enough to determine the usecase is supported. unlike neutron extensions micro versions are not advertised individually and cant be enabled only when the deployment is configured to support a feature. > > > > > Can we not return 403 in Stein, since moving instances is disable-able > > anyway, and just make it work in Train? Having a new microversion > > with a > > description of "nothing changed except we finished a feature so you > > can > > do this very obscure thing now" seems like we're just using them as an > > I think "nothing is changed" would not be true. Some operation (e.g. > server move) that was rejected before (or even accepted but caused > unintentional resource overallocation) now works properly. since the min bandwith before was best effort any overallocation was not a bug or unintentional it was allowed by design given that we initall planned to delegate the bandwith mangment to the sdn contoler. as matt pointed out the apis for creating qos rules and policies are admin only as are most of the move operations. a tenant could have chosen to apply the QOS policy but the admin had to create it in the first place. > Isn't it the > "you can do this very obscure thing now" documentation of a > microversion that makes the new API behavior discoverable? > > > experimental feature flag, which was definitely not the intent. I know > > returning 403 for "you can't do this right now" isn't *as* > > discoverable, > > but you kinda have to handle 403 for operations that could be disabled > > anyway, so... > > The boot / delete use case would not be experimental, that would be > final. > > 403 is a client error but in this case, in Stein, move operations would > not be implemented yet. So for me that error is not a client error > (e.g. there is no way a client can fix it) but a server error, like > HTTP 501. a 501 "not implemented" would be a valid error code to use with the new mirco version that declares support for bandwith based schduling. resize today does not retrun 501 https://developer.openstack.org/api-ref/compute/?expanded=resize-server-resize-action-detail#resize-server-resize-action nor do shelve/unshelve https://developer.openstack.org/api-ref/compute/#shelve-server-shelve-action https://developer.openstack.org/api-ref/compute/#unshelve-restore-shelved-server-unshelve-action the same is true of migrate and live migrate https://developer.openstack.org/api-ref/compute/?expanded=#migrate-server-migrate-action https://developer.openstack.org/api-ref/compute/?expanded=#live-migrate-server-os-migratelive-action as such for older microverions returning 501 would be incorrect as its a change in the set of response codes that existing clients should expect form those endpoints. while i agree it is not a client error being consitent with exisitng behavior chould be preferable as client presuably know how to deal with it. > > Cheers, > gibi > > > > > --Dan > > From jean-philippe at evrard.me Mon Jan 21 11:31:32 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Mon, 21 Jan 2019 12:31:32 +0100 Subject: [releases] Additions to releases-core In-Reply-To: References: <20190115151123.GA1297@sm-workstation> Message-ID: <44951afdf919914866ee0a6dd6c2a0f7d3e5fa20.camel@evrard.me> On Thu, 2019-01-17 at 10:59 -0800, Kendall Nelson wrote: > Thanks to Sean and the rest of release management for this > opportunity! > > I am excited to be apart of the team and help out :) > > And congrats to you too JP :) We'll celebrate at FOSDEM ;) > > -Kendall (diablo_rojo) > > Likewise. It's a pleasure to join the team. I look forward helping the team more, if possible onboard more people into the wonderful life of release management. And... Congrats to you too Kendall! FOSDEM will indeed be THE occasion to chat about that (who said celebrate?!) around the beverage/food/game of your choice :p JP (evrardjp) From yikunkero at gmail.com Mon Jan 21 11:37:52 2019 From: yikunkero at gmail.com (Yikun Jiang) Date: Mon, 21 Jan 2019 19:37:52 +0800 Subject: [Nova] [Scheduler] Change Nova weight_multiplier method to accept host_state Message-ID: In blueprint [1], we approved to allow set weights multiplier on host aggregates to have more flexibility when scheduling. In order to make weigher can get the metadata info, we propose to add a new parameter "host_state" in weight_multiplier method [2], just like what we current do in filter host_passes [3]. That means if you implements the custom weigher, you should change your weigher to accept a host_state. Any thoughts or feedback are welcome, thanks! : ) [1] https://blueprints.launchpad.net/nova/+spec/per-aggregate-scheduling-weight [2] https://review.openstack.org/#/c/628163/12/nova/weights.py at 79 [3] https://github.com/openstack/nova/blob/master/nova/scheduler/filters/__init__.py#L46 Regards, Yikun ---------------------------------------- Jiang Yikun(Kero) Mail: yikunkero at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Jan 21 11:42:24 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 21 Jan 2019 12:42:24 +0100 Subject: [magnum][queens] issues Message-ID: I am trying patches you just released for magnum (git fetch git:// git.openstack.org/openstack/magnum refs/changes/30/629130/9 && git checkout FETCH_HEAD) I got same issues on proxy. In the old version I modified with the help of spyros the scripts under /usr/lib/python2.7/dist-packages/magnum/drivers/common/templates/kubernetes/fragments because PROXY variables are not inherited in /etc/sysconfig/heat-params PROXY E NO PROXY variables are present but we must modify configure-kubernetes-master.sh to force them . /etc/sysconfig/heat-params echo "configuring kubernetes (master)" _prefix=${CONTAINER_INFRA_PREFIX:-docker.io/openstackmagnum/} export HTTP_PROXY=${HTTP_PROXY} export HTTPS_PROXY=${HTTPS_PROXY} export NO_PROXY=${NO_PROXY} echo "HTTP_PROXY IS ${HTTP_PROXY}" exporting the above variables when external network has a proxy, the master is installed but stack hangs creating kube master Resource Group Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Jan 21 11:53:40 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 21 Jan 2019 12:53:40 +0100 Subject: [manila-ui][queens] cannot create share snapshot Message-ID: Hello, I installed manila on centos queens but it is not possible creating share snapshot from ui: Danger: An error occurred. Please try again later. It works from command line . Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Mon Jan 21 12:03:48 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 21 Jan 2019 12:03:48 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <118e96b2-e8cf-0711-1ea0-6d6f23f34eae@redhat.com> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> <118e96b2-e8cf-0711-1ea0-6d6f23f34eae@redhat.com> Message-ID: On Mon, 21 Jan 2019, Zane Bitter wrote: > It could be that we don't actually want to get those features done, but > interestingly (and slightly surprisingly) during the technical vision > exercise nobody suggested we delete every design goal except for "Basic > Physical Data Center Management". (If you *do* think we should do that, > please propose it as a patch so we can discuss it.) It seems like we all > actually kinda agree on where we want to get, but some of the critical paths > to getting there may be blocked by other priorities. Moving forward on "all actually kinda agree" is pretty much all we can do. I often fear that the lack of widespread engagement with threads like this one and with reviews like the original vision one or your (Zane's) clarifications [1] represent disinterest. Thus, who is the "all"? However, as you've pointed out, the nature of TC elections mean that the TC is the only community-wide representative body. If people don't speak up in the individual cases, then we don't really have any choice but to assume they have spoken when they elected the people they did. Do that many people vote? Thus my continuous pleas for input. But it's okay. We've had some interesting discussion here, and that's useful. I'm not sure I'm able to make any concrete conclusions about what people want the role of the TC to be other than the same people who have always wanted it to be a reactive governance org still do, and the same people who have always wanted it be an active leadership org still do. I guess those in the latter camp simply need to get on with it. [1] https://review.openstack.org/#/c/631435/ -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From dabarren at gmail.com Mon Jan 21 12:24:46 2019 From: dabarren at gmail.com (Eduardo Gonzalez) Date: Mon, 21 Jan 2019 13:24:46 +0100 Subject: Why COA exam is being retired? Message-ID: Reading the info in the COA site [0] says the following* "The OpenStack Foundation is winding down the administration of the COA exam".* Is there any reason for retiring the exam? I've tried to find a notice in the mailing list but not found anything at all. [0] https://www.openstack.org/coa/ Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From adjandeye.sylla at orange.com Mon Jan 21 12:45:55 2019 From: adjandeye.sylla at orange.com (adjandeye.sylla at orange.com) Date: Mon, 21 Jan 2019 12:45:55 +0000 Subject: Question about Heat parser Message-ID: <12561_1548074756_5C45BF04_12561_456_1_C4EB90C743A33246994EC9C93505F19901928383@OPEXCAUBM32.corporate.adroot.infra.ftgroup> Dear all, In the context of my post-doc research, I'm working with HOT templates and I want to parse them. Instead of developing a new parser, I want to reuse the one that is used by Heat but I don't want to connect to a running OpenStack to do the parsing. Is it possible to use Heat as a standalone parser (i.e. without connecting to a running OpenStack) ? Thank you for your answer. Best regards, Adja SYLLA _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From qi.ni at intel.com Mon Jan 21 06:23:42 2019 From: qi.ni at intel.com (Ni, Qi) Date: Mon, 21 Jan 2019 06:23:42 +0000 Subject: Patch needs to be re-approved. Message-ID: <32C216DF431DC842B4FB087109584B420473BC3B@shsmsx102.ccr.corp.intel.com> Hi, my patch https://review.openstack.org/#/c/623401/13 has passed the code review but according to Developer's Guide, it will not automatically merged when the dependent change has merged. It now needs another approval or a toggle of the approval. Please take a look at it. Thank you. Best Regards, Nicky. -------------- next part -------------- An HTML attachment was scrubbed... URL: From iurygregory at gmail.com Mon Jan 21 13:33:51 2019 From: iurygregory at gmail.com (Iury Gregory) Date: Mon, 21 Jan 2019 14:33:51 +0100 Subject: [ironic] alidation of image href secreturl failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request In-Reply-To: References: Message-ID: Hi Zufar, Since you are trying to use Agent deploy, you need to use HTTPS instead of HTTP. Try to run the services under HTTPS and see if the problem still happen. Em sáb, 19 de jan de 2019 às 11:06, Zufar Dhiyaulhaq < zufar at onf-ambassador.org> escreveu: > Hi, > > I get some error when trying to create an instance for bare-metal node. > Bellow is my troubleshooting. I don't know what is happening. Any > suggestions? > > *Ironic Error Log*: > 2019-01-19 15:36:41.232 15780 ERROR ironic.drivers.modules.deploy_utils > [req-200dac66-0995-41c4-8c8c-dff053d27e36 499299da0c284a4ba9214ea0d83867cc > 62088a869020430392a4fb1a0c5d2863 - default default] Agent deploy supports > only HTTP(S) URLs as instance_info['image_source'] or swift temporary URL. > Either the specified URL is not a valid HTTP(S) URL or is not reachable for > node 6c20755a-e36b-495a-98e1-a40f58e5ac3c. Error: Validation of image href > secreturl failed, reason: Got HTTP code 404 instead of 200 in response to > HEAD request.: ImageRefValidationFailed: Validation of image href secreturl > failed, reason: Got HTTP code 404 instead of 200 in response to HEAD > request. > > 2019-01-19 15:36:41.233 15780 ERROR ironic.conductor.manager > [req-200dac66-0995-41c4-8c8c-dff053d27e36 499299da0c284a4ba9214ea0d83867cc > 62088a869020430392a4fb1a0c5d2863 - default default] Error while preparing > to deploy to node 6c20755a-e36b-495a-98e1-a40f58e5ac3c: Validation of image > href secreturl failed, reason: Got HTTP code 404 instead of 200 in response > to HEAD request.: ImageRefValidationFailed: Validation of image href > secreturl failed, reason: Got HTTP code 404 instead of 200 in response to > HEAD request. > > *Nova Error Log:* > 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall [-] Fixed > interval looping call > 'nova.virt.ironic.driver.IronicDriver._wait_for_active' failed: > InstanceDeployFailure: Failed to provision instance 3$ > 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall Traceback > (most recent call last): > 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall File > "/usr/lib/python2.7/site-packages/oslo_service/loopingcall.py", line 137, > in _run_loop > 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall result = > func(*self.args, **self.kw) > 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall File > "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 505, in > _wait_for_active > 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall raise > exception.InstanceDeployFailure(msg) > 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall > InstanceDeployFailure: Failed to provision instance > 38c276b1-b88a-4f4b-924b-8b52377f3145: Failed to prepare to deploy: > Validation of image href secre$ > 2019-01-19 16:35:52.639 13355 ERROR oslo.service.loopingcall > 2019-01-19 16:35:52.640 13355 ERROR nova.virt.ironic.driver > [req-e11c3fcc-2066-49c6-b47b-0e3879840ad0 7ad46602ac42417a8c798c69cb3105e5 > f3bb39ae2e0946e1bbf812bcde6e08a7 - default default] Error deploying instanc$ > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager > [req-e11c3fcc-2066-49c6-b47b-0e3879840ad0 7ad46602ac42417a8c798c69cb3105e5 > f3bb39ae2e0946e1bbf812bcde6e08a7 - default default] [instance: > 38c276b1-b88a-4$ > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] Traceback (most recent call last): > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2252, in > _build_resources > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] yield resources > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2032, in > _build_and_run_instance > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] > block_device_info=block_device_info) > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1136, > in spawn > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] 'node': node_uuid}) > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in > __exit__ > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] self.force_reraise() > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in > force_reraise > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] six.reraise(self.type_, > self.value, self.tb) > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 1128, > in spawn > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] > timer.start(interval=CONF.ironic.api_retry_interval).wait() > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] return hubs.get_hub().switch() > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] return self.greenlet.switch() > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/oslo_service/loopingcall.py", line 137, > in _run_loop > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] result = func(*self.args, ** > self.kw) > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] File > "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 505, in > _wait_for_active > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] raise > exception.InstanceDeployFailure(msg) > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] InstanceDeployFailure: Failed to > provision instance 38c276b1-b88a-4f4b-924b-8b52377f3145: Failed to prep$ > 2019-01-19 16:35:52.641 13355 ERROR nova.compute.manager [instance: > 38c276b1-b88a-4f4b-924b-8b52377f3145] > > *Ironic Configuration:* > [DEFAULT] > enabled_drivers=pxe_ipmitool > enabled_hardware_types = ipmi > log_dir=/var/log/ironic > transport_url=rabbit://guest:guest at 10.60.60.10:5672/ > auth_strategy=keystone > notification_driver = messaging > > [conductor] > send_sensor_data = true > automated_clean=true > > [swift] > region_name = RegionOne > project_domain_id = default > user_domain_id = default > project_name = services > password = IRONIC_PASSWORD > username = ironic > auth_url = http://10.60.60.10:5000/v3 > auth_type = password > > [pxe] > tftp_root=/tftpboot > tftp_server=10.60.60.10 > ipxe_enabled=True > pxe_bootfile_name=undionly.kpxe > uefi_pxe_bootfile_name=ipxe.efi > pxe_config_template=$pybasedir/drivers/modules/ipxe_config.template > uefi_pxe_config_template=$pybasedir/drivers/modules/ipxe_config.template > pxe_append_params=coreos.autologin > #ipxe_use_swift=True > > [agent] > image_download_source = http > > [deploy] > http_root=/httpboot > http_url=http://10.60.60.10:8088 > > [service_catalog] > insecure = True > auth_uri=http://10.60.60.10:5000/v3 > auth_type=password > auth_url=http://10.60.60.10:35357 > project_domain_id = default > user_domain_id = default > project_name = services > username = ironic > password = IRONIC_PASSWORD > region_name = RegionOne > > [database] > connection=mysql+pymysql:// > ironic:IRONIC_DBPASSWORD at 10.60.60.10/ironic?charset=utf8 > > [keystone_authtoken] > auth_url=http://10.60.60.10:35357 > www_authenticate_uri=http://10.60.60.10:5000 > auth_type=password > username=ironic > password=IRONIC_PASSWORD > user_domain_name=Default > project_name=services > project_domain_name=Default > > [neutron] > www_authenticate_uri=http://10.60.60.10:5000 > auth_type=password > auth_url=http://10.60.60.10:35357 > project_domain_name=Default > project_name=services > user_domain_name=Default > username=ironic > password=IRONIC_PASSWORD > cleaning_network = 461a6663-e015-4ecf-9076-d1b502c3db25 > provisioning_network = 461a6663-e015-4ecf-9076-d1b502c3db25 > > [glance] > region_name = RegionOne > project_domain_id = default > user_domain_id = default > project_name = services > password = IRONIC_PASSWORD > username = ironic > auth_url = http://10.60.60.10:5000/v3 > auth_type = password > temp_url_endpoint_type = swift > swift_endpoint_url = http://10.60.60.10:8080/v1/AUTH_%(tenant_id)s > swift_account = AUTH_f3bb39ae2e0946e1bbf812bcde6e08a7 > swift_container = glance > swift_temp_url_key = secret > > *Temp-URL enable:* > [root at zu-controller0 ~(keystone_admin)]# openstack object store account > show > +------------+---------------------------------------+ > | Field | Value | > +------------+---------------------------------------+ > | Account | AUTH_f3bb39ae2e0946e1bbf812bcde6e08a7 | > | Bytes | 996 | > | Containers | 1 | > | Objects | 1 | > | properties | Temp-Url-Key='secret' | > +------------+---------------------------------------+ > > *Swift Endpoint:* > [root at zu-controller0 ~(keystone_admin)]# openstack endpoint list | grep > swift > | 07e9d544a44241f5b317f651dce5f0a4 | RegionOne | swift | > object-store | True | public | > http://10.60.60.10:8080/v1/AUTH_%(tenant_id)s | > | dadfd168384542b0933fe41df87d9dc8 | RegionOne | swift | > object-store | True | internal | > http://10.60.60.10:8080/v1/AUTH_%(tenant_id)s | > | e53aca9d357542868516d367a0bf13a6 | RegionOne | swift | > object-store | True | admin | > http://10.60.60.10:8080/v1/AUTH_%(tenant_id)s | > > > Best Regards, > Zufar Dhiyaulhaq > -- *Att[]'sIury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Part of the puppet-manager-core team in OpenStack* *Software Engineer at Red Hat Czech* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From iurygregory at gmail.com Mon Jan 21 13:40:10 2019 From: iurygregory at gmail.com (Iury Gregory) Date: Mon, 21 Jan 2019 14:40:10 +0100 Subject: Patch needs to be re-approved. In-Reply-To: <32C216DF431DC842B4FB087109584B420473BC3B@shsmsx102.ccr.corp.intel.com> References: <32C216DF431DC842B4FB087109584B420473BC3B@shsmsx102.ccr.corp.intel.com> Message-ID: Hi Ni,Qi The last message from Zuul on your patch ( https://review.openstack.org/#/c/623401 ) says it's merged. Also, you can check the repository on git and you will see that it's merged ( https://github.com/openstack/neutron-lib/commits/master and https://github.com/openstack/neutron-lib/commit/e70f02a3e2c159518fb89c74ee6fbe1467934938 ) Em seg, 21 de jan de 2019 às 14:24, Ni, Qi escreveu: > > > Hi, > > my patch https://review.openstack.org/#/c/623401/13 has passed the code > review but according to Developer’s Guide, it will not automatically > merged when the dependent change has merged. It now needs another approval > or a toggle of the approval. Please take a look at it. Thank you. > > > > Best Regards, > > Nicky. > > > -- *Att[]'sIury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Part of the puppet-manager-core team in OpenStack* *Software Engineer at Red Hat Czech* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From lbragstad at gmail.com Mon Jan 21 14:34:08 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Mon, 21 Jan 2019 08:34:08 -0600 Subject: [heystone][nova][cinder] netapp queens trust scoped token In-Reply-To: References: Message-ID: Yes, they were originally implemented during the Queens release [0]. [0] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/queens/application-credentials.html On Sat, Jan 19, 2019 at 1:45 AM Ignazio Cassano wrote: > Hello Lance, application credential are also supported in queens? > Thanks > Ignazio > > > Il giorno Ven 18 Gen 2019 16:04 Lance Bragstad ha > scritto: > >> >> >> On Fri, Jan 18, 2019 at 6:49 AM Ignazio Cassano >> wrote: >> >>> Hello Everyone, >>> I am using a client for backupping openstack virtual machine. >>> The crux of the problem is the client uses trust based authentication >>> for scheduling backup jobs on behalf of user. When a trust scoped token is >>> passed to cinder client to take a snapshot, I expect the client use the >>> token to authenticate and perform the operation which cinder client does. >>> However cinder volume service invokes novaclient as part of cinder nfs >>> backend snapshot operation and novaclient tries to re-authenticate. Since >>> keystone does not allow re-authentication using trust based tokens, cinder >>> snapshot operation fails. >>> >>> >> Keystone allows authorization by allowing users to scope tokens to >> trusts, but once a trust-token is scoped, it can't be rescoped. Instead, >> keystone requires that you build another authentication request for a new >> trust-scoped token using the trust [0][1]. >> >> [0] >> https://developer.openstack.org/api-ref/identity/v3-ext/index.html?expanded=consuming-a-trust-detail#id121 >> [1] >> https://git.openstack.org/cgit/openstack/keystone/tree/keystone/auth/plugins/token.py#n84 >> >> >>> So I get the following error: >>> >>> 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs [req-61d977c1-eef3-4309-ac02-aaa0eb880925 >>> ab1bdb5dadc54312891f3a6410fef04d 6d1bffb04e3b4cdda30dc17aa96bfffc - default >>> default] Call to Nova to create snapshot failed: Forbidden: You are not >>> authorized to perform the requested action: Using trust-scoped token to >>> create another token. Create a new trust-scoped token instead. (HTTP 403) >>> (Request-ID: req-f55b682a-001b-4952-bfb1-abf4dd6bf459) >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs Traceback >>> (most recent call last): >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/ remotefs.py", >>> line 1452, in _create_snapshot_online >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> connection_info) >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> File "/usr/lib/python2.7/site-packages/cinder/compute/nova.py", line 188, >>> in create_volume_snapshot >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> create_info=create_info) >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> File >>> "/usr/lib/python2.7/site-packages/novaclient/v2/assisted_volume_snapshots.py", >>> line 43, in create >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> return self._create('/os-assisted-volume-snapshots', body, 'snapshot') >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> File "/usr/lib/python2.7/site-packages/novaclient/base.py", line 361, in >>> _create >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> resp, body = self.api.client.post(url, body=body) >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 310, >>> in post >>> >>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>> return self.request(url, 'POST', **kwargs) >>> >>> >>> My cinder volume are non netapp fas8040 via nfs. >>> >>> Anyone can help me ? >>> >> >> There is another feature in keystone that sounds like a better fit for >> what you're trying to do, called application credentials [2]. Application >> credentials were written as a way for users to grant authorization to >> services and scripts (e.g., the scipt asking cinder for a snapshot in your >> case.) >> >> Application credentials aren't tokens, but your scripts can use them to >> authenticate for a token [3]. The keystoneauth library already supports >> application credentials, so if you use that for building a session you >> should be able to use it in other clients that already support keystoneauth >> [4]. >> >> [2] >> https://docs.openstack.org/keystone/latest/user/application_credentials.html >> [3] >> https://docs.openstack.org/keystone/latest/user/application_credentials.html#using-application-credentials >> [4] >> https://docs.openstack.org/keystoneauth/latest/authentication-plugins.html#application-credentials >> >>> >>> Regards >>> >>> Ignazio >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Jan 21 15:39:58 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 21 Jan 2019 16:39:58 +0100 Subject: [heystone][nova][cinder] netapp queens trust scoped token In-Reply-To: References: Message-ID: Many thanks Ignazio Il giorno lun 21 gen 2019 alle ore 15:39 Lance Bragstad ha scritto: > Yes, they were originally implemented during the Queens release [0]. > > [0] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/queens/application-credentials.html > > On Sat, Jan 19, 2019 at 1:45 AM Ignazio Cassano > wrote: > >> Hello Lance, application credential are also supported in queens? >> Thanks >> Ignazio >> >> >> Il giorno Ven 18 Gen 2019 16:04 Lance Bragstad ha >> scritto: >> >>> >>> >>> On Fri, Jan 18, 2019 at 6:49 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Hello Everyone, >>>> I am using a client for backupping openstack virtual machine. >>>> The crux of the problem is the client uses trust based authentication >>>> for scheduling backup jobs on behalf of user. When a trust scoped token is >>>> passed to cinder client to take a snapshot, I expect the client use the >>>> token to authenticate and perform the operation which cinder client does. >>>> However cinder volume service invokes novaclient as part of cinder nfs >>>> backend snapshot operation and novaclient tries to re-authenticate. Since >>>> keystone does not allow re-authentication using trust based tokens, cinder >>>> snapshot operation fails. >>>> >>>> >>> Keystone allows authorization by allowing users to scope tokens to >>> trusts, but once a trust-token is scoped, it can't be rescoped. Instead, >>> keystone requires that you build another authentication request for a new >>> trust-scoped token using the trust [0][1]. >>> >>> [0] >>> https://developer.openstack.org/api-ref/identity/v3-ext/index.html?expanded=consuming-a-trust-detail#id121 >>> [1] >>> https://git.openstack.org/cgit/openstack/keystone/tree/keystone/auth/plugins/token.py#n84 >>> >>> >>>> So I get the following error: >>>> >>>> 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs [req-61d977c1-eef3-4309-ac02-aaa0eb880925 >>>> ab1bdb5dadc54312891f3a6410fef04d 6d1bffb04e3b4cdda30dc17aa96bfffc - default >>>> default] Call to Nova to create snapshot failed: Forbidden: You are not >>>> authorized to perform the requested action: Using trust-scoped token to >>>> create another token. Create a new trust-scoped token instead. (HTTP 403) >>>> (Request-ID: req-f55b682a-001b-4952-bfb1-abf4dd6bf459) >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs Traceback >>>> (most recent call last): >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/ remotefs.py", >>>> line 1452, in _create_snapshot_online >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> connection_info) >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> File "/usr/lib/python2.7/site-packages/cinder/compute/nova.py", line 188, >>>> in create_volume_snapshot >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> create_info=create_info) >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> File >>>> "/usr/lib/python2.7/site-packages/novaclient/v2/assisted_volume_snapshots.py", >>>> line 43, in create >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> return self._create('/os-assisted-volume-snapshots', body, 'snapshot') >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> File "/usr/lib/python2.7/site-packages/novaclient/base.py", line 361, in >>>> _create >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> resp, body = self.api.client.post(url, body=body) >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 310, >>>> in post >>>> >>>> > 2018-12-12 04:07:14.759 2859 ERROR cinder.volume.drivers. remotefs >>>> return self.request(url, 'POST', **kwargs) >>>> >>>> >>>> My cinder volume are non netapp fas8040 via nfs. >>>> >>>> Anyone can help me ? >>>> >>> >>> There is another feature in keystone that sounds like a better fit for >>> what you're trying to do, called application credentials [2]. Application >>> credentials were written as a way for users to grant authorization to >>> services and scripts (e.g., the scipt asking cinder for a snapshot in your >>> case.) >>> >>> Application credentials aren't tokens, but your scripts can use them to >>> authenticate for a token [3]. The keystoneauth library already supports >>> application credentials, so if you use that for building a session you >>> should be able to use it in other clients that already support keystoneauth >>> [4]. >>> >>> [2] >>> https://docs.openstack.org/keystone/latest/user/application_credentials.html >>> [3] >>> https://docs.openstack.org/keystone/latest/user/application_credentials.html#using-application-credentials >>> [4] >>> https://docs.openstack.org/keystoneauth/latest/authentication-plugins.html#application-credentials >>> >>>> >>>> Regards >>>> >>>> Ignazio >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From pkovar at redhat.com Mon Jan 21 16:36:17 2019 From: pkovar at redhat.com (Petr Kovar) Date: Mon, 21 Jan 2019 17:36:17 +0100 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: <5C41EE3D.9060701@openstack.org> References: <20190118144233.132eb0e427389da15e725141@redhat.com> <20190118143906.2qqarb5xere4zorw@yuggoth.org> <5C41EE3D.9060701@openstack.org> Message-ID: <20190121173617.e3f6336543860f49e0f5b97b@redhat.com> Just updated the perms in Gerrit for the openstack-doc-core group. Welcome back, Alex. pk On Fri, 18 Jan 2019 09:18:21 -0600 Jimmy McArthur wrote: > Go Alex!!!! > > > Sean Mooney > > January 18, 2019 at 9:07 AM > > same its nice to see you return alex. i also cant vote on you returning > > to a core position on the docs team but i think you did great > > work before and would continue to have a good impact if the nomination > > is carried. > > > > > > Alexandra Settle > > January 18, 2019 at 8:47 AM > > > > Awww thanks Jeremy! I've missed everyone so much! :D > > > > > > ------------------------------------------------------------------------ > > *From:* Jeremy Stanley > > *Sent:* 18 January 2019 14:39 > > *To:* openstack-discuss at lists.openstack.org > > *Subject:* Re: [docs] Nominating Alex Settle for openstack-doc-core > > On 2019-01-18 14:42:33 +0100 (+0100), Petr Kovar wrote: > > > Alex Settle recently re-joined the Documentation Project after a > > > few-month break. It's great to have her back and I want to > > > formally nominate her for membership in the openstack-doc-core > > > team, to follow the formal process for cores. > > > > > > Please let the ML know should you have any objections. > > > > I'm in no way core on Docs, but I still wanted to take the > > opportunity to welcome Alex back. You've been sorely missed! > > -- > > Jeremy Stanley > From jungleboyj at gmail.com Mon Jan 21 17:16:47 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Mon, 21 Jan 2019 11:16:47 -0600 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: <20190118144233.132eb0e427389da15e725141@redhat.com> References: <20190118144233.132eb0e427389da15e725141@redhat.com> Message-ID: <9b602908-c2f4-af45-cb0a-490e1d50b0b8@gmail.com> Wonderful news!  Welcome back Alex! Jay On 1/18/2019 7:42 AM, Petr Kovar wrote: > Hi all, > > Alex Settle recently re-joined the Documentation Project after a few-month > break. It's great to have her back and I want to formally nominate her for > membership in the openstack-doc-core team, to follow the formal process for > cores. > > Please let the ML know should you have any objections. > > Thanks, > pk > > From jungleboyj at gmail.com Mon Jan 21 17:16:59 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Mon, 21 Jan 2019 11:16:59 -0600 Subject: [dev][tc][ptl] Evaluating projects in relation to OpenStack cloud vision In-Reply-To: <376cf6a6-ae31-d6e2-41d2-0fa36061df9c@redhat.com> References: <8da07091-1fec-174b-af81-6ccc008bab2f@gmail.com> <376cf6a6-ae31-d6e2-41d2-0fa36061df9c@redhat.com> Message-ID: > I love this idea. > > Looking at https://docs.openstack.org/upstream-training/ it seems that > the syllabus for Upstream Institute is (or will eventually be) > effectively the Contributor Guide, so a good first step would be to > link to the vision from the Contributor Guide: > https://review.openstack.org/631366 > > Any idea what else would be involved in making this happen? > > thanks, > Zane. > > Zane, I help with OUI so I can make sure when we do the next review of things that we include this in the syllabus. Thanks! Jay From ignaziocassano at gmail.com Mon Jan 21 17:32:42 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 21 Jan 2019 18:32:42 +0100 Subject: [magnum][queens] issues In-Reply-To: References: Message-ID: I think that the script used to write /etc/sysconfig/heat-parms should insert an export for any variable initialized. Any case resourcegroup worked fine before applying last patch. What is changed ? Thanks in Advance for any help. Regards Ignazio Il giorno Lun 21 Gen 2019 12:42 Ignazio Cassano ha scritto: > I am trying patches you just released for magnum (git fetch git:// > git.openstack.org/openstack/magnum refs/changes/30/629130/9 && git > checkout FETCH_HEAD) > I got same issues on proxy. In the old version I modified with the help of > spyros the scripts under > /usr/lib/python2.7/dist-packages/magnum/drivers/common/templates/kubernetes/fragments > because PROXY variables are not inherited > in /etc/sysconfig/heat-params PROXY E NO PROXY variables are present but > we must modify configure-kubernetes-master.sh to force > them > . /etc/sysconfig/heat-params > echo "configuring kubernetes (master)" > _prefix=${CONTAINER_INFRA_PREFIX:-docker.io/openstackmagnum/} > export HTTP_PROXY=${HTTP_PROXY} > export HTTPS_PROXY=${HTTPS_PROXY} > export NO_PROXY=${NO_PROXY} > echo "HTTP_PROXY IS ${HTTP_PROXY}" > exporting the above variables when external network has a proxy, the > master is installed but stack hangs creating kube master Resource Group > > Regards > Ignazio > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From grant at absolutedevops.io Mon Jan 21 17:40:52 2019 From: grant at absolutedevops.io (Grant Morley) Date: Mon, 21 Jan 2019 17:40:52 +0000 Subject: Read Only FS after ceph issue Message-ID: Hi all, We are in the process of retiring one of our old platforms and last night our ceph cluster went into an "Error" state briefly because 1 of the OSDs went close to full. The data got re-balanced fine and the health of ceph is now "OK" - however we have about 40% of our instances that now have corrupt disks which is a bit odd. Even more strange is that we cannot get them into rescue mode. As soon as we try we the instances seem to hang during the bootup process when they are trying to mount "/dev/vdb1" and we eventually get a kernel timeout error as below: Warning: fsck not present, so skipping root file system [ 5.644526] EXT4-fs (vdb1): INFO: recovery required on readonly filesystem [ 5.645583] EXT4-fs (vdb1): write access will be enabled during recovery [ 240.504873] INFO: task exe:332 blocked for more than 120 seconds. [ 240.506986] Not tainted 4.4.0-66-generic #87-Ubuntu [ 240.508782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 240.511438] exe D ffff88003714b878 0 332 1 0x00000000 [ 240.513809] ffff88003714b878 ffff88007c18e358 ffffffff81e11500 ffff88007be81c00 [ 240.516665] ffff88003714c000 ffff88007fc16dc0 7fffffffffffffff ffffffff81838cd0 [ 240.519546] ffff88003714b9d0 ffff88003714b890 ffffffff818384d5 0000000000000000 [ 240.522399] Call Trace: I have even tried using a different image for nova rescue and we are getting the same results. Has anyone come across this before? This system is running OpenStack Mitaka with Ceph Jewel. Any help or suggestions will be much appreciated. Regards, -- Grant Morley Cloud Lead Absolute DevOps Ltd Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP www.absolutedevops.io grant at absolutedevops.io 0845 874 0580 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Fox at pnnl.gov Mon Jan 21 17:43:22 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Mon, 21 Jan 2019 17:43:22 +0000 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: <118e96b2-e8cf-0711-1ea0-6d6f23f34eae@redhat.com> References: <20190114201650.GA6655@sm-workstation> <3e46b54c-5e91-7e11-450c-c8dd155d987e@gmail.com> <1A3C52DFCD06494D8528644858247BF01C27E707@EX10MBOX03.pnnl.gov> <1e897f61-8f43-b7cb-bd69-42a1577fd575@redhat.com> , <118e96b2-e8cf-0711-1ea0-6d6f23f34eae@redhat.com> Message-ID: <1A3C52DFCD06494D8528644858247BF01C286C02@EX10MBOX03.pnnl.gov> I like the idea for which things to prune. Sounds reasonable to me. For the most part they were technological implementations around a social/governance problem. They can't be pruned until that is resolved. I was pushed that way too for my issues and rather then spawn a new project, I just gave up on creating a new project. So it has gone both ways. Either new projects sprang up or use cases were dropped. Either way, the governence/social aspect needs solving. I still think that is the TC's job to solve. How? Thanks, Kevin ________________________________________ From: Zane Bitter [zbitter at redhat.com] Sent: Monday, January 21, 2019 1:15 AM To: openstack-discuss at lists.openstack.org Subject: Re: [tc] [all] Please help verify the role of the TC On 18/01/19 1:57 AM, Doug Hellmann wrote: > Chris Dent writes: > >> On Thu, 17 Jan 2019, Zane Bitter wrote: >> >>>> Thus: What if the TC and PTLs were the same thing? Would it become >>>> more obvious that there's too much in play to make progress in a >>>> unified direction (on the thing called OpenStack), leading us to >>>> choose less to do, and choose more consistency and actionable >>>> leadership? And would it enable some power to execute on that >>>> leadership. >>> >>> I'm not sure we need to speculate, because as you know the TC and PTLs >>> literally were the same thing prior to 2014-ish. My recollection is that >>> there were pluses and minuses, but on the whole I don't think it had the >>> effect you're suggesting it might. >> >> Part and parcel of what I'm suggesting is that less stuff would be >> considered in the domain of "what do we do?" such that the tyranny of >> the old/existing projects that you describe is a feature not a bug, >> as an in-built constraint. >> >> It's not a future I really like, but it is one strategy for enabling >> moving in one direction: cut some stuff. Stop letting so many >> flowers bloom. >> >> Letting those flowers bloom is in the camp of "contribution in, >> all its many and diverse forms". > > What would you prune? As a frequent and loud advocate for allowing all of those new projects in, I feel like this is a good moment to take stock and consider whether I might have been mistaken to do so, if only to reassure other folks that they can attempt to answer the question without me yelling at them ;) I do think Chris offers a valid line of enquiry, even though (like him) I don't really like the future that it leads to. I would identify two classes of project that we might consider for pruning in this scenario. * There are a number of projects that in a perfect world would arguably be just a feature rather than a separate service. The general pattern was usually that they had to do something on the compute node that was easier *socially* to get implemented in a separate project; often they also had to do something in the control plane that could potentially have been handled by a combination of other services, but again it was easier to throw that code into the project too rather than force multiple hard dependencies on cloud operators that wanted the feature. Pruning these projects could in theory lead to a more technically justifiable design for the features they support, and help build a critical mass of users for the more generic control plane services (I'm thinking of e.g. Mistral) that might have been used by multiple features, instead of being effectively reimplemented in various hard-coded configurations by multiple projects. * There are a number of projects that proceeded a long way down the path despite containing fundamental design flaws due to workarounds for missing features in services they depended on. In at least one case, multiple companies toiled away diligently for years taking over from one another as each, successively, ran out of runway while still waiting for features to build a sustainable design on top of. In the meantime, we added them to OpenStack and encouraged/demanded that they spend a good fraction of their time and effort on not breaking existing users from release to release. Pruning these projects might folks interested in them the opportunity to forego backwards-compatibility in favour of ensuring the features they need are present first, and then rapidly iterating toward a long-term sustainable design. The problem I still see with this it is that we made all of these decisions for good reasons, which were about getting feedback. We encouraged projects to guarantee backwards compatibility because that's needed to get users to use it for real and give feedback. We added projects that depended on missing features in part to provide feedback to other teams on what features were needed. We added projects that were really features because users needed those features, and there was no other way to hear their feedback. Clearly in some cases, that was not enough. But it's very hard to see how we can get the features users want done with even _less_ feedback. It could be that we don't actually want to get those features done, but interestingly (and slightly surprisingly) during the technical vision exercise nobody suggested we delete every design goal except for "Basic Physical Data Center Management". (If you *do* think we should do that, please propose it as a patch so we can discuss it.) It seems like we all actually kinda agree on where we want to get, but some of the critical paths to getting there may be blocked by other priorities. At this point I actually wouldn't be too unhappy to see a reset, where we said OK we are not going to worry about this other stuff until we've re-architected the building blocks to operate in such a way that they can support all of the additional services we want. Especially if we had a specific plan for prioritising those aspects. But how are we going to get feedback on what exactly it is we need to do without folks in the community building those additional services and features, and users using them? That's not a rhetorical question; if you have ideas I'd like to hear them. cheers, Zane. From amy at demarco.com Mon Jan 21 17:48:12 2019 From: amy at demarco.com (Amy Marrich) Date: Mon, 21 Jan 2019 11:48:12 -0600 Subject: [Diversity] [OffTOPIC] OpenStack Diversity Survey - ends Feb 28 Message-ID: This is the last time the Diversity and Inclusion WG is asking for your assistance for our current survey in collecting data in regards to diversity. We will be closing the survey on February 28th, 2019 in order to start compiling the data. We revised the Diversity Survey that was originally distributed to the Community in the Fall of 2015 and reached out in August with our new survey. We are looking to update our view of the OpenStack community and it's diversity. We are pleased to be working with members of the CHAOSS project who have signed confidentiality agreements in order to assist us in the following ways: 1) Assistance in analyzing the results 2) And feeding the results into the CHAOSS software and metrics development work so that we can help other Open Source projects Please take the time to fill out the survey and share it with others in the community. The survey can be found at: https://www.surveymonkey.com/r/OpenStackDiversity Thank you for assisting us in this important task! Please feel free to reach out to me via email, in Berlin, or to myself or any WG member in #openstack-diversity! Amy Marrich (spotz) Diversity and Inclusion Working Group Chair -------------- next part -------------- An HTML attachment was scrubbed... URL: From amy at demarco.com Mon Jan 21 17:48:12 2019 From: amy at demarco.com (Amy Marrich) Date: Mon, 21 Jan 2019 11:48:12 -0600 Subject: [Diversity] [OffTOPIC] OpenStack Diversity Survey - ends Feb 28 Message-ID: This is the last time the Diversity and Inclusion WG is asking for your assistance for our current survey in collecting data in regards to diversity. We will be closing the survey on February 28th, 2019 in order to start compiling the data. We revised the Diversity Survey that was originally distributed to the Community in the Fall of 2015 and reached out in August with our new survey. We are looking to update our view of the OpenStack community and it's diversity. We are pleased to be working with members of the CHAOSS project who have signed confidentiality agreements in order to assist us in the following ways: 1) Assistance in analyzing the results 2) And feeding the results into the CHAOSS software and metrics development work so that we can help other Open Source projects Please take the time to fill out the survey and share it with others in the community. The survey can be found at: https://www.surveymonkey.com/r/OpenStackDiversity Thank you for assisting us in this important task! Please feel free to reach out to me via email, in Berlin, or to myself or any WG member in #openstack-diversity! Amy Marrich (spotz) Diversity and Inclusion Working Group Chair -------------- next part -------------- An HTML attachment was scrubbed... URL: From ed at leafe.com Mon Jan 21 19:13:46 2019 From: ed at leafe.com (Ed Leafe) Date: Mon, 21 Jan 2019 13:13:46 -0600 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> Message-ID: On Jan 21, 2019, at 3:10 AM, Jean-Philippe Evrard wrote: > > I think it would be great to have a larger community feedback, or at > least a API SIG feedback, analysing this pattern. I would strongly prefer the approach of each service implementing an endpoint to be called by the Keystone when a project is deleted. Relying on a library that would somehow be able to understand all the parts a project touches within a service sounds a lot more error-prone. -- Ed Leafe From lbragstad at gmail.com Mon Jan 21 19:55:38 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Mon, 21 Jan 2019 13:55:38 -0600 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> Message-ID: On Mon, Jan 21, 2019 at 1:17 PM Ed Leafe wrote: > On Jan 21, 2019, at 3:10 AM, Jean-Philippe Evrard > wrote: > > > > I think it would be great to have a larger community feedback, or at > > least a API SIG feedback, analysing this pattern. > > I would strongly prefer the approach of each service implementing an > endpoint to be called by the Keystone when a project is deleted. Relying on > a library that would somehow be able to understand all the parts a project > touches within a service sounds a lot more error-prone. > Are you referring to the system scope approach detailed on line 38, here [0]? I might be misunderstanding something, but I didn't think keystone was going to iterate all available services and call clean-up APIs. I think it was just that services would be able to expose an endpoint that cleans up resources without a project scoped token (e.g., it would be system scoped [1]). [0] https://etherpad.openstack.org/p/community-goal-project-deletion [1] https://docs.openstack.org/keystone/latest/admin/tokens-overview.html#system-scoped-tokens > > -- Ed Leafe > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ed at leafe.com Mon Jan 21 20:17:54 2019 From: ed at leafe.com (Ed Leafe) Date: Mon, 21 Jan 2019 14:17:54 -0600 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> Message-ID: <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> On Jan 21, 2019, at 1:55 PM, Lance Bragstad wrote: > > Are you referring to the system scope approach detailed on line 38, here [0]? Yes. > I might be misunderstanding something, but I didn't think keystone was going to iterate all available services and call clean-up APIs. I think it was just that services would be able to expose an endpoint that cleans up resources without a project scoped token (e.g., it would be system scoped [1]). > > [0] https://etherpad.openstack.org/p/community-goal-project-deletion > [1] https://docs.openstack.org/keystone/latest/admin/tokens-overview.html#system-scoped-tokens It is more likely that I’m misunderstanding. Reading that etherpad, it appeared that it was indeed the goal to have project deletion in Keystone cascade to all the services, but I guess I missed line 19. So if it isn’t Keystone calling this API on all the services, what would be the appropriate actor? -- Ed Leafe From lbragstad at gmail.com Mon Jan 21 20:30:18 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Mon, 21 Jan 2019 14:30:18 -0600 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> Message-ID: On Mon, Jan 21, 2019 at 2:18 PM Ed Leafe wrote: > On Jan 21, 2019, at 1:55 PM, Lance Bragstad wrote: > > > > Are you referring to the system scope approach detailed on line 38, here > [0]? > > Yes. > > > I might be misunderstanding something, but I didn't think keystone was > going to iterate all available services and call clean-up APIs. I think it > was just that services would be able to expose an endpoint that cleans up > resources without a project scoped token (e.g., it would be system scoped > [1]). > > > > [0] https://etherpad.openstack.org/p/community-goal-project-deletion > > [1] > https://docs.openstack.org/keystone/latest/admin/tokens-overview.html#system-scoped-tokens > > It is more likely that I’m misunderstanding. Reading that etherpad, it > appeared that it was indeed the goal to have project deletion in Keystone > cascade to all the services, but I guess I missed line 19. > > So if it isn’t Keystone calling this API on all the services, what would > be the appropriate actor? > The actor could still be something like os-purge or adjutant [0]. Depending on how the implementation shakes out in each service, the implementation in the actor could be an interation of all services calling the same API for each one. I guess the benefit is that the actor doesn't need to manage the deletion order based on the dependencies of the resources (internal or external to a service). Adrian, and others, have given this a bunch more thought than I have. So I'm curious to hear if what I'm saying is in line with how they've envisioned things. I'm recalling most of this from Berlin. [0] https://adjutant.readthedocs.io/en/latest/ > > > -- Ed Leafe > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Mon Jan 21 20:49:27 2019 From: melwittt at gmail.com (melanie witt) Date: Mon, 21 Jan 2019 12:49:27 -0800 Subject: Read Only FS after ceph issue In-Reply-To: References: Message-ID: <367a4705-468c-f73a-ba4e-450e36af7609@gmail.com> On Mon, 21 Jan 2019 17:40:52 +0000, Grant Morley wrote: > Hi all, > > We are in the process of retiring one of our old platforms and last > night our ceph cluster went into an "Error" state briefly because 1 of > the OSDs went close to full. The data got re-balanced fine and the > health of ceph is now "OK" - however we have about 40% of our instances > that now have corrupt disks which is a bit odd. > > Even more strange is that we cannot get them into rescue mode. As soon > as we try we the instances seem to hang during the bootup process when > they are trying to mount "/dev/vdb1" and we eventually get a kernel > timeout error as below: > > Warning: fsck not present, so skipping root file system > [ 5.644526] EXT4-fs (vdb1): INFO: recovery required on readonly filesystem > [ 5.645583] EXT4-fs (vdb1): write access will be enabled during recovery > [ 240.504873] INFO: task exe:332 blocked for more than 120 seconds. > [ 240.506986] Not tainted 4.4.0-66-generic #87-Ubuntu > [ 240.508782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 240.511438] exe D ffff88003714b878 0 332 1 0x00000000 > [ 240.513809] ffff88003714b878 ffff88007c18e358 ffffffff81e11500 ffff88007be81c00 > [ 240.516665] ffff88003714c000 ffff88007fc16dc0 7fffffffffffffff ffffffff81838cd0 > [ 240.519546] ffff88003714b9d0 ffff88003714b890 ffffffff818384d5 0000000000000000 > [ 240.522399] Call Trace: > > I have even tried using a different image for nova rescue and we are > getting the same results. Has anyone come across this before? > > This system is running OpenStack Mitaka with Ceph Jewel. > > Any help or suggestions will be much appreciated. I don't know whether this is related, but what you describe reminded me of issues I have seen before in the past: https://bugs.launchpad.net/nova/+bug/1781878 See my comment #1 on the bug ^ for links to additional information on the same root issue. Hope this helps in some way, -melanie From ekcs.openstack at gmail.com Mon Jan 21 21:04:03 2019 From: ekcs.openstack at gmail.com (Eric K) Date: Mon, 21 Jan 2019 13:04:03 -0800 Subject: [congress] nominating Akhil Jain for core reviewer Message-ID: Hi all, I'm writing to nominate Akhil Jain as a Congress core reviewer. He has been an active and consistent contributor in code, reviews, and community interactions for Rocky and Stein cycles. Some of his key contributions include data source drivers, working with other projects to drive key integrations, thorough code reviews, and many important nuts-and-bolts patches. He has been especially valuable to the project and the community through his technical designs informed by his gathering and understanding of operator requirements. Akhil: I look forward to continuing working together! Eric From tpb at dyncloud.net Mon Jan 21 21:14:24 2019 From: tpb at dyncloud.net (Tom Barron) Date: Mon, 21 Jan 2019 16:14:24 -0500 Subject: [manila-ui][queens] cannot create share snapshot In-Reply-To: References: Message-ID: <20190121211424.7unpcp4cytw7gqhj@barron.net> On 21/01/19 12:53 +0100, Ignazio Cassano wrote: >Hello, >I installed manila on centos queens but it is not possible creating share >snapshot from ui: >Danger: An error occurred. Please try again later. > >It works from command line . > > > > >Regards >Ignazio To go about debugging this I'd suggest ensuring that httpd logs and manila logs running at debug level and checking them while attempting to create the snapshot (1) from cli, and (2) from horizon. If it's not clear what's going on at that point you may want to open a launchpad bug at https://bugs.launchpad.net/manila-ui/+filebug to engage help from the manila development team. -- Tom Barron From ignaziocassano at gmail.com Mon Jan 21 21:30:58 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 21 Jan 2019 22:30:58 +0100 Subject: [manila-ui][queens] cannot create share snapshot In-Reply-To: <20190121211424.7unpcp4cytw7gqhj@barron.net> References: <20190121211424.7unpcp4cytw7gqhj@barron.net> Message-ID: Ok, I will do that. Thanks Ignazio Il giorno Lun 21 Gen 2019 22:14 Tom Barron ha scritto: > On 21/01/19 12:53 +0100, Ignazio Cassano wrote: > >Hello, > >I installed manila on centos queens but it is not possible creating share > >snapshot from ui: > >Danger: An error occurred. Please try again later. > > > >It works from command line . > > > > > > > > > >Regards > >Ignazio > > To go about debugging this I'd suggest ensuring that httpd logs and > manila logs running at debug level and checking them while attempting > to create the snapshot (1) from cli, and (2) from horizon. If it's > not clear what's going on at that point you may want to open a > launchpad bug at https://bugs.launchpad.net/manila-ui/+filebug to > engage help from the manila development team. > > -- Tom Barron > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michaelr at catalyst.net.nz Mon Jan 21 22:03:17 2019 From: michaelr at catalyst.net.nz (Michael Richardson) Date: Tue, 22 Jan 2019 11:03:17 +1300 Subject: [Trove] State of the Trove service tenant deployment model Message-ID: <20190122110317.9eff51b2ef0b79137fc99593@catalyst.net.nz> Hi all, Is anyone aware of the current state of the "service tenant" deployment model for Trove, and whether it is a viable option; or whether the historical method of customer tenants/projects are still the recommended approach? Any and all comments/thoughts/gotchas would be most appreciated! Cheers, Michael From ianyrchoi at gmail.com Mon Jan 21 22:24:07 2019 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Tue, 22 Jan 2019 07:24:07 +0900 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: <20190118144233.132eb0e427389da15e725141@redhat.com> References: <20190118144233.132eb0e427389da15e725141@redhat.com> Message-ID: Really welcome back to doc-core, Alex :) With many thanks, /Ian Petr Kovar wrote on 1/18/2019 10:42 PM: > Hi all, > > Alex Settle recently re-joined the Documentation Project after a few-month > break. It's great to have her back and I want to formally nominate her for > membership in the openstack-doc-core team, to follow the formal process for > cores. > > Please let the ML know should you have any objections. > > Thanks, > pk > > From andy at andybotting.com Mon Jan 21 22:43:17 2019 From: andy at andybotting.com (Andy Botting) Date: Tue, 22 Jan 2019 09:43:17 +1100 Subject: [Trove] State of the Trove service tenant deployment model In-Reply-To: <20190122110317.9eff51b2ef0b79137fc99593@catalyst.net.nz> References: <20190122110317.9eff51b2ef0b79137fc99593@catalyst.net.nz> Message-ID: Hi Michael, > > Is anyone aware of the current state of the "service tenant" deployment > model for Trove, and whether it is a viable option; or whether the > historical method of customer tenants/projects are still the recommended > approach? > > Any and all comments/thoughts/gotchas would be most appreciated! > We've had it running in production on the Nectar cloud for the last 12 months. The work I did made it upstream, so you should be right to go from at least Queens. You'll just need to set something like this in your Trove config: nova_proxy_admin_user = trove nova_proxy_admin_pass = nova_proxy_admin_tenant_name = trove remote_nova_client = trove.common.single_tenant_remote.nova_client_trove_admin remote_cinder_client = trove.common.single_tenant_remote.cinder_client_trove_admin remote_neutron_client = trove.common.single_tenant_remote.neutron_client_trove_admin cheers, Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: From michaelr at catalyst.net.nz Mon Jan 21 23:13:51 2019 From: michaelr at catalyst.net.nz (Michael Richardson) Date: Tue, 22 Jan 2019 12:13:51 +1300 Subject: [Trove] State of the Trove service tenant deployment model In-Reply-To: References: <20190122110317.9eff51b2ef0b79137fc99593@catalyst.net.nz> Message-ID: <20190122121351.31bdcc89a6ee52e9eaf3b02b@catalyst.net.nz> Hi Andy, On Tue, 22 Jan 2019 09:43:17 +1100 Andy Botting wrote: > We've had it running in production on the Nectar cloud for the last 12 > months. The work I did made it upstream, so you should be right to go from > at least Queens. > > You'll just need to set something like this in your Trove config: > > nova_proxy_admin_user = trove > nova_proxy_admin_pass = > nova_proxy_admin_tenant_name = trove > remote_nova_client = > trove.common.single_tenant_remote.nova_client_trove_admin > remote_cinder_client = > trove.common.single_tenant_remote.cinder_client_trove_admin > remote_neutron_client = > trove.common.single_tenant_remote.neutron_client_trove_admin > > cheers, > Andy Superb -- great to hear! Would it be fair to say that the old Rabbit message bus security issue (shared credentials that could be extracted from backups) is no longer an issue? (Apologies if this is long gone -- from an initial foray into the code it was hard to tell). Cheers, Michael From grant at absolutedevops.io Mon Jan 21 23:18:16 2019 From: grant at absolutedevops.io (Grant Morley) Date: Mon, 21 Jan 2019 23:18:16 +0000 Subject: Read Only FS after ceph issue In-Reply-To: <367a4705-468c-f73a-ba4e-450e36af7609@gmail.com> References: <367a4705-468c-f73a-ba4e-450e36af7609@gmail.com> Message-ID: Hi, Thanks for the email. We have managed to fix this by upgrading to the latest minor version patch of Ceph Jewel and restarting the OSDs. Seems like there might have been some write lock issues that were not being reported by ceph. Many Thanks, On 21/01/2019 20:49, melanie witt wrote: > On Mon, 21 Jan 2019 17:40:52 +0000, Grant Morley > wrote: >> Hi all, >> >> We are in the process of retiring one of our old platforms and last >> night our ceph cluster went into an "Error" state briefly because 1 >> of the OSDs went close to full. The data got re-balanced fine and the >> health of ceph is now "OK" - however we have about 40% of our >> instances that now have corrupt disks which is a bit odd. >> >> Even more strange is that we cannot get them into rescue mode. As >> soon as we try we the instances seem to hang during the bootup >> process when they are trying to mount "/dev/vdb1" and we eventually >> get a kernel timeout error as below: >> >> Warning: fsck not present, so skipping root file system >> [    5.644526] EXT4-fs (vdb1): INFO: recovery required on readonly >> filesystem >> [    5.645583] EXT4-fs (vdb1): write access will be enabled during >> recovery >> [  240.504873] INFO: task exe:332 blocked for more than 120 seconds. >> [  240.506986]       Not tainted 4.4.0-66-generic #87-Ubuntu >> [  240.508782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> disables this message. >> [  240.511438] exe             D ffff88003714b878     0 332      1 >> 0x00000000 >> [  240.513809]  ffff88003714b878 ffff88007c18e358 ffffffff81e11500 >> ffff88007be81c00 >> [  240.516665]  ffff88003714c000 ffff88007fc16dc0 7fffffffffffffff >> ffffffff81838cd0 >> [  240.519546]  ffff88003714b9d0 ffff88003714b890 ffffffff818384d5 >> 0000000000000000 >> [  240.522399] Call Trace: >> >> I have even tried using a different image for nova rescue and we are >> getting the same results. Has anyone come across this before? >> >> This system is running OpenStack Mitaka with Ceph Jewel. >> >> Any help or suggestions will be much appreciated. > > I don't know whether this is related, but what you describe reminded > me of issues I have seen before in the past: > > https://bugs.launchpad.net/nova/+bug/1781878 > > See my comment #1 on the bug ^ for links to additional information on > the same root issue. > > Hope this helps in some way, > -melanie -- Grant Morley Cloud Lead Absolute DevOps Ltd Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP www.absolutedevops.io grant at absolutedevops.io 0845 874 0580 -------------- next part -------------- An HTML attachment was scrubbed... URL: From andy at andybotting.com Mon Jan 21 23:23:16 2019 From: andy at andybotting.com (Andy Botting) Date: Tue, 22 Jan 2019 10:23:16 +1100 Subject: [Trove] State of the Trove service tenant deployment model In-Reply-To: <20190122121351.31bdcc89a6ee52e9eaf3b02b@catalyst.net.nz> References: <20190122110317.9eff51b2ef0b79137fc99593@catalyst.net.nz> <20190122121351.31bdcc89a6ee52e9eaf3b02b@catalyst.net.nz> Message-ID: Hi Michael, > Superb -- great to hear! > > Would it be fair to say that the old Rabbit message bus security issue > (shared credentials that could be extracted from backups) is no longer an > issue? (Apologies if this is long gone -- from an initial foray into the > code it was hard to tell). Good question - I'm not entirely sure. I did remember sitting on a presentation a while back saying they were fixing it. I haven't had a good look into the backups, so I'm not sure if the rabbit creds are actually in the backup file or not. cheers -------------- next part -------------- An HTML attachment was scrubbed... URL: From adriant at catalyst.net.nz Tue Jan 22 01:14:50 2019 From: adriant at catalyst.net.nz (Adrian Turjak) Date: Tue, 22 Jan 2019 14:14:50 +1300 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> Message-ID: I've expanded on the notes in the etherpad about why Keystone isn't the actor. At the summit we discussed this option, and all the people familiar with Keystone who were in the room (or in some later discussions), agreed that making Keystone the actor is a BAD idea. Keystone does not currently do any orchestration or workflow of this nature, making it do that adds a lot of extra logic which it just shouldn't need. After a project delete it would need to call all the APIs, and then confirm they succeeded, and maybe retry. This would have to be done asynchronously since waiting and confirming the deletion would take longer than a single API call to delete a project in Keystone should take. That kind of logic doesn't fit in Keystone. Not to mention there are issues on how Keystone would know which services support such an API, and where exactly it might be (although catalog + consistent API placement or discovery could solve that). Essentially, going down the route of "make this Keystone's problem" is in my opinion a hard NO, but I'll let the Keystone devs weigh in on that before we make that a very firm hard NO. As for solutions. Ideally we do implement the APIs per service (that's the end goal), but we ALSO make libraries that do deletion of resource using the existing APIs. If the library sees that a service version is one with the purge API it uses it, otherwise it has a fallback for less efficient deletion. This has the major benefit of working for all existing deployments, and ones stuck on older OpenStack versions. This is a universal problem and we need to solve it backwards AND forwards. By doing both (with a first step focus on the libraries) we can actually give projects more time to build the purge API, and maybe have the API portion of the goal extend into another cycle if needed. Essentially, we'd make a purge library that uses the SDK to delete resources. If a service has a purge endpoint, then the library (via the SDK) uses that. The specifics of how the library purges, or if the library will be split into multiple libraries (one top level, and then one per service) is to be decided. A rough look at what a deletion process might looks like: 1. Disable project in Keystone (so no new resources can be created or modified), or clear all role assignments (and api-keys) from project. 2. Purge platform orchestration services (Magnum, Sahara 3. Purge Heat (Heat after Magnum, because magnum and such use Heat, and deleting Heat stacks without deleting the 'resource' which uses that stack can leave a mess) 4. Purge everything left (order to be decided or potentially dynamically chosen). 5. Delete or Disable Keystone project (disable is enough really). The actor is then first a CLI built into the purge library as a OSClient command, then secondly maybe an API or two in Adjutant which will use this library.  Or anyone can use the library and make anything they want an actor. Ideally if we can even make the library allow selectively choosing which services to purge (conditional on dependency chain), that could be useful for cases where a user wants to delete everything except maybe what's in Swift or Cinder. This is in many ways a HUGE goal, but one that we really need to accomplish. We've lived with this problem too long and the longer we leave it unsolved, the harder it becomes. On 22/01/19 9:30 AM, Lance Bragstad wrote: > > > On Mon, Jan 21, 2019 at 2:18 PM Ed Leafe > wrote: > > On Jan 21, 2019, at 1:55 PM, Lance Bragstad > wrote: > > > > Are you referring to the system scope approach detailed on line > 38, here [0]? > > Yes. > > > I might be misunderstanding something, but I didn't think > keystone was going to iterate all available services and call > clean-up APIs. I think it was just that services would be able to > expose an endpoint that cleans up resources without a project > scoped token (e.g., it would be system scoped [1]). > > > > [0] https://etherpad.openstack.org/p/community-goal-project-deletion > > [1] > https://docs.openstack.org/keystone/latest/admin/tokens-overview.html#system-scoped-tokens > > > It is more likely that I’m misunderstanding. Reading that > etherpad, it appeared that it was indeed the goal to have project > deletion in Keystone cascade to all the services, but I guess I > missed line 19. > > So if it isn’t Keystone calling this API on all the services, what > would be the appropriate actor? > > > The actor could still be something like os-purge or adjutant [0]. > Depending on how the implementation shakes out in each service, the > implementation in the actor could be an interation of all services > calling the same API for each one. I guess the benefit is that the > actor doesn't need to manage the deletion order based on the > dependencies of the resources (internal or external to a service). > > Adrian, and others, have given this a bunch more thought than I have. > So I'm curious to hear if what I'm saying is in line with how they've > envisioned things. I'm recalling most of this from Berlin. > > [0] https://adjutant.readthedocs.io/en/latest/ >   > > > > -- Ed Leafe > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michaelr at catalyst.net.nz Tue Jan 22 03:50:20 2019 From: michaelr at catalyst.net.nz (Michael Richardson) Date: Tue, 22 Jan 2019 16:50:20 +1300 Subject: [Trove] State of the Trove service tenant deployment model In-Reply-To: References: <20190122110317.9eff51b2ef0b79137fc99593@catalyst.net.nz> <20190122121351.31bdcc89a6ee52e9eaf3b02b@catalyst.net.nz> Message-ID: <20190122035020.GA32608@catalyst.net.nz> No problem -- shall dig much deeper, and post an update on this thread once more is known. Cheers, Michael On Tue, Jan 22, 2019 at 10:23:16AM +1100, Andy Botting wrote: > Hi Michael, > > > > Superb -- great to hear! > > > > Would it be fair to say that the old Rabbit message bus security issue > > (shared credentials that could be extracted from backups) is no longer an > > issue? (Apologies if this is long gone -- from an initial foray into the > > code it was hard to tell). > > > Good question - I'm not entirely sure. I did remember sitting on a > presentation a while back saying they were fixing it. > > I haven't had a good look into the backups, so I'm not sure if the rabbit > creds are actually in the backup file or not. > > cheers From anusha.iiitm at gmail.com Tue Jan 22 04:50:20 2019 From: anusha.iiitm at gmail.com (Anusha Ramineni) Date: Tue, 22 Jan 2019 10:20:20 +0530 Subject: [congress] nominating Akhil Jain for core reviewer In-Reply-To: References: Message-ID: +1 , doing consistent and good work for the past 2 cycles . Would be great addition to team. Thanks, Anusha On Tue, 22 Jan 2019, 2:35 am Eric K Hi all, > > I'm writing to nominate Akhil Jain as a Congress core reviewer. He has > been an active and consistent contributor in code, reviews, and > community interactions for Rocky and Stein cycles. > > Some of his key contributions include data source drivers, working > with other projects to drive key integrations, thorough code reviews, > and many important nuts-and-bolts patches. He has been especially > valuable to the project and the community through his technical > designs informed by his gathering and understanding of operator > requirements. > > Akhil: I look forward to continuing working together! > > Eric > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liliueecg at gmail.com Tue Jan 22 04:56:22 2019 From: liliueecg at gmail.com (Li Liu) Date: Mon, 21 Jan 2019 23:56:22 -0500 Subject: [Cyborg][IRC] The Cyborg IRC meeting will be held Wednesday at 0300 UTC Message-ID: The IRC meeting will be held Wednesday at 0300 UTC, which is 10:00 pm est(Tuesday) / 7:00 pm pst(Tuesday) /11 am Beijing time (Wednesday) This Week's Agenda: 1. Track Status on DB work 2. Plans on pushing for the deadline of Stein(1.5 month to go) a. Nova integration. . b. DB change, device drivers and discovery. -- Thank you Regards Li Liu -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Tue Jan 22 06:18:04 2019 From: zbitter at redhat.com (Zane Bitter) Date: Tue, 22 Jan 2019 19:18:04 +1300 Subject: Question about Heat parser In-Reply-To: <12561_1548074756_5C45BF04_12561_456_1_C4EB90C743A33246994EC9C93505F19901928383@OPEXCAUBM32.corporate.adroot.infra.ftgroup> References: <12561_1548074756_5C45BF04_12561_456_1_C4EB90C743A33246994EC9C93505F19901928383@OPEXCAUBM32.corporate.adroot.infra.ftgroup> Message-ID: <70ebb9f4-45f0-b8a3-1eff-00d11ad4763a@redhat.com> On 22/01/19 1:45 AM, adjandeye.sylla at orange.com wrote: > Dear all, > > In the context of my post-doc research, I’m  working with HOT templates > and I want to parse them. > > Instead of developing a new parser, I want to reuse the one that is used > by Heat but I don’t want to connect to a running OpenStack to do the > parsing. > > Is it possible to use Heat as a standalone parser (i.e. without > connecting to a running OpenStack) ? Yes, that shouldn't be a problem. You can import Heat as a library ("from heat.engine import template"). As long as you don't call any of the methods that deal with the database (i.e. load()/store()), you should have no trouble. Note that there are a number of template APIs (the ones that deal with intrinsic functions) that take a 'stack' argument. Originally we passed a heat.engine.stack.Stack object, but now you should pass a heat.engine.stk_defn.StackDefinition object. That should make things easier for you, because a StackDefinition can be constructed from a Template, and it doesn't deal with the database at all, so you can skip the complexity of the Stack object. We look forward to hearing the results of your research :) cheers, Zane. From zbitter at redhat.com Tue Jan 22 06:29:25 2019 From: zbitter at redhat.com (Zane Bitter) Date: Tue, 22 Jan 2019 19:29:25 +1300 Subject: [Trove] State of the Trove service tenant deployment model In-Reply-To: <20190122121351.31bdcc89a6ee52e9eaf3b02b@catalyst.net.nz> References: <20190122110317.9eff51b2ef0b79137fc99593@catalyst.net.nz> <20190122121351.31bdcc89a6ee52e9eaf3b02b@catalyst.net.nz> Message-ID: On 22/01/19 12:13 PM, Michael Richardson wrote: > Hi Andy, > > On Tue, 22 Jan 2019 09:43:17 +1100 Andy Botting wrote: > > >> We've had it running in production on the Nectar cloud for the last 12 >> months. The work I did made it upstream, so you should be right to go from >> at least Queens. >> >> You'll just need to set something like this in your Trove config: >> >> nova_proxy_admin_user = trove >> nova_proxy_admin_pass = >> nova_proxy_admin_tenant_name = trove >> remote_nova_client = >> trove.common.single_tenant_remote.nova_client_trove_admin >> remote_cinder_client = >> trove.common.single_tenant_remote.cinder_client_trove_admin >> remote_neutron_client = >> trove.common.single_tenant_remote.neutron_client_trove_admin >> >> cheers, >> Andy > > > Superb -- great to hear! > > Would it be fair to say that the old Rabbit message bus security issue (shared credentials that could be extracted from backups) is no longer an issue? (Apologies if this is long gone -- from an initial foray into the code it was hard to tell). Last time I heard (which was probably mid-2017), the Trove team had implemented encryption for messages on the RabbitMQ bus. IIUC each DB being managed had its own encryption keys, so that would theoretically prevent both snooping and spoofing of messages. That's the good news. The bad news is that AFAIK it's still using a shared RabbitMQ bus, so attacks like denial of service are still possible if you can extract the shared credentials from the VM. Not sure about replay attacks; I haven't actually investigated the implementation. cheers, Zane. From chkumar246 at gmail.com Tue Jan 22 06:52:21 2019 From: chkumar246 at gmail.com (Chandan kumar) Date: Tue, 22 Jan 2019 12:22:21 +0530 Subject: [tripleo][openstack-ansible] collaboration on os_tempest role update VII - Jan 22, 2019 Message-ID: Hello, Here is the seventh update (Jan 15 to Jan 22, 2019) on collaboration on os_tempest[1] role between TripleO and OpenStack-Ansible projects. Things got merged: os_tempest: * Add support for aarch64 images - https://review.openstack.org/#/c/620032/ * Configuration drives don't appear to work on aarch64+kvm - https://review.openstack.org/#/c/626592/ * Fix tempest workspace path - https://review.openstack.org/#/c/628182/ * Add libselinux-python package for Red Hat distro - https://review.openstack.org/#/c/631203/ * Use usr/local/share/ansible/roles for data_files - https://review.openstack.org/#/c/630917/ * Use tempest_domain_name var for setting domain - https://review.openstack.org/#/c/630957/ * Rename tempest_public_net_physical_{type to name} - https://review.openstack.org/#/c/631183/ * Add tempest_interface_name var for setting interface - https://review.openstack.org/#/c/630942/ ansible-config_template * Use usr/share/ansible/plugins for data_files - https://review.openstack.org/631214 ansible-role-python_venv_build * Use usr/local/share/ansible/roles for data_files - https://review.openstack.org/#/c/631777/ openstack-ansible-tests * Setup clouds.yaml on tempest node - https://review.openstack.org/631794 Tripleo-spec * New spec for stein: os_tempest tripleo integration - https://review.openstack.org/630654 Summary: * We have cleaned up os_tempest cloudname and network related vars, fixed tempest workspace upgrade issue and os_tempest got support of aarch64 images and nova Configuration drives. * Tripleo os_tempest spec published: https://specs.openstack.org/openstack/tripleo-specs/specs/stein/ostempest-tripleo.html Things in Progress: os_tempest * Always generate stackviz irrespective of tests pass or fail - https://review.openstack.org/631967 * Add telemetry distro plugin install for aodh - https://review.openstack.org/632125 * Added tempest.conf for heat_plugin - https://review.openstack.org/632021 * Use the correct heat tests - https://review.openstack.org/630695 * Use tempest_cloud_name in tempestconf - https://review.openstack.org/631708 * Adds tempest run command with --test-list option - https://review.openstack.org/631351 os_tempest integration with Tripleo * Added requirements for integrating os_tempest role - https://review.openstack.org/628421 * Run tempest using os_tempest role in standalone job - https://review.openstack.org/627500 * Use os_tempest for running tempest on standalone - https://review.openstack.org/628415 Upcoming week: * We are able to run tempest in Triplo CI standalone test results are here: http://logs.openstack.org/00/627500/65/check/tripleo-ci-centos-7-standalone-os-tempest/198ae77/logs/undercloud/var/log/tempest/stestr_results.html.gz * We will try to finish os_tempest integration with tripleo and heat support in os_tempest. Thanks to odyssey4me, cloudnull, jrosser (a lot of help on different patches), Panda and Sagi from Tripleo CI team pabelanger and dmsimard on figuring out os_tempest dependencies, action_plugins and nested ansible. Here is the 6th update [2]. Have queries, Feel free to ping us on #tripleo or #openstack-ansible channel. Links: [1.] http://git.openstack.org/cgit/openstack/openstack-ansible-os_tempest [2.] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001730.html Thanks, Chandan Kumar From rico.lin.guanyu at gmail.com Tue Jan 22 07:38:03 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Tue, 22 Jan 2019 15:38:03 +0800 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> <20190111185156.fmpaplichmwpvk5u@pacific.linksys.moosehall> Message-ID: New SIG request sent, please help to review it. https://review.openstack.org/632252 On Sat, Jan 12, 2019 at 8:36 AM Rico Lin wrote: > > > Adam Spiers 於 2019年1月12日 週六,上午2:59寫道: > >> Fine by me - sounds like we have a consensus for autoscaling then? > > I think “Autoscaling SIG” gets the majority vote. Let’s give it few more > days for people in different time zones. > > >> >> Melvin Hillsman wrote: >> >+1 SIGs should have limited scope - shared interest in a particular area >> - >> >even if that area is something broad like security the mission and work >> >should be specific which could lead to working groups, additional SIGs, >> >projects, etc so I want to be careful how I word it but yes limited scope >> >is the ideal way to start a SIG imo. >> > >> >On Fri, Jan 11, 2019 at 11:14 AM Duc Truong >> wrote: >> > >> >> +1 on limiting the scope to autoscaling at first. I prefer the name >> >> autoscaling since the mission is to improve automatic scaling. If the >> >> mission is changed later, we can change the name of the SIG to reflect >> >> that. >> >> >> >> On Fri, Jan 11, 2019 at 8:24 AM Ben Nemec >> wrote: >> >> > >> >> > >> >> > >> >> > On 1/11/19 10:14 AM, Adam Spiers wrote: >> >> > > Rico Lin wrote: >> >> > >> Dear all >> >> > >> >> >> > >> To continue the discussion of whether we should have new SIG for >> >> > >> autoscaling. >> >> > >> I think we already got enough time for this ML [1], and it's >> time to >> >> > >> jump to the next step. As we got a lot of positive feedbacks from >> ML >> >> > >> [1], I think it's definitely considered an action to create a new >> SIG, >> >> > >> do some init works, and finally Here are some things that we can >> start >> >> > >> right now, to come out with the name of SIG, the definition and >> >> mission. >> >> > >> Here's my draft plan: To create a SIG name `Automatic SIG`, with >> given >> >> > >> initial mission to improve automatic scaling with (but not >> limited to) >> >> > >> OpenStack. As we discussed in forum [2], to have scenario tests >> and >> >> > >> documents will be considered as actions for the initial mission. I >> >> > >> gonna assume we will start from scenarios which already provide >> some >> >> > >> basic tests and documents which we can adapt very soon and use >> them to >> >> > >> build a SIG environment. And the long-term mission of this SIG is >> to >> >> > >> make sure we provide good documentation and test coverage for most >> >> > >> automatic functionality. >> >> > >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make >> sure we >> >> > >> can provide more value if there are more needs in the future. Just >> >> > >> like the example which Adam raised `self-optimizing` from people >> who >> >> > >> are using watcher [3]. Let me know if you got any concerns about >> this >> >> > >> name. >> >> > > >> >> > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound >> >> > > quite right to me, because it's not clear what is being automated. >> For >> >> > > example from the outside people might think it was a SIG about CI, >> or >> >> > > about automated testing, or both - or even some kind of automatic >> >> > > creation of new SIGs ;-) >> >> > > Here are some alternative suggestions: >> >> > > - Optimization SIG >> >> > > - Self-optimization SIG >> >> > > - Auto-optimization SIG >> >> > > - Adaptive Cloud SIG >> >> > > - Self-adaption SIG >> >> > > - Auto-adaption SIG >> >> > > - Auto-configuration SIG >> >> > > >> >> > > although I'm not sure these are a huge improvement on "Autoscaling >> SIG" >> >> > > - maybe some are too broad, or too vague. It depends on how >> likely it >> >> > > is that the scope will go beyond just auto-scaling. Of course you >> >> could >> >> > > also just stick with the original idea of "Auto-scaling" :-) >> >> > >> >> > I'm inclined to argue that limiting the scope of this SIG is >> actually a >> >> > feature, not a bug. Better to have a tightly focused SIG that has >> very >> >> > specific, achievable goals than to try to boil the ocean by solving >> all >> >> > of the auto* problems in OpenStack. We all know how "one SIG to rule >> >> > them all" ends. ;-) >> >> > >> >> > >> And to clarify, there will definitely some cross SIG co-work >> between >> >> > >> this new SIG and Self-Healing SIG (there're some common >> requirements >> >> > >> even across self-healing and autoscaling features.). We also need >> to >> >> > >> make sure we do not provide any duplicated work against >> self-healing >> >> > >> SIG. As a start, let's only focus on autoscaling scenario, and >> make >> >> > >> sure we're doing it right before we move to multiple cases. >> >> > > >> >> > > Sounds good! >> >> > >> If no objection, I will create the new SIG before next weekend and >> >> > >> plan a short schedule in Denver summit and PTG. >> >> > > >> >> > > Thanks for driving this! >> >> > >> >> >> >> >> > >> >-- >> >Kind regards, >> > >> >Melvin Hillsman >> >mrhillsman at gmail.com >> >mobile: (832) 264-2646 >> >> -- > May The Force of OpenStack Be With You, > > *Rico Lin*irc: ricolin > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.settle at outlook.com Tue Jan 22 09:35:47 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Tue, 22 Jan 2019 09:35:47 +0000 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: References: <20190118144233.132eb0e427389da15e725141@redhat.com>, Message-ID: Thank you all! Really overwhelmed by the support from you all :) Thank you for placing your trust in me once again. ________________________________ From: Ian Y. Choi Sent: 21 January 2019 22:24 To: openstack-discuss at lists.openstack.org Cc: Alex Settle Subject: Re: [docs] Nominating Alex Settle for openstack-doc-core Really welcome back to doc-core, Alex :) With many thanks, /Ian Petr Kovar wrote on 1/18/2019 10:42 PM: > Hi all, > > Alex Settle recently re-joined the Documentation Project after a few-month > break. It's great to have her back and I want to formally nominate her for > membership in the openstack-doc-core team, to follow the formal process for > cores. > > Please let the ML know should you have any objections. > > Thanks, > pk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adjandeye.sylla at orange.com Tue Jan 22 11:11:17 2019 From: adjandeye.sylla at orange.com (adjandeye.sylla at orange.com) Date: Tue, 22 Jan 2019 11:11:17 +0000 Subject: Question about Heat parser In-Reply-To: <70ebb9f4-45f0-b8a3-1eff-00d11ad4763a@redhat.com> References: <12561_1548074756_5C45BF04_12561_456_1_C4EB90C743A33246994EC9C93505F19901928383@OPEXCAUBM32.corporate.adroot.infra.ftgroup> <70ebb9f4-45f0-b8a3-1eff-00d11ad4763a@redhat.com> Message-ID: <20853_1548155478_5C46FA56_20853_368_1_C4EB90C743A33246994EC9C93505F19901928576@OPEXCAUBM32.corporate.adroot.infra.ftgroup> Thank you. I will import the Heat library and use it to do the parsing. I will give your news of my results :-) cheers, Adja -----Message d'origine----- De : Zane Bitter [mailto:zbitter at redhat.com] Envoyé : mardi 22 janvier 2019 07:18 À : openstack-discuss at lists.openstack.org Objet : Re: Question about Heat parser On 22/01/19 1:45 AM, adjandeye.sylla at orange.com wrote: > Dear all, > > In the context of my post-doc research, I'm  working with HOT templates > and I want to parse them. > > Instead of developing a new parser, I want to reuse the one that is used > by Heat but I don't want to connect to a running OpenStack to do the > parsing. > > Is it possible to use Heat as a standalone parser (i.e. without > connecting to a running OpenStack) ? Yes, that shouldn't be a problem. You can import Heat as a library ("from heat.engine import template"). As long as you don't call any of the methods that deal with the database (i.e. load()/store()), you should have no trouble. Note that there are a number of template APIs (the ones that deal with intrinsic functions) that take a 'stack' argument. Originally we passed a heat.engine.stack.Stack object, but now you should pass a heat.engine.stk_defn.StackDefinition object. That should make things easier for you, because a StackDefinition can be constructed from a Template, and it doesn't deal with the database at all, so you can skip the complexity of the Stack object. We look forward to hearing the results of your research :) cheers, Zane. _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. From Greg.Waines at windriver.com Tue Jan 22 12:35:04 2019 From: Greg.Waines at windriver.com (Waines, Greg) Date: Tue, 22 Jan 2019 12:35:04 +0000 Subject: [openstack-helm] Support for Docker Registry with authentication turned on ? Message-ID: <9ACD444D-18D0-48FA-803A-38FF561DA32C@windriver.com> Hey ... We’re relatively new to openstack-helm. We are trying to use the openstack-helm charts with a Docker Registry that has token authentication turned on. With the current charts, there does not seem to be a way to do this. I.e. there is not an ‘imagePullSecrets’ in the defined pods/containers or in the defined serviceAccounts . Our thinking would be to add a default imagePullSecret to all of the serviceAccounts defined in the openstack-helm serviceaccount template. OR is there another way to use openstack-helm charts with a Docker Registry with authentication turned on ? Any info is appreciated, Greg / Angie / Jerry. -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo.sgaravatto at gmail.com Tue Jan 22 12:49:27 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Tue, 22 Jan 2019 13:49:27 +0100 Subject: [nova][ops] Problems disabling notifications on state changes In-Reply-To: <1548064159.1231.4@smtp.office365.com> References: <1548064159.1231.4@smtp.office365.com> Message-ID: Thanks a lot ! Cheers, Massimo On Mon, Jan 21, 2019 at 10:49 AM Balázs Gibizer wrote: > > > On Mon, Jan 21, 2019 at 8:42 AM, Massimo Sgaravatto > wrote: > > I am disabling ceilometer on my OpenStack Ocata cloud. > > > > As fas as I understand, besides stopping the ceilometer services, I > > can also apply the following changes to nova.conf on the compute > > nodes: > > > > instance_usage_audit = True --> false > > notify_on_state_change = vm_and_task_state --> None > > > > > > I have a problem with the latter change: > > > > # grep ^notify_on_state_change /etc/nova/nova.conf > > notify_on_state_change=None > > > > but in the nova log: > > > > 2019-01-21 08:31:48.246 6349 ERROR nova ConfigFileValueError: Value > > for option notify_on_state_change is not valid: Valid values are > > [None, vm_state, vm_and_task_state], but found 'None' > > > > > > I have also tried setting > > > > notify_on_state_change= > > > > > > but it complains that it is not a valid value > > > > I can simply comment that line, but I am afraid there is a problem > > somewhere > > If you want to turn off all the notification sending from Nova, then I > suggest to add the following [1] to the nova.conf: > [oslo_messaging_notifications] > > driver = noop > > > I verified in a devstack that if you not specify the > notify_on_state_change attribute in the config file then it properly > defaults to None. I think there is no way to specify the None value > othervise in the config file. > > Cheers, > gibi > [1] > > https://docs.openstack.org/oslo.messaging/latest/configuration/opts.html#oslo_messaging_notifications.driver > > > > > Thanks, Massimo > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From natal at redhat.com Tue Jan 22 14:12:53 2019 From: natal at redhat.com (=?UTF-8?Q?Natal_Ng=C3=A9tal?=) Date: Tue, 22 Jan 2019 15:12:53 +0100 Subject: About gitignore policy. Message-ID: Hi everyone, I would to have more information about the .gitignore file policy. First I have made a patch to ignore backup files from vim editor. The patches was refused because that doesn't respect the OpenStack policy: https://review.openstack.org/#/c/631790/ https://review.openstack.org/#/c/631789/ The policy is available in a cookiecutter template repository: https://git.openstack.org/cgit/openstack-dev/cookiecutter/tree/%7b%7bcookiecutter.repo_name%7d%7d/.gitignore?id=da86f5e042cbe069166d71809d50dea4c8b6aed0 Then I have started to make patch, to align OpenStack projects with this policy: https://review.openstack.org/#/c/632086/ In nova project the patch was refused, the first reason was we have already talk about this subject. The second reason was this policy is only for the new projects. Oslo is not a new project and is follow this policy. I know is not a priority subject, but I just want understand. For me a policy must be followed by all projects, to harmonize all, but maybe I wrong. For example, in this case, I would to know what is the rules? The new projects must follow the policy but not the old ones? In this case, what the rules for old projects and can we defined also an policy for it? Thanks in advance for your help. My best regards From sean.mcginnis at gmx.com Tue Jan 22 14:32:39 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 22 Jan 2019 08:32:39 -0600 Subject: About gitignore policy. In-Reply-To: References: Message-ID: <20190122143239.GA27435@sm-workstation> > > Then I have started to make patch, to align OpenStack projects with this policy: > > https://review.openstack.org/#/c/632086/ > > In nova project the patch was refused, the first reason was we have > already talk about this subject. The second reason was this policy is > only for the new projects. Oslo is not a new project and is follow > this policy. > > I know is not a priority subject, but I just want understand. For me a > policy must be followed by all projects, to harmonize all, but maybe I > wrong. For example, in this case, I would to know what is the rules? > The new projects must follow the policy but not the old ones? In this > case, what the rules for old projects and can we defined also an > policy for it? Thanks in advance for your help. > > My best regards > It is somewhat up to each project and the core teams if, when, and how they want to enforce something like this. So while it is generally recommended not to put IDE, OS-specific, and other similar things in the per-project gitignore, there is also no real value added by going through all existing projects and removing things that are already there. Sean From jean-philippe at evrard.me Tue Jan 22 14:41:51 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 22 Jan 2019 15:41:51 +0100 Subject: [Release-job-failures][puppet] Release of openstack/puppet-aodh failed References: Message-ID: <24adfb19faa7e9af6a49c5d4f09026a0c179f0de.camel@evrard.me> Hello, puppet-aodh failed [3] to release due to an issue in the upload- puppetforge role, for a permission issue [0]. Maybe a privilege escalation is required in [1], or a deeper analysis needs to be done. The role was changed 5 days ago [2], so I am not sure this code had the chance to be tested in post release pipelines. Could someone have a look, please? Thank you in advance. Jean-Philippe Evrard (evrardjp) [0]: http://logs.openstack.org/61/617ffad84b633618490ca1023f8a31d9694b31a9/release/release-openstack-puppet/c6e519d/ara-report/result/6311c199-c059-486a-be76-8dacf273819c/ [1]: http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/upload-puppetforge/tasks/main.yaml#n17 [2]: https://github.com/openstack-infra/zuul-jobs/commit/fd8ffc6711d6481e2fcd40d158255f297156a1ff#diff-dcae4256ea4c9b5460a706b69e7d22a8 [3]: http://lists.openstack.org/pipermail/release-job-failures/2019-January/001065.html From tobias.urdin at binero.se Tue Jan 22 14:47:26 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Tue, 22 Jan 2019 15:47:26 +0100 Subject: [Release-job-failures][puppet] Release of openstack/puppet-aodh failed In-Reply-To: <24adfb19faa7e9af6a49c5d4f09026a0c179f0de.camel@evrard.me> References: <24adfb19faa7e9af6a49c5d4f09026a0c179f0de.camel@evrard.me> Message-ID: <25f92480-accd-bc55-ba35-8793efa45bde@binero.se> Hello, I have a review up [1] to solve the last issue we got when trying the release flow. It just needs another zuul core to review it then with some coordination help from Infra I'm hoping this will pass today. Best regards Tobias [1] https://review.openstack.org/#/c/632163/ On 01/22/2019 03:44 PM, Jean-Philippe Evrard wrote: > Hello, > > puppet-aodh failed [3] to release due to an issue in the upload- > puppetforge role, for a permission issue [0]. > > Maybe a privilege escalation is required in [1], or a deeper analysis > needs to be done. The role was changed 5 days ago [2], so I am not sure > this code had the chance to be tested in post release pipelines. > > Could someone have a look, please? > > Thank you in advance. > > Jean-Philippe Evrard (evrardjp) > > > [0]: > http://logs.openstack.org/61/617ffad84b633618490ca1023f8a31d9694b31a9/release/release-openstack-puppet/c6e519d/ara-report/result/6311c199-c059-486a-be76-8dacf273819c/ > > [1]: > http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/upload-puppetforge/tasks/main.yaml#n17 > > [2]: > https://github.com/openstack-infra/zuul-jobs/commit/fd8ffc6711d6481e2fcd40d158255f297156a1ff#diff-dcae4256ea4c9b5460a706b69e7d22a8 > > [3]: > http://lists.openstack.org/pipermail/release-job-failures/2019-January/001065.html > > > From jean-philippe at evrard.me Tue Jan 22 14:51:41 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 22 Jan 2019 15:51:41 +0100 Subject: [Release-job-failures][puppet] Release of openstack/puppet-aodh failed In-Reply-To: <25f92480-accd-bc55-ba35-8793efa45bde@binero.se> References: <24adfb19faa7e9af6a49c5d4f09026a0c179f0de.camel@evrard.me> <25f92480-accd-bc55-ba35-8793efa45bde@binero.se> Message-ID: <71c1f79e4ce5ab12a0a7b89706f1817b5e7665d8.camel@evrard.me> On Tue, 2019-01-22 at 15:47 +0100, Tobias Urdin wrote: > Hello, > I have a review up [1] to solve the last issue we got when trying > the > release flow. > > It just needs another zuul core to review it then with some > coordination > help from Infra > I'm hoping this will pass today. > > Best regards > Tobias > > [1] https://review.openstack.org/#/c/632163/ > Yup, I think that should do the trick for the task failing :) Thanks! From tobias.urdin at binero.se Tue Jan 22 15:03:14 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Tue, 22 Jan 2019 16:03:14 +0100 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> Message-ID: Thanks for the thorough feedback Adrian. My opinion is also that Keystone should not be the actor in executing this functionality but somewhere else whether that is Adjutant or any other form (application, library, CLI etc). I would also like to bring up the point about knowing if a project is "dirty" (it has provisioned resources). This is something that I think all business logic would benefit from, we've had issue with knowing when resources should be deleted, our solution is pretty much look at metrics the last X minutes, check if project is disabled and compare to business logic that says it should be deleted. While the above works it kills some of logical points of disabling a project since the only thing that knows if the project should be deleted or is actually disabled is the business logic application that says they clicked the deleted button and not disabled. Most of the functionality you are mentioning is things that the ospurge project has been working to implement and the maintainer even did a full rewrite which improved the dependency arrangement for resource removal. I think the biggest win for this community goal would be the developers of the projects would be available for input regarding the project specific code that does purging. There has been some really nasty bugs in ospurge in the past that if executed with the admin user you would wipe everything and not only that project, which is probably a issue that makes people think twice about using a purging toolkit at all. We should carefully consider what parts of ospurge could be reused, concept, code or anything in between that could help derive what direction we wan't to push this goal. I'm excited :) Best regards Tobias On 01/22/2019 02:18 AM, Adrian Turjak wrote: > I've expanded on the notes in the etherpad about why Keystone isn't > the actor. > > At the summit we discussed this option, and all the people familiar > with Keystone who were in the room (or in some later discussions), > agreed that making Keystone the actor is a BAD idea. > > Keystone does not currently do any orchestration or workflow of this > nature, making it do that adds a lot of extra logic which it just > shouldn't need. After a project delete it would need to call all the > APIs, and then confirm they succeeded, and maybe retry. This would > have to be done asynchronously since waiting and confirming the > deletion would take longer than a single API call to delete a project > in Keystone should take. That kind of logic doesn't fit in Keystone. > Not to mention there are issues on how Keystone would know which > services support such an API, and where exactly it might be (although > catalog + consistent API placement or discovery could solve that). > > Essentially, going down the route of "make this Keystone's problem" is > in my opinion a hard NO, but I'll let the Keystone devs weigh in on > that before we make that a very firm hard NO. > > As for solutions. Ideally we do implement the APIs per service (that's > the end goal), but we ALSO make libraries that do deletion of resource > using the existing APIs. If the library sees that a service version is > one with the purge API it uses it, otherwise it has a fallback for > less efficient deletion. This has the major benefit of working for all > existing deployments, and ones stuck on older OpenStack versions. This > is a universal problem and we need to solve it backwards AND forwards. > > By doing both (with a first step focus on the libraries) we can > actually give projects more time to build the purge API, and maybe > have the API portion of the goal extend into another cycle if needed. > > Essentially, we'd make a purge library that uses the SDK to delete > resources. If a service has a purge endpoint, then the library (via > the SDK) uses that. The specifics of how the library purges, or if the > library will be split into multiple libraries (one top level, and then > one per service) is to be decided. > > A rough look at what a deletion process might looks like: > 1. Disable project in Keystone (so no new resources can be created or > modified), or clear all role assignments (and api-keys) from project. > 2. Purge platform orchestration services (Magnum, Sahara > 3. Purge Heat (Heat after Magnum, because magnum and such use Heat, > and deleting Heat stacks without deleting the 'resource' which uses > that stack can leave a mess) > 4. Purge everything left (order to be decided or potentially > dynamically chosen). > 5. Delete or Disable Keystone project (disable is enough really). > > The actor is then first a CLI built into the purge library as a > OSClient command, then secondly maybe an API or two in Adjutant which > will use this library.  Or anyone can use the library and make > anything they want an actor. > > Ideally if we can even make the library allow selectively choosing > which services to purge (conditional on dependency chain), that could > be useful for cases where a user wants to delete everything except > maybe what's in Swift or Cinder. > > > This is in many ways a HUGE goal, but one that we really need to > accomplish. We've lived with this problem too long and the longer we > leave it unsolved, the harder it becomes. > > > On 22/01/19 9:30 AM, Lance Bragstad wrote: >> >> >> On Mon, Jan 21, 2019 at 2:18 PM Ed Leafe > > wrote: >> >> On Jan 21, 2019, at 1:55 PM, Lance Bragstad > > wrote: >> > >> > Are you referring to the system scope approach detailed on line >> 38, here [0]? >> >> Yes. >> >> > I might be misunderstanding something, but I didn't think >> keystone was going to iterate all available services and call >> clean-up APIs. I think it was just that services would be able to >> expose an endpoint that cleans up resources without a project >> scoped token (e.g., it would be system scoped [1]). >> > >> > [0] >> https://etherpad.openstack.org/p/community-goal-project-deletion >> > [1] >> https://docs.openstack.org/keystone/latest/admin/tokens-overview.html#system-scoped-tokens >> >> >> It is more likely that I’m misunderstanding. Reading that >> etherpad, it appeared that it was indeed the goal to have project >> deletion in Keystone cascade to all the services, but I guess I >> missed line 19. >> >> So if it isn’t Keystone calling this API on all the services, >> what would be the appropriate actor? >> >> >> The actor could still be something like os-purge or adjutant [0]. >> Depending on how the implementation shakes out in each service, the >> implementation in the actor could be an interation of all services >> calling the same API for each one. I guess the benefit is that the >> actor doesn't need to manage the deletion order based on the >> dependencies of the resources (internal or external to a service). >> >> Adrian, and others, have given this a bunch more thought than I have. >> So I'm curious to hear if what I'm saying is in line with how they've >> envisioned things. I'm recalling most of this from Berlin. >> >> [0] https://adjutant.readthedocs.io/en/latest/ >> >> >> >> -- Ed Leafe >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon at csail.mit.edu Tue Jan 22 15:12:11 2019 From: jon at csail.mit.edu (Jonathan Proulx) Date: Tue, 22 Jan 2019 10:12:11 -0500 Subject: [User-committee] UC Election - Looking for Election Officials In-Reply-To: References: Message-ID: <20190122151211.cvv3vues2dvk2jtg@csail.mit.edu> Was on PTO last week when the ask came through, but if official are still needed I'm game. -Jon On Mon, Jan 14, 2019 at 01:12:09PM -0600, Matt Van Winkle wrote: :Hey Stackers, : : :We are getting ready for the Winter UC election and we need to have at :least two Election Officials. I was wondering if you would like to help us :on that process. You can find all the details of the election at :*https://governance.openstack.org/uc/reference/uc-election-feb2019.html :*. : : :I do want to point out to those who are new that Election Officials are :unable to run in the election itself but can of course vote. : : : :The election dates will be: :January 21 - February 03, 05:59 UTC: Open candidacy for UC positions :February 04 - February 10, 11:59 UTC: UC elections (voting) : : : :Please, reach out to any of the current UC members or simple reply to this :email if you can help us in this community process. : : : :Thanks, : : : :OpenStack User Committee : :Amy, Leong, Matt, Melvin, and Joseph :_______________________________________________ :User-committee mailing list :User-committee at lists.openstack.org :http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee -- From mriedemos at gmail.com Tue Jan 22 15:20:45 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 22 Jan 2019 09:20:45 -0600 Subject: [nova] Do we need to copy pci_devices to target cell DB during cross-cell resize? Message-ID: <3bfc308a-87cf-a873-618e-7a4ebc58a7e7@gmail.com> While working on the code to create instance-related data in the target cell database during a cross-cell resize I noticed I wasn't copying over pci_devices [1] but when looking at the PciDevice object create/save methods, those aren't really what I'm looking for here. And looking at the data model, there is a compute_node_id field which is the compute_nodes.id primary key which won't match the target cell DB. I am not very familiar with the PCI device manager code and data model (it's my weakest area in nova lo these many years) but looking closer at this, am I correct in understanding that the PciDevice object and data model is really more about the actual inventory and allocations of PCI devices on a given compute node and therefore it doesn't really make sense to need to copy that data over to the target cell database. During a cross-cell resize, the scheduler is going to pick a target host in another cell and claim standard resources (VCPU, MEMORY_MB and DISK_GB) in placement, but things like NUMA/PCI claims won't happen until we do a ResourceTracker.resize_claim on the target host in the target cell. In that case, it seems the things I only need to care about mirroring is instance.pci_requests and instance.numa_topology, correct? Since those are the user-requested (via flavor/image/port) resources which will then result in PciDevice and NUMA allocations on the target host. I'm just looking for confirmation from others that better understand the data model in this area. [1] https://review.openstack.org/#/c/627892/5/nova/conductor/tasks/cross_cell_migrate.py at 115 -- Thanks, Matt From mvanwinkle at salesforce.com Tue Jan 22 15:24:00 2019 From: mvanwinkle at salesforce.com (Matt Van Winkle) Date: Tue, 22 Jan 2019 09:24:00 -0600 Subject: [User-committee] UC Election - Looking for Election Officials In-Reply-To: <20190122151211.cvv3vues2dvk2jtg@csail.mit.edu> References: <20190122151211.cvv3vues2dvk2jtg@csail.mit.edu> Message-ID: Thank you, Jonathan! I have you down as our second official. The UC will be back in touch with both of you shortly on next steps. Thanks! VW On Tue, Jan 22, 2019 at 9:12 AM Jonathan Proulx wrote: > > Was on PTO last week when the ask came through, but if official are > still needed I'm game. > > -Jon > > On Mon, Jan 14, 2019 at 01:12:09PM -0600, Matt Van Winkle wrote: > :Hey Stackers, > : > : > :We are getting ready for the Winter UC election and we need to have at > :least two Election Officials. I was wondering if you would like to help us > :on that process. You can find all the details of the election at > :*https://governance.openstack.org/uc/reference/uc-election-feb2019.html > : >*. > : > : > :I do want to point out to those who are new that Election Officials are > :unable to run in the election itself but can of course vote. > : > : > : > :The election dates will be: > :January 21 - February 03, 05:59 UTC: Open candidacy for UC positions > :February 04 - February 10, 11:59 UTC: UC elections (voting) > : > : > : > :Please, reach out to any of the current UC members or simple reply to this > :email if you can help us in this community process. > : > : > : > :Thanks, > : > : > : > :OpenStack User Committee > : > :Amy, Leong, Matt, Melvin, and Joseph > > :_______________________________________________ > :User-committee mailing list > :User-committee at lists.openstack.org > :http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee > > > -- > -- Matt Van Winkle Senior Manager, Software Engineering | Salesforce Mobile: 210-445-4183 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ssbarnea at redhat.com Tue Jan 22 15:32:17 2019 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Tue, 22 Jan 2019 15:32:17 +0000 Subject: About gitignore policy. In-Reply-To: <20190122143239.GA27435@sm-workstation> References: <20190122143239.GA27435@sm-workstation> Message-ID: <128B8725-2B95-488D-9CB1-7F3EA4AEA4CF@redhat.com> We are talking about policies not fundamental laws of the universe. Policies are created in order to simplify decisions based on previous history, so clearly there were projects that existed before this policy was introduced. Try to look at it via pragmatic approach: this policy tries to avoid a never ending list of reviews related to local files. -- for good reasons, people have the freedom to use whatever editor they want. So what if everyone is writing his own editor with his own editor config files to be ignore? This is why such a policy was created. Obviously it appeared after it was observed and I am sure there are lots of projects that do have .gitignore files that do not "pass" this policy. Don't waste time trying to solve it, it does not worth the time.... you may endup like "someone is wrong on the internet". ;) harmony sounds nice on paper but in real life is impossible (or at least impractical) to achieve. If someone mentions it during a review, accept it and add the exception to your user .gitingore file. Cheers Sorin PS. I got myself a CR rejected few months ago for the same reason, I read about it and accepted it. It makes sense. > On 22 Jan 2019, at 14:32, Sean McGinnis wrote: > >> >> Then I have started to make patch, to align OpenStack projects with this policy: >> >> https://review.openstack.org/#/c/632086/ >> >> In nova project the patch was refused, the first reason was we have >> already talk about this subject. The second reason was this policy is >> only for the new projects. Oslo is not a new project and is follow >> this policy. >> >> I know is not a priority subject, but I just want understand. For me a >> policy must be followed by all projects, to harmonize all, but maybe I >> wrong. For example, in this case, I would to know what is the rules? >> The new projects must follow the policy but not the old ones? In this >> case, what the rules for old projects and can we defined also an >> policy for it? Thanks in advance for your help. >> >> My best regards >> > > It is somewhat up to each project and the core teams if, when, and how they > want to enforce something like this. > > So while it is generally recommended not to put IDE, OS-specific, and other > similar things in the per-project gitignore, there is also no real value added > by going through all existing projects and removing things that are already > there. > > Sean -------------- next part -------------- An HTML attachment was scrubbed... URL: From mihalis68 at gmail.com Tue Jan 22 15:56:10 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 22 Jan 2019 10:56:10 -0500 Subject: Fwd: [OpenStack Foundation] 2019 Individual Director election and Bylaws amendment results In-Reply-To: References: <256BFC05-F639-4F7C-A7E3-57F248782218@openstack.org> Message-ID: Congrats to the new openstack foundation directors! Just FYI the following went to the foundation mailing list. Chris From: Jonathan Bryce Date: Fri, Jan 18, 2019, 12:42 Subject: [OpenStack Foundation] 2019 Individual Director election and Bylaws amendment results To: Hello everyone, The 2019 election of Individual Directors has closed. Results are available at the following link: https://www.bigpulse.com/pollresults?code=1336400LYwRuQQZqR72BJRpKJYV The Bylaws amendments have now been passed by the Individual Member class. The elected and appointed directors will be seated at the Board meeting at the end of January. Congratulations to our new and returning directors. We actually had the highest number of voters for any of our elections participate this year, so thank you to everyone for joining in the process. Jonathan 210-317-2438 Individual Directors ----------------------------------------- Tim Bell ChangBo Guo Sean McGinnis Prakash Ramchandran Allison Randal Egle Sigler Monty Taylor Shane Wang Platinum Directors ----------------------------------------- Alan Clark Ruan He Anni Lai Mark McLoughlin Chris Price Imad Sousou Brian Stein Ryan Van Wyk Gold Directors ----------------------------------------- Mark Baker Johan Christenson Robert Esker Clemens Hardewig Arkady Kanevsky JunWei Liu Vijoy Pandey Joseph Wang _______________________________________________ Foundation mailing list Foundation at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/foundation -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at medberry.net Tue Jan 22 15:57:03 2019 From: openstack at medberry.net (David Medberry) Date: Tue, 22 Jan 2019 08:57:03 -0700 Subject: [User-committee] UC Election - Looking for Election Officials In-Reply-To: References: <20190122151211.cvv3vues2dvk2jtg@csail.mit.edu> Message-ID: and resending under my registered email so it can be processed properly... Holler if you need a third. Happy to assist. Sounds like you have sufficient but let me know. On Tue, Jan 22, 2019 at 8:55 AM David Medberry wrote: > > Holler if you need a third. Happy to assist. Sounds like you have > sufficient but let me know. > > On Tue, Jan 22, 2019 at 8:24 AM Matt Van Winkle > wrote: > > > > Thank you, Jonathan! > > > > I have you down as our second official. The UC will be back in touch with both of you shortly on next steps. > > > > Thanks! > > VW > > > > On Tue, Jan 22, 2019 at 9:12 AM Jonathan Proulx wrote: > >> > >> > >> Was on PTO last week when the ask came through, but if official are > >> still needed I'm game. > >> > >> -Jon > >> > >> On Mon, Jan 14, 2019 at 01:12:09PM -0600, Matt Van Winkle wrote: > >> :Hey Stackers, > >> : > >> : > >> :We are getting ready for the Winter UC election and we need to have at > >> :least two Election Officials. I was wondering if you would like to help us > >> :on that process. You can find all the details of the election at > >> :*https://governance.openstack.org/uc/reference/uc-election-feb2019.html > >> :*. > >> : > >> : > >> :I do want to point out to those who are new that Election Officials are > >> :unable to run in the election itself but can of course vote. > >> : > >> : > >> : > >> :The election dates will be: > >> :January 21 - February 03, 05:59 UTC: Open candidacy for UC positions > >> :February 04 - February 10, 11:59 UTC: UC elections (voting) > >> : > >> : > >> : > >> :Please, reach out to any of the current UC members or simple reply to this > >> :email if you can help us in this community process. > >> : > >> : > >> : > >> :Thanks, > >> : > >> : > >> : > >> :OpenStack User Committee > >> : > >> :Amy, Leong, Matt, Melvin, and Joseph > >> > >> :_______________________________________________ > >> :User-committee mailing list > >> :User-committee at lists.openstack.org > >> :http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee > >> > >> > >> -- > > > > > > > > -- > > Matt Van Winkle > > Senior Manager, Software Engineering | Salesforce > > Mobile: 210-445-4183 From dave at medberry.net Tue Jan 22 15:55:33 2019 From: dave at medberry.net (David Medberry) Date: Tue, 22 Jan 2019 08:55:33 -0700 Subject: [User-committee] UC Election - Looking for Election Officials In-Reply-To: References: <20190122151211.cvv3vues2dvk2jtg@csail.mit.edu> Message-ID: Holler if you need a third. Happy to assist. Sounds like you have sufficient but let me know. On Tue, Jan 22, 2019 at 8:24 AM Matt Van Winkle wrote: > > Thank you, Jonathan! > > I have you down as our second official. The UC will be back in touch with both of you shortly on next steps. > > Thanks! > VW > > On Tue, Jan 22, 2019 at 9:12 AM Jonathan Proulx wrote: >> >> >> Was on PTO last week when the ask came through, but if official are >> still needed I'm game. >> >> -Jon >> >> On Mon, Jan 14, 2019 at 01:12:09PM -0600, Matt Van Winkle wrote: >> :Hey Stackers, >> : >> : >> :We are getting ready for the Winter UC election and we need to have at >> :least two Election Officials. I was wondering if you would like to help us >> :on that process. You can find all the details of the election at >> :*https://governance.openstack.org/uc/reference/uc-election-feb2019.html >> :*. >> : >> : >> :I do want to point out to those who are new that Election Officials are >> :unable to run in the election itself but can of course vote. >> : >> : >> : >> :The election dates will be: >> :January 21 - February 03, 05:59 UTC: Open candidacy for UC positions >> :February 04 - February 10, 11:59 UTC: UC elections (voting) >> : >> : >> : >> :Please, reach out to any of the current UC members or simple reply to this >> :email if you can help us in this community process. >> : >> : >> : >> :Thanks, >> : >> : >> : >> :OpenStack User Committee >> : >> :Amy, Leong, Matt, Melvin, and Joseph >> >> :_______________________________________________ >> :User-committee mailing list >> :User-committee at lists.openstack.org >> :http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee >> >> >> -- > > > > -- > Matt Van Winkle > Senior Manager, Software Engineering | Salesforce > Mobile: 210-445-4183 From openstack at nemebean.com Tue Jan 22 17:06:44 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 22 Jan 2019 11:06:44 -0600 Subject: [oslo] Meetings next two weeks Message-ID: <7489c0a1-3580-c737-dfd3-52834fda7d41@nemebean.com> I had some last-minute travel come up for next week (the 28th-1st) which means I may have spotty availability. I'm not sure if I'll be able to run the meeting on Monday. I guess you can watch for courtesy pings and if they don't happen then chances are I couldn't. :-) The next week (4th-8th) I'm on vacation so I definitely won't be around. Unless someone else wants to run the meeting that one will not happen for sure. -Ben From smooney at redhat.com Tue Jan 22 17:29:03 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 22 Jan 2019 17:29:03 +0000 Subject: About gitignore policy. In-Reply-To: <128B8725-2B95-488D-9CB1-7F3EA4AEA4CF@redhat.com> References: <20190122143239.GA27435@sm-workstation> <128B8725-2B95-488D-9CB1-7F3EA4AEA4CF@redhat.com> Message-ID: <7798b3b51afc3aadd3537a317f1046ff9ab99082.camel@redhat.com> On Tue, 2019-01-22 at 15:32 +0000, Sorin Sbarnea wrote: > We are talking about policies not fundamental laws of the universe. > > Policies are created in order to simplify decisions based on previous history, so clearly there were projects that > existed before this policy was introduced. > > Try to look at it via pragmatic approach: this policy tries to avoid a never ending list of reviews related to local > files. -- for good reasons, people have the freedom to use whatever editor they want. So what if everyone is writing > his own editor with his own editor config files to be ignore? > > This is why such a policy was created. Obviously it appeared after it was observed and I am sure there are lots of > projects that do have .gitignore files that do not "pass" this policy. > > Don't waste time trying to solve it, it does not worth the time.... you may endup like "someone is wrong on the > internet". ;) > > harmony sounds nice on paper but in real life is impossible (or at least impractical) to achieve. > > If someone mentions it during a review, accept it and add the exception to your user .gitingore file. so i personally am in the camp of just merge patches for trivial change like this when they are proposed but also as you said you can just set these in you users ~/.gitignore too. there is some documentation about that here https://git-scm.com/docs/gitignore however i have found most people are not aware that there is such a thing as a user's git ignore which is what propmts people to propose patches to change the projects .gitignore so if we do say we dont suport this becase of policy x we should try to tell people how they can update ther local env to have the same effect. > > Cheers > Sorin > > PS. I got myself a CR rejected few months ago for the same reason, I read about it and accepted it. It makes sense. > > > On 22 Jan 2019, at 14:32, Sean McGinnis wrote: > > > > > Then I have started to make patch, to align OpenStack projects with this policy: > > > > > > https://review.openstack.org/#/c/632086/ > > > > > > In nova project the patch was refused, the first reason was we have > > > already talk about this subject. The second reason was this policy is > > > only for the new projects. Oslo is not a new project and is follow > > > this policy. > > > > > > I know is not a priority subject, but I just want understand. For me a > > > policy must be followed by all projects, to harmonize all, but maybe I > > > wrong. For example, in this case, I would to know what is the rules? > > > The new projects must follow the policy but not the old ones? In this > > > case, what the rules for old projects and can we defined also an > > > policy for it? Thanks in advance for your help. > > > > > > My best regards > > > > > > > It is somewhat up to each project and the core teams if, when, and how they > > want to enforce something like this. > > > > So while it is generally recommended not to put IDE, OS-specific, and other > > similar things in the per-project gitignore, there is also no real value added > > by going through all existing projects and removing things that are already > > there. > > > > Sean > > From ukalifon at redhat.com Tue Jan 22 17:30:50 2019 From: ukalifon at redhat.com (Udi Kalifon) Date: Tue, 22 Jan 2019 18:30:50 +0100 Subject: [qa] dynamic credentials with the tempest swift client In-Reply-To: <1686b21dc57.10adf6b3c12767.7382989113710382621@ghanshyammann.com> References: <1aa6bce4-622e-4787-a73b-27de7ed9d224@www.fastmail.com> <1686b21dc57.10adf6b3c12767.7382989113710382621@ghanshyammann.com> Message-ID: Thanks to everyone who helped! I was able to disable dynamic credentials like this: 1) Add a reference to an accounts file in tempest.conf, and disable the use of dynamic credentials: [auth] test_accounts_file = /home/ukalifon/src/tempest/cloud-01/accounts.yaml use_dynamic_credentials = False 2) The accounts.yaml file must contain at least 2 uses, one admin and one regular. Mine looks like this: - username: 'admin' tenant_name: 'admin' password: 'cYsJrqtj7IvC581DxsLZkXlku' roles: - 'admin' - username: 'johndoe' tenant_name: 'admin' password: 'johndoe' roles: - '_member_' 3) I needed to create the regular user that I placed in my accounts.yaml: openstack user create --project admin --password johndoe johndoe openstack role add --project admin --user johndoe _member_ Regards, Udi Kalifon; Senior QE; RHOS-UI Automation On Sun, Jan 20, 2019 at 1:06 PM Ghanshyam Mann wrote: > Pre-provisioned account is the way to use the existing cred to run the > Tempest. Can you check in tempest log about the reason > of test skip? You can take ref of gate job for pre-provisioned accounts [1] > > [1] > http://logs.openstack.org/50/628250/3/check/tempest-full-test-account-py3 > > -gmann > > > ---- On Sat, 19 Jan 2019 01:01:22 +0900 Udi Kalifon > wrote ---- > > When I try this it just skips the tests, and doesn't say anywhere why. > I added this to my tempest.conf: > > [auth] > > test_accounts_file = /home/ukalifon/src/tempest/cloud-01/accounts.yaml > > use_dynamic_credentials = False > > > > And my accounts.yaml looks like this:- username: 'admin' > > tenant_name: 'admin' > > password: 'cYsJrqtj7IvC581DxsLZkXlku' > > > > Regards, > > Udi Kalifon; Senior QE; RHOS-UI Automation > > > > > > > > On Fri, Jan 18, 2019 at 11:08 AM Masayuki Igawa < > masayuki.igawa at gmail.com> wrote: > > Hi, > > > > On Thu, Jan 17, 2019, at 17:58, Udi Kalifon wrote: > > : > > > So I'm looking for a way to utilize the client without it > automatically > > > creating itself dynamic credentials; it has to use the > already-existing > > > admin credentials on the admin project in order to see the container > > > with the plans. What's the right way to do that, please? Thanks a > lot > > > in advance! > > > > Does this pre-provisioned credentials help you? > > > https://docs.openstack.org/tempest/latest/configuration.html#pre-provisioned-credentials > > > > -- Masayuki Igawa > > Key fingerprint = C27C 2F00 3A2A 999A 903A 753D 290F 53ED C899 BF89 > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Jan 22 17:35:29 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 22 Jan 2019 17:35:29 +0000 Subject: About gitignore policy. In-Reply-To: <7798b3b51afc3aadd3537a317f1046ff9ab99082.camel@redhat.com> References: <20190122143239.GA27435@sm-workstation> <128B8725-2B95-488D-9CB1-7F3EA4AEA4CF@redhat.com> <7798b3b51afc3aadd3537a317f1046ff9ab99082.camel@redhat.com> Message-ID: <4e78b092f7523c7c4dda4cca8aea7976db5b2b97.camel@redhat.com> On Tue, 2019-01-22 at 17:29 +0000, Sean Mooney wrote: > On Tue, 2019-01-22 at 15:32 +0000, Sorin Sbarnea wrote: > > We are talking about policies not fundamental laws of the universe. > > > > Policies are created in order to simplify decisions based on previous history, so clearly there were projects that > > existed before this policy was introduced. > > > > Try to look at it via pragmatic approach: this policy tries to avoid a never ending list of reviews related to local > > files. -- for good reasons, people have the freedom to use whatever editor they want. So what if everyone is writing > > his own editor with his own editor config files to be ignore? > > > > This is why such a policy was created. Obviously it appeared after it was observed and I am sure there are lots of > > projects that do have .gitignore files that do not "pass" this policy. > > > > Don't waste time trying to solve it, it does not worth the time.... you may endup like "someone is wrong on the > > internet". ;) > > > > harmony sounds nice on paper but in real life is impossible (or at least impractical) to achieve. > > > > If someone mentions it during a review, accept it and add the exception to your user .gitingore file. > > so i personally am in the camp of just merge patches for trivial change like this when they are proposed > but also as you said you can just set these in you users ~/.gitignore too. by the way your user level git ignore is at this location by default $HOME/.config/git/ignore if you want to use a ~.gitgnore you need to override that default. using ~/.gitignore is generally not ideal if you are one of those people that like to commit there home directory to git to be able to easily version contol it. anyway i hope that helps. > there is some documentation about that here https://git-scm.com/docs/gitignore however > i have found most people are not aware that there is such a thing as a user's git ignore > which is what propmts people to propose patches to change the projects .gitignore > so if we do say we dont suport this becase of policy x we should try to tell people how they > can update ther local env to have the same effect. > > > > > Cheers > > Sorin > > > > PS. I got myself a CR rejected few months ago for the same reason, I read about it and accepted it. It makes sense. > > > > > On 22 Jan 2019, at 14:32, Sean McGinnis wrote: > > > > > > > Then I have started to make patch, to align OpenStack projects with this policy: > > > > > > > > https://review.openstack.org/#/c/632086/ > > > > > > > > In nova project the patch was refused, the first reason was we have > > > > already talk about this subject. The second reason was this policy is > > > > only for the new projects. Oslo is not a new project and is follow > > > > this policy. > > > > > > > > I know is not a priority subject, but I just want understand. For me a > > > > policy must be followed by all projects, to harmonize all, but maybe I > > > > wrong. For example, in this case, I would to know what is the rules? > > > > The new projects must follow the policy but not the old ones? In this > > > > case, what the rules for old projects and can we defined also an > > > > policy for it? Thanks in advance for your help. > > > > > > > > My best regards > > > > > > > > > > It is somewhat up to each project and the core teams if, when, and how they > > > want to enforce something like this. > > > > > > So while it is generally recommended not to put IDE, OS-specific, and other > > > similar things in the per-project gitignore, there is also no real value added > > > by going through all existing projects and removing things that are already > > > there. > > > > > > Sean > > > > > > From smooney at redhat.com Tue Jan 22 17:54:58 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 22 Jan 2019 17:54:58 +0000 Subject: Fwd: [OpenStack Foundation] 2019 Individual Director election and Bylaws amendment results In-Reply-To: References: <256BFC05-F639-4F7C-A7E3-57F248782218@openstack.org> Message-ID: <7e752b4bca196619133da0c2d5577038c156b211.camel@redhat.com> On Tue, 2019-01-22 at 10:56 -0500, Chris Morgan wrote: > Congrats to the new openstack foundation directors! Just FYI the following went to the foundation mailing list. congrats. for anyone who recently moved company as i did in july and previously used there company email for there openstack foundation account to be able to vote in election turns out you have to update that seperately by logging in to openstack.org and updating your email as it is handled sepreatly from gerrit and ubuntu one/launchpad. so if you qualify to vote in the election but did not recive an email maybe check you have updated it too. > > Chris > > From: Jonathan Bryce > Date: Fri, Jan 18, 2019, 12:42 > Subject: [OpenStack Foundation] 2019 Individual Director election and Bylaws amendment results > To: > > > Hello everyone, > > The 2019 election of Individual Directors has closed. Results are available at the following link: > > https://www.bigpulse.com/pollresults?code=1336400LYwRuQQZqR72BJRpKJYV > > The Bylaws amendments have now been passed by the Individual Member class. The elected and appointed directors will be > seated at the Board meeting at the end of January. > > Congratulations to our new and returning directors. We actually had the highest number of voters for any of our > elections participate this year, so thank you to everyone for joining in the process. > > Jonathan > 210-317-2438 > > > Individual Directors > ----------------------------------------- > Tim Bell > ChangBo Guo > Sean McGinnis > Prakash Ramchandran > Allison Randal > Egle Sigler > Monty Taylor > Shane Wang > > Platinum Directors > ----------------------------------------- > Alan Clark > Ruan He > Anni Lai > Mark McLoughlin > Chris Price > Imad Sousou > Brian Stein > Ryan Van Wyk > > Gold Directors > ----------------------------------------- > Mark Baker > Johan Christenson > Robert Esker > Clemens Hardewig > Arkady Kanevsky > JunWei Liu > Vijoy Pandey > Joseph Wang > > > _______________________________________________ > Foundation mailing list > Foundation at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/foundation > > From ignaziocassano at gmail.com Tue Jan 22 18:04:05 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 22 Jan 2019 19:04:05 +0100 Subject: [Cinder][nova] queens backup Message-ID: Hi All, Please, I' d like to know if cinder backup and/or cinder snapshot call qemu guest agent for fsfreezing. If Yes, does it freeze file systems in the volume? Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Jan 22 18:19:21 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 22 Jan 2019 18:19:21 +0000 Subject: [nova] Do we need to copy pci_devices to target cell DB during cross-cell resize? In-Reply-To: <3bfc308a-87cf-a873-618e-7a4ebc58a7e7@gmail.com> References: <3bfc308a-87cf-a873-618e-7a4ebc58a7e7@gmail.com> Message-ID: <7d19a76eaf59950696a5cbd21cc08b2b85cbda76.camel@redhat.com> On Tue, 2019-01-22 at 09:20 -0600, Matt Riedemann wrote: > While working on the code to create instance-related data in the target > cell database during a cross-cell resize I noticed I wasn't copying over > pci_devices [1] but when looking at the PciDevice object create/save > methods, those aren't really what I'm looking for here. And looking at > the data model, there is a compute_node_id field which is the > compute_nodes.id primary key which won't match the target cell DB. > > I am not very familiar with the PCI device manager code and data model > (it's my weakest area in nova lo these many years) but looking closer at > this, am I correct in understanding that the PciDevice object and data > model is really more about the actual inventory and allocations of PCI > devices on a given compute node and therefore it doesn't really make > sense to need to copy that data over to the target cell database. > > During a cross-cell resize, the scheduler is going to pick a target host > in another cell and claim standard resources (VCPU, MEMORY_MB and > DISK_GB) in placement, but things like NUMA/PCI claims won't happen > until we do a ResourceTracker.resize_claim on the target host in the > target cell. In that case, it seems the things I only need to care about > mirroring is instance.pci_requests and instance.numa_topology, correct? yes i belive that is correct. the scheduler when seleciting the host in the remote numa cell will need to run the pci passthough filter to validate that pci_request against the destingation host. you are specifically doing a resize so you dont need to regenerate a new xml on the source node before starting a live migration sice its not a live migration but you might want to premtpvily allcoate the pci devices on the destination if you want to prevent a race with other hosts. That said for stien it may be better to declare that out of scope. its really not any more racy then spawning an instace as we dont claim the device untill we get to the compute node anyway today. the instance.numa_topology shoudl really be recalulated for the target host also. you do not want to require the destination host to place the vm with the original numa toptolgy from the source node. so i think you need to propagate the numa related request which are all in the flavor/image but i dont think you need to copy the instance numa_topology object. its not a live migration so provided the numa toplogy filter says the host is vlaid you are free to recalualte the numa toplogy form scratch when it lands on the compute node based on the image and flavor values. > > Since those are the user-requested (via flavor/image/port) resources > which will then result in PciDevice and NUMA allocations on the target host. > > I'm just looking for confirmation from others that better understand the > data model in this area. > > [1] > https://review.openstack.org/#/c/627892/5/nova/conductor/tasks/cross_cell_migrate.py at 115 > From michaelr at catalyst.net.nz Tue Jan 22 18:21:00 2019 From: michaelr at catalyst.net.nz (Michael Richardson) Date: Wed, 23 Jan 2019 07:21:00 +1300 Subject: [Trove] State of the Trove service tenant deployment model In-Reply-To: References: <20190122110317.9eff51b2ef0b79137fc99593@catalyst.net.nz> <20190122121351.31bdcc89a6ee52e9eaf3b02b@catalyst.net.nz> Message-ID: <20190122182100.GA26958@catalyst.net.nz> On Tue, Jan 22, 2019 at 07:29:25PM +1300, Zane Bitter wrote: > Last time I heard (which was probably mid-2017), the Trove team had > implemented encryption for messages on the RabbitMQ bus. IIUC each DB being > managed had its own encryption keys, so that would theoretically prevent > both snooping and spoofing of messages. That's the good news. > > The bad news is that AFAIK it's still using a shared RabbitMQ bus, so > attacks like denial of service are still possible if you can extract the > shared credentials from the VM. Not sure about replay attacks; I haven't > actually investigated the implementation. > > cheers, > Zane. Excellent - many thanks for the confirmation. Cheers, Michael From mnaser at vexxhost.com Tue Jan 22 18:28:51 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Tue, 22 Jan 2019 10:28:51 -0800 Subject: [tripleo][openstack-ansible] collaboration on os_tempest role update VII - Jan 22, 2019 In-Reply-To: References: Message-ID: Thanks for your updates, Chandan. On Mon, Jan 21, 2019 at 10:55 PM Chandan kumar wrote: > Hello, > > Here is the seventh update (Jan 15 to Jan 22, 2019) on collaboration > on os_tempest[1] role > between TripleO and OpenStack-Ansible projects. > > Things got merged: > os_tempest: > * Add support for aarch64 images - > https://review.openstack.org/#/c/620032/ > * Configuration drives don't appear to work on aarch64+kvm - > https://review.openstack.org/#/c/626592/ > * Fix tempest workspace path - https://review.openstack.org/#/c/628182/ > * Add libselinux-python package for Red Hat distro - > https://review.openstack.org/#/c/631203/ > * Use usr/local/share/ansible/roles for data_files - > https://review.openstack.org/#/c/630917/ > * Use tempest_domain_name var for setting domain - > https://review.openstack.org/#/c/630957/ > * Rename tempest_public_net_physical_{type to name} - > https://review.openstack.org/#/c/631183/ > * Add tempest_interface_name var for setting interface - > https://review.openstack.org/#/c/630942/ > > ansible-config_template > * Use usr/share/ansible/plugins for data_files - > https://review.openstack.org/631214 > > ansible-role-python_venv_build > * Use usr/local/share/ansible/roles for data_files - > https://review.openstack.org/#/c/631777/ > > openstack-ansible-tests > * Setup clouds.yaml on tempest node - https://review.openstack.org/631794 > > Tripleo-spec > * New spec for stein: os_tempest tripleo integration - > https://review.openstack.org/630654 > > Summary: > * We have cleaned up os_tempest cloudname and network related vars, > fixed tempest workspace upgrade > issue and os_tempest got support of aarch64 images and nova > Configuration drives. > * Tripleo os_tempest spec published: > > https://specs.openstack.org/openstack/tripleo-specs/specs/stein/ostempest-tripleo.html > > Things in Progress: > os_tempest > * Always generate stackviz irrespective of tests pass or fail - > https://review.openstack.org/631967 > * Add telemetry distro plugin install for aodh - > https://review.openstack.org/632125 > * Added tempest.conf for heat_plugin - https://review.openstack.org/632021 > * Use the correct heat tests - https://review.openstack.org/630695 > * Use tempest_cloud_name in tempestconf - > https://review.openstack.org/631708 > * Adds tempest run command with --test-list option - > https://review.openstack.org/631351 > > os_tempest integration with Tripleo > * Added requirements for integrating os_tempest role - > https://review.openstack.org/628421 > * Run tempest using os_tempest role in standalone job - > https://review.openstack.org/627500 > * Use os_tempest for running tempest on standalone - > https://review.openstack.org/628415 > > Upcoming week: > * We are able to run tempest in Triplo CI standalone > test results are here: > > http://logs.openstack.org/00/627500/65/check/tripleo-ci-centos-7-standalone-os-tempest/198ae77/logs/undercloud/var/log/tempest/stestr_results.html.gz > * We will try to finish os_tempest integration with tripleo and heat > support in os_tempest. > > Thanks to odyssey4me, cloudnull, jrosser (a lot of help on different > patches), Panda and Sagi from Tripleo CI team > pabelanger and dmsimard on figuring out os_tempest dependencies, > action_plugins and nested ansible. > > Here is the 6th update [2]. Have queries, Feel free to ping us on > #tripleo or #openstack-ansible channel. > > Links: > [1.] http://git.openstack.org/cgit/openstack/openstack-ansible-os_tempest > [2.] > http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001730.html > > Thanks, > > Chandan Kumar > > -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Tue Jan 22 18:40:33 2019 From: hberaud at redhat.com (Herve Beraud) Date: Tue, 22 Jan 2019 19:40:33 +0100 Subject: [all][oslo][coordination] remove method get_transport in transport.py Message-ID: Hello everybody! Since few months oslo plan to remove an deprecated method (oslo_messaging.get_transport). In a first this method was deprecated to facilitate removing rpc_backend support, get_transport aliases was deprecated to simplify the code and the tests. Aliases were first depreacted in oslo.messaging 5.20.0 during Pike and now we want to definitely remove this method. The support of this aliases was planed to be remove during milestone stein-1. So I propose to all projects who use this deprecated method to start to move to the right oslo.messaging supported transport methods (get_rpc_transport or get_notification_transport depends on your needs). This thread will try to coordinate projects during migrator. I already submit a patch[1] to oslo.messaging to start to remove this method. To move forward on this topic I personaly plan to propose patchs to projects that I know that use it. To ensure a better removing, to be sure that all projects remove usages of this method, and to merge in the right order, I propose to you to use the same topic in your gerrit reviews by using refer the bug https://bugs.launchpad.net/oslo.messaging/+bug/1714945 in your commit (Related-Bug: #1714945). Please ensure that all your projects stop to use this method. If you facing an issue please reply to this thread to centralize support during the migrator. You can track this topic by visiting this page: https://review.openstack.org/#/q/topic:bug/1714945+(status:open+OR+status:merged) [1] https://review.openstack.org/632523 Cheers! Best! -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Tue Jan 22 18:45:44 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 22 Jan 2019 12:45:44 -0600 Subject: [Cinder][nova] queens backup In-Reply-To: References: Message-ID: <20190122184544.GA1219@sm-workstation> On Tue, Jan 22, 2019 at 07:04:05PM +0100, Ignazio Cassano wrote: > Hi All, > Please, I' d like to know if cinder backup and/or cinder snapshot call > qemu guest agent for fsfreezing. > If Yes, does it freeze file systems in the volume? > Regards > Ignazio Unfortunately no, initiating a snapshot or backup from Cinder does not call out to Nova to do any guest quiescing. There would need to be something else coordinating the calls between the guest OS and initiating the Cinder snapshot to do that. Sean From hberaud at redhat.com Tue Jan 22 18:46:44 2019 From: hberaud at redhat.com (Herve Beraud) Date: Tue, 22 Jan 2019 19:46:44 +0100 Subject: [scientific-sig] IRC meeting Wednesday 1100 UTC: 2FA with FreeIPA, OpenInfra Days London, ISC 2019 In-Reply-To: <95C77676-5EA4-4D33-A2C0-B0D2A95A9504@telfer.org> References: <95C77676-5EA4-4D33-A2C0-B0D2A95A9504@telfer.org> Message-ID: Hey! Do you have a courtesy ping system or something like that where I can put my nick to be pinged before meeting? Best. Le mar. 15 janv. 2019 à 22:00, Stig Telfer a écrit : > Hi All - > > We have a Scientific SIG IRC meeting coming up on Wednesday 16th at 1100 > UTC in channel #openstack-meeting. Everyone is welcome. > > Full agenda and details are here: > > > https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_16th_2019 > > We’ve got quite a bit to cover this week. Our headline item is a > presentation on experiences configuring 2FA for OpenStack using FreeIPA. > We also have some exciting events coming up, and discussion on a possible > Scientific OpenStack BoF at the International Supercomputer Conference in > Frankfurt in June. > > Cheers, > Stig > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From fsbiz at yahoo.com Tue Jan 22 18:56:30 2019 From: fsbiz at yahoo.com (Farhad Sunavala) Date: Tue, 22 Jan 2019 18:56:30 +0000 (UTC) Subject: [openstack-dev] [neutron] References: <1488344635.1835770.1548183390982.ref@mail.yahoo.com> Message-ID: <1488344635.1835770.1548183390982@mail.yahoo.com> Hi, I am open to suggestions.We have a need to switch traffic from our project to other projects without first getting out on the internet, floating IPs, etc. The other projects will be sharing their networks with our project.As shown in figure below, the orange network belongs to our project (10.0.0.0/26) The green network (172.31.0.0/24) belongs to another project andhas an overlapping network with the red tenant (172.31.0.0/16) For now, the solution is to create VMs in our project and make sure none of the interfaceshaving overlapping CIDRs.  Thus, there is a VM attached to the 'orange' and 'red' netsand another VM attached to the 'orange' and 'green' nets. Problem: Too much resources (VMs) will need to be created if we have 100 tenants with overlapping networks. Solution:Is there a way I can minimize VM resource in our project by not allocating a separate VMfor shared networks with overlapping CIDRs? thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 1548182968204blob.jpg Type: image/png Size: 15177 bytes Desc: not available URL: From ignaziocassano at gmail.com Tue Jan 22 19:17:59 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 22 Jan 2019 20:17:59 +0100 Subject: [Cinder][nova] queens backup In-Reply-To: <20190122184544.GA1219@sm-workstation> References: <20190122184544.GA1219@sm-workstation> Message-ID: Thanks for the info. Ignazio Il giorno Mar 22 Gen 2019 19:45 Sean McGinnis ha scritto: > On Tue, Jan 22, 2019 at 07:04:05PM +0100, Ignazio Cassano wrote: > > Hi All, > > Please, I' d like to know if cinder backup and/or cinder snapshot call > > qemu guest agent for fsfreezing. > > If Yes, does it freeze file systems in the volume? > > Regards > > Ignazio > > Unfortunately no, initiating a snapshot or backup from Cinder does not > call out > to Nova to do any guest quiescing. There would need to be something else > coordinating the calls between the guest OS and initiating the Cinder > snapshot > to do that. > > Sean > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stig.openstack at telfer.org Tue Jan 22 19:26:54 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Tue, 22 Jan 2019 19:26:54 +0000 Subject: [scientific-sig] IRC meeting Wednesday 1100 UTC: 2FA with FreeIPA, OpenInfra Days London, ISC 2019 In-Reply-To: References: <95C77676-5EA4-4D33-A2C0-B0D2A95A9504@telfer.org> Message-ID: Hi Hervé - apologies, we don’t at present, but I’ll always try to mail out a reminder on openstack-discuss shortly before. Best wishes, Stig > On 22 Jan 2019, at 18:46, Herve Beraud wrote: > > Hey! > > Do you have a courtesy ping system or something like that where I can put my nick to be pinged before meeting? > > Best. > > Le mar. 15 janv. 2019 à 22:00, Stig Telfer a écrit : > Hi All - > > We have a Scientific SIG IRC meeting coming up on Wednesday 16th at 1100 UTC in channel #openstack-meeting. Everyone is welcome. > > Full agenda and details are here: > > https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_16th_2019 > > We’ve got quite a bit to cover this week. Our headline item is a presentation on experiences configuring 2FA for OpenStack using FreeIPA. We also have some exciting events coming up, and discussion on a possible Scientific OpenStack BoF at the International Supercomputer Conference in Frankfurt in June. > > Cheers, > Stig > > > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From dkrol3 at gmail.com Tue Jan 22 20:09:29 2019 From: dkrol3 at gmail.com (=?UTF-8?Q?Darek_Kr=C3=B3l?=) Date: Tue, 22 Jan 2019 21:09:29 +0100 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model Message-ID: On Tue, Jan 22, 2019 at 07:29:25PM +1300, Zane Bitter wrote: > Last time I heard (which was probably mid-2017), the Trove team had > implemented encryption for messages on the RabbitMQ bus. IIUC each DB being > managed had its own encryption keys, so that would theoretically prevent > both snooping and spoofing of messages. That's the good news. > > The bad news is that AFAIK it's still using a shared RabbitMQ bus, so > attacks like denial of service are still possible if you can extract the > shared credentials from the VM. Not sure about replay attacks; I haven't > actually investigated the implementation. > > cheers, > Zane. > Excellent - many thanks for the confirmation. > > Cheers, > Michael Hello Michael and Zane, sorry for the late reply. I believe Zane is referring to a video from 2017 [0]. Yes, messages from trove instances are encrypted and the keys are kept in Trove DB. It is still a shared message bus, but it can be a message bus dedicated for Trove only and separated from message bus shared by other Openstack services. DDOS attacks are also mentioned in the video as a potential threat but there is very little details and possible solutions. Recently we had some internal discussion about this threat within Trove team. Maybe we could user Rabbitmq mechanisms for flow control mentioned in [1,2,3] ? Another point, I'm wondering if this is a problem only in Trove or is it something other services would be interesting in also ? Best, Darek [0] https://youtu.be/dzvcKlt3Lx8 [1] https://www.rabbitmq.com/flow-control.html [2] http://www.rabbitmq.com/blog/2012/04/17/rabbitmq-performance-measurements-part-1/ [3] https://tech.labs.oliverwyman.com/blog/2013/08/31/controlling-fast-producers-in-a-rabbit-as-a-service/ From Kevin.Fox at pnnl.gov Tue Jan 22 20:24:55 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Tue, 22 Jan 2019 20:24:55 +0000 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model In-Reply-To: References: Message-ID: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov> We tried to solve it as a cross project issue for a while and then everyone gave up. lots of projects have the same problem. trove, sahara, magnum, etc. Other then just control messages, there is also the issue of version skew between guest agents and controllers and how to do rolling upgrades. Its messy today. I'd recommend at this point to maybe just run kubernetes across the vms and push the guest agents/workload to them. You can still drive it via an openstack api, but doing rolling upgrades of guest agents or mysql containers or whatever is way simpler for operators to handle. We should embrace k8s as part of the solution rather then trying to reimplement it IMO. Thanks, Kevin ________________________________________ From: Darek Król [dkrol3 at gmail.com] Sent: Tuesday, January 22, 2019 12:09 PM To: Michael Richardson Cc: openstack-discuss at lists.openstack.org Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model On Tue, Jan 22, 2019 at 07:29:25PM +1300, Zane Bitter wrote: > Last time I heard (which was probably mid-2017), the Trove team had > implemented encryption for messages on the RabbitMQ bus. IIUC each DB being > managed had its own encryption keys, so that would theoretically prevent > both snooping and spoofing of messages. That's the good news. > > The bad news is that AFAIK it's still using a shared RabbitMQ bus, so > attacks like denial of service are still possible if you can extract the > shared credentials from the VM. Not sure about replay attacks; I haven't > actually investigated the implementation. > > cheers, > Zane. > Excellent - many thanks for the confirmation. > > Cheers, > Michael Hello Michael and Zane, sorry for the late reply. I believe Zane is referring to a video from 2017 [0]. Yes, messages from trove instances are encrypted and the keys are kept in Trove DB. It is still a shared message bus, but it can be a message bus dedicated for Trove only and separated from message bus shared by other Openstack services. DDOS attacks are also mentioned in the video as a potential threat but there is very little details and possible solutions. Recently we had some internal discussion about this threat within Trove team. Maybe we could user Rabbitmq mechanisms for flow control mentioned in [1,2,3] ? Another point, I'm wondering if this is a problem only in Trove or is it something other services would be interesting in also ? Best, Darek [0] https://youtu.be/dzvcKlt3Lx8 [1] https://www.rabbitmq.com/flow-control.html [2] http://www.rabbitmq.com/blog/2012/04/17/rabbitmq-performance-measurements-part-1/ [3] https://tech.labs.oliverwyman.com/blog/2013/08/31/controlling-fast-producers-in-a-rabbit-as-a-service/ From ashlee at openstack.org Tue Jan 22 20:24:31 2019 From: ashlee at openstack.org (Ashlee Ferguson) Date: Tue, 22 Jan 2019 12:24:31 -0800 Subject: Denver Summit CFP Closing Tomorrow Message-ID: <8035102F-8831-4C93-AAD9-DBA38B775A3A@openstack.org> Hi everyone, The CFP closes in less than 48 hours, so make sure to submit your presentations, panels, and workshops for the Denver Open Infrastructure Summit before tomorrow, January 23 at 11:59pm PT (January 24 at 7:59am UTC). SUBMIT YOUR PRESENTATION Tracks: AI, Machine Learning & HPC CI/CD Container Infrastructure Edge Computing Hands-on Workshops Open Development (formerly Open Source Community) Open Infrastructure Basics Private & Hybrid Cloud Public Cloud Security - NEW! Telecom & NFV Full track descriptions For tips on your submissions, check out these articles for some advice from the Summit Programming Committees— http://superuser.openstack.org/articles/tips-for-the-open-infrastructure-summit/ http://superuser.openstack.org/articles/tips-open-development-track-denver-summit/ http://superuser.openstack.org/articles/tips-for-talks-on-the-public-cloud-track-for-the-denver-summit/ Denver Summit registration and sponsor sales are currently open. Learn more Please email speakersupport at openstack.org with any questions or feedback. Cheers, Ashlee Ashlee Ferguson OpenStack Foundation ashlee at openstack.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From stig.openstack at telfer.org Tue Jan 22 20:31:39 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Tue, 22 Jan 2019 20:31:39 +0000 Subject: [scientific-sig] IRC meeting at 2100 UTC: Jitter on SR-IOV IB, Terraform and Kubespray Message-ID: Hi All - We have a Scientific SIG IRC meeting in about 30 minutes in channel #openstack-meeting. Everyone is welcome. Today’s agenda is at: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_22nd_2019 We have Jacob Anders from CSIRO talking about some recent research on latency and jitter for SR-IOV InfiniBand. We also have Martial Michel from DataMachines on Terraform and Kubespray. Cheers, Stig From mriedemos at gmail.com Tue Jan 22 20:34:40 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 22 Jan 2019 14:34:40 -0600 Subject: [Cinder][nova] queens backup In-Reply-To: <20190122184544.GA1219@sm-workstation> References: <20190122184544.GA1219@sm-workstation> Message-ID: On 1/22/2019 12:45 PM, Sean McGinnis wrote: > On Tue, Jan 22, 2019 at 07:04:05PM +0100, Ignazio Cassano wrote: >> Hi All, >> Please, I' d like to know if cinder backup and/or cinder snapshot call >> qemu guest agent for fsfreezing. >> If Yes, does it freeze file systems in the volume? >> Regards >> Ignazio > Unfortunately no, initiating a snapshot or backup from Cinder does not call out > to Nova to do any guest quiescing. There would need to be something else > coordinating the calls between the guest OS and initiating the Cinder snapshot > to do that. > > Sean > If you snapshot a volume-backed server in nova the compute API will attempt to quiesce the guest before creating the snapshot of the volume: https://github.com/openstack/nova/blob/31956108e6e785407bdcc31dbc8ba99e6a28c96d/nova/compute/api.py#L3100 As for backups, the compute createBackup API does not support volume-backed servers: https://github.com/openstack/nova/blob/31956108e6e785407bdcc31dbc8ba99e6a28c96d/nova/compute/api.py#L2894 -- Thanks, Matt From dkrol3 at gmail.com Tue Jan 22 20:38:01 2019 From: dkrol3 at gmail.com (=?UTF-8?Q?Darek_Kr=C3=B3l?=) Date: Tue, 22 Jan 2019 21:38:01 +0100 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov> References: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov> Message-ID: Is there any documentation written down from this discussion ? I would really like to read more about the problem and any ideas for possible solutions. Your recommendation abou k8s sounds interesting but I’m not sure if I understand it fully. Would you like to have a k8s cluster for all tenants on top of vms to handle trove instances ? And Is upgrade a different problem than ddos attack at message bus ? Best, Darek On Tue, 22 Jan 2019 at 21:25, Fox, Kevin M wrote: > We tried to solve it as a cross project issue for a while and then > everyone gave up. lots of projects have the same problem. trove, sahara, > magnum, etc. > > Other then just control messages, there is also the issue of version skew > between guest agents and controllers and how to do rolling upgrades. Its > messy today. > > I'd recommend at this point to maybe just run kubernetes across the vms > and push the guest agents/workload to them. You can still drive it via an > openstack api, but doing rolling upgrades of guest agents or mysql > containers or whatever is way simpler for operators to handle. We should > embrace k8s as part of the solution rather then trying to reimplement it > IMO. > > Thanks, > Kevin > ________________________________________ > From: Darek Król [dkrol3 at gmail.com] > Sent: Tuesday, January 22, 2019 12:09 PM > To: Michael Richardson > Cc: openstack-discuss at lists.openstack.org > Subject: Subject: Re: [Trove] State of the Trove service tenant deployment > model > > On Tue, Jan 22, 2019 at 07:29:25PM +1300, Zane Bitter wrote: > > Last time I heard (which was probably mid-2017), the Trove team had > > implemented encryption for messages on the RabbitMQ bus. IIUC each DB > being > > managed had its own encryption keys, so that would theoretically prevent > > both snooping and spoofing of messages. That's the good news. > > > > The bad news is that AFAIK it's still using a shared RabbitMQ bus, so > > attacks like denial of service are still possible if you can extract the > > shared credentials from the VM. Not sure about replay attacks; I haven't > > actually investigated the implementation. > > > > cheers, > > Zane. > > > Excellent - many thanks for the confirmation. > > > > Cheers, > > Michael > > Hello Michael and Zane, > > sorry for the late reply. > > I believe Zane is referring to a video from 2017 [0]. > Yes, messages from trove instances are encrypted and the keys are kept > in Trove DB. It is still a shared message bus, but it can be a message > bus dedicated for Trove only and separated from message bus shared by > other Openstack services. > > DDOS attacks are also mentioned in the video as a potential threat but > there is very little details and possible solutions. Recently we had > some internal discussion about this threat within Trove team. Maybe we > could user Rabbitmq mechanisms for flow control mentioned in [1,2,3] ? > > Another point, I'm wondering if this is a problem only in Trove or is > it something other services would be interesting in also ? > > Best, > Darek > > [0] https://youtu.be/dzvcKlt3Lx8 > [1] https://www.rabbitmq.com/flow-control.html > [2] > http://www.rabbitmq.com/blog/2012/04/17/rabbitmq-performance-measurements-part-1/ > [3] > https://tech.labs.oliverwyman.com/blog/2013/08/31/controlling-fast-producers-in-a-rabbit-as-a-service/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ltoscano at redhat.com Tue Jan 22 20:39:51 2019 From: ltoscano at redhat.com (Luigi Toscano) Date: Tue, 22 Jan 2019 21:39:51 +0100 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov> References: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov> Message-ID: <4627229.4aJM9VPmJv@whitebase.usersys.redhat.com> On Tuesday, 22 January 2019 21:24:55 CET Fox, Kevin M wrote: > We tried to solve it as a cross project issue for a while and then everyone > gave up. lots of projects have the same problem. trove, sahara, magnum, > etc. > I would say that Sahara does not have the same problem. There was a discussion to use a loca agent in 2014 but it died out. Sahara communicate with each node through ssh, so no (ab)use of the message bus. Ciao -- Luigi From Kevin.Fox at pnnl.gov Tue Jan 22 20:44:46 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Tue, 22 Jan 2019 20:44:46 +0000 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model In-Reply-To: <4627229.4aJM9VPmJv@whitebase.usersys.redhat.com> References: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov>, <4627229.4aJM9VPmJv@whitebase.usersys.redhat.com> Message-ID: <1A3C52DFCD06494D8528644858247BF01C2886E9@EX10MBOX03.pnnl.gov> Yeah, it has its own sets of problems. The implementation chosen differs from the one chosen by trove. But in a way that's even worse for operators, as each advanced service (trove, magnum, sahara) has to be learned independently by operators on how to secure it, debug it, upgrade it, etc. Adding each one to the cluster increases the cognitive burden significantly so its rare to deploy multiple in an openstack. Thanks, Kevin ________________________________________ From: Luigi Toscano [ltoscano at redhat.com] Sent: Tuesday, January 22, 2019 12:39 PM To: openstack-discuss at lists.openstack.org Cc: Fox, Kevin M; Darek Król; Michael Richardson Subject: Re: Subject: Re: [Trove] State of the Trove service tenant deployment model On Tuesday, 22 January 2019 21:24:55 CET Fox, Kevin M wrote: > We tried to solve it as a cross project issue for a while and then everyone > gave up. lots of projects have the same problem. trove, sahara, magnum, > etc. > I would say that Sahara does not have the same problem. There was a discussion to use a loca agent in 2014 but it died out. Sahara communicate with each node through ssh, so no (ab)use of the message bus. Ciao -- Luigi From Kevin.Fox at pnnl.gov Tue Jan 22 20:49:55 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Tue, 22 Jan 2019 20:49:55 +0000 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model In-Reply-To: References: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov>, Message-ID: <1A3C52DFCD06494D8528644858247BF01C288706@EX10MBOX03.pnnl.gov> Its probably captured in summit notes from 5-3 years ago. Nothing specific I can point at without going through a lot of archaeology. Yeah, deploying the users databases on top of kubernetes in vm's would be easier to upgrade I think then pure vm's with a pile of debs/rpms inside. Its tangentially related to the message bus stuff. If you solve the ddos attack issue with the message bus, you still have the upgrade problem. but depending on how you choose to solve the communications channel issues you can solve other issues such as upgrades easier, harder, or not at all. Thanks, Kevin ________________________________ From: Darek Król [dkrol3 at gmail.com] Sent: Tuesday, January 22, 2019 12:38 PM To: Fox, Kevin M Cc: Michael Richardson; openstack-discuss at lists.openstack.org Subject: Re: Subject: Re: [Trove] State of the Trove service tenant deployment model Is there any documentation written down from this discussion ? I would really like to read more about the problem and any ideas for possible solutions. Your recommendation abou k8s sounds interesting but I’m not sure if I understand it fully. Would you like to have a k8s cluster for all tenants on top of vms to handle trove instances ? And Is upgrade a different problem than ddos attack at message bus ? Best, Darek On Tue, 22 Jan 2019 at 21:25, Fox, Kevin M > wrote: We tried to solve it as a cross project issue for a while and then everyone gave up. lots of projects have the same problem. trove, sahara, magnum, etc. Other then just control messages, there is also the issue of version skew between guest agents and controllers and how to do rolling upgrades. Its messy today. I'd recommend at this point to maybe just run kubernetes across the vms and push the guest agents/workload to them. You can still drive it via an openstack api, but doing rolling upgrades of guest agents or mysql containers or whatever is way simpler for operators to handle. We should embrace k8s as part of the solution rather then trying to reimplement it IMO. Thanks, Kevin ________________________________________ From: Darek Król [dkrol3 at gmail.com] Sent: Tuesday, January 22, 2019 12:09 PM To: Michael Richardson Cc: openstack-discuss at lists.openstack.org Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model On Tue, Jan 22, 2019 at 07:29:25PM +1300, Zane Bitter wrote: > Last time I heard (which was probably mid-2017), the Trove team had > implemented encryption for messages on the RabbitMQ bus. IIUC each DB being > managed had its own encryption keys, so that would theoretically prevent > both snooping and spoofing of messages. That's the good news. > > The bad news is that AFAIK it's still using a shared RabbitMQ bus, so > attacks like denial of service are still possible if you can extract the > shared credentials from the VM. Not sure about replay attacks; I haven't > actually investigated the implementation. > > cheers, > Zane. > Excellent - many thanks for the confirmation. > > Cheers, > Michael Hello Michael and Zane, sorry for the late reply. I believe Zane is referring to a video from 2017 [0]. Yes, messages from trove instances are encrypted and the keys are kept in Trove DB. It is still a shared message bus, but it can be a message bus dedicated for Trove only and separated from message bus shared by other Openstack services. DDOS attacks are also mentioned in the video as a potential threat but there is very little details and possible solutions. Recently we had some internal discussion about this threat within Trove team. Maybe we could user Rabbitmq mechanisms for flow control mentioned in [1,2,3] ? Another point, I'm wondering if this is a problem only in Trove or is it something other services would be interesting in also ? Best, Darek [0] https://youtu.be/dzvcKlt3Lx8 [1] https://www.rabbitmq.com/flow-control.html [2] http://www.rabbitmq.com/blog/2012/04/17/rabbitmq-performance-measurements-part-1/ [3] https://tech.labs.oliverwyman.com/blog/2013/08/31/controlling-fast-producers-in-a-rabbit-as-a-service/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nate.johnston at redhat.com Tue Jan 22 22:12:50 2019 From: nate.johnston at redhat.com (Nate Johnston) Date: Tue, 22 Jan 2019 17:12:50 -0500 Subject: [neutron] Neutron Bug Deputy report for week of Jan 14 Message-ID: <20190122221250.knvd66l26zidkmse@bishop> Neutron friends, Below is a summary of the neutron bugs that came in last week (Jan 14th). Any bugs with a preceding "(*)" are not yet in "in progress" state. Looks like lajoskatona is this week's bug deputy. Ports/IPAM - https://bugs.launchpad.net/bugs/1811905 Deferred IP allocation, port update with binding_host_id + set new mac address fails - https://bugs.launchpad.net/bugs/1812788 Port with no active binding mark as dead Gate Failures - (*) https://bugs.launchpad.net/bugs/1812364 Error "OSError: [Errno 22] failed to open netns" in l3-agent logs - https://bugs.launchpad.net/bugs/1812404 test_concurrent_create_port_forwarding_update_port failed with InvalidIpForSubnet - (*) https://bugs.launchpad.net/bugs/1812872 Trunk functional tests can interact with each other - https://bugs.launchpad.net/bugs/1812552 tempest-slow tests fails often L3 / DHCP - https://bugs.launchpad.net/bugs/1811873 get_l3_agent_with_min_routers fails with postgresql backend - (*) https://bugs.launchpad.net/bugs/1812118 Neutron doesn't allow to update router external subnets DNS - (*) https://bugs.launchpad.net/bugs/1812168 Neutron doesn't delete Designate entry when port is deleted QoS - https://bugs.launchpad.net/bugs/1812576 L2 agent do not clear old QoS rules after restart Oslo - https://bugs.launchpad.net/bugs/1812922 neutron functional tests break with oslo.utils 3.39.1 and above Docs - (*) https://bugs.launchpad.net/bugs/1812225 Firewall-as-a-Service (FWaaS) in neutron docs for Rocky still refer to plans for Ocata - (*) https://bugs.launchpad.net/bugs/1809578 Documentation might be wrong - https://bugs.launchpad.net/bugs/1812497 create vm failed, RequiredOptError: value required for option lock_path in group From mriedemos at gmail.com Tue Jan 22 22:47:36 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 22 Jan 2019 16:47:36 -0600 Subject: [nova] Do we need to copy pci_devices to target cell DB during cross-cell resize? In-Reply-To: <7d19a76eaf59950696a5cbd21cc08b2b85cbda76.camel@redhat.com> References: <3bfc308a-87cf-a873-618e-7a4ebc58a7e7@gmail.com> <7d19a76eaf59950696a5cbd21cc08b2b85cbda76.camel@redhat.com> Message-ID: <41b2f1f1-3989-fc6d-8821-295c2c377ba2@gmail.com> On 1/22/2019 12:19 PM, Sean Mooney wrote: > you are specifically doing a resize so you dont need to regenerate a new xml > on the source node before starting a live migration sice its not a live migration > but you might want to premtpvily allcoate the pci devices on the destination > if you want to prevent a race with other hosts. That said for stien it may be better > to declare that out of scope. its really not any more racy then spawning an instace > as we dont claim the device untill we get to the compute node anyway today. Correct, and the plan for cross-cell resize is to do the same RT.resize_claim on the target host in the target cell before trying to move anything, which is the same thing we do in ComputeManager.prep_resize for a normal resize within the same cell. > > the instance.numa_topology shoudl really be recalulated for the target host also. > you do not want to require the destination host to place the vm with the original > numa toptolgy from the source node. so i think you need to propagate the numa related > request which are all in the flavor/image but i dont think you need to copy > the instance numa_topology object. its not a live migration so provided the The instance.numa_topology is calculated from the flavor and image during the initial server create: https://github.com/openstack/nova/blob/dd84e75260c3c919398536f7d05764713dc1c8cd/nova/compute/api.py#L799 And during the MoveClaim: https://github.com/openstack/nova/blob/dd84e75260c3c919398536f7d05764713dc1c8cd/nova/compute/claims.py#L294 So yeah it looks like I don't really have to worry about updating/setting instance.numa_topology in the target cell DB during the resize although it seems pretty weird that we leave that stale information in the instances table during a resize (it's also stale in the RequestSpec - I think Alex Xu reported a bug for that). -- Thanks, Matt From anlin.kong at gmail.com Tue Jan 22 23:08:32 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Wed, 23 Jan 2019 12:08:32 +1300 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov> References: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov> Message-ID: On Wed, Jan 23, 2019 at 9:27 AM Fox, Kevin M wrote: > > I'd recommend at this point to maybe just run kubernetes across the vms > and push the guest agents/workload to them. > This sounds like an overkill to me. Currently, different projects in openstack are solving this issue in different ways, e.g. Octavia is using two-way SSL authentication API between the controller service and amphora(which is the vm running HTTP server inside), Magnum is using heat-container-agent that is communicating with Heat via API, etc. However, Trove chooses another option which has brought a lot of discussions over a long time. In the current situation, I don't think it's doable for each project heading to one common solution, but Trove can learn from other projects to solve its own problem. Cheers, Lingxian Kong -------------- next part -------------- An HTML attachment was scrubbed... URL: From adriant at catalyst.net.nz Tue Jan 22 23:21:27 2019 From: adriant at catalyst.net.nz (Adrian Turjak) Date: Wed, 23 Jan 2019 12:21:27 +1300 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> Message-ID: <86ed4afc-056e-602a-e30c-08a51c2a2080@catalyst.net.nz> Thanks for the input! I'm willing to bet there are many people excited about this goal, or will be when they realise it exists! The 'dirty' state I think would be solved with a report API in each service (tell me everything a given project has resource wise). Such an API would be useful without needing to query each resource list, and potentially could be an easy thing to implement to help a purge library figure out what to delete. I know right now our method for checking if a project is 'dirty' is part of our quota checking scripts, and it has to query a lot of APIs per service to build an idea of what a project has. As for using existing code, OSPurge could well be a starting point, but the major part of this goal has to be that each OpenStack service (that creates resources owned by a project) takes ownership of their own deletion logic. This is why a top level library for cross project logic, with per service plugin libraries is possibly the best approach. Each library would follow the same template and abstraction layers (as inherited from the top level library), but how each service implements their own deletion is up to them. I would also push for them using the SDK only as their point of interaction with the APIs (lets set some hard requirements and standards!), because that is the python library we should be using going forward. In addition such an approach could mean that anyone can write a plugin for the top level library (e.g. internal company only services) which will automatically get picked up if installed. We would need robust and extensive testing for this, because deletion is critical, and we need it to work, but also not cause damage in ways it shouldn't. And you're right, purge tools purging outside of the scope asked for is a worry. Our own internal logic actually works by having the triggering admin user add itself to the project (and ensure no admin role), then scope a token to just that project, and delete resources form the point of view of a project user. That way it's kind of like a user deleting their own resources, and in truth having a nicer way to even do that (non-admin clearing of project) would be amazing for a lot of people who don't want to close their account or disable their project, but just want to delete stray resources and not get charged. On 23/01/19 4:03 AM, Tobias Urdin wrote: > Thanks for the thorough feedback Adrian. > > My opinion is also that Keystone should not be the actor in executing > this functionality but somewhere else > whether that is Adjutant or any other form (application, library, CLI > etc). > > I would also like to bring up the point about knowing if a project is > "dirty" (it has provisioned resources). > This is something that I think all business logic would benefit from, > we've had issue with knowing when > resources should be deleted, our solution is pretty much look at > metrics the last X minutes, check if project > is disabled and compare to business logic that says it should be deleted. > > While the above works it kills some of logical points of disabling a > project since the only thing that knows if > the project should be deleted or is actually disabled is the business > logic application that says they clicked the > deleted button and not disabled. > > Most of the functionality you are mentioning is things that the > ospurge project has been working to implement and the > maintainer even did a full rewrite which improved the dependency > arrangement for resource removal. > > I think the biggest win for this community goal would be the > developers of the projects would be available for input regarding > the project specific code that does purging. There has been some > really nasty bugs in ospurge in the past that if executed with the admin > user you would wipe everything and not only that project, which is > probably a issue that makes people think twice about > using a purging toolkit at all. > > We should carefully consider what parts of ospurge could be reused, > concept, code or anything in between that could help derive > what direction we wan't to push this goal. > > I'm excited :) > > Best regards > Tobias > From Kevin.Fox at pnnl.gov Tue Jan 22 23:24:53 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Tue, 22 Jan 2019 23:24:53 +0000 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model In-Reply-To: References: <1A3C52DFCD06494D8528644858247BF01C28869E@EX10MBOX03.pnnl.gov>, Message-ID: <1A3C52DFCD06494D8528644858247BF01C2887E3@EX10MBOX03.pnnl.gov> Octavia is a slightly easier solution in that each tenanat's load balancer doesn't have to be a different version of the software being deployed, such as trove's users selection of mysql 5 vs mysq 10 or postgres, or sahara's choice in hadoop, etc. Permutations of user's version vs guest agent's version, etc. lets not get into rpms vs debs, which image building tool you use to stamp out the images, test frameworks, etc. Its simpler for a Developer, not to deal with it and just say its an Operators problem. But as a whole, including the Operators problems, its way more complex to deal with that way. just my $0.02 Thanks, Kevin ________________________________ From: Lingxian Kong [anlin.kong at gmail.com] Sent: Tuesday, January 22, 2019 3:08 PM To: openstack-discuss at lists.openstack.org Subject: Re: Subject: Re: [Trove] State of the Trove service tenant deployment model On Wed, Jan 23, 2019 at 9:27 AM Fox, Kevin M > wrote: I'd recommend at this point to maybe just run kubernetes across the vms and push the guest agents/workload to them. This sounds like an overkill to me. Currently, different projects in openstack are solving this issue in different ways, e.g. Octavia is using two-way SSL authentication API between the controller service and amphora(which is the vm running HTTP server inside), Magnum is using heat-container-agent that is communicating with Heat via API, etc. However, Trove chooses another option which has brought a lot of discussions over a long time. In the current situation, I don't think it's doable for each project heading to one common solution, but Trove can learn from other projects to solve its own problem. Cheers, Lingxian Kong -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Wed Jan 23 01:04:28 2019 From: zbitter at redhat.com (Zane Bitter) Date: Wed, 23 Jan 2019 14:04:28 +1300 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model In-Reply-To: References: Message-ID: <33eabff6-a7a5-d33f-9d5c-23ef2c60b064@redhat.com> On 23/01/19 9:09 AM, Darek Król wrote: > On Tue, Jan 22, 2019 at 07:29:25PM +1300, Zane Bitter wrote: >> Last time I heard (which was probably mid-2017), the Trove team had >> implemented encryption for messages on the RabbitMQ bus. IIUC each DB being >> managed had its own encryption keys, so that would theoretically prevent >> both snooping and spoofing of messages. That's the good news. >> >> The bad news is that AFAIK it's still using a shared RabbitMQ bus, so >> attacks like denial of service are still possible if you can extract the >> shared credentials from the VM. Not sure about replay attacks; I haven't >> actually investigated the implementation. >> >> cheers, >> Zane. > >> Excellent - many thanks for the confirmation. >> >> Cheers, >> Michael > > Hello Michael and Zane, > > sorry for the late reply. > > I believe Zane is referring to a video from 2017 [0]. > Yes, messages from trove instances are encrypted and the keys are kept > in Trove DB. It is still a shared message bus, but it can be a message > bus dedicated for Trove only and separated from message bus shared by > other Openstack services. > > DDOS attacks are also mentioned in the video as a potential threat but > there is very little details and possible solutions. Yes, in fact that was me asking the question in that video :) > Recently we had > some internal discussion about this threat within Trove team. Maybe we > could user Rabbitmq mechanisms for flow control mentioned in [1,2,3] ? > > Another point, I'm wondering if this is a problem only in Trove or is > it something other services would be interesting in also ? > > Best, > Darek > > [0] https://youtu.be/dzvcKlt3Lx8 > [1] https://www.rabbitmq.com/flow-control.html > [2] http://www.rabbitmq.com/blog/2012/04/17/rabbitmq-performance-measurements-part-1/ > [3] https://tech.labs.oliverwyman.com/blog/2013/08/31/controlling-fast-producers-in-a-rabbit-as-a-service/ > From gmann at ghanshyammann.com Wed Jan 23 01:06:02 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 23 Jan 2019 10:06:02 +0900 Subject: [dev] [tc][all] TC office hours is started now on #openstack-tc Message-ID: <168783e1c77.107e728f023926.8901909246464360899@ghanshyammann.com> Hello everyone, TC office hour is started on #openstack-tc channel. Feel free to reach to us for anything you want discuss/input/feedback/help from TC. - gmann & TC From zh.f at outlook.com Wed Jan 23 06:12:19 2019 From: zh.f at outlook.com (Zhang Fan) Date: Wed, 23 Jan 2019 06:12:19 +0000 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model In-Reply-To: <33eabff6-a7a5-d33f-9d5c-23ef2c60b064@redhat.com> References: <33eabff6-a7a5-d33f-9d5c-23ef2c60b064@redhat.com> Message-ID: Hey all, Glad to see someone actually having trove in production and giving feedback to the community, thanks for doing that BTW. IIRC, back in 2017, we had a remote discussion during PTG, and we were planning to adopt octavia solution, @huntxu drafted a specs https://review.openstack.org/#/c/553679/, but as far as I know, he will not continue this work in the future. Best Wishes. Fan Zhang On Jan 23, 2019, at 09:04, Zane Bitter > wrote: On 23/01/19 9:09 AM, Darek Król wrote: On Tue, Jan 22, 2019 at 07:29:25PM +1300, Zane Bitter wrote: Last time I heard (which was probably mid-2017), the Trove team had implemented encryption for messages on the RabbitMQ bus. IIUC each DB being managed had its own encryption keys, so that would theoretically prevent both snooping and spoofing of messages. That's the good news. The bad news is that AFAIK it's still using a shared RabbitMQ bus, so attacks like denial of service are still possible if you can extract the shared credentials from the VM. Not sure about replay attacks; I haven't actually investigated the implementation. cheers, Zane. Excellent - many thanks for the confirmation. Cheers, Michael Hello Michael and Zane, sorry for the late reply. I believe Zane is referring to a video from 2017 [0]. Yes, messages from trove instances are encrypted and the keys are kept in Trove DB. It is still a shared message bus, but it can be a message bus dedicated for Trove only and separated from message bus shared by other Openstack services. DDOS attacks are also mentioned in the video as a potential threat but there is very little details and possible solutions. Yes, in fact that was me asking the question in that video :) Recently we had some internal discussion about this threat within Trove team. Maybe we could user Rabbitmq mechanisms for flow control mentioned in [1,2,3] ? Another point, I'm wondering if this is a problem only in Trove or is it something other services would be interesting in also ? Best, Darek [0] https://youtu.be/dzvcKlt3Lx8 [1] https://www.rabbitmq.com/flow-control.html [2] http://www.rabbitmq.com/blog/2012/04/17/rabbitmq-performance-measurements-part-1/ [3] https://tech.labs.oliverwyman.com/blog/2013/08/31/controlling-fast-producers-in-a-rabbit-as-a-service/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkrol3 at gmail.com Wed Jan 23 06:34:48 2019 From: dkrol3 at gmail.com (=?UTF-8?Q?Darek_Kr=C3=B3l?=) Date: Wed, 23 Jan 2019 07:34:48 +0100 Subject: Subject: Re: [Trove] State of the Trove service tenant deployment model Message-ID: On Wed, Jan 23, 2019 at 9:27 AM Fox, Kevin M > wrote: > > I'd recommend at this point to maybe just run kubernetes across the vms and push the guest agents/workload to them. > This sounds like an overkill to me. Currently, different projects in openstack are solving this issue > in different ways, e.g. Octavia is using two-way SSL authentication API between the controller service and amphora(which is the vm running HTTP server inside), Magnum is using heat-container-agent that is communicating with Heat via API, etc. However, Trove chooses another option which has brought a lot of discussions over a long time. > In the current situation, I don't think it's doable for each project heading to one common solution, but Trove can learn from other projects to solve its own problem. > Cheers, > Lingxian Kong The Octavia way of communication was discussed by Trove several times in the context of secuirty. However, the security threat has been eliminated by encryption. I'm wondering if the Octavia way prevents DDOS attacks also ? Implementation of two-way SSL authentication API could be included in the Trove priority list IMHO if it solves all issues with security/DDOS attacks. This could also creates some share code between both projects and help other services as well. Best, Darek From zigo at debian.org Wed Jan 23 07:51:36 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 23 Jan 2019 08:51:36 +0100 Subject: oslo.messaging, OpenSSL 1.1.1a, Eventlet and Python 3.7 Message-ID: Hi there! Just a quick return from my experience. While testing all what's in the subject in Debian Buster/Sid (testing a full deployment), I noticed that most (if not all) components wouldn't connect to RabbitMQ. The crash was like this: http://paste.openstack.org/show/743089/ This happened in: - cinder-scheduler - nova-conductor - neutron-api - neutron-rpc-server and probably many more. After some search-engine time, I found this patch: https://github.com/eventlet/eventlet/pull/531/commits This fixed the issue for me (on all OpenStack components). I've pushed it in the eventlet Debian package Git repository, and hopefully, I'll be able to push it to Buster before the release (currently waiting on approval from the current package maintainer). Hoping this helps the project and/or some deployment people, Cheers, Thomas Goirand (zigo) P.S: More soon about the latest update of OCI. From hberaud at redhat.com Wed Jan 23 08:28:44 2019 From: hberaud at redhat.com (Herve Beraud) Date: Wed, 23 Jan 2019 09:28:44 +0100 Subject: [scientific-sig] IRC meeting Wednesday 1100 UTC: 2FA with FreeIPA, OpenInfra Days London, ISC 2019 In-Reply-To: References: <95C77676-5EA4-4D33-A2C0-B0D2A95A9504@telfer.org> Message-ID: Hi Stig, Thanks! Best Le mar. 22 janv. 2019 à 20:34, Stig Telfer a écrit : > Hi Hervé - apologies, we don’t at present, but I’ll always try to mail out > a reminder on openstack-discuss shortly before. > > Best wishes, > Stig > > > > On 22 Jan 2019, at 18:46, Herve Beraud wrote: > > > > Hey! > > > > Do you have a courtesy ping system or something like that where I can > put my nick to be pinged before meeting? > > > > Best. > > > > Le mar. 15 janv. 2019 à 22:00, Stig Telfer > a écrit : > > Hi All - > > > > We have a Scientific SIG IRC meeting coming up on Wednesday 16th at 1100 > UTC in channel #openstack-meeting. Everyone is welcome. > > > > Full agenda and details are here: > > > > > https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_16th_2019 > > > > We’ve got quite a bit to cover this week. Our headline item is a > presentation on experiences configuring 2FA for OpenStack using FreeIPA. > We also have some exciting events coming up, and discussion on a possible > Scientific OpenStack BoF at the International Supercomputer Conference in > Frankfurt in June. > > > > Cheers, > > Stig > > > > > > > > -- > > Hervé Beraud > > Senior Software Engineer > > Red Hat - Openstack Oslo > > irc: hberaud > > -----BEGIN PGP SIGNATURE----- > > > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > > v6rDpkeNksZ9fFSyoY2o > > =ECSj > > -----END PGP SIGNATURE----- > > > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bdobreli at redhat.com Wed Jan 23 11:14:51 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Wed, 23 Jan 2019 12:14:51 +0100 Subject: [TripleO][Kolla] Reduce base layer of containers for security and size of images (maintenance) sakes: UPDATE Message-ID: <013cf955-7026-64e3-7e01-4e0091528934@redhat.com> Here is an update. The %{systemd_ordering} macro is proposed for lightening containers images and removing the systemd dependency for containers. Please see & try patches in the topic [0] for RDO, and [1][2][3][4][5] for generic Fedora 29 rpms. I'd very appreciate if anyone building Kolla containers for f29/(rhel8 yet?) could try these out as well. PS (somewhat internal facing but who cares): I wonder if we could see those changes catched up automagically for rhel8 repos as well? > I'm tracking systemd changes here [0],[1],[2], btw (if accepted, > it should be working as of fedora28(or 29) I hope) > > [0] https://review.rdoproject.org/r/#/q/topic:base-container-reduction > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1654659 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1654672 [0] https://review.rdoproject.org/r/#/q/topic:base-container-reduction [1] https://bugzilla.redhat.com/show_bug.cgi?id=1654659 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1654672 [3] https://bugzilla.redhat.com/show_bug.cgi?id=1668688 [4] https://bugzilla.redhat.com/show_bug.cgi?id=1668687 [5] https://bugzilla.redhat.com/show_bug.cgi?id=1668678 -- Best regards, Bogdan Dobrelya, Irc #bogdando From zigo at debian.org Wed Jan 23 11:25:15 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 23 Jan 2019 12:25:15 +0100 Subject: oslo.messaging, OpenSSL 1.1.1a, Eventlet and Python 3.7 In-Reply-To: References: Message-ID: On 1/23/19 8:51 AM, Thomas Goirand wrote: > Hi there! > > Just a quick return from my experience. > > While testing all what's in the subject in Debian Buster/Sid (testing a > full deployment), I noticed that most (if not all) components wouldn't > connect to RabbitMQ. The crash was like this: > http://paste.openstack.org/show/743089/ > > This happened in: > - cinder-scheduler > - nova-conductor > - neutron-api > - neutron-rpc-server > > and probably many more. While at it, python3-amqp needs to be 2.4.0, otherwise, there's other types of failures with SSL and Python 3. Looks like 2.4.0 deals well with Rocky btw. Cheers, Thomas Goirand (zigo) From derekh at redhat.com Wed Jan 23 11:32:37 2019 From: derekh at redhat.com (Derek Higgins) Date: Wed, 23 Jan 2019 11:32:37 +0000 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: References: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> <72ed6b5b-a28b-8fa2-dfce-fcf31ccc40a6@gmail.com> Message-ID: On Thu, 17 Jan 2019 at 17:45, Derek Higgins wrote: > > On Mon, 14 Jan 2019 at 16:47, Brian Haley wrote: > > > > On 1/7/19 12:42 PM, Julia Kreger wrote: > > > On Mon, Jan 7, 2019 at 9:11 AM Clark Boylan wrote: > > >> > > >> On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: > > > [trim] > > >>> > > >>> Doing so, allows us to raise this behavior change to operators minimizing the > > >>> need of them having to troubleshoot it in production, and gives them a choice > > >>> in the direction that they wish to take. > > >> > > >> https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to cover this. Basically you should explicitly enable specific helpers when you need them rather than relying on the auto helper rules. > > >> > > >> Maybe even avoid the configuration option entirely if ironic and neutron can set the required helper for tftp when tftp is used? > > >> > > > Great link Clark, thanks! > > > > > > It could be viable to ask operators to explicitly set their security > > > groups for tftp to be passed. > > > > > > I guess we actually have multiple cases where there are issues and the > > > only non-impacted case is when the ironic conductor host is directly > > > attached to the flat network the machine is booting from. In the case > > > of a flat network, it doesn't seem viable for us to change rules > > > ad-hoc since we would need to be able to signal that the helper is > > > needed, but it does seem viable to say "make sure connectivity works x > > > way". Where as with multitenant networking, we use dedicated networks, > > > so conceivably it is just a static security group setting that an > > > operator can keep in place. Explicit static rules like that seem less > > > secure to me without conntrack helpers. :( > > > > > > Does anyone in Neutron land have any thoughts? > > > > I am from Neutron land, sorry for the slow reply. > > > > First, I'm trying to get in contact with someone that knows more about > > nf_conntrack_helper than me, I'll follow-up here or in the patch. > Great, thanks > > > > > In neutron, as in most projects, the goal is to have things configured > > so admins don't need to set any extra options, so we've typically done > > things like set sysctl values to make sure we don't get tripped-up by > > such issues. Mostly these settings have been in the L3 code, so are > > done in namespaces and have limited "impact" on the system hypervisor on > > the compute node. > > > > Since this is security group related it is different, since that isn't > > done in a namespace - we add a rule for related/established connections > > in the "root" namespace, for example in the iptables_hybrid case. For > > that reason it's not obvious to me that setting this sysctl is bad - > > it's not in the VM itself, and the packets aren't going to the > > hypervisor, so is there any impact we need to worry about besides just > > having it loaded? > > As far as I've been able to figure out we'd need to have the kernel > module loaded, > one per protocol supported e.g. > nf_conntrack_tftp, nf_conntrack_sip etc... Looks like we also need "nf_nat_tftp" I'm also confused, we don't have any iptables rules inside the namespace matching any connections with --ctstate RELATED so is this just a nat issues, where the helpers are needed for nat to work and iptables isn't blocking anything? if thats the case maybe we just set the sysctl as per patch and document that the appropriate kernel modules should be loaded > > and set the sysctl inside the namespace, > > my testing was on devstack where the node being deployed with ironic was on the > same host, It may be that the sysctl is also needed in the root namespace in a > more realistic scenario where ironic is controlling real baremetal > nodes, I'll see if > I can find out if this is the case. I've set up a more realistic env and there were no additional requirements > > > > > The other option would be to add more rules when SG rules are added that > > are associated with a protocol that has a helper. IMO that's not a > > great solution as there is no way for the user to control what filters > > (like IP addresses) are allowed, for example a SIP helper IP address. > > Ya, it doesn't sound ideal, also this would require specific SG rules to enable > outgoing traffic which isn't normally the case > > > > > Hopefully I'm understanding things correctly. > > > > Thanks, > > > > -Brian > > From hberaud at redhat.com Wed Jan 23 12:41:23 2019 From: hberaud at redhat.com (Herve Beraud) Date: Wed, 23 Jan 2019 13:41:23 +0100 Subject: [all][oslo][coordination] remove method get_transport in transport.py In-Reply-To: References: Message-ID: To a more friendly topic tracking I propose to you to use the following topic on your reviews: topic/remove-deprecated-get-transport git review -t topic/remove-deprecated-get-transport Le mar. 22 janv. 2019 à 19:40, Herve Beraud a écrit : > Hello everybody! > > Since few months oslo plan to remove an deprecated method > (oslo_messaging.get_transport). > > In a first this method was deprecated to facilitate removing rpc_backend > support, get_transport aliases was deprecated to simplify the code and the > tests. Aliases were first depreacted in oslo.messaging 5.20.0 during Pike > and now we want to definitely remove this method. > > The support of this aliases was planed to be remove during milestone > stein-1. > > So I propose to all projects who use this deprecated method to start to > move to the right oslo.messaging supported transport methods > (get_rpc_transport or get_notification_transport depends on your needs). > > This thread will try to coordinate projects during migrator. > > I already submit a patch[1] to oslo.messaging to start to remove this > method. > > To move forward on this topic I personaly plan to propose patchs to > projects that I know that use it. > > To ensure a better removing, to be sure that all projects remove usages of > this method, and to merge in the right order, I propose to you to use the > same topic in your gerrit reviews by using refer the bug > https://bugs.launchpad.net/oslo.messaging/+bug/1714945 in your commit > (Related-Bug: #1714945). > > Please ensure that all your projects stop to use this method. > > If you facing an issue please reply to this thread to centralize support > during the migrator. > > You can track this topic by visiting this page: > > https://review.openstack.org/#/q/topic:bug/1714945+(status:open+OR+status:merged) > > [1] https://review.openstack.org/632523 > > Cheers! > Best! > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Wed Jan 23 14:11:28 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 23 Jan 2019 08:11:28 -0600 Subject: [cinder] Proposing new Core Members ... In-Reply-To: <20190108223535.GA29520@sm-workstation> References: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> <20190108223535.GA29520@sm-workstation> Message-ID: <7c45be94-34c4-2555-9532-fb721ac783ed@gmail.com> All, There were no concerns with these nominations so I have added Yikun and Rajat to the core list. Welcome to the Cinder Core team! Thank you for all the efforts and I look forward to working with you both in the future! Jay (jungleboyj) On 1/8/2019 4:35 PM, Sean McGinnis wrote: > On Tue, Jan 08, 2019 at 04:00:14PM -0600, Jay Bryant wrote: >> Team, >> >> I would like propose two people who have been taking a more active role in >> Cinder reviews as Core Team Members: >> > >> I think that both Rajat and Yikun will be welcome additions to help replace >> the cores that have recently been removed. >> > +1 from me. Both have been doing a good job giving constructive feedback on > reviews and have been spending some time reviewing code other than their own > direct interests, so I think they would be welcome additions. > > Sean From mbooth at redhat.com Wed Jan 23 14:18:51 2019 From: mbooth at redhat.com (Matthew Booth) Date: Wed, 23 Jan 2019 14:18:51 +0000 Subject: [Cinder][nova] queens backup In-Reply-To: <20190122184544.GA1219@sm-workstation> References: <20190122184544.GA1219@sm-workstation> Message-ID: On Tue, 22 Jan 2019 at 18:51, Sean McGinnis wrote: > > On Tue, Jan 22, 2019 at 07:04:05PM +0100, Ignazio Cassano wrote: > > Hi All, > > Please, I' d like to know if cinder backup and/or cinder snapshot call > > qemu guest agent for fsfreezing. > > If Yes, does it freeze file systems in the volume? > > Regards > > Ignazio > > Unfortunately no, initiating a snapshot or backup from Cinder does not call out > to Nova to do any guest quiescing. There would need to be something else > coordinating the calls between the guest OS and initiating the Cinder snapshot > to do that. Although it wouldn't help making a consistent snapshot of an instance with multiple disks, for a single disk it shouldn't matter if the guest is quiesced as long as the backend can make an instantaneous snapshot. I'm pretty sure many (most?) backends would support that; certainly plain old LVM does. Does cinder use this functionality where it's available, and would that solve the problem you're trying to address? Matt > > Sean > -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK) From ignaziocassano at gmail.com Wed Jan 23 14:45:54 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 23 Jan 2019 15:45:54 +0100 Subject: [Cinder][nova] queens backup In-Reply-To: References: <20190122184544.GA1219@sm-workstation> Message-ID: Hello Matthew, our backend is netapp fas8040 via nfs and our instances have often more than one disk. Thanks Ignazio Il giorno mer 23 gen 2019 alle ore 15:19 Matthew Booth ha scritto: > On Tue, 22 Jan 2019 at 18:51, Sean McGinnis wrote: > > > > On Tue, Jan 22, 2019 at 07:04:05PM +0100, Ignazio Cassano wrote: > > > Hi All, > > > Please, I' d like to know if cinder backup and/or cinder snapshot > call > > > qemu guest agent for fsfreezing. > > > If Yes, does it freeze file systems in the volume? > > > Regards > > > Ignazio > > > > Unfortunately no, initiating a snapshot or backup from Cinder does not > call out > > to Nova to do any guest quiescing. There would need to be something else > > coordinating the calls between the guest OS and initiating the Cinder > snapshot > > to do that. > > Although it wouldn't help making a consistent snapshot of an instance > with multiple disks, for a single disk it shouldn't matter if the > guest is quiesced as long as the backend can make an instantaneous > snapshot. I'm pretty sure many (most?) backends would support that; > certainly plain old LVM does. Does cinder use this functionality where > it's available, and would that solve the problem you're trying to > address? > > Matt > > > > > Sean > > > > > -- > Matthew Booth > Red Hat OpenStack Engineer, Compute DFG > > Phone: +442070094448 (UK) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 23 14:55:17 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 23 Jan 2019 15:55:17 +0100 Subject: [cinder] volume snapshot on queens not working as in ocata Message-ID: Hello All, I have tow different openstack installations: ocata and queens. On both I have the same cinder sorage based on netapp via nfs. If I run the following steps on ocata, it works fine: [root at redhatcontrol ~(keystone_admin)]# openstack user list | grep -E 'demo|admin' | 1ced168db56f40819fef88d926dab69f | admin | | 66a09025a74448089bc110d5beded3d7 | demo | [root at redhatcontrol ~(keystone_admin)]# openstack project list +----------------------------------+---------------------+ | ID | Name | +----------------------------------+---------------------+ | 422a93f6de4f47c2971dcf38bd6db818 | demo | | 54d2cfa23b894357be9688b08374e1a9 | default test tenant | | 5e34c44cbf8b4db782187a01c82b890a | admin | | 95fe0e009128404aa3e0fac1e970f240 | Migration Tenant | | c266b36c8f0e4959b0cbdba2c11e4371 | services | +----------------------------------+---------------------+ [root at redhatcontrol ~(keystone_admin)]# openstack role list +----------------------------------+---------------+ | ID | Name | +----------------------------------+---------------+ | 4de19a9f2c3947a09b57fc6e5d09255c | SwiftOperator | | 71c439ed9e5c43ab87a81432e108739e | admin | | 9fe2ff9ee4384b1894a90878d3e92bab | _member_ | | fa8209d2929540d8a590703c00d2a83f | ResellerAdmin | +----------------------------------+---------------+ openstack trust create --impersonate --project 422a93f6de4f47c2971dcf38bd6db818 --role 9fe2ff9ee4384b1894a90878d3e92bab 66a09025a74448089bc110d5beded3d7 1ced168db56f40819fef88d926dab69f +--------------------+----------------------------------+ | Field | Value | +--------------------+----------------------------------+ | deleted_at | None | | expires_at | None | | id | 34e04d74345b42b987ed177b463de36f | | impersonation | True | | project_id | 422a93f6de4f47c2971dcf38bd6db818 | | redelegation_count | 0 | | remaining_uses | None | | roles | _member_ | | trustee_user_id | 1ced168db56f40819fef88d926dab69f | | trustor_user_id | 66a09025a74448089bc110d5beded3d7 | +--------------------+----------------------------------+ openstack --os-trust-id 34e04d74345b42b987ed177b463de36f --os-username admin --os-auth-url http://142.44.137.114:5000/v3 --os_identity_api_version 3 --os-password xxxxxx volume list +--------------------------------------+------+-----------+------+-------------+ | ID | Name | Status | Size | Attached to | +--------------------------------------+------+-----------+------+-------------+ | fde028b7-2dbb-4c7e-a1e9-7a21e05d6810 | vol1 | available | 1 | | +--------------------------------------+------+-----------+------+-------------+ openstack --os-trust-id 34e04d74345b42b987ed177b463de36f --os-username admin --os-password 1f53a5fd265b40b3 volume snapshot create fde028b7-2dbb-4c7e-a1e9-7a21e05d6810 +-------------+--------------------------------------+ | Field | Value | +-------------+--------------------------------------+ | created_at | 2019-01-22T22:31:09.633331 | | description | None | | id | 3e3bd85e-e223-42da-a682-168eb65e57c1 | | name | fde028b7-2dbb-4c7e-a1e9-7a21e05d6810 | | properties | | | size | 1 | | status | creating | | updated_at | None | | volume_id | fde028b7-2dbb-4c7e-a1e9-7a21e05d6810 | :: Same steps on queens produce errors: [root at controller ~(keystone_admin)]# openstack project list +----------------------------------+----------+ | ID | Name | +----------------------------------+----------+ | 65588a4fefc04676b58b7218cc806631 | test | | be28cd5925ef4e9a9a67880374702b92 | services | | d04bc32a65554223868ae3da8d38b986 | demo | | f31311f2b8e943c7ac0d9ffde3da301e | admin | +----------------------------------+----------+ [root at controller ~(keystone_admin)]# openstack user list +----------------------------------+-------------+ | ID | Name | +----------------------------------+-------------+ | 0abd6ac789d94e738d08b436cb3a4032 | neutron | | 115ce41c43d14ac8aba0703b45b64fd7 | sunil | | 138c168dc5b44b5a80de03249a955b6b | swift | | 2fdf04289f15459cb63ea8503e93449d | ceilometer | | 34d8b24e389940b192f17a83d9524966 | placement | | 499e005a7f3b4e0980289f894a8b9b80 | cinder | | 4fbbb1237d464795973544608986701a | triliovault | | 600374eb0884476ba4a2dd8c83684d7e | glance | | 849a4e937c1e481a8d0f414feb5022d5 | aodh | | 9731c45b5a2b48dda26b709b21d2149c | demo | | 9bf75dc88e7d4cc4a5d8eb8fa8428ede | gnocchi | | b032b818c1ef4ce7b8d7b8f9a34f632c | nova | | e66caefd407344bfbec86d079949132f | admin | +----------------------------------+-------------+ [root at controller ~(keystone_admin)]# openstack role list +----------------------------------+---------------+ | ID | Name | +----------------------------------+---------------+ | 51b3dbaa08634312ba4ca29f5d081752 | SwiftOperator | | 5282071e3609458dbbb08daf50d59dfb | ResellerAdmin | | 9fe2ff9ee4384b1894a90878d3e92bab | _member_ | | b42186fc8953476eb2f5b4c697ce1408 | admin | +----------------------------------+---------------+ [root at controller ~(keystone_admin)]# openstack volume list +--------------------------------------+------------+-----------+------+------------------------------------+ | ID | Name | Status | Size | Attached to | +--------------------------------------+------------+-----------+------+------------------------------------+ | a603c9a5-ebbf-4622-b836-cd430f10aa5d | iscsi test | available | 1 | | | 263be5ed-f120-4b3b-a15d-181d59c0f85f | vol1 | in-use | 1 | Attached to server1 on /dev/vdb | | 12bc6248-b624-4fb0-81f2-c1f986c4697c | amit_vol | in-use | 1 | Attached to amit_test on /dev/vdb | | a8db08ee-7beb-484d-bdfc-7c5b123c1457 | amit_vol | available | 1 | | | d6fb4fac-c4a7-48a2-8a73-86316ca624d5 | amit_vol | available | 1 | | | 97afbe02-79c7-4907-b7c2-e6ef81d9b765 | temp | available | 1 | | | 62a489e9-6524-4ad2-8b7e-c70206568b10 | temp | error | 1 | | | abfd441e-51fe-4949-bb8c-0dd9395fb687 | amit_vol | available | 1 | | +--------------------------------------+------------+-----------+------+------------------------------------+ [root at controller ~(keystone_admin)]# openstack trust create --project f31311f2b8e943c7ac0d9ffde3da301e --role b42186fc8953476eb2f5b4c697ce1408 --impersonate e66caefd407344bfbec86d079949132f 4fbbb1237d464795973544608986701a +--------------------+----------------------------------+ | Field | Value | +--------------------+----------------------------------+ | deleted_at | None | | expires_at | None | | id | 464864573bf24a08a5c4aadd90492b90 | | impersonation | True | | project_id | f31311f2b8e943c7ac0d9ffde3da301e | | redelegation_count | 0 | | remaining_uses | None | | roles | admin | | trustee_user_id | 4fbbb1237d464795973544608986701a | | trustor_user_id | e66caefd407344bfbec86d079949132f | +--------------------+----------------------------------+ [root at controller ~]# openstack --os-trust-id 464864573bf24a08a5c4aadd90492b90 --os-username triliovault --os-password ********** --os-auth-url http://192.168.4.205:5000/v3 --os-identity-api-version 3 volume snapshot create --force --volume 12bc6248-b624-4fb0-81f2-c1f986c4697c amit_test +-------------+--------------------------------------+ | Field | Value | +-------------+--------------------------------------+ | created_at | 2019-01-23T08:59:01.276145 | | description | None | | id | 8f148320-4952-4641-aede-8b393155ea97 | | name | amit_test | | properties | | | size | 1 | | status | creating | | updated_at | None | | volume_id | 12bc6248-b624-4fb0-81f2-c1f986c4697c | +-------------+--------------------------------------+ [root at controller ~]# openstack --os-trust-id 464864573bf24a08a5c4aadd90492b90 --os-username triliovault --os-password ********** --os-auth-url http://192.168.4.205:5000/v3 --os-identity-api-version 3 volume snapshot show 8f148320-4952-4641-aede-8b393155ea97 +--------------------------------------------+--------------------------------------+ | Field | Value | +--------------------------------------------+--------------------------------------+ | created_at | 2019-01-23T08:59:01.000000 | | description | None | | id | 8f148320-4952-4641-aede-8b393155ea97 | | name | amit_test | | os-extended-snapshot-attributes:progress | 0% | | os-extended-snapshot-attributes:project_id | f31311f2b8e943c7ac0d9ffde3da301e | | properties | | | size | 1 | | status | error | | updated_at | 2019-01-23T08:59:03.000000 | | volume_id | 12bc6248-b624-4fb0-81f2-c1f986c4697c | +--------------------------------------------+--------------------------------------+ Any help, please ? Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From kgiusti at gmail.com Wed Jan 23 15:03:23 2019 From: kgiusti at gmail.com (Ken Giusti) Date: Wed, 23 Jan 2019 10:03:23 -0500 Subject: oslo.messaging, OpenSSL 1.1.1a, Eventlet and Python 3.7 In-Reply-To: References: Message-ID: Thomas - thanks for looking into this. Opened a couple bugs against the oslo.messaging library to track this: https://bugs.launchpad.net/oslo.messaging/+bug/1813029 https://bugs.launchpad.net/oslo.messaging/+bug/1813030 On Wed, Jan 23, 2019 at 6:30 AM Thomas Goirand wrote: > On 1/23/19 8:51 AM, Thomas Goirand wrote: > > Hi there! > > > > Just a quick return from my experience. > > > > While testing all what's in the subject in Debian Buster/Sid (testing a > > full deployment), I noticed that most (if not all) components wouldn't > > connect to RabbitMQ. The crash was like this: > > http://paste.openstack.org/show/743089/ > > > > This happened in: > > - cinder-scheduler > > - nova-conductor > > - neutron-api > > - neutron-rpc-server > > > > and probably many more. > > While at it, python3-amqp needs to be 2.4.0, otherwise, there's other > types of failures with SSL and Python 3. Looks like 2.4.0 deals well > with Rocky btw. > > Cheers, > > Thomas Goirand (zigo) > > -- Ken Giusti (kgiusti at gmail.com) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mordred at inaugust.com Wed Jan 23 15:21:17 2019 From: mordred at inaugust.com (Monty Taylor) Date: Wed, 23 Jan 2019 15:21:17 +0000 Subject: Scuba Diving at the Denver Summit Message-ID: <1a05abf3-edb5-c9c9-708e-b24e12cab8be@inaugust.com> Hey everybody! tl;dr - Let me know if you'd like to go diving in the Rocky Mountains. Of all of the places we've had a Summit where one might reasonably expect there to be a possibility of Scuba Diving, the Denver Summit is probably REALLY low on the list. But it turns out that there is a Dive Shop that does dives at the Denver Aquarium [0] (with all the fish and sharks and whatnot) The Denver Aquarium is also very close to the Convention Center. I'm going to see about putting together a private group dive event for those of us who think that sounds like a fun idea. Due to cross-contamination concerns for the health and safety of the fish, the dive shop provides all equipment - so you don't have to bring gear. They will allow bringing your own mask - and I'm assuming if a well-scrubbed mask is ok then a well-scrubbed dive computer would be ok - but I'll be checking with them when I call to arrange things. If you are Open Water certified and would like to join, please send me a response back (reply directly to me rather than the list is fine) so I can judge general interest level when talking to the lovely humans. If you are **NOT** Open Water certified but are thinking to yourself "wow, that sounds like fun" - the dive shop does certification dives at the aquarium on the weekends. That's a bit out of scope for me to arrange, but if you want to show up in Denver early and do an Open Water class then join us, that's awesome. You can also get certified at home before coming ... but you'll need an Open Water certification to join us. BUT WAIT - THERE'S MORE! There is another interesting diving opportunity in the general Mountain West area - although not immediately **close** to Denver - that Sandy and I are going to be going to before the Summit - Homestead Crater. We're arranging to meet a Scuba instructor there to do our second Altitude cert dive, and also to do a 2-dive "Hot Springs Diver" specialty. We'd be happy to have folks join us. Homestead Crater [1] is outside of Salt Lake City. It's a hot spring - the water is 96F/35.5C and it's also an altitude dive. If people are interested, we can meet at the crater on Thursday morning pre-summit, dive the crater, then drive to Denver on Friday/Saturday. ** If only a small number of people are interested in any of this - we'll likely just do the Denver dives the weekend before during one of the normally scheduled dive times instead of arranging a group outing, and shift Homestead, if anyone is interested, a few days earlier ** I will not have space enough in my car to get you from the crater to Denver, so if you want to fly in to SLC ahead of time and meet there, awesome. Unlike Denver, the crater will require you have equipment, including scuba tanks. For anyone interested in coming, there are dive shops in Salt Lake where you can rent gear. There is a dive shop at the crater to do air fills. Anyway - if anyone is interested in some or all of these things, let me know this week so I can get a headcount and then we can talk more off-list about logistics and figure out how to deal with things for people coming from far off (carpools, equipment, etc) Monty PS. If any of you aren't already using Subsurface for dive logging, I'll be happy to help you get set up. [0] https://divedowntown.com/ [1] https://homesteadresort.com/utah-resort-things-to-do/homestead-crater/ From sean.mcginnis at gmx.com Wed Jan 23 15:49:45 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 23 Jan 2019 09:49:45 -0600 Subject: [Cinder][nova] queens backup In-Reply-To: References: <20190122184544.GA1219@sm-workstation> Message-ID: <20190123154945.GA1213@sm-workstation> > > Although it wouldn't help making a consistent snapshot of an instance > with multiple disks, for a single disk it shouldn't matter if the > guest is quiesced as long as the backend can make an instantaneous > snapshot. I'm pretty sure many (most?) backends would support that; > certainly plain old LVM does. Does cinder use this functionality where > it's available, and would that solve the problem you're trying to > address? > > Matt > That is a good point. If you snap all of the volumes used, it may not be quiesced and have all IO flushed, but it would at least be a crash consistent set of data. All of the volumes used by a VM would need to be added to a group: https://docs.openstack.org/cinder/latest/admin/blockstorage-groups.html You would then be able to create a group snapshot to have all volumes snapped together. From ignaziocassano at gmail.com Wed Jan 23 15:56:24 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 23 Jan 2019 16:56:24 +0100 Subject: [Cinder][nova] queens backup In-Reply-To: <20190123154945.GA1213@sm-workstation> References: <20190122184544.GA1219@sm-workstation> <20190123154945.GA1213@sm-workstation> Message-ID: Hello, I did not understand if you mean cinder snapshot pr netapp snapshot. Any case, why, we do not need to quiesce the instance ? Regards Ignazio Il giorno mer 23 gen 2019 alle ore 16:49 Sean McGinnis < sean.mcginnis at gmx.com> ha scritto: > > > > Although it wouldn't help making a consistent snapshot of an instance > > with multiple disks, for a single disk it shouldn't matter if the > > guest is quiesced as long as the backend can make an instantaneous > > snapshot. I'm pretty sure many (most?) backends would support that; > > certainly plain old LVM does. Does cinder use this functionality where > > it's available, and would that solve the problem you're trying to > > address? > > > > Matt > > > > That is a good point. If you snap all of the volumes used, it may not be > quiesced and have all IO flushed, but it would at least be a crash > consistent > set of data. > > All of the volumes used by a VM would need to be added to a group: > > https://docs.openstack.org/cinder/latest/admin/blockstorage-groups.html > > You would then be able to create a group snapshot to have all volumes > snapped > together. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Wed Jan 23 15:58:46 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 23 Jan 2019 09:58:46 -0600 Subject: [cinder] volume snapshot on queens not working as in ocata In-Reply-To: References: Message-ID: <20190123155845.GB1213@sm-workstation> On Wed, Jan 23, 2019 at 03:55:17PM +0100, Ignazio Cassano wrote: > Hello All, > I have tow different openstack installations: ocata and queens. > On both I have the same cinder sorage based on netapp via nfs. > If I run the following steps on ocata, it works fine: > > > | status | > error | > | updated_at | > 2019-01-23T08:59:03.000000 | > | volume_id | > 12bc6248-b624-4fb0-81f2-c1f986c4697c | > +--------------------------------------------+--------------------------------------+ > > Any help, please ? > > Ignazio There is most likely an issue between Cinder and the backend storage device. You may be able to get some details by using the 'cinder message-list' and 'cinder message-show' command line: https://docs.openstack.org/python-cinderclient/latest/cli/details.html#cinder-message-list If that does not provide a reasonable message at least pointing to where the problem is (which we should probably see if we can improve) then you will likely need to go to the cinder-volume log files. If you search in there, there will probably be traceback messages or other errors indicating why the snapshot failed. Sean From sean.mcginnis at gmx.com Wed Jan 23 16:01:53 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 23 Jan 2019 10:01:53 -0600 Subject: [Cinder][nova] queens backup In-Reply-To: References: <20190122184544.GA1219@sm-workstation> <20190123154945.GA1213@sm-workstation> Message-ID: <20190123160152.GC1213@sm-workstation> On Wed, Jan 23, 2019 at 04:56:24PM +0100, Ignazio Cassano wrote: > Hello, I did not understand if you mean cinder snapshot pr netapp snapshot. > Any case, why, we do not need to quiesce the instance ? > Regards > Ignazio > If being crash consistent is good enough for your needs, then you don't. I know some do prefer the coordinated quiescing of IO in the instance to make sure any in-flight transactions are flushed out and application data is more likely to be in a good consistent state. Depending on your application running in the instance, things like databases are pretty good at rolling back incomplete transactions, so it's just a matter of whether you can allow the possibility that something that was successful in the milliseconds before the snap was created to now be rolled back when the application restarts. From doug at doughellmann.com Wed Jan 23 16:07:38 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 23 Jan 2019 11:07:38 -0500 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: <20190118144233.132eb0e427389da15e725141@redhat.com> References: <20190118144233.132eb0e427389da15e725141@redhat.com> Message-ID: Petr Kovar writes: > Hi all, > > Alex Settle recently re-joined the Documentation Project after a few-month > break. It's great to have her back and I want to formally nominate her for > membership in the openstack-doc-core team, to follow the formal process for > cores. > > Please let the ML know should you have any objections. > > Thanks, > pk I'm late to the party, but want to register my +1 for welcoming Alex back to the team. -- Doug From ignaziocassano at gmail.com Wed Jan 23 17:02:56 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 23 Jan 2019 18:02:56 +0100 Subject: [Cinder][nova] queens backup In-Reply-To: <20190123160152.GC1213@sm-workstation> References: <20190122184544.GA1219@sm-workstation> <20190123154945.GA1213@sm-workstation> <20190123160152.GC1213@sm-workstation> Message-ID: Manu thanks. I read a blueprint for providing quiesce function to nova api but I cannot find it. Must I talk directly with libvirt api? Ignazio Il giorno Mer 23 Gen 2019 17:01 Sean McGinnis ha scritto: > On Wed, Jan 23, 2019 at 04:56:24PM +0100, Ignazio Cassano wrote: > > Hello, I did not understand if you mean cinder snapshot pr netapp > snapshot. > > Any case, why, we do not need to quiesce the instance ? > > Regards > > Ignazio > > > > If being crash consistent is good enough for your needs, then you don't. I > know > some do prefer the coordinated quiescing of IO in the instance to make > sure any > in-flight transactions are flushed out and application data is more likely > to > be in a good consistent state. > > Depending on your application running in the instance, things like > databases > are pretty good at rolling back incomplete transactions, so it's just a > matter > of whether you can allow the possibility that something that was > successful in > the milliseconds before the snap was created to now be rolled back when the > application restarts. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 23 17:05:58 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 23 Jan 2019 18:05:58 +0100 Subject: [cinder] volume snapshot on queens not working as in ocata In-Reply-To: <20190123155845.GB1213@sm-workstation> References: <20190123155845.GB1213@sm-workstation> Message-ID: Thanks. I'll send logs asap. Regards Ignazio Il giorno Mer 23 Gen 2019 16:58 Sean McGinnis ha scritto: > On Wed, Jan 23, 2019 at 03:55:17PM +0100, Ignazio Cassano wrote: > > Hello All, > > I have tow different openstack installations: ocata and queens. > > On both I have the same cinder sorage based on netapp via nfs. > > If I run the following steps on ocata, it works fine: > > > > > > > | status | > > error | > > | updated_at | > > 2019-01-23T08:59:03.000000 | > > | volume_id | > > 12bc6248-b624-4fb0-81f2-c1f986c4697c | > > > +--------------------------------------------+--------------------------------------+ > > > > Any help, please ? > > > > Ignazio > > There is most likely an issue between Cinder and the backend storage > device. > > You may be able to get some details by using the 'cinder message-list' and > 'cinder message-show' command line: > > > https://docs.openstack.org/python-cinderclient/latest/cli/details.html#cinder-message-list > > If that does not provide a reasonable message at least pointing to where > the > problem is (which we should probably see if we can improve) then you will > likely need to go to the cinder-volume log files. If you search in there, > there > will probably be traceback messages or other errors indicating why the > snapshot > failed. > > Sean > -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Wed Jan 23 17:18:56 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Wed, 23 Jan 2019 10:18:56 -0700 Subject: [openstack-dev] [tripleo] reducing our upstream CI footprint In-Reply-To: References: Message-ID: On Thu, Nov 1, 2018 at 5:47 AM Derek Higgins wrote: > On Wed, 31 Oct 2018 at 17:22, Alex Schultz wrote: > > > > Hey everyone, > > > > Based on previous emails around this[0][1], I have proposed a possible > > reducing in our usage by switching the scenario001--011 jobs to > > non-voting and removing them from the gate[2]. This will reduce the > > likelihood of causing gate resets and hopefully allow us to land > > corrective patches sooner. In terms of risks, there is a risk that we > > might introduce breaking changes in the scenarios because they are > > officially non-voting, and we will still be gating promotions on these > > scenarios. This means that if they are broken, they will need the > > same attention and care to fix them so we should be vigilant when the > > jobs are failing. > > > > The hope is that we can switch these scenarios out with voting > > standalone versions in the next few weeks, but until that I think we > > should proceed by removing them from the gate. I know this is less > > than ideal but as most failures with these jobs in the gate are either > > timeouts or unrelated to the changes (or gate queue), they are more of > > hindrance than a help at this point. > > > > Thanks, > > -Alex > > While on the topic of reducing the CI footprint > > something worth considering when pushing up a string of patches would > be to remove a bunch of the check jobs at the start of the patch set. > > e.g. If I'm working on t-h-t and have a series of 10 patches, while > looking for feedback I could remove most of the jobs from > zuul.d/layout.yaml in patch 1 so all 10 patches don't run the entire > suite of CI jobs. Once it becomes clear that the patchset is nearly > ready to merge, I change patch 1 leave zuul.d/layout.yaml as is. > > I'm not suggesting everybody does this but anybody who tends to push > up multiple patch sets together could consider it to not tie up > resources for hours. > > > > > [0] > http://lists.openstack.org/pipermail/openstack-dev/2018-October/136141.html > > [1] > http://lists.openstack.org/pipermail/openstack-dev/2018-October/135396.html > > [2] > https://review.openstack.org/#/q/topic:reduce-tripleo-usage+(status:open+OR+status:merged) > > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Greetings, Just a quick update.. The TripleO CI team is just about done migrating multinode scenario 1-4 jobs to the SingleNode job. This update and a few other minor changes have moved the needle with regards to TripleO's upstream resource consumption. In October of 2017 we had the following footprint.. tripleo: 111256883.96s, 52.45% [1] Today our footprint is now.. tripleo: 313097590.30s, 36.70% [2] We are still working the issue and we should see further improvement over the next couple months. I'll update the list again at the end of Stein. Thanks to Clark, Doug, Alex, Emilien and Juan for the work to make this happen!! Also thank you to the folks on the TripleO-CI team, you know who you are :) [1] http://paste.openstack.org/show/733644/ [2] http://paste.openstack.org/show/743188/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Wed Jan 23 18:29:39 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 23 Jan 2019 10:29:39 -0800 Subject: [Cinder][nova] queens backup In-Reply-To: References: <20190122184544.GA1219@sm-workstation> <20190123154945.GA1213@sm-workstation> <20190123160152.GC1213@sm-workstation> Message-ID: On Wed, 23 Jan 2019 18:02:56 +0100, Ignazio Cassano wrote: > Manu thanks. > I read a blueprint for providing quiesce function to nova api but I > cannot find it. > Must I talk directly with libvirt api? Quiesce was never added to the nova API as a separate function and a spec proposal to add it was last reviewed in Newton [1]. At the time of review, only one virt driver, libvirt, supported quiesce and the justification to add a new REST API that all but one driver could not support, was not compelling enough. AFAIK the libvirt driver is still the only one that supports quiesce. There were other concerns beyond that though, and they are detailed in the review. As Matt Riedemann mentioned in his earlier reply on this thread [2], a quiesce step is integrated into the nova snapshot API, if the driver supports it (only libvirt). This is the only way you can quiesce an instance today. Cheers, -melanie [1] https://review.openstack.org/295595 [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001984.html > Il giorno Mer 23 Gen 2019 17:01 Sean McGinnis > ha scritto: > > On Wed, Jan 23, 2019 at 04:56:24PM +0100, Ignazio Cassano wrote: > > Hello, I did not understand if you mean cinder snapshot pr netapp > snapshot. > > Any case, why, we do not need to quiesce the instance ? > > Regards > > Ignazio > > > > If being crash consistent is good enough for your needs, then you > don't. I know > some do prefer the coordinated quiescing of IO in the instance to > make sure any > in-flight transactions are flushed out and application data is more > likely to > be in a good consistent state. > > Depending on your application running in the instance, things like > databases > are pretty good at rolling back incomplete transactions, so it's > just a matter > of whether you can allow the possibility that something that was > successful in > the milliseconds before the snap was created to now be rolled back > when the > application restarts. > From ignaziocassano at gmail.com Wed Jan 23 18:34:13 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 23 Jan 2019 19:34:13 +0100 Subject: [Cinder][nova] queens backup In-Reply-To: References: <20190122184544.GA1219@sm-workstation> <20190123154945.GA1213@sm-workstation> <20190123160152.GC1213@sm-workstation> Message-ID: Thanks for the info. Ignazio Il giorno Mer 23 Gen 2019 19:29 melanie witt ha scritto: > On Wed, 23 Jan 2019 18:02:56 +0100, Ignazio Cassano > wrote: > > Manu thanks. > > I read a blueprint for providing quiesce function to nova api but I > > cannot find it. > > Must I talk directly with libvirt api? > > Quiesce was never added to the nova API as a separate function and a > spec proposal to add it was last reviewed in Newton [1]. At the time of > review, only one virt driver, libvirt, supported quiesce and the > justification to add a new REST API that all but one driver could not > support, was not compelling enough. AFAIK the libvirt driver is still > the only one that supports quiesce. There were other concerns beyond > that though, and they are detailed in the review. > > As Matt Riedemann mentioned in his earlier reply on this thread [2], a > quiesce step is integrated into the nova snapshot API, if the driver > supports it (only libvirt). This is the only way you can quiesce an > instance today. > > Cheers, > -melanie > > [1] https://review.openstack.org/295595 > [2] > > http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001984.html > > > Il giorno Mer 23 Gen 2019 17:01 Sean McGinnis > > ha scritto: > > > > On Wed, Jan 23, 2019 at 04:56:24PM +0100, Ignazio Cassano wrote: > > > Hello, I did not understand if you mean cinder snapshot pr netapp > > snapshot. > > > Any case, why, we do not need to quiesce the instance ? > > > Regards > > > Ignazio > > > > > > > If being crash consistent is good enough for your needs, then you > > don't. I know > > some do prefer the coordinated quiescing of IO in the instance to > > make sure any > > in-flight transactions are flushed out and application data is more > > likely to > > be in a good consistent state. > > > > Depending on your application running in the instance, things like > > databases > > are pretty good at rolling back incomplete transactions, so it's > > just a matter > > of whether you can allow the possibility that something that was > > successful in > > the milliseconds before the snap was created to now be rolled back > > when the > > application restarts. > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Jan 23 19:08:14 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 23 Jan 2019 19:08:14 +0000 Subject: [Cinder][nova] queens backup In-Reply-To: References: <20190122184544.GA1219@sm-workstation> <20190123154945.GA1213@sm-workstation> <20190123160152.GC1213@sm-workstation> Message-ID: On Wed, 2019-01-23 at 19:34 +0100, Ignazio Cassano wrote: > Thanks for the info. > Ignazio > > Il giorno Mer 23 Gen 2019 19:29 melanie witt ha scritto: > > On Wed, 23 Jan 2019 18:02:56 +0100, Ignazio Cassano > > wrote: > > > Manu thanks. > > > I read a blueprint for providing quiesce function to nova api but I > > > cannot find it. > > > Must I talk directly with libvirt api? > > > > Quiesce was never added to the nova API as a separate function and a > > spec proposal to add it was last reviewed in Newton [1]. At the time of > > review, only one virt driver, libvirt, supported quiesce and the > > justification to add a new REST API that all but one driver could not > > support, was not compelling enough. AFAIK the libvirt driver is still > > the only one that supports quiesce. There were other concerns beyond > > that though, and they are detailed in the review. > > > > As Matt Riedemann mentioned in his earlier reply on this thread [2], a > > quiesce step is integrated into the nova snapshot API, if the driver > > supports it (only libvirt). This is the only way you can quiesce an > > instance today. the closest semi portable api call is pause, but unlike quiesce, pause will also stop the execution of the vm. i say its semi portable as drivers are not required to implement it. it does have more broad support https://docs.openstack.org/nova/latest/user/support-matrix.html#operation_pause vmware,powervm and ironic being the main virt drivers missing support. calling pause however will be a distruptive backup and would not be suitable in many cases. it is overkill in most cases and it also wont guarentte that the io buffers are flushed just that no new data is written to the disks but the paused instnace while the backup is done. > > > > Cheers, > > -melanie > > > > [1] https://review.openstack.org/295595 > > [2] > > http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001984.html > > > > > Il giorno Mer 23 Gen 2019 17:01 Sean McGinnis > > > ha scritto: > > > > > > On Wed, Jan 23, 2019 at 04:56:24PM +0100, Ignazio Cassano wrote: > > > > Hello, I did not understand if you mean cinder snapshot pr netapp > > > snapshot. > > > > Any case, why, we do not need to quiesce the instance ? > > > > Regards > > > > Ignazio > > > > > > > > > > If being crash consistent is good enough for your needs, then you > > > don't. I know > > > some do prefer the coordinated quiescing of IO in the instance to > > > make sure any > > > in-flight transactions are flushed out and application data is more > > > likely to > > > be in a good consistent state. > > > > > > Depending on your application running in the instance, things like > > > databases > > > are pretty good at rolling back incomplete transactions, so it's > > > just a matter > > > of whether you can allow the possibility that something that was > > > successful in > > > the milliseconds before the snap was created to now be rolled back > > > when the > > > application restarts. > > > > > > > > > > > From haleyb.dev at gmail.com Wed Jan 23 20:46:11 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Wed, 23 Jan 2019 15:46:11 -0500 Subject: [openstack-dev] [neutron] In-Reply-To: <1488344635.1835770.1548183390982@mail.yahoo.com> References: <1488344635.1835770.1548183390982.ref@mail.yahoo.com> <1488344635.1835770.1548183390982@mail.yahoo.com> Message-ID: <9cece8b7-d501-d97e-f689-ddd6f07b5e9b@gmail.com> On 1/22/19 1:56 PM, Farhad Sunavala wrote: > Hi, > > > I am open to suggestions. > We have a need to switch traffic from our project to other projects > without first getting out > on the internet, floating IPs, etc. > > The other projects will be sharing their networks with our project. > As shown in figure below, the orange network belongs to our project > (10.0.0.0/26) > > The green network (172.31.0.0/24) belongs to another project > and > has an overlapping network with the red tenant (172.31.0.0/16) > > For now, the solution is to create VMs in our project and make sure none > of the interfaces > having overlapping CIDRs.  Thus, there is a VM attached to the 'orange' > and 'red' nets > and another VM attached to the 'orange' and 'green' nets. > > Problem: Too much resources (VMs) will need to be created if we have 100 > tenants with overlapping networks. > > Solution: > Is there a way I can minimize VM resource in our project by not > allocating a separate VM > for shared networks with overlapping CIDRs? Have you tried setting allow_overlapping_ips=False in neutron.conf and restarting the server? -Brian From smooney at redhat.com Wed Jan 23 21:20:52 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 23 Jan 2019 21:20:52 +0000 Subject: [openstack-dev] [neutron] In-Reply-To: <9cece8b7-d501-d97e-f689-ddd6f07b5e9b@gmail.com> References: <1488344635.1835770.1548183390982.ref@mail.yahoo.com> <1488344635.1835770.1548183390982@mail.yahoo.com> <9cece8b7-d501-d97e-f689-ddd6f07b5e9b@gmail.com> Message-ID: <58d441a312fc6c813b67c2f46674ee1961c92ae7.camel@redhat.com> On Wed, 2019-01-23 at 15:46 -0500, Brian Haley wrote: > On 1/22/19 1:56 PM, Farhad Sunavala wrote: > > Hi, > > > > > > I am open to suggestions. > > We have a need to switch traffic from our project to other projects > > without first getting out > > on the internet, floating IPs, etc. > > > > The other projects will be sharing their networks with our project. > > As shown in figure below, the orange network belongs to our project > > (10.0.0.0/26) > > > > The green network (172.31.0.0/24) belongs to another project > > and > > has an overlapping network with the red tenant (172.31.0.0/16) > > > > For now, the solution is to create VMs in our project and make sure none > > of the interfaces > > having overlapping CIDRs. Thus, there is a VM attached to the 'orange' > > and 'red' nets > > and another VM attached to the 'orange' and 'green' nets. > > > > Problem: Too much resources (VMs) will need to be created if we have 100 > > tenants with overlapping networks. > > > > Solution: > > Is there a way I can minimize VM resource in our project by not > > allocating a separate VM > > for shared networks with overlapping CIDRs? > > Have you tried setting allow_overlapping_ips=False in neutron.conf and > restarting the server? correct me if im wrong but setting allow_overlapping_ips=false would effectivly prevent overlaping CIDRs https://docs.openstack.org/neutron/latest/configuration/neutron.html#DEFAULT.allow_overlapping_ips you would generally only do that if you were using routed network or didnt want teanat to have overlapping CIDRs for there networks. if we removed the requirement to allowing overlapping cidrs then setting allow_overlapping_ips=false and configuring a default subnet pool so that tenant networks automatically got issued non over lapping subnets that would work but that is not what the original question was. > > -Brian > From twilson at redhat.com Wed Jan 23 22:09:26 2019 From: twilson at redhat.com (Terry Wilson) Date: Wed, 23 Jan 2019 16:09:26 -0600 Subject: [qa][tempest] Waiting for interface status == ACTIVE before checking status In-Reply-To: References: Message-ID: In the networking-ovn project, we hit this bug *very* often: https://bugs.launchpad.net/tempest/+bug/1728600. You can see the logstash here where it has failed 330 times in the last week: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22AssertionError%3A%20%5B%5D%20is%20not%20true%20%3A%20No%20IPv4%20addresses%20found%20in%5C%22 The bug has been around since 2017, and there are earlier reports of it than that. The bug happens in some projects outside of networking-ovn as well. At the core of the issue is that _get_server_port_id_and_ip4 loops through server ports to return ones that are ACTIVE, but there is a race where a port could become temporarily inactive if the ml2 driver continually monitors the actual port status. In the case we hit, os-vif started recreating the ovs port during an operation, so we would detect the status of the port as down and change the status, and then when the port is recreated we set the port status back to up. If the check happens while the port is down, the test fails. There have been comments that the port status shouldn't flip w/o any user request that would cause it, but that would mean that a plugin/driver would have to ignore the actual status of a port and that seems wrong. External things can affect what state a port is in. https://review.openstack.org/#/c/449695/7/tempest/scenario/manager.py adds a wait mechanism to checking the port status so that momentary flips of port status will not cause the test to inadvertently fail. The patch currently has 10 +1s. We really need to get this fixed. Thanks! Terry From sean.mcginnis at gmx.com Wed Jan 23 23:07:43 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 23 Jan 2019 17:07:43 -0600 Subject: [Release-job-failures] Release of openstack/sqlalchemy-migrate failed Message-ID: <20190123230742.GA13386@sm-workstation> There were release job failures with the latest sqlalchemy-migrate release. It appears it is another one of the issues with the readthedocs configuration not being correct. Everything else appears to have succeeded, so the new release is available. It just might need the published documentation manually updated. ----- Forwarded message from zuul at openstack.org ----- Date: Wed, 23 Jan 2019 22:54:02 +0000 From: zuul at openstack.org To: release-job-failures at lists.openstack.org Subject: [Release-job-failures] Release of openstack/sqlalchemy-migrate failed Reply-To: openstack-discuss at lists.openstack.org Build failed. - trigger-readthedocs-webhook http://logs.openstack.org/af/af6733baf027501fc2ad8e2e6b0dbae1989658e5/release/trigger-readthedocs-webhook/c2c2350/ : FAILURE in 2m 04s - release-openstack-python http://logs.openstack.org/af/af6733baf027501fc2ad8e2e6b0dbae1989658e5/release/release-openstack-python/bed935c/ : SUCCESS in 4m 09s - announce-release http://logs.openstack.org/af/af6733baf027501fc2ad8e2e6b0dbae1989658e5/release/announce-release/a6d8057/ : SUCCESS in 5m 51s - propose-update-constraints http://logs.openstack.org/af/af6733baf027501fc2ad8e2e6b0dbae1989658e5/release/propose-update-constraints/b689276/ : SUCCESS in 3m 44s _______________________________________________ Release-job-failures mailing list Release-job-failures at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/release-job-failures ----- End forwarded message ----- From cboylan at sapwetik.org Wed Jan 23 23:11:20 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 23 Jan 2019 15:11:20 -0800 Subject: [Release-job-failures] Release of openstack/sqlalchemy-migrate failed In-Reply-To: <20190123230742.GA13386@sm-workstation> References: <20190123230742.GA13386@sm-workstation> Message-ID: <1548285080.2749149.1642133272.1B8BD8D9@webmail.messagingengine.com> On Wed, Jan 23, 2019, at 3:07 PM, Sean McGinnis wrote: > There were release job failures with the latest sqlalchemy-migrate release. It > appears it is another one of the issues with the readthedocs configuration not > being correct. > > Everything else appears to have succeeded, so the new release is available. It > just might need the published documentation manually updated. > There is a known bug [0] with RTD's API that prevents our jobs from updating the docs on their end. I believe they poll our git repos though and the docs should auto update within some number of hours or days. [0] https://github.com/rtfd/readthedocs.org/issues/4986 Clark From corvus at inaugust.com Wed Jan 23 23:28:55 2019 From: corvus at inaugust.com (James E. Blair) Date: Wed, 23 Jan 2019 15:28:55 -0800 Subject: [infra][dev] Zuul promote pipeline for container publishing Message-ID: <877eev9cnc.fsf@meyer.lemoncheese.net> Hi, We recently added a new pipeline to OpenStack's Zuul, and three new jobs and roles to the zuul-jobs repo to support container publishing. If your project publishes containers, or may do so in the future, read on. The new pipeline is the "promote" pipeline. It runs jobs on changes after they merge, but unlike the "post" pipeline, it does not run with the resulting branch state after the merge -- it runs in the context of the change. This means that it's not suitable for building release artifacts, but that's not its purpose. Its purpose is to promote artifacts previously built in the gate pipeline to production. That is safe to do because anything built in the gate pipeline is built with the full future state of in-flight changes to all related projects and branches (even if it's built before they land). We build a lot of artifacts in the gate pipeline now, and then throw them away only to build them again in post. The promote pipeline is an attempt to avoid that waste and instead preserve artifacts built in gate and directly publish them -- but only if and when the changes for which they were built are merged. The first functional use of this system is in building container images and publishing them on Docker Hub. We have begun using this in openstack-infra to build container images of third-party applications, but this will work just as well and as easily for building images of applications in our infrastructure (for example, we are also starting to use this to publish Zuul images). To implement this, we wrote three jobs which are available now from zuul-jobs (and therefore can be used in any project or job in the system): * build-docker-image [1] * upload-docker-image [2] * promote-docker-image [3] The first job should run in the check pipeline and simply runs "docker build" on the arguments passed to it. It is used to verify that the build succeeds. The second job should run in the gate pipeline. It uploads the image to Docker Hub into the final repository location, but it only does so with a single tag with the form "change_123456" where '123456' is the Gerrit change number for the change under consideration. This will mean that docker pulls won't accidentally fetch the image yet. But the image is staged in Docker Hub and is ready to be promoted if the change passes all its tests and merges. The third job runs in the promote pipeline. This job, unlike the others, only performs Docker Hub API calls and requires no build resources. Therefore it is a zero-node job in Zuul -- a job which only runs on the Zuul executor. This job finds a previously uploaded image with a "change_" tag, and re-tags it with any specified tags ("latest" by default, but you can also tag version numbers or anything else). It also cleans up unused "change_" tags so they don't clutter up the repository. The nice thing about zero-node jobs is that they start and finish very quickly (since they don't have to deal with resource allocation and contention). With this system, an image will be fully published within about a minute after the change merges. To use this system, you will need to create (if you haven't done so already) a credential which is permitted to upload to an organization on Docker Hub, and you will need to add a secret [4] to Zuul for it. A complete example .zuul.yaml might look like this: - secret: name: myproject-dockerhub data: username: dockerhubuser password: !encrypted/pkcs1-oaep - DFlbrDM5eUMptMGIVMXV1g455xOJLi92UYF08Z2/JlIGu3t6v052o9FKlVyj1ZmpXs5+2 JTa5jHkLTvTsYs9fCaNcQc2nmViCyWNlbOMzjB17uiZOaYFNs1sMqZcUZbGEz7Y8ds6Qq NBXI10jWFPTah4QxUuBvUbT3vmjnUToCzexl5ZGhKgijcnROWfUsnlCdugpgoNIcPsUki zty5FotDihnrC8n8vIomVK6EClY38ty97pLrADzFDd+Cos/OUlvi2xooUhzx8Bn020rJA lqEU5v8LGXp5QkHx0MSDx6JY6KppJ/4p/yM+4By6l+A20zdcimxmgiNc9rMWPwDj7xsao m7NAZWmWqOO0Xkhgt6WOfugwgt9X46sgs2+yDEfbnI5ok8uRbAB/4FWj/KdpyXwhcf+O2 wEfhxLwDbAoGONQPjb4YcZmCXtmR7Qe5t+n2jyczWXvrbaBDUQP5a+YtVNN/xhmQ7D740 POlxv7bLxJAixzqaQ3d8Rz9ZEv6zzRuhWph32UQtZ1JxSNww+EvmXm2eEi2Q2z6pT1Cx/ j2OrFyA2GL/UJOVb15VHKF6bgHPHWJtpjPFhqdcvBhVute4BWB+KPcWH+y+apHN1enK3H tNJO9iqm34nKwSuj5ExmFw50LtwR5/9FyRuRPq/vBL+8y82v8FDmeYsBeobn5M= - job: name: myproject-build-image parent: build-docker-image description: Build Docker images. vars: &image_vars docker_images: - context: . repository: myproject/imagename - job: name: myproject-upload-image parent: upload-docker-image description: Build Docker images and upload to Docker Hub. secrets: - name: docker_credentials secret: myproject-dockerhub pass-to-parent: true vars: *image_vars - job: name: myproject-promote-image parent: promote-docker-image description: Promote previously uploaded Docker images. secrets: - name: docker_credentials secret: myproject-dockerhub pass-to-parent: true vars: *image_vars - project: check: jobs: - myproject-build-image gate: jobs: - myproject-upload-image promote: jobs: - myproject-promote-image If you find that you need jobs which behave slightly differently, you may be able to inherit from these jobs and add pre or post playbooks. Or if you need something particularly complicated, you can re-implement the jobs using the underlying roles in zuul-jobs, which have the same names an the jobs which invoke them. [1] https://zuul-ci.org/docs/zuul-jobs/jobs.html#job-build-docker-image [2] https://zuul-ci.org/docs/zuul-jobs/jobs.html#job-upload-docker-image [3] https://zuul-ci.org/docs/zuul-jobs/jobs.html#job-promote-docker-image [4] https://zuul-ci.org/docs/zuul/user/encryption.html From corvus at inaugust.com Wed Jan 23 23:46:01 2019 From: corvus at inaugust.com (James E. Blair) Date: Wed, 23 Jan 2019 15:46:01 -0800 Subject: [infra][tc] Container images in openstack/ on Docker Hub Message-ID: <87bm477xae.fsf@meyer.lemoncheese.net> Hi, As part of the recent infrastructure work described in http://lists.openstack.org/pipermail/openstack-discuss/2019-January/002026.html we now have the ability to fairly easily support uploading of container images to the "openstack/" namespace on Docker Hub. The Infrastructure team does have an account on Docker Hub with ownership rights to this space. It is now fairly simple for us to allow any OpenStack project to upload to openstack/$short_name. As a (perhaps unlikely, but simple) example, Nova could upload images to "openstack/nova", including suffixed images, such as "openstack/nova-scheduler". The system that would enable this is described in this proposed change: https://review.openstack.org/632818 I believe it's within the TC's purview to decide whether this should happen, and if so, what policies should govern it (i.e., what projects are entitled to upload to openstack/). It's possible that the status quo where deployment projects upload to their own namespaces (e.g., loci/) while openstack/ remains empty is desirable. However, since we recently gained the technical ability to handle this, I thought it worth bringing up. Personally, I don't presently advocate one way or the other. -Jim From mriedemos at gmail.com Wed Jan 23 23:56:09 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 23 Jan 2019 17:56:09 -0600 Subject: [nova] [placement] [packaging] placement extraction check in meeting In-Reply-To: References: Message-ID: On 1/16/2019 1:29 PM, Matt Riedemann wrote: > Nested providers / reshaper / VGPU: > > * Matt (me!) has agreed to rebase and address the comments on the > libvirt patch [3] to try and push that forward. Done. https://review.openstack.org/#/c/599208/ -- Thanks, Matt From sean.mcginnis at gmx.com Thu Jan 24 01:12:18 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 23 Jan 2019 19:12:18 -0600 Subject: [Release-job-failures] Release of openstack/sqlalchemy-migrate failed In-Reply-To: <1548285080.2749149.1642133272.1B8BD8D9@webmail.messagingengine.com> References: <20190123230742.GA13386@sm-workstation> <1548285080.2749149.1642133272.1B8BD8D9@webmail.messagingengine.com> Message-ID: <20190124011218.GA16536@sm-workstation> On Wed, Jan 23, 2019 at 03:11:20PM -0800, Clark Boylan wrote: > On Wed, Jan 23, 2019, at 3:07 PM, Sean McGinnis wrote: > > There were release job failures with the latest sqlalchemy-migrate release. It > > appears it is another one of the issues with the readthedocs configuration not > > being correct. > > > > Everything else appears to have succeeded, so the new release is available. It > > just might need the published documentation manually updated. > > > > There is a known bug [0] with RTD's API that prevents our jobs from updating the docs on their end. I believe they poll our git repos though and the docs should auto update within some number of hours or days. > > [0] https://github.com/rtfd/readthedocs.org/issues/4986 > > Clark > Thanks Clark. Any idea if the polling is a long term plan for them? I'm wondering if we should just remove our readthedocs jobs and just rely on that mechanism for the doc publishing. From fungi at yuggoth.org Thu Jan 24 01:54:46 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 24 Jan 2019 01:54:46 +0000 Subject: [Release-job-failures] Release of openstack/sqlalchemy-migrate failed In-Reply-To: <20190124011218.GA16536@sm-workstation> References: <20190123230742.GA13386@sm-workstation> <1548285080.2749149.1642133272.1B8BD8D9@webmail.messagingengine.com> <20190124011218.GA16536@sm-workstation> Message-ID: <20190124015446.g3a7ftu6mstey3ny@yuggoth.org> On 2019-01-23 19:12:18 -0600 (-0600), Sean McGinnis wrote: [...] > Thanks Clark. Any idea if the polling is a long term plan for them? I'm > wondering if we should just remove our readthedocs jobs and just rely on that > mechanism for the doc publishing. It seems like the expectation is this will be fixed when they (eventually) update the RTD codebase, as best I've been able to follow the discussion there. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Thu Jan 24 02:01:29 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 24 Jan 2019 02:01:29 +0000 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: <87bm477xae.fsf@meyer.lemoncheese.net> References: <87bm477xae.fsf@meyer.lemoncheese.net> Message-ID: <20190124020129.fefo7kbvv7svilj5@yuggoth.org> On 2019-01-23 15:46:01 -0800 (-0800), James E. Blair wrote: [...] > It's possible that the status quo where deployment projects upload to > their own namespaces (e.g., loci/) while openstack/ remains empty is > desirable. However, since we recently gained the technical ability to > handle this, I thought it worth bringing up. > > Personally, I don't presently advocate one way or the other. If nothing else, it's a great opportunity to revisit our decision in https://governance.openstack.org/tc/resolutions/20170530-binary-artifacts.html and make sure it's still relevant for the present situation. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From christian.zunker at codecentric.cloud Thu Jan 24 06:36:13 2019 From: christian.zunker at codecentric.cloud (Christian Zunker) Date: Thu, 24 Jan 2019 07:36:13 +0100 Subject: [magnum] Persistent Volume Claim with cinder backend Message-ID: Hi, we are running Magnum Rocky. I tried to create a persistent volume claim and got it working with provisioner: kubernetes.io/cinder But it failed with provisioner: openstack.org/standalone-cinder The docs state kubernetes.io/cinder is deprecated: https://kubernetes.io/docs/concepts/storage/storage-classes/#openstack-cinder Which one should be used in Rocky? This is our complete config for this case: apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: cinder annotations: storageclass.beta.kubernetes.io/is-default-class: "true" labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: EnsureExists provisioner: openstack.org/standalone-cinder parameters: type: volumes_hdd availability: cinderAZ_ceph --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: myclaim spec: accessModes: - ReadWriteOnce resources: requests: storage: 42Gi storageClassName: cinder regards Christian -------------- next part -------------- An HTML attachment was scrubbed... URL: From chason.chan at foxmail.com Thu Jan 24 07:51:40 2019 From: chason.chan at foxmail.com (Chason) Date: Thu, 24 Jan 2019 15:51:40 +0800 Subject: [docs] Nominating Alex Settle for openstack-doc-core In-Reply-To: References: Message-ID: <5FC39307-6065-4FD6-942C-F7F5C4A22EA5@foxmail.com> So glad to see you back to us, Alex! \o/ Absolutely agree with this Nominating. > Date: Fri, 18 Jan 2019 14:42:33 +0100 > From: Petr Kovar > To: openstack-discuss at lists.openstack.org > Cc: Alex Settle > Subject: [docs] Nominating Alex Settle for openstack-doc-core > Message-ID: <20190118144233.132eb0e427389da15e725141 at redhat.com> > Content-Type: text/plain; charset=US-ASCII > > Hi all, > > Alex Settle recently re-joined the Documentation Project after a few-month > break. It's great to have her back and I want to formally nominate her for > membership in the openstack-doc-core team, to follow the formal process for > cores. > > Please let the ML know should you have any objections. > > Thanks, > pk > > > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > openstack-discuss mailing list > openstack-discuss at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss > > > ------------------------------ > > End of openstack-discuss Digest, Vol 3, Issue 110 > ************************************************* From ignaziocassano at gmail.com Thu Jan 24 08:21:48 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 24 Jan 2019 09:21:48 +0100 Subject: [cinder] volume snapshot on queens not working as in ocata In-Reply-To: References: <20190123155845.GB1213@sm-workstation> Message-ID: Sorry , It works fine . Snapshot is created fine. I made a mistake. Ignazio Il giorno mer 23 gen 2019 alle ore 18:05 Ignazio Cassano < ignaziocassano at gmail.com> ha scritto: > Thanks. > I'll send logs asap. > Regards > Ignazio > > Il giorno Mer 23 Gen 2019 16:58 Sean McGinnis ha > scritto: > >> On Wed, Jan 23, 2019 at 03:55:17PM +0100, Ignazio Cassano wrote: >> > Hello All, >> > I have tow different openstack installations: ocata and queens. >> > On both I have the same cinder sorage based on netapp via nfs. >> > If I run the following steps on ocata, it works fine: >> > >> > >> >> > | status | >> > error | >> > | updated_at | >> > 2019-01-23T08:59:03.000000 | >> > | volume_id | >> > 12bc6248-b624-4fb0-81f2-c1f986c4697c | >> > >> +--------------------------------------------+--------------------------------------+ >> > >> > Any help, please ? >> > >> > Ignazio >> >> There is most likely an issue between Cinder and the backend storage >> device. >> >> You may be able to get some details by using the 'cinder message-list' and >> 'cinder message-show' command line: >> >> >> https://docs.openstack.org/python-cinderclient/latest/cli/details.html#cinder-message-list >> >> If that does not provide a reasonable message at least pointing to where >> the >> problem is (which we should probably see if we can improve) then you will >> likely need to go to the cinder-volume log files. If you search in there, >> there >> will probably be traceback messages or other errors indicating why the >> snapshot >> failed. >> >> Sean >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jakub.sliva at ultimum.io Thu Jan 24 14:18:01 2019 From: jakub.sliva at ultimum.io (=?UTF-8?B?SmFrdWIgU2zDrXZh?=) Date: Thu, 24 Jan 2019 15:18:01 +0100 Subject: [tc][telemetry][horizon] ceilometer-dashboard repository creation In-Reply-To: References: <20190110130557.q3fgchx3uot6aupj@yuggoth.org> Message-ID: čt 10. 1. 2019 v 14:49 odesílatel Doug Hellmann napsal: > Jeremy Stanley writes: > > > On 2019-01-10 12:27:06 +0100 (+0100), Jakub Slíva wrote: > >> our company created a little plugin to Horizon and we would like to > >> share it with the community in a bit more official way. So I created > >> change request (https://review.openstack.org/#/c/619235/) in order to > >> create official repository under project Telemetry. However, PTL > >> recommended me to put this new repository under OpenStack without any > >> project - i.e. make it unofficial. > >> > >> I have also discussed this with Horizon team during their meeting > >> ( > http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-31 > ) > >> and now I am bit stuck because I do not know how to proceed next. > >> Could you, please, advise me? > > > > It looks like much of this confusion stemmed from recommendation by > > project-config-core reviewers, unfortunately. We too often see > > people from official teams in OpenStack request new Git repositories > > for work their team will be performing, but who forget to also > > record them in the appropriate governance lists. As a result, if a > > proposed repository looks closely-related to the work of an existing > > team (in this case possibly either Horizon or Telemetry) we usually > > assume this was the case and recommend during the review process > > that they file a corresponding change to the OpenStack TC's > > governance repository. Given this is an independent group's work for > > which neither the Horizon nor Telemetry teams have expressed an > > interest in adopting responsibility, it's perfectly acceptable to > > have it operate as an unofficial project or to apply for status as > > another official project team within OpenStack. > > > > The main differences between the two options are that contributors > > to official OpenStack project teams gain the ability to vote in > > Technical Committee elections, their repositories can publish > > documentation on the https://docs.openstack.org/ Web site, they're > > able to reserve space for team-specific discussions and working > > sessions at OSF Project Teams Gathering meetings (such as the one > > coming up in Denver immediately following the Open Infrastructure > > Summit)... but official project teams are also expected to hold team > > lead elections twice a year, participate in OpenStack release > > processes, follow up on implementing cycle goals, and otherwise meet > > the requirements laid out in our > > > https://governance.openstack.org/tc/reference/new-projects-requirements.html > > document. > > -- > > Jeremy Stanley > > Jakub, thank you for starting this thread. As you can see from Jeremy's > response, you have a couple of options. You had previously told me you > wanted the repository to be "official", and since the existing teams do > not want to manage it I think that it is likely that you will want to > create a new team for it. However, since that path does introduce some > obligations, before you go ahead it would be good to understand what > benefits you are seeking by joining an official team. Can you fill in > some background for us, so we can offer the best guidance? > > -- > Doug > > Thank you all for the information. However, after long internal discussion we decided not to undergo all the obligatory steps and create a new team for such small plugin. Therefore, we will abandon both changes in Gerrit. Jakub Sliva Ultimum Technologies s.r.o. Na Poříčí 1047/26, 11000 Praha 1 Czech Republic http://ultimum.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From e0ne at e0ne.info Thu Jan 24 15:01:46 2019 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Thu, 24 Jan 2019 17:01:46 +0200 Subject: [tc][telemetry][horizon] ceilometer-dashboard repository creation In-Reply-To: References: <20190110130557.q3fgchx3uot6aupj@yuggoth.org> Message-ID: Jakub, Please, notify me or create a patch to 'Horizon Plugin Registry'[1] once project will bee created. [1] https://github.com/openstack/horizon/blob/master/doc/source/install/plugin-registry.rst Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Thu, Jan 24, 2019 at 4:19 PM Jakub Slíva wrote: > > čt 10. 1. 2019 v 14:49 odesílatel Doug Hellmann > napsal: > >> Jeremy Stanley writes: >> >> > On 2019-01-10 12:27:06 +0100 (+0100), Jakub Slíva wrote: >> >> our company created a little plugin to Horizon and we would like to >> >> share it with the community in a bit more official way. So I created >> >> change request (https://review.openstack.org/#/c/619235/) in order to >> >> create official repository under project Telemetry. However, PTL >> >> recommended me to put this new repository under OpenStack without any >> >> project - i.e. make it unofficial. >> >> >> >> I have also discussed this with Horizon team during their meeting >> >> ( >> http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-31 >> ) >> >> and now I am bit stuck because I do not know how to proceed next. >> >> Could you, please, advise me? >> > >> > It looks like much of this confusion stemmed from recommendation by >> > project-config-core reviewers, unfortunately. We too often see >> > people from official teams in OpenStack request new Git repositories >> > for work their team will be performing, but who forget to also >> > record them in the appropriate governance lists. As a result, if a >> > proposed repository looks closely-related to the work of an existing >> > team (in this case possibly either Horizon or Telemetry) we usually >> > assume this was the case and recommend during the review process >> > that they file a corresponding change to the OpenStack TC's >> > governance repository. Given this is an independent group's work for >> > which neither the Horizon nor Telemetry teams have expressed an >> > interest in adopting responsibility, it's perfectly acceptable to >> > have it operate as an unofficial project or to apply for status as >> > another official project team within OpenStack. >> > >> > The main differences between the two options are that contributors >> > to official OpenStack project teams gain the ability to vote in >> > Technical Committee elections, their repositories can publish >> > documentation on the https://docs.openstack.org/ Web site, they're >> > able to reserve space for team-specific discussions and working >> > sessions at OSF Project Teams Gathering meetings (such as the one >> > coming up in Denver immediately following the Open Infrastructure >> > Summit)... but official project teams are also expected to hold team >> > lead elections twice a year, participate in OpenStack release >> > processes, follow up on implementing cycle goals, and otherwise meet >> > the requirements laid out in our >> > >> https://governance.openstack.org/tc/reference/new-projects-requirements.html >> > document. >> > -- >> > Jeremy Stanley >> >> Jakub, thank you for starting this thread. As you can see from Jeremy's >> response, you have a couple of options. You had previously told me you >> wanted the repository to be "official", and since the existing teams do >> not want to manage it I think that it is likely that you will want to >> create a new team for it. However, since that path does introduce some >> obligations, before you go ahead it would be good to understand what >> benefits you are seeking by joining an official team. Can you fill in >> some background for us, so we can offer the best guidance? >> >> -- >> Doug >> >> > Thank you all for the information. However, after long internal discussion > we decided not to undergo all the obligatory steps and create a new team > for such small plugin. Therefore, we will abandon both changes in Gerrit. > > Jakub Sliva > > Ultimum Technologies s.r.o. > Na Poříčí 1047/26, 11000 Praha 1 > Czech Republic > http://ultimum.io > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Jan 24 15:09:07 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 24 Jan 2019 09:09:07 -0600 Subject: [nova] Per-instance serial number implementation question Message-ID: The proposal from the spec for this feature was to add an image property (hw_unique_serial), flavor extra spec (hw:unique_serial) and new "unique" choice to the [libvirt]/sysinfo_serial config option. The image property and extra spec would be booleans but really only True values make sense and False would be more or less ignored. There were no plans to enforce strict checking of a boolean value, e.g. if the image property was True but the flavor extra spec was False, we would not raise an exception for incompatible values, we would just use OR logic and take the image property True value. The boolean usage proposed is a bit confusing, as can be seen from comments in the spec [1] and the proposed code change [2]. After thinking about this a bit, I'm now thinking maybe we should just use a single-value enum for the image property and flavor extra spec: image: hw_guest_serial=unique flavor: hw:guest_serial=unique If either are set, then we use a unique serial number for the guest. If neither are set, then the serial number is based on the host configuration as it is today. I think that's more clear usage, do others agree? Alex does. I can't think of any cases where users would want hw_unique_serial=False, so this removes that ability and confusion over whether or not to enforce mismatching booleans. [1] https://review.openstack.org/#/c/612531/2/specs/stein/approved/per-instance-libvirt-sysinfo-serial.rst at 43 [2] https://review.openstack.org/#/c/619953/7/nova/virt/libvirt/driver.py at 4894 -- Thanks, Matt From zh.f at outlook.com Thu Jan 24 15:48:31 2019 From: zh.f at outlook.com (Zhang Fan) Date: Thu, 24 Jan 2019 15:48:31 +0000 Subject: [nova] Per-instance serial number implementation question In-Reply-To: References: Message-ID: Yes, enum sounds much clearer to me. It is similar to hw:cpu_policy or hw_cpu_policy. So why not :) From Fan’s plastic iPhone > 在 2019年1月24日,23:13,Matt Riedemann 写道: > > The proposal from the spec for this feature was to add an image property (hw_unique_serial), flavor extra spec (hw:unique_serial) and new "unique" choice to the [libvirt]/sysinfo_serial config option. The image property and extra spec would be booleans but really only True values make sense and False would be more or less ignored. There were no plans to enforce strict checking of a boolean value, e.g. if the image property was True but the flavor extra spec was False, we would not raise an exception for incompatible values, we would just use OR logic and take the image property True value. > > The boolean usage proposed is a bit confusing, as can be seen from comments in the spec [1] and the proposed code change [2]. > > After thinking about this a bit, I'm now thinking maybe we should just use a single-value enum for the image property and flavor extra spec: > > image: hw_guest_serial=unique > flavor: hw:guest_serial=unique > > If either are set, then we use a unique serial number for the guest. If neither are set, then the serial number is based on the host configuration as it is today. > > I think that's more clear usage, do others agree? Alex does. I can't think of any cases where users would want hw_unique_serial=False, so this removes that ability and confusion over whether or not to enforce mismatching booleans. > > [1] https://review.openstack.org/#/c/612531/2/specs/stein/approved/per-instance-libvirt-sysinfo-serial.rst at 43 > [2] https://review.openstack.org/#/c/619953/7/nova/virt/libvirt/driver.py at 4894 > > -- > > Thanks, > > Matt > From melwittt at gmail.com Thu Jan 24 18:16:41 2019 From: melwittt at gmail.com (melanie witt) Date: Thu, 24 Jan 2019 10:16:41 -0800 Subject: [ptl][goals][python3] please update your team's status in the wiki In-Reply-To: References: Message-ID: <8b80b3bf-e842-3c44-d5e4-73279e4fe36d@gmail.com> On Thu, 20 Dec 2018 08:21:00 -0500, Doug Hellmann wrote: > > I noticed this morning that the last time the Python 3 status page [1] was > updated in the wiki was August. I hope we've had changes to the level > of support since then. > > Keeping that page up to date is part of fulfilling the goal this > cycle. Please take a few minutes to review the content today and update > it if necessary. > > Thanks! > > [1] https://wiki.openstack.org/wiki/Python3#Python_3_Status_of_OpenStack_projects Sorry for the late response to this, but I have a question. Looking at the wiki, I see that the previous status updates for nova were from before we started using mox3 in our unit tests. As of today, our unit tests pass under python3 because of mox3. Does that mean we are OK to mark our status as "Yes" for unit test support of python3 on the wiki? Some notes on mox3: we've worked on removing mox3 from our unit tests over the past several cycles and today it is only present in a few of our test files: $ grep -rIn mox3 nova nova/tests/unit/cells/test_cells_messaging.py:21:from mox3 import mox nova/tests/unit/network/test_neutronv2.py:24:from mox3 import mox nova/tests/unit/network/test_manager.py:20:from mox3 import mox We advised folks not to spend time removing mox3 from cells v1 and nova-network unit tests, as cells v1 and nova-network are slated for removal from the code base, as soon as we're able (test_cells_messaging.py and test_manager.py). That leaves test_neutronv2.py as the only remaining file using mox3. Cheers, -melanie From haleyb.dev at gmail.com Thu Jan 24 18:38:15 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Thu, 24 Jan 2019 13:38:15 -0500 Subject: [openstack-dev] [neutron] In-Reply-To: <58d441a312fc6c813b67c2f46674ee1961c92ae7.camel@redhat.com> References: <1488344635.1835770.1548183390982.ref@mail.yahoo.com> <1488344635.1835770.1548183390982@mail.yahoo.com> <9cece8b7-d501-d97e-f689-ddd6f07b5e9b@gmail.com> <58d441a312fc6c813b67c2f46674ee1961c92ae7.camel@redhat.com> Message-ID: On 1/23/19 4:20 PM, Sean Mooney wrote: > On Wed, 2019-01-23 at 15:46 -0500, Brian Haley wrote: >> On 1/22/19 1:56 PM, Farhad Sunavala wrote: >>> Hi, >>> >>> >>> I am open to suggestions. >>> We have a need to switch traffic from our project to other projects >>> without first getting out >>> on the internet, floating IPs, etc. >>> >>> The other projects will be sharing their networks with our project. >>> As shown in figure below, the orange network belongs to our project >>> (10.0.0.0/26) >>> >>> The green network (172.31.0.0/24) belongs to another project >>> and >>> has an overlapping network with the red tenant (172.31.0.0/16) >>> >>> For now, the solution is to create VMs in our project and make sure none >>> of the interfaces >>> having overlapping CIDRs. Thus, there is a VM attached to the 'orange' >>> and 'red' nets >>> and another VM attached to the 'orange' and 'green' nets. >>> >>> Problem: Too much resources (VMs) will need to be created if we have 100 >>> tenants with overlapping networks. >>> >>> Solution: >>> Is there a way I can minimize VM resource in our project by not >>> allocating a separate VM >>> for shared networks with overlapping CIDRs? >> >> Have you tried setting allow_overlapping_ips=False in neutron.conf and >> restarting the server? > correct me if im wrong but setting allow_overlapping_ips=false would effectivly prevent overlaping CIDRs > https://docs.openstack.org/neutron/latest/configuration/neutron.html#DEFAULT.allow_overlapping_ips > > you would generally only do that if you were using routed network or didnt want teanat to have overlapping CIDRs > for there networks. Right, I thought that's what his picture showed - two tenants with the same private subnet CIDR. > if we removed the requirement to allowing overlapping cidrs then setting > allow_overlapping_ips=false and configuring a default subnet pool so that tenant networks automatically got > issued non over lapping subnets that would work but that is not what the original question was. Yes, that would be the other (preferred) way, then tenants would only have to ask for a CIDR from the pool. -Brian From lbragstad at gmail.com Thu Jan 24 18:38:46 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Thu, 24 Jan 2019 12:38:46 -0600 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: <86ed4afc-056e-602a-e30c-08a51c2a2080@catalyst.net.nz> References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> <86ed4afc-056e-602a-e30c-08a51c2a2080@catalyst.net.nz> Message-ID: Sending out a quick recap. It sounds like we have multiple champions, which is great, in addition to an understanding of how we can implement this. Is it fair to say that we're going to pursue the OSPurge approach* initially and follow up in subsequent releases with more details about service specific (system-scoped) purge APIs? If so, do we think we're ready to propose this and get it into review? * detailed at line 68 here - https://etherpad.openstack.org/p/community-goal-project-deletion On Tue, Jan 22, 2019 at 5:23 PM Adrian Turjak wrote: > Thanks for the input! I'm willing to bet there are many people excited > about this goal, or will be when they realise it exists! > > The 'dirty' state I think would be solved with a report API in each > service (tell me everything a given project has resource wise). Such an > API would be useful without needing to query each resource list, and > potentially could be an easy thing to implement to help a purge library > figure out what to delete. I know right now our method for checking if a > project is 'dirty' is part of our quota checking scripts, and it has to > query a lot of APIs per service to build an idea of what a project has. > > As for using existing code, OSPurge could well be a starting point, but > the major part of this goal has to be that each OpenStack service (that > creates resources owned by a project) takes ownership of their own > deletion logic. This is why a top level library for cross project logic, > with per service plugin libraries is possibly the best approach. Each > library would follow the same template and abstraction layers (as > inherited from the top level library), but how each service implements > their own deletion is up to them. I would also push for them using the > SDK only as their point of interaction with the APIs (lets set some hard > requirements and standards!), because that is the python library we > should be using going forward. In addition such an approach could mean > that anyone can write a plugin for the top level library (e.g. internal > company only services) which will automatically get picked up if installed. > > We would need robust and extensive testing for this, because deletion is > critical, and we need it to work, but also not cause damage in ways it > shouldn't. > > And you're right, purge tools purging outside of the scope asked for is > a worry. Our own internal logic actually works by having the triggering > admin user add itself to the project (and ensure no admin role), then > scope a token to just that project, and delete resources form the point > of view of a project user. That way it's kind of like a user deleting > their own resources, and in truth having a nicer way to even do that > (non-admin clearing of project) would be amazing for a lot of people who > don't want to close their account or disable their project, but just > want to delete stray resources and not get charged. > > On 23/01/19 4:03 AM, Tobias Urdin wrote: > > Thanks for the thorough feedback Adrian. > > > > My opinion is also that Keystone should not be the actor in executing > > this functionality but somewhere else > > whether that is Adjutant or any other form (application, library, CLI > > etc). > > > > I would also like to bring up the point about knowing if a project is > > "dirty" (it has provisioned resources). > > This is something that I think all business logic would benefit from, > > we've had issue with knowing when > > resources should be deleted, our solution is pretty much look at > > metrics the last X minutes, check if project > > is disabled and compare to business logic that says it should be deleted. > > > > While the above works it kills some of logical points of disabling a > > project since the only thing that knows if > > the project should be deleted or is actually disabled is the business > > logic application that says they clicked the > > deleted button and not disabled. > > > > Most of the functionality you are mentioning is things that the > > ospurge project has been working to implement and the > > maintainer even did a full rewrite which improved the dependency > > arrangement for resource removal. > > > > I think the biggest win for this community goal would be the > > developers of the projects would be available for input regarding > > the project specific code that does purging. There has been some > > really nasty bugs in ospurge in the past that if executed with the admin > > user you would wipe everything and not only that project, which is > > probably a issue that makes people think twice about > > using a purging toolkit at all. > > > > We should carefully consider what parts of ospurge could be reused, > > concept, code or anything in between that could help derive > > what direction we wan't to push this goal. > > > > I'm excited :) > > > > Best regards > > Tobias > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Thu Jan 24 20:50:05 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 24 Jan 2019 15:50:05 -0500 Subject: [ptl][goals][python3] please update your team's status in the wiki In-Reply-To: <8b80b3bf-e842-3c44-d5e4-73279e4fe36d@gmail.com> References: <8b80b3bf-e842-3c44-d5e4-73279e4fe36d@gmail.com> Message-ID: <8949BD42-E43E-4D1C-A347-EC83D3C288F4@doughellmann.com> > On Jan 24, 2019, at 1:16 PM, melanie witt wrote: > >> On Thu, 20 Dec 2018 08:21:00 -0500, Doug Hellmann wrote: >> I noticed this morning that the last time the Python 3 status page [1] was >> updated in the wiki was August. I hope we've had changes to the level >> of support since then. >> Keeping that page up to date is part of fulfilling the goal this >> cycle. Please take a few minutes to review the content today and update >> it if necessary. >> Thanks! >> [1] https://wiki.openstack.org/wiki/Python3#Python_3_Status_of_OpenStack_projects > > Sorry for the late response to this, but I have a question. > > Looking at the wiki, I see that the previous status updates for nova were from before we started using mox3 in our unit tests. As of today, our unit tests pass under python3 because of mox3. Does that mean we are OK to mark our status as "Yes" for unit test support of python3 on the wiki? The point is to accurately communicate the status rather than try to hit an arbitrary measure. So, if you are confident that the level of test coverage under python 3 is good, then go ahead and update the wiki to reflect that. > > Some notes on mox3: we've worked on removing mox3 from our unit tests over the past several cycles and today it is only present in a few of our test files: > > $ grep -rIn mox3 nova > nova/tests/unit/cells/test_cells_messaging.py:21:from mox3 import mox > nova/tests/unit/network/test_neutronv2.py:24:from mox3 import mox > nova/tests/unit/network/test_manager.py:20:from mox3 import mox > > We advised folks not to spend time removing mox3 from cells v1 and nova-network unit tests, as cells v1 and nova-network are slated for removal from the code base, as soon as we're able (test_cells_messaging.py and test_manager.py). That leaves test_neutronv2.py as the only remaining file using mox3. That sounds like a reasonable approach. I think I remember something about that when the mox goal was proposed, so it’s not a surprise. Thanks, Doug > > Cheers, > -melanie > > > From openstack at nemebean.com Thu Jan 24 22:17:25 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 24 Jan 2019 16:17:25 -0600 Subject: [oslo] Proposing Zane Bitter as general Oslo core Message-ID: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> Hi all, Zane is already core on oslo.service, but he's been doing good stuff in adjacent projects as well. We could keep playing whack-a-mole with giving him +2 on more repos, but I trust his judgment so I'm proposing we just add him to the oslo-core group. If there are no objections in the next week I'll proceed with the addition. Thanks. -Ben From anlin.kong at gmail.com Thu Jan 24 22:39:56 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Fri, 25 Jan 2019 11:39:56 +1300 Subject: [magnum] Persistent Volume Claim with cinder backend In-Reply-To: References: Message-ID: Hi, I don't think you can use `openstack.org/standalone-cinder` without setup a standalone cinder external provisioner[1]. Although kubernetes.io/cinder is deprecated, it just works, either for the in-tree openstack provider or openstack-cloud-controller-manager[2] which is supported in Stein dev cycle and already backported to Rocky. Regardless of both of them, CSI is the future(I'm going to add that support in Magnum, too). [1]: https://github.com/kubernetes/cloud-provider-openstack/blob/f056677572b2635632abcc7dbde459cdfc4432b9/docs/using-cinder-standalone-provisioner.md [2]: https://review.openstack.org/#/q/6c61a1a949615f6dc1df36f3098cd97466ac7238 Cheers, Lingxian Kong On Thu, Jan 24, 2019 at 7:39 PM Christian Zunker wrote: > Hi, > > we are running Magnum Rocky. > I tried to create a persistent volume claim and got it working with > provisioner: kubernetes.io/cinder > But it failed with provisioner: openstack.org/standalone-cinder > > The docs state kubernetes.io/cinder is deprecated: > > https://kubernetes.io/docs/concepts/storage/storage-classes/#openstack-cinder > > Which one should be used in Rocky? > > > This is our complete config for this case: > apiVersion: storage.k8s.io/v1 > kind: StorageClass > metadata: > name: cinder > annotations: > storageclass.beta.kubernetes.io/is-default-class: "true" > labels: > kubernetes.io/cluster-service: "true" > addonmanager.kubernetes.io/mode: EnsureExists > provisioner: openstack.org/standalone-cinder > parameters: > type: volumes_hdd > availability: cinderAZ_ceph > --- > kind: PersistentVolumeClaim > apiVersion: v1 > metadata: > name: myclaim > spec: > accessModes: > - ReadWriteOnce > resources: > requests: > storage: 42Gi > storageClassName: cinder > > regards > Christian > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Thu Jan 24 23:08:47 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 24 Jan 2019 18:08:47 -0500 Subject: [oslo] Proposing Zane Bitter as general Oslo core In-Reply-To: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> References: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> Message-ID: +1 > On Jan 24, 2019, at 5:17 PM, Ben Nemec wrote: > > Hi all, > > Zane is already core on oslo.service, but he's been doing good stuff in adjacent projects as well. We could keep playing whack-a-mole with giving him +2 on more repos, but I trust his judgment so I'm proposing we just add him to the oslo-core group. > > If there are no objections in the next week I'll proceed with the addition. > > Thanks. > > -Ben > From mihalis68 at gmail.com Thu Jan 24 23:27:00 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Thu, 24 Jan 2019 18:27:00 -0500 Subject: [ops] Berlin ops meet up march 2019 Message-ID: We (the ops meet ups team) just heard that the city of Berlin just decided that March 8th 2019 is to be a public holiday. Our first reaction is to see if we can just pull the ops meet up forward one day (making it the 6th and 7th instead of the 7th and the 8th). Please let us know if this would be a problem for you ASAP. A good example might be if you have already booked flights. Since the even tickets have not been issued yet, we are hoping that's not the case for anyone. Moving to the 6th and 7th allows it to happen during normal business days for the host, Deutsch Telekom. I don't personally like the idea of trying to keep the 8th when it's a public holiday. Regards, Chris -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Jan 25 01:27:41 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 25 Jan 2019 10:27:41 +0900 Subject: [qa][tempest] Waiting for interface status == ACTIVE before checking status In-Reply-To: References: Message-ID: <168829ea636.ac2416e683695.6782940432609689439@ghanshyammann.com> ---- On Thu, 24 Jan 2019 07:09:26 +0900 Terry Wilson wrote ---- > In the networking-ovn project, we hit this bug *very* often: > https://bugs.launchpad.net/tempest/+bug/1728600. You can see the > logstash here where it has failed 330 times in the last week: > http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22AssertionError%3A%20%5B%5D%20is%20not%20true%20%3A%20No%20IPv4%20addresses%20found%20in%5C%22 > > The bug has been around since 2017, and there are earlier reports of > it than that. The bug happens in some projects outside of > networking-ovn as well. > > At the core of the issue is that _get_server_port_id_and_ip4 loops > through server ports to return ones that are ACTIVE, but there is a > race where a port could become temporarily inactive if the ml2 driver > continually monitors the actual port status. In the case we hit, > os-vif started recreating the ovs port during an operation, so we > would detect the status of the port as down and change the status, and > then when the port is recreated we set the port status back to up. If > the check happens while the port is down, the test fails. But is this by design or bug that Active port on Active VM can flip to down. Waitinthe g for already active and bounded port to become active again after we got the Active server is not right things to test. As Sean also pointed that in patch that we should go for the approach of "making sure all attached interface to server is active, server is sshable bthe efore server can be used in test" [1]. This is something we agreed in Denver PTG for afazekas proposal[2]. If we see the from user perspective , user can have an Active VM with active port which can flip to down in between of that port usage. This seems bug to me. [1] - https://review.openstack.org/#/c/600046 [2] - https://etherpad.openstack.org/p/handling-of-interface-attach-detach-hotplug-unplug I have also commented on patch, sorry for delaying the review on that. -gmann > > There have been comments that the port status shouldn't flip w/o any > user request that would cause it, but that would mean that a > plugin/driver would have to ignore the actual status of a port and > that seems wrong. External things can affect what state a port is in. > > https://review.openstack.org/#/c/449695/7/tempest/scenario/manager.py > adds a wait mechanism to checking the port status so that momentary > flips of port status will not cause the test to inadvertently fail. > The patch currently has 10 +1s. We really need to get this fixed. > > Thanks! > Terry > > From melwittt at gmail.com Fri Jan 25 02:20:01 2019 From: melwittt at gmail.com (melanie witt) Date: Thu, 24 Jan 2019 18:20:01 -0800 Subject: [nova] Per-instance serial number implementation question In-Reply-To: References: Message-ID: On Thu, 24 Jan 2019 09:09:07 -0600, Matt Riedemann wrote: > The proposal from the spec for this feature was to add an image property > (hw_unique_serial), flavor extra spec (hw:unique_serial) and new > "unique" choice to the [libvirt]/sysinfo_serial config option. The image > property and extra spec would be booleans but really only True values > make sense and False would be more or less ignored. There were no plans > to enforce strict checking of a boolean value, e.g. if the image > property was True but the flavor extra spec was False, we would not > raise an exception for incompatible values, we would just use OR logic > and take the image property True value. > > The boolean usage proposed is a bit confusing, as can be seen from > comments in the spec [1] and the proposed code change [2]. > > After thinking about this a bit, I'm now thinking maybe we should just > use a single-value enum for the image property and flavor extra spec: > > image: hw_guest_serial=unique > flavor: hw:guest_serial=unique > > If either are set, then we use a unique serial number for the guest. If > neither are set, then the serial number is based on the host > configuration as it is today. > > I think that's more clear usage, do others agree? Alex does. I can't > think of any cases where users would want hw_unique_serial=False, so > this removes that ability and confusion over whether or not to enforce > mismatching booleans. I think use of the enum makes sense and it happens to make it easier to reason about in conjunction with the [libvirt]/sysinfo_serial config option, at least for me. -melanie From melwittt at gmail.com Fri Jan 25 03:40:51 2019 From: melwittt at gmail.com (melanie witt) Date: Thu, 24 Jan 2019 19:40:51 -0800 Subject: [nova][dev] stein blueprint tracking status Message-ID: <64f5c762-b917-3aea-709a-8f32199d46c4@gmail.com> Hi all, I've updated our stein blueprint tracking status etherpad with all of our approved blueprints for the cycle and notes about their current status: https://etherpad.openstack.org/p/nova-stein-blueprint-status Feature freeze s-3 is fast approaching on March 7. Let's use this etherpad to help us focus our work and complete as much as we can in the next 6 weeks. At the top, I've collected status on our cycle themes. TL;DR is that some efforts have stalled and we probably need people to jump in and help. Some people are already helping, and that's awesome! Please use this etherpad as a review guide and a place to communicate notes about blueprint status and progress, anything that will help us all move forward on completing implementations and reviews. Communication will be key as we try to get things done by s-3. Use the etherpad, #openstack-nova on IRC, and the ML to ask questions, get unstuck, and give updates. Let me know if I've missed any blueprints and feel free to add notes about any items that need to be moved, etc. Cheers, -melanie From prsrivas at redhat.com Fri Jan 25 06:53:13 2019 From: prsrivas at redhat.com (Pritha Srivastava) Date: Fri, 25 Jan 2019 12:23:13 +0530 Subject: [Keystone] How to retrieve secret key for a given access key id Message-ID: Hi All, I have a scenario where I need to retrieve the secret key of a given access key id (of EC2 credentials) from Keystone. I know that this can be done by sending a GET request to the following URL: v3/users/{user_id}/credentials/OS-EC2/access_key_id I don't have the user_id required for the above request, but I have the admin username and admin password, that was used to create the EC2 credentials. Is there a way to get the user_id, and then get the secret key from Keystone? Thanks, Pritha -------------- next part -------------- An HTML attachment was scrubbed... URL: From soumplis at admin.grnet.gr Fri Jan 25 08:19:44 2019 From: soumplis at admin.grnet.gr (Alexandros Soumplis) Date: Fri, 25 Jan 2019 08:19:44 +0000 Subject: Why COA exam is being retired? In-Reply-To: References: Message-ID: <25c27f7e-80ec-2eb5-6b88-5627bc9f1f01@admin.grnet.gr> That's an interesting question if anyone could elaborate On 21/01/2019 14:24, Eduardo Gonzalez wrote: > Reading the info in the COA site [0] says the following/"The OpenStack > Foundation is winding down the administration of the COA exam"./ > / > / > Is there any reason for retiring the exam? I've tried to find a notice > in the mailing list but not found anything//at all. > > [0] https://www.openstack.org/coa/ > > Regards/ > / -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3620 bytes Desc: S/MIME Cryptographic Signature URL: From yedhusastri at gmail.com Fri Jan 25 08:25:36 2019 From: yedhusastri at gmail.com (Yedhu Sastri) Date: Fri, 25 Jan 2019 09:25:36 +0100 Subject: Not getting full bandwidth VXLAN + DVR Message-ID: Hello, In our OpenStack environment(Newton) we are using 10G network in all our nodes. We are using OVS bridging with VXLAN tunneling and DVR. We also enabled Jumbo frames in NIC and also in physical switches. We also enabled VXLAN offloading in our NIC. irqbalance is running which suppose to distribute the network irqs to all cores of the CPU. But unfortunately we are only getting below 1G bandwidth when communicate with our VM's with floating IP's from compute hosts. We tested it using iperf and results are like Host to VM using floating IP - less than 1Gbits/sec VM to VM using internal IP - ~2.5Gbits/sec Any idea or solutions to solve this issue is much appreciated. -- With kind regards, Yedhu Sastri -------------- next part -------------- An HTML attachment was scrubbed... URL: From yedhusastri at gmail.com Fri Jan 25 09:14:22 2019 From: yedhusastri at gmail.com (Yedhu Sastri) Date: Fri, 25 Jan 2019 10:14:22 +0100 Subject: Not getting full bandwidth VXLAN + DVR Message-ID: Hello, In our OpenStack environment(Newton) we are using 10G network in all our nodes. We are using OVS bridging with VXLAN tunneling and DVR. We also enabled Jumbo frames in NIC and also in physical switches. We also enabled VXLAN offloading in our NIC. irqbalance is running which suppose to distribute the network irqs to all cores of the CPU. But unfortunately we are only getting below 1G bandwidth when communicate with our VM's with floating IP's from compute hosts. We tested it using iperf and results are like Host to VM using floating IP - less than 1Gbits/sec VM to VM using internal IP - ~2.5Gbits/sec Any idea or solutions to solve this issue is much appreciated. -- With kind regards, Yedhu Sastri -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Fri Jan 25 11:11:12 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 25 Jan 2019 11:11:12 +0000 Subject: Not getting full bandwidth VXLAN + DVR In-Reply-To: References: Message-ID: On Fri, 2019-01-25 at 10:14 +0100, Yedhu Sastri wrote: > Hello, > > In our OpenStack environment(Newton) we are using 10G network in all our nodes. We are using OVS bridging with VXLAN > tunneling and DVR. We also enabled Jumbo frames in NIC and also in physical switches. We also enabled VXLAN offloading > in our NIC. irqbalance is running which suppose to distribute the network irqs to all cores of the CPU. But > unfortunately we are only getting below 1G bandwidth when communicate with our VM's with floating IP's from compute > hosts. We tested it using iperf and results are like > > Host to VM using floating IP - less than 1Gbits/sec sorry for the complexity of the diagram but with dvr your networking will look something like this https://docs.openstack.org/ocata/networking-guide/_images/deploy-ovs-ha-dvr-compconn1.png the diagram is actully incorrect in that there should not be a line between port interface 3 and interface 3 as that imply you add a phyical nic to the br-tun which is not correct. when iperf connects via the internet the ingress flow is described here https://docs.openstack.org/ocata/networking-guide/deploy-ovs-ha-dvr.html but i will summarise below. looking at the simplifed diagram https://docs.openstack.org/ocata/networking-guide/_images/deploy-ovs-ha-dvr-flowns2.png 1 packet arrives in datacenter wan uplink and is switch to your wan router. the wan router haveing connectivity to the subnet of your floating ip generate an arp to discover the mac of the floatin ip. the arp request ingress the compute node on interface 2 enters the br-provider bridge (usually called br-ex) crosses the patch port to the br-int where it exits via fg port labled 6 and enters the fip namespace via the tap device created by ovs labled 7 in the diagram. this tap has the floating ip assigned so it responds to the arp which when recived by your wan router trigers it to lean the dset mac and route the tcp stream from your iperf client to that mac. the iperf traffic takes teh same path to fip namespace where it is intercepted by an iptables DNAT rule which updates the destination ip to the private ip and it is sent to the dvr namespace by a veth pair labled 8 and 9 in the diagram. once the packet is recived in the dvr namespace it is routed to ovs via interface 10 after doing a similar arp request to learn the dest mac for the private ip. the packet ender the br-int and if you are using the iptable firewall dirver exit via another veth pair and enter linux bridge which the vms tab device is connected to and finally gets to the vm. not if you are using the conntrack or noop firewall driver the vm tap is added to the br-int directly so the qbr linux bridge and the veth pair shown as 12 and 13 will not exist. finally the vm recived the iperf connect and the reponce packets are sent backward through the same path. so i went thorugh that flow for 2 reasons. first in the north south path the network encapsulation used for the teant network e.g. vxlan is irellevent as the packet is never vxlan encapsulated. second there are several places where there could be botelnecks. first are you using 10G nics for the br-ex/br-provider bridge? second is the local tunnel endpoint ip assinged to this bridge.? the answer should be yes to both and i will procedd as if the answer is yes. if you are not using a 10G nic for the br-ex then that is why you are seeing sub 1G speeds next you mention you are using jumbo frames. assuming you are using 9000 byte mtu then i would expect the mtu of the neutorn vxlan network to be 8950. in this case looking at https://docs.openstack.org/ocata/networking-guide/_images/deploy-ovs-ha-dvr-flowns2.png again you should check the mtus are set correctly at the following locations. interfce 2 should be set to your phyical network mtu which im assuming is 9000 in this example interfces 15(vm tap), 14(qbr bridge) 13,(qvb veth interface) and 10(qr port in dvr namespace) should all have there mtu set to 8950. interface 9(rfp), 8(fpr) and 7(fg) should be set to 9000. when you do your testing with iperf you should be setting you mtu or packet size to 8950. if you use 9000 it will force the tcp packets to be segmented when it is routed from the rfp interface tothe qr interface in the dvr namespace which will requrie the vm to reassmeble it later. this is the first bottelneck you will need to ensure is not present. when you are doing the vm to vm testing it will use an mtu of 8950 as that is the mtu of the neutron network and is included in the dhcp reply. if you have validated that the mtus are set correctly the next stpe is to determin if packet are bing droped to do this you need to check interface 16(the vm interface in the vm) 15 (the vm tap on the host) 13/12 the veth between ovs and the linux bridge 10(the dvr interface) 9/8 (the veth between the fip and dvr namespace) 7( the floatin ip gateway port on ovs) and finally 2 the uplink to the physical network. if you see packet loss on vm on either port 16 or 15 you can try to enable multi queue for the virtio interface you do that by setting hw_vif_multiqueue_enabled=true in the image metadata and then enabling multiqueu in the guest with ethtool -L combined #num_of_queues. if the packet loss is observed on teh veth between the linux bridge and ovs (13/12) then you could change form the ip tables firewall to conntrack or noop firewall driver. if the bottleneck is in the dvr router namespace between port 10 and 9 and its not cause by ip fragmentaiton then you are hitting a kernel limitaion and you will need to tune the kernel to improve routing performance. if the packet loss is betwen 8 and 7 you are hitting a linux kernel dnat bottleneck. again some kernel option may be able to optimise this but there is not much you can do. if the pakcet loss is in RX on interfce 2 you need to ensure the Rescive side scalingin is enable and the nic is configured to use muliptle quese ethtool -L combined #num_of_queues. you should also ensure that offload such a LRO are enabled if availabel on your nic. if none of the above help then your only recorse is to evaluate other neturon netowkring solution such as OVN which will implement dvr/fip/nat using openflow rules in ovs. i hope this helps. regards sean > > VM to VM using internal IP - ~2.5Gbits/sec > > Any idea or solutions to solve this issue is much appreciated. > > From cdent+os at anticdent.org Fri Jan 25 12:49:33 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 25 Jan 2019 12:49:33 +0000 (GMT) Subject: [placement] update 19-03 Message-ID: HTML: https://anticdent.org/placement-update-19-03.html Hello, here's a quick placement update. This will be just a brief summary rather than usual tome. I'm in the midst of some other work. # Most Important Work to complete and review changes to deployment to support extracted placement is the main thing that matters. The next placement extraction status checkin will be 17.00 UTC, February 6th. # What's Changed * Changes to allow database status checks in `placement-status upgrade check` have either merged or will soon. These combine with online data migrations to ensure that the state of an upgraded installations has healthy consumers and resource providers. * libvirt vgpu reshaper [code](https://review.openstack.org/#/c/599208/) is ready for review and has an associated functional test. When that stuff merges the main remaining extraction-related tasks are in the deployment tools. * [os-resource-classes](https://pypi.org/p/os-resource-classes) `0.2.0` was released, adding the `PCPU` class. # Bugs * Placement related [bugs not yet in progress](https://goo.gl/TgiPXb): 14. -1. * [In progress placement bugs](https://goo.gl/vzGGDQ) 16. Stable. # Main Themes ## Nested * * * ## Extraction * [etherpad](https://etherpad.openstack.org/p/placement-extract-stein-5) Deployment related changes: * [TripleO](https://review.openstack.org/#/q/topic:tripleo-placement-extraction) * [OpenStack Ansible](https://review.openstack.org/#/q/project:openstack/openstack-ansible-os_placement) * [Kolla and Kolla Ansible](https://review.openstack.org/#/q/topic:split-placement) and [kolla upgrade](https://review.openstack.org/#/q/topic:upgrade-placement) [Delete placement from nova](https://review.openstack.org/618215). # Other Please refer to [last week](https://anticdent.org/placement-update-19-02.html) for lots of pending changes. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From jungleboyj at gmail.com Fri Jan 25 13:57:21 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Fri, 25 Jan 2019 07:57:21 -0600 Subject: Why COA exam is being retired? In-Reply-To: <25c27f7e-80ec-2eb5-6b88-5627bc9f1f01@admin.grnet.gr> References: <25c27f7e-80ec-2eb5-6b88-5627bc9f1f01@admin.grnet.gr> Message-ID: <16640d78-1124-a21d-8658-b7d9b2d50509@gmail.com> Alexandros, I got this question from someone else the other day and reached out to the Foundation for answers. The response I got was that they had been providing this to support the community in the past but no longer felt it was necessary as other companies in the community had started their own offerings.  So, they were choosing to remove their COA exam given that there were now good alternatives available. Hope this information helps. Thanks! Jay (jungleboyj) On 1/25/2019 2:19 AM, Alexandros Soumplis wrote: > > That's an interesting question if anyone could elaborate > > > On 21/01/2019 14:24, Eduardo Gonzalez wrote: >> Reading the info in the COA site [0] says the following/"The >> OpenStack Foundation is winding down the administration of the COA >> exam"./ >> / >> / >> Is there any reason for retiring the exam? I've tried to find a >> notice in the mailing list but not found anything//at all. >> >> [0] https://www.openstack.org/coa/ >> >> Regards/ >> / > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From emccormick at cirrusseven.com Fri Jan 25 14:09:48 2019 From: emccormick at cirrusseven.com (Erik McCormick) Date: Fri, 25 Jan 2019 09:09:48 -0500 Subject: Why COA exam is being retired? In-Reply-To: <16640d78-1124-a21d-8658-b7d9b2d50509@gmail.com> References: <25c27f7e-80ec-2eb5-6b88-5627bc9f1f01@admin.grnet.gr> <16640d78-1124-a21d-8658-b7d9b2d50509@gmail.com> Message-ID: On Fri, Jan 25, 2019, 8:58 AM Jay Bryant Alexandros, > > I got this question from someone else the other day and reached out to the > Foundation for answers. > > The response I got was that they had been providing this to support the > community in the past but no longer felt it was necessary as other > companies in the community had started their own offerings. So, they were > choosing to remove their COA exam given that there were now good > alternatives available. > That's sad. I really appreciated having a non-vendory, ubiased, community-driven option. If a vendor folds or moves on from Openstack, your certification becomes worthless. Presumably, so long as there is Openstack, there will be the foundation at its core. I hope they might reconsider. -Erik > Hope this information helps. > > Thanks! > > Jay > > (jungleboyj) > On 1/25/2019 2:19 AM, Alexandros Soumplis wrote: > > That's an interesting question if anyone could elaborate > > > On 21/01/2019 14:24, Eduardo Gonzalez wrote: > > Reading the info in the COA site [0] says the following* "The OpenStack > Foundation is winding down the administration of the COA exam".* > > Is there any reason for retiring the exam? I've tried to find a notice in > the mailing list but not found anything at all. > > [0] https://www.openstack.org/coa/ > > Regards > > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Fri Jan 25 14:54:16 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 25 Jan 2019 14:54:16 +0000 Subject: [oslo] Proposing Zane Bitter as general Oslo core In-Reply-To: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> References: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> Message-ID: <4461f5929edb4091d9d2bea084226c00a6a6631d.camel@redhat.com> I already thought he was one, heh. +1 from me. On Thu, 2019-01-24 at 16:17 -0600, Ben Nemec wrote: > Hi all, > > Zane is already core on oslo.service, but he's been doing good stuff > in > adjacent projects as well. We could keep playing whack-a-mole with > giving him +2 on more repos, but I trust his judgment so I'm > proposing > we just add him to the oslo-core group. > > If there are no objections in the next week I'll proceed with the > addition. > > Thanks. > > -Ben > From sfinucan at redhat.com Fri Jan 25 15:03:47 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 25 Jan 2019 15:03:47 +0000 Subject: [nova] Per-instance serial number implementation question In-Reply-To: References: Message-ID: On Thu, 2019-01-24 at 09:09 -0600, Matt Riedemann wrote: > The proposal from the spec for this feature was to add an image property > (hw_unique_serial), flavor extra spec (hw:unique_serial) and new > "unique" choice to the [libvirt]/sysinfo_serial config option. The image > property and extra spec would be booleans but really only True values > make sense and False would be more or less ignored. There were no plans > to enforce strict checking of a boolean value, e.g. if the image > property was True but the flavor extra spec was False, we would not > raise an exception for incompatible values, we would just use OR logic > and take the image property True value. > > The boolean usage proposed is a bit confusing, as can be seen from > comments in the spec [1] and the proposed code change [2]. > > After thinking about this a bit, I'm now thinking maybe we should just > use a single-value enum for the image property and flavor extra spec: > > image: hw_guest_serial=unique > flavor: hw:guest_serial=unique > > If either are set, then we use a unique serial number for the guest. If > neither are set, then the serial number is based on the host > configuration as it is today. > > I think that's more clear usage, do others agree? Alex does. I can't > think of any cases where users would want hw_unique_serial=False, so > this removes that ability and confusion over whether or not to enforce > mismatching booleans. Makes sense - we do that for 'hw_cpu_policy', for example (we have 'dedicated' and 'shared', but they're essentially booleans). However, the reason we added 'hw:cpu_policy=shared' (the flavor extra spec) was to provide a way for operators to prevent users requesting pinned CPUs via images metadata if they didn't want them in their cloud. At the risk of rehasing what's in the spec, if there aren't cases where users would want 'hw_unique_serial=False' then what is the upside of ever supporting non-unique serials going forward? Could we not just default to unique serials, avoiding these knobs? Stephen > [1] https://review.openstack.org/#/c/612531/2/specs/stein/approved/per-instance-libvirt-sysinfo-serial.rst at 43 > [2] https://review.openstack.org/#/c/619953/7/nova/virt/libvirt/driver.py at 4894 From jaypipes at gmail.com Fri Jan 25 15:09:04 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Fri, 25 Jan 2019 10:09:04 -0500 Subject: Why COA exam is being retired? In-Reply-To: References: <25c27f7e-80ec-2eb5-6b88-5627bc9f1f01@admin.grnet.gr> <16640d78-1124-a21d-8658-b7d9b2d50509@gmail.com> Message-ID: <5077d9dc-c4af-8736-0db3-2e05cbc1e992@gmail.com> On 01/25/2019 09:09 AM, Erik McCormick wrote: > On Fri, Jan 25, 2019, 8:58 AM Jay Bryant wrote: > > Alexandros, > > I got this question from someone else the other day and reached out > to the Foundation for answers. > > The response I got was that they had been providing this to support > the community in the past but no longer felt it was necessary as > other companies in the community had started their own offerings. > So, they were choosing to remove their COA exam given that there > were now good alternatives available. > > That's sad. I really appreciated having a non-vendory, ubiased, > community-driven option. +10 > If a vendor folds or moves on from Openstack, > your certification becomes worthless. Presumably, so long as there is > Openstack, there will be the foundation at its core. I hope they might > reconsider. +100 -jay From jaypipes at gmail.com Fri Jan 25 15:18:36 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Fri, 25 Jan 2019 10:18:36 -0500 Subject: [nova] Per-instance serial number implementation question In-Reply-To: References: Message-ID: <57165ee7-e38c-c316-3dca-899a381a1ed1@gmail.com> On 01/24/2019 09:20 PM, melanie witt wrote: > On Thu, 24 Jan 2019 09:09:07 -0600, Matt Riedemann > wrote: >> The proposal from the spec for this feature was to add an image property >> (hw_unique_serial), flavor extra spec (hw:unique_serial) and new >> "unique" choice to the [libvirt]/sysinfo_serial config option. The image >> property and extra spec would be booleans but really only True values >> make sense and False would be more or less ignored. There were no plans >> to enforce strict checking of a boolean value, e.g. if the image >> property was True but the flavor extra spec was False, we would not >> raise an exception for incompatible values, we would just use OR logic >> and take the image property True value. >> >> The boolean usage proposed is a bit confusing, as can be seen from >> comments in the spec [1] and the proposed code change [2]. >> >> After thinking about this a bit, I'm now thinking maybe we should just >> use a single-value enum for the image property and flavor extra spec: >> >> image: hw_guest_serial=unique >> flavor: hw:guest_serial=unique >> >> If either are set, then we use a unique serial number for the guest. If >> neither are set, then the serial number is based on the host >> configuration as it is today. >> >> I think that's more clear usage, do others agree? Alex does. I can't >> think of any cases where users would want hw_unique_serial=False, so >> this removes that ability and confusion over whether or not to enforce >> mismatching booleans. > > I think use of the enum makes sense and it happens to make it easier to > reason about in conjunction with the [libvirt]/sysinfo_serial config > option, at least for me. +1 Though as I've noted on the spec, I have no idea why we even need yet another extra-spec/image metadata item (particularly a libvirt-specific one leaking out of the API. Why can't we just always set the libvirt sysinfo_serial number to the instance's UUID and be done with it? Best, -jay From jon at csail.mit.edu Fri Jan 25 15:27:13 2019 From: jon at csail.mit.edu (Jonathan Proulx) Date: Fri, 25 Jan 2019 10:27:13 -0500 Subject: Why COA exam is being retired? In-Reply-To: <5077d9dc-c4af-8736-0db3-2e05cbc1e992@gmail.com> References: <25c27f7e-80ec-2eb5-6b88-5627bc9f1f01@admin.grnet.gr> <16640d78-1124-a21d-8658-b7d9b2d50509@gmail.com> <5077d9dc-c4af-8736-0db3-2e05cbc1e992@gmail.com> Message-ID: <20190125152713.dxbxgkzoevzw35f2@csail.mit.edu> On Fri, Jan 25, 2019 at 10:09:04AM -0500, Jay Pipes wrote: :On 01/25/2019 09:09 AM, Erik McCormick wrote: :> On Fri, Jan 25, 2019, 8:58 AM Jay Bryant That's sad. I really appreciated having a non-vendory, ubiased, :> community-driven option. : :+10 : :> If a vendor folds or moves on from Openstack, your certification :> becomes worthless. Presumably, so long as there is Openstack, there :> will be the foundation at its core. I hope they might reconsider. : :+100 So to clarify is the COA certifiaction going away or is the Foundation just no longer administerign the exam? It would be a shame to loose a standard unbiased certification, but if this is a transition away from directly providing the training and only providing the exam specification that may be reasonable. -Jon From kgiusti at gmail.com Fri Jan 25 15:36:27 2019 From: kgiusti at gmail.com (Ken Giusti) Date: Fri, 25 Jan 2019 10:36:27 -0500 Subject: [oslo] Proposing Zane Bitter as general Oslo core In-Reply-To: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> References: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> Message-ID: Heck yes - no objections here. On 1/24/19, Ben Nemec wrote: > Hi all, > > Zane is already core on oslo.service, but he's been doing good stuff in > adjacent projects as well. We could keep playing whack-a-mole with > giving him +2 on more repos, but I trust his judgment so I'm proposing > we just add him to the oslo-core group. > > If there are no objections in the next week I'll proceed with the addition. > > Thanks. > > -Ben > > -- Ken Giusti (kgiusti at gmail.com) From jungleboyj at gmail.com Fri Jan 25 15:39:53 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Fri, 25 Jan 2019 09:39:53 -0600 Subject: Why COA exam is being retired? In-Reply-To: <20190125152713.dxbxgkzoevzw35f2@csail.mit.edu> References: <25c27f7e-80ec-2eb5-6b88-5627bc9f1f01@admin.grnet.gr> <16640d78-1124-a21d-8658-b7d9b2d50509@gmail.com> <5077d9dc-c4af-8736-0db3-2e05cbc1e992@gmail.com> <20190125152713.dxbxgkzoevzw35f2@csail.mit.edu> Message-ID: On 1/25/2019 9:27 AM, Jonathan Proulx wrote: > On Fri, Jan 25, 2019 at 10:09:04AM -0500, Jay Pipes wrote: > :On 01/25/2019 09:09 AM, Erik McCormick wrote: > :> On Fri, Jan 25, 2019, 8:58 AM Jay Bryant > :> That's sad. I really appreciated having a non-vendory, ubiased, > :> community-driven option. > : > :+10 > : > :> If a vendor folds or moves on from Openstack, your certification > :> becomes worthless. Presumably, so long as there is Openstack, there > :> will be the foundation at its core. I hope they might reconsider. > : > :+100 > > So to clarify is the COA certifiaction going away or is the Foundation > just no longer administerign the exam? > > It would be a shame to loose a standard unbiased certification, but if > this is a transition away from directly providing the training and > only providing the exam specification that may be reasonable. > > -Jon When Allison e-mailed me last week they said they were having meetings to figure out how to go forward with the COA.  The foundations partners were going to be offering the exam through September and they were working on communicating the status of things to the community. So, probably best to not jump to conclusions and wait for the official word from the community. - Jay From mriedemos at gmail.com Fri Jan 25 16:09:11 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 25 Jan 2019 10:09:11 -0600 Subject: [nova] Per-instance serial number implementation question In-Reply-To: References: Message-ID: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> On 1/25/2019 9:03 AM, Stephen Finucane wrote: > At the > risk of rehasing what's in the spec, if there aren't cases where users > would want 'hw_unique_serial=False' then what is the upside of ever > supporting non-unique serials going forward? Could we not just default > to unique serials, avoiding these knobs? The answer to this, and Jay's same question elsewhere in this thread, is likely going to need to come from danpb who I think originally added the serial stuff in the libvirt driver. I could not find discussions about why the serial was per-host rather than per-guest in the original code changes. So besides "this is the way it's always been" and I'm risk averse to backward incompatible changes, I don't have a good answer. -- Thanks, Matt From jaypipes at gmail.com Fri Jan 25 16:11:59 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Fri, 25 Jan 2019 11:11:59 -0500 Subject: [nova] Per-instance serial number implementation question In-Reply-To: References: Message-ID: <3203f03d-4107-1169-9333-9502bb518024@gmail.com> On 01/25/2019 10:03 AM, Stephen Finucane wrote: > On Thu, 2019-01-24 at 09:09 -0600, Matt Riedemann wrote: >> The proposal from the spec for this feature was to add an image property >> (hw_unique_serial), flavor extra spec (hw:unique_serial) and new >> "unique" choice to the [libvirt]/sysinfo_serial config option. The image >> property and extra spec would be booleans but really only True values >> make sense and False would be more or less ignored. There were no plans >> to enforce strict checking of a boolean value, e.g. if the image >> property was True but the flavor extra spec was False, we would not >> raise an exception for incompatible values, we would just use OR logic >> and take the image property True value. >> >> The boolean usage proposed is a bit confusing, as can be seen from >> comments in the spec [1] and the proposed code change [2]. >> >> After thinking about this a bit, I'm now thinking maybe we should just >> use a single-value enum for the image property and flavor extra spec: >> >> image: hw_guest_serial=unique >> flavor: hw:guest_serial=unique >> >> If either are set, then we use a unique serial number for the guest. If >> neither are set, then the serial number is based on the host >> configuration as it is today. >> >> I think that's more clear usage, do others agree? Alex does. I can't >> think of any cases where users would want hw_unique_serial=False, so >> this removes that ability and confusion over whether or not to enforce >> mismatching booleans. > > Makes sense - we do that for 'hw_cpu_policy', for example (we have > 'dedicated' and 'shared', but they're essentially booleans). However, > the reason we added 'hw:cpu_policy=shared' (the flavor extra spec) was > to provide a way for operators to prevent users requesting pinned CPUs > via images metadata if they didn't want them in their cloud. At the > risk of rehasing what's in the spec, if there aren't cases where users > would want 'hw_unique_serial=False' then what is the upside of ever > supporting non-unique serials going forward? Could we not just default > to unique serials, avoiding these knobs? Right, which is precisely what I commented on the spec. :) I see no valid reason to not just always set the sysinfo_serial to the instance UUID and be done with it. -jay From jaypipes at gmail.com Fri Jan 25 16:13:59 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Fri, 25 Jan 2019 11:13:59 -0500 Subject: [oslo] Proposing Zane Bitter as general Oslo core In-Reply-To: References: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> Message-ID: +1 On 01/25/2019 10:36 AM, Ken Giusti wrote: > Heck yes - no objections here. > > > On 1/24/19, Ben Nemec wrote: >> Hi all, >> >> Zane is already core on oslo.service, but he's been doing good stuff in >> adjacent projects as well. We could keep playing whack-a-mole with >> giving him +2 on more repos, but I trust his judgment so I'm proposing >> we just add him to the oslo-core group. >> >> If there are no objections in the next week I'll proceed with the addition. >> >> Thanks. >> >> -Ben >> >> > > From juliaashleykreger at gmail.com Fri Jan 25 16:18:55 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 25 Jan 2019 08:18:55 -0800 Subject: Ironic ibmc driver for Huawei server In-Reply-To: References: Message-ID: Greetings, If your CI can vote on ci-sandbox, you should be able to vote on ironic. I don't remember having to grand permissions previously. If not, please let me know and I'll contact the Infrastructure team. -Julia On Thu, Jan 24, 2019 at 7:50 PM xmufive at qq.com wrote: > Hi julia, > > It seems the basic features of our 3rd party CI is ready. It now could > verify +1 on openstack-dev/ci-sandbox[1]. > I am writing for asking for permission of review ironic project for > huawei-ironic-ci. > > The account used for huawei-ironic-ci is: > > Username > huawei-ironic-ci > Full Name Huawei Ironic CI > Email Address zhengxianhui at huawei.com > Registered Jan 14, 2019 8:00 PM > Account ID 29800 > > [1] https://review.openstack.org/#/c/630893/ > > > Thanks, > > ------------------ 原始邮件 ------------------ > *发件人:* "Julia Kreger"; > *发送时间:* 2019年1月9日(星期三) 晚上10:47 > *收件人:* "xmufive at qq.com"; > *抄送:* "openstack-discuss"; > *主题:* Re: Ironic ibmc driver for Huawei server > > Ironic does not have a deadline for merging specs. We will generally > avoid landing large features the closer we get to the end of the > cycle. If third party CI is up before the end of the cycle, I suspect > it would just be a matter of iterating the driver code through review. > You may wish to propose it sooner rather than later, and we can begin > to give you feedback from there. > > -Julia > > On Tue, Jan 8, 2019 at 11:21 PM xmufive at qq.com > wrote: > > > > Hi Julia, > > > > When is the deadline of approving specs, I am afraid that huawei ibmc > spec will be put off util next release. > > > > Thanks > > Qianbiao NG > > > > > > ------------------ 原始邮件 ------------------ > > 发件人: "Julia Kreger"; > > 发送时间: 2019年1月9日(星期三) 凌晨2:26 > > 收件人: "xmufive at qq.com"; > > 抄送: "openstack-discuss"; > > 主题: Re: Ironic ibmc driver for Huawei server > > > > Greetings Qianbiao.NG, > > > > Welcome to Ironic! > > > > The purpose and requirement of Third Party CI is to test drivers are > > in working order with the current state of the code in Ironic and help > > prevent the community from accidentally breaking an in-tree vendor > > driver. Vendors do this by providing one or more physical systems in a > > pool of hardware that is managed by a Zuul v3 or Jenkins installation > > which installs ironic (typically in a virtual machine), and configures > > it to perform a deployment upon the physical bare metal node. Upon > > failure or successful completion of the test, the results are posted > > back to OpenStack Gerrit. > > > > Ultimately this helps provide the community and the vendor with a > > level of assurance in what is released by the ironic community. The > > cinder project has a similar policy and I'll email you directly with > > the contacts at Huawei that work with the Cinder community, as they > > would be familiar with many of the aspects of operating third party > > CI. > > > > You can find additional information here on the requirement and the > > reasoning behind it: > > > > > https://specs.openstack.org/openstack/ironic-specs/specs/approved/third-party-ci.html > > > > We may also be able to put you in touch with some vendors that have > > recently worked on implementing third-party CI. I'm presently > > inquiring with others if that will be possible. If you are able to > > join Internet Relay Chat, our IRC channel (#openstack-ironic) has > > several individual who have experience setting up and maintaining > > third-party CI for ironic. > > > > Thanks, > > > > -Julia > > > > On Tue, Jan 8, 2019 at 8:54 AM xmufive at qq.com > wrote: > > > > > > Hi julia, > > > > > > According to the comment of story< > https://storyboard.openstack.org/#!/story/2004635 >, > > > 1. The spec for huawei ibmc drvier has been post here: > https://storyboard.openstack.org/#!/story/2004635 , waiting for review. > > > 2. About the third-party CI part, we provide mocked unittests for our > driver's code. Not sure what third-party CI works for in this case. What > else we should do? > > > > > > Thanks > > > Qianbiao.NG > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaypipes at gmail.com Fri Jan 25 16:27:11 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Fri, 25 Jan 2019 11:27:11 -0500 Subject: [qa][tempest] Waiting for interface status == ACTIVE before checking status In-Reply-To: <168829ea636.ac2416e683695.6782940432609689439@ghanshyammann.com> References: <168829ea636.ac2416e683695.6782940432609689439@ghanshyammann.com> Message-ID: <4c90c8e2-ff0f-9bf7-6dfe-1164871295ee@gmail.com> On 01/24/2019 08:27 PM, Ghanshyam Mann wrote: > If we see the from user perspective , user can have an Active VM with > active port which can flip to down in between of that port usage. This seems bug to me. Agreed, Ghanshyam. -jay From sfinucan at redhat.com Fri Jan 25 16:35:44 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 25 Jan 2019 16:35:44 +0000 Subject: [nova] Per-instance serial number implementation question In-Reply-To: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> References: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> Message-ID: <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> On Fri, 2019-01-25 at 10:09 -0600, Matt Riedemann wrote: > On 1/25/2019 9:03 AM, Stephen Finucane wrote: > > At the > > risk of rehasing what's in the spec, if there aren't cases where users > > would want 'hw_unique_serial=False' then what is the upside of ever > > supporting non-unique serials going forward? Could we not just default > > to unique serials, avoiding these knobs? > > The answer to this, and Jay's same question elsewhere in this thread, is > likely going to need to come from danpb who I think originally added the > serial stuff in the libvirt driver. I could not find discussions about > why the serial was per-host rather than per-guest in the original code > changes. So besides "this is the way it's always been" and I'm risk > averse to backward incompatible changes, I don't have a good answer. Chatted with Dan about this. In summary, this value is up to apps (i.e. nova) to populate. That said, as noted in the spec, nova currently populates this using the value of the host OS' '/etc/machine-id' file. It is possible that operators/users are using this to determine if two guests are co-located. He noted that one would be a valid point in claiming the host OS identity should have been reported in 'chassis.serial' instead of 'system.serial' in the first place [1] but changing it now is definitely not zero risk. My personal take on that is that we can avoid the configurable option and it might be good to start reporting 'chassis.serial' in case anyone was doing the above (assuming we care about that possible use case). We'd just need to make sure the change in behaviour was fully documented by way of an 'upgrade' reno. That's just my take though. Stephen [1] https://libvirt.org/formatdomain.html#elementsSysinfo From colleen at gazlene.net Fri Jan 25 16:42:50 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 25 Jan 2019 17:42:50 +0100 Subject: [dev][keystone] Keystone Team Update - Week of 21 January 2019 Message-ID: <1548434570.1789898.1643516392.69EBB8C5@webmail.messagingengine.com> # Keystone Team Update - Week of 21 January 2019 ## News ### Technical Vision Statement >From time to time we talk about writing a mission statement for keystone, and then the idea always loses steam due to unfocused motivation. Luckily the TC has now given us some excellent starting points[1] and has requested we publish a similar team statement and/or update the overall guiding document[2]. We started taking notes on how keystone measures up with the overall vision[3] and whoever finds time first will write up an addition to the keystone contributor guide, where we can further discuss on the review. [1] https://governance.openstack.org/tc/reference/technical-vision.html [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001417.html [3] https://etherpad.openstack.org/p/keystone-technical-vision-notes ### External and Tokenless Auth (What's old is new again) In working with the Edge computing group and the engineers at Oath we've been revisiting external authentication with X.509 as well as tokenless authentication[4][5], both ancient features in keystone. Though they are under-tested and have suffered mild bitrot they are still useful features. External authentication with mod_ssl can potentially be used as a drop-in replacement for the custom authentication plugin that Oath currently uses for their IdP Athenz. It does not on its own solve the Edge use case in which a client may be unable to connect to the keystone server for long periods of time, but the ideas can be used as a starting point for proper offline authentication. Tokenless authentication is closely tied to X.509 authentication and was a useful idea for starting to reduce the security impact of bearer tokens, but it was never fully implemented. We have been discussing revamping that feature and will be working on cleaning up the bitrot and the documentation around these features. [4] http://eavesdrop.openstack.org/meetings/keystone/2019/keystone.2019-01-22-16.00.log.html#l-227 [5] http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2019-01-22.log.html#t2019-01-22T17:09:52 ## Open Specs Stein specs: https://bit.ly/2Pi6dGj Ongoing specs: https://bit.ly/2OyDLTh ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 16 changes this week. ## Changes that need Attention Search query: https://bit.ly/2RLApdA There are 80 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ## Bugs This week we opened 4 new bugs and closed 2. Bug #1813057 (keystone:Medium) opened by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1813057 Bug #1813085 (keystone:Undecided) opened by Tim Buckley https://bugs.launchpad.net/keystone/+bug/1813085 Bug #1813183 (keystone:Undecided) opened by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1813183 Bug #1813265 (keystone:Undecided) opened by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1813265 Bugs fixed (2) Bug #1794864 (keystone:Medium) fixed by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1794864 Bug #1804522 (keystone:Medium) fixed by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1804522 ## Milestone Outlook https://releases.openstack.org/stein/schedule.html Feature proposal freeze happens next week, if you are working on a feature for Stein please have at least a WIP on Gerrit ASAP. The feature freeze is five weeks after that, so major features that are not already fairly far along may have to be pushed to Train. ## Shout-outs Thanks Guang for being super helpful with all of our X.509 and tokenless auth questions! ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter Dashboard generated using gerrit-dash-creator and https://gist.github.com/lbragstad/9b0477289177743d1ebfc276d1697b67 From smooney at redhat.com Fri Jan 25 16:47:46 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 25 Jan 2019 16:47:46 +0000 Subject: [nova] Per-instance serial number implementation question In-Reply-To: <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> References: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> Message-ID: <58b21f9476536a5e9f1fad6596e9882aa6ad7e1d.camel@redhat.com> On Fri, 2019-01-25 at 16:35 +0000, Stephen Finucane wrote: > On Fri, 2019-01-25 at 10:09 -0600, Matt Riedemann wrote: > > On 1/25/2019 9:03 AM, Stephen Finucane wrote: > > > At the > > > risk of rehasing what's in the spec, if there aren't cases where users > > > would want 'hw_unique_serial=False' then what is the upside of ever > > > supporting non-unique serials going forward? Could we not just default > > > to unique serials, avoiding these knobs? > > > > The answer to this, and Jay's same question elsewhere in this thread, is > > likely going to need to come from danpb who I think originally added the > > serial stuff in the libvirt driver. I could not find discussions about > > why the serial was per-host rather than per-guest in the original code > > changes. So besides "this is the way it's always been" and I'm risk > > averse to backward incompatible changes, I don't have a good answer. > > Chatted with Dan about this. In summary, this value is up to apps (i.e. > nova) to populate. That said, as noted in the spec, nova currently > populates this using the value of the host OS' '/etc/machine-id' file. > It is possible that operators/users are using this to determine if two > guests are co-located. you can get the hashed host-id form the instance metadata to determin if two instance are on the same host from a tenant perspective in a hypervior indepent way so i think that usecase would still be supported. > He noted that one would be a valid point in > claiming the host OS identity should have been reported in > 'chassis.serial' instead of 'system.serial' in the first place [1] but > changing it now is definitely not zero risk. > > My personal take on that is that we can avoid the configurable option > and it might be good to start reporting 'chassis.serial' in case anyone > was doing the above (assuming we care about that possible use case). > We'd just need to make sure the change in behaviour was fully > documented by way of an 'upgrade' reno. That's just my take though. > > Stephen > > [1] https://libvirt.org/formatdomain.html#elementsSysinfo > > From sfinucan at redhat.com Fri Jan 25 16:55:14 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 25 Jan 2019 16:55:14 +0000 Subject: [dev] Upcoming git-review changes Message-ID: <911e155d0d84beeb1e7e9bc0b2e10266895ee9ed.camel@redhat.com> Happy Friday, Just a quick heads up that the upcoming release of git-review, likely 2.0, removes a feature. Currently, git-review will detect the presence of "bug", "lp", "blueprint" or "bp" in commit messages and, if found, set the Gerrit topic for that submitted review to something based on the blueprint/bug. As noted in the commit that removed this [1], this hadn't been updated to support StoryBoard and there was no desire to do so given the OpenStack-specific nature of this feature along with the fact that it often did the wrong thing. We (being the people who authored and reviewed the patch) think this is definitely a good thing to do, but if anyone has genuine concerns about the removal of this functionality, be sure to raise them here or on #openstack-infra before we push that release out next week. Cheers, Stephen [1] https://github.com/openstack-infra/git-review/commit/03768832c4 From smooney at redhat.com Fri Jan 25 16:58:48 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 25 Jan 2019 16:58:48 +0000 Subject: [qa][tempest] Waiting for interface status == ACTIVE before checking status In-Reply-To: <4c90c8e2-ff0f-9bf7-6dfe-1164871295ee@gmail.com> References: <168829ea636.ac2416e683695.6782940432609689439@ghanshyammann.com> <4c90c8e2-ff0f-9bf7-6dfe-1164871295ee@gmail.com> Message-ID: <154c1b218be5778032b0893f92b0db1e25d02645.camel@redhat.com> On Fri, 2019-01-25 at 11:27 -0500, Jay Pipes wrote: > On 01/24/2019 08:27 PM, Ghanshyam Mann wrote: > > If we see the from user perspective , user can have an Active VM with > > active port which can flip to down in between of that port usage. This seems bug to me. > > Agreed, Ghanshyam. so as this bug states https://bugs.launchpad.net/neutron/+bug/1672629 if admin-sate-up is False tehn the nova port status shoudl be down even if the vm is active. it may also be true that if the Data-plane-status extention is used https://specs.openstack.org/openstack/neutron-specs/specs/pike/port-data-plane-status.html the port status might change to down the datapath status is is marked as down but im not sure about that. they are ment to be independ but its a little confusing. > > -jay > From twilson at redhat.com Fri Jan 25 17:04:12 2019 From: twilson at redhat.com (Terry Wilson) Date: Fri, 25 Jan 2019 11:04:12 -0600 Subject: [qa][tempest] Waiting for interface status == ACTIVE before checking status In-Reply-To: <168829ea636.ac2416e683695.6782940432609689439@ghanshyammann.com> References: <168829ea636.ac2416e683695.6782940432609689439@ghanshyammann.com> Message-ID: On Thu, Jan 24, 2019 at 7:34 PM Ghanshyam Mann wrote: > As Sean also pointed that in patch that we should go for the approach of > "making sure all attached interface to server is active, server is sshable > bthe efore server can be used in test" [1]. This is something we agreed > in Denver PTG for afazekas proposal[2]. > > If we see the from user perspective , user can have an Active VM with > active port which can flip to down in between of that port usage. This seems bug to me. To me, this ignores real-world situations where a port status *can* change w/o user interaction. It seems weird to ignore a status change if it is detected. In the case that we hit, it was a change to os-vif where it was recreating a port. But it could just as easily be some vendor-specific "that port just died" kind of thing. Why not update the status of the port if you know it has changed? Also, the patch itself (outside the ironic case) just adds a window for the status to bounce. From smooney at redhat.com Fri Jan 25 17:09:06 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 25 Jan 2019 17:09:06 +0000 Subject: [dev] Upcoming git-review changes In-Reply-To: <911e155d0d84beeb1e7e9bc0b2e10266895ee9ed.camel@redhat.com> References: <911e155d0d84beeb1e7e9bc0b2e10266895ee9ed.camel@redhat.com> Message-ID: <63924e09c22eb64f826b989a24535781fd60537e.camel@redhat.com> On Fri, 2019-01-25 at 16:55 +0000, Stephen Finucane wrote: > Happy Friday, > > Just a quick heads up that the upcoming release of git-review, likely > 2.0, removes a feature. Currently, git-review will detect the presence > of "bug", "lp", "blueprint" or "bp" in commit messages and, if found, > set the Gerrit topic for that submitted review to something based on > the blueprint/bug. > > As noted in the commit that removed this [1], this hadn't been updated > to support StoryBoard and there was no desire to do so given the > OpenStack-specific nature of this feature along with the fact that it > often did the wrong thing. > > We (being the people who authored and reviewed the patch) think this is > definitely a good thing to do, but if anyone has genuine concerns about > the removal of this functionality, be sure to raise them here or on > #openstack-infra before we push that release out next week. does it strill automatically set the topic to the current git branch name? i never really use the feature that is being remvoed but i always use the reverse. e.g. create a branch with the correct name and rely on git review setting the gerrit topic to match. > > Cheers, > Stephen > > [1] https://github.com/openstack-infra/git-review/commit/03768832c4 > > From mark at stackhpc.com Fri Jan 25 17:23:05 2019 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 25 Jan 2019 17:23:05 +0000 Subject: [kayobe] Proposing Pierre Riteau for core Message-ID: Hi, I'd like to propose Pierre Riteau (priteau) for core. He has contributed a number of good patches and provided some thoughtful and useful reviews. Cores, please respond +1 or -1. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaypipes at gmail.com Fri Jan 25 17:26:44 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Fri, 25 Jan 2019 12:26:44 -0500 Subject: [qa][tempest] Waiting for interface status == ACTIVE before checking status In-Reply-To: References: <168829ea636.ac2416e683695.6782940432609689439@ghanshyammann.com> Message-ID: <92c4b8e2-80b6-3964-f98b-f6a363bc8cd9@gmail.com> On 01/25/2019 12:04 PM, Terry Wilson wrote: > On Thu, Jan 24, 2019 at 7:34 PM Ghanshyam Mann wrote: > >> As Sean also pointed that in patch that we should go for the approach of >> "making sure all attached interface to server is active, server is sshable >> bthe efore server can be used in test" [1]. This is something we agreed >> in Denver PTG for afazekas proposal[2]. >> >> If we see the from user perspective , user can have an Active VM with >> active port which can flip to down in between of that port usage. This seems bug to me. > > To me, this ignores real-world situations where a port status *can* > change w/o user interaction. How is this ignoring that scenario? > It seems weird to ignore a status change > if it is detected. In the case that we hit, it was a change to os-vif > where it was recreating a port. Which was a bug, right? > But it could just as easily be some vendor-specific "that port just > died" kind of thing. In which case, the test waiting for SSH to be available would timeout because connectivity would be broken anyway, no? > Why not update the status of the port if you > know it has changed? Sorry, I don't see where anyone is suggesting not changing the status of the port if some non-bug real scenario changes the status of the port? > Also, the patch itself (outside the ironic case) just adds a window > for the status to bounce. Unless I'm mistaken, the patch is simply changing the condition that the tempest test uses to identify broken VM connectivity. It will use the SSH connectivity test instead of looking at the port status test. The SSH test was determined to be a more stable test of VM network connectivity than relying on the Neutron port status indicator which can be a little flaky. Or am I missing something? -jay From sfinucan at redhat.com Fri Jan 25 17:31:40 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 25 Jan 2019 17:31:40 +0000 Subject: [dev] Upcoming git-review changes In-Reply-To: <63924e09c22eb64f826b989a24535781fd60537e.camel@redhat.com> References: <911e155d0d84beeb1e7e9bc0b2e10266895ee9ed.camel@redhat.com> <63924e09c22eb64f826b989a24535781fd60537e.camel@redhat.com> Message-ID: On Fri, 2019-01-25 at 17:09 +0000, Sean Mooney wrote: > On Fri, 2019-01-25 at 16:55 +0000, Stephen Finucane wrote: > > Happy Friday, > > > > Just a quick heads up that the upcoming release of git-review, likely > > 2.0, removes a feature. Currently, git-review will detect the presence > > of "bug", "lp", "blueprint" or "bp" in commit messages and, if found, > > set the Gerrit topic for that submitted review to something based on > > the blueprint/bug. > > > > As noted in the commit that removed this [1], this hadn't been updated > > to support StoryBoard and there was no desire to do so given the > > OpenStack-specific nature of this feature along with the fact that it > > often did the wrong thing. > > > > We (being the people who authored and reviewed the patch) think this is > > definitely a good thing to do, but if anyone has genuine concerns about > > the removal of this functionality, be sure to raise them here or on > > #openstack-infra before we push that release out next week. > does it strill automatically set the topic to the current git branch name? > i never really use the feature that is being remvoed but i always use the reverse. > e.g. create a branch with the correct name and rely on git review setting the gerrit topic to match. No, this feature continues to work as is, and as always can be disabled with the '-T'/'--no-topic' flags. Stephen > > Cheers, > > Stephen > > > > [1] https://github.com/openstack-infra/git-review/commit/03768832c4 > > > > From chris.friesen at windriver.com Fri Jan 25 18:00:11 2019 From: chris.friesen at windriver.com (Chris Friesen) Date: Fri, 25 Jan 2019 12:00:11 -0600 Subject: [nova] Per-instance serial number implementation question In-Reply-To: <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> References: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> Message-ID: On 1/25/2019 10:35 AM, Stephen Finucane wrote: > My personal take on that is that we can avoid the configurable option > and it might be good to start reporting 'chassis.serial' in case anyone > was doing the above (assuming we care about that possible use case). > We'd just need to make sure the change in behaviour was fully > documented by way of an 'upgrade' reno. That's just my take though. I'm on this page as well. No need for flags, just do it. Chris From smooney at redhat.com Fri Jan 25 18:17:44 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 25 Jan 2019 18:17:44 +0000 Subject: [qa][tempest] Waiting for interface status == ACTIVE before checking status In-Reply-To: <92c4b8e2-80b6-3964-f98b-f6a363bc8cd9@gmail.com> References: <168829ea636.ac2416e683695.6782940432609689439@ghanshyammann.com> <92c4b8e2-80b6-3964-f98b-f6a363bc8cd9@gmail.com> Message-ID: On Fri, 2019-01-25 at 12:26 -0500, Jay Pipes wrote: > On 01/25/2019 12:04 PM, Terry Wilson wrote: > > On Thu, Jan 24, 2019 at 7:34 PM Ghanshyam Mann wrote: > > > > > As Sean also pointed that in patch that we should go for the approach of > > > "making sure all attached interface to server is active, server is sshable > > > bthe efore server can be used in test" [1]. This is something we agreed > > > in Denver PTG for afazekas proposal[2]. > > > > > > If we see the from user perspective , user can have an Active VM with > > > active port which can flip to down in between of that port usage. This seems bug to me. > > > > To me, this ignores real-world situations where a port status *can* > > change w/o user interaction. > > How is this ignoring that scenario? the only case i know of for definity would be if the admin state is down which should not prevent the vm from booting but neutron shoudl not allow network connecitigy in this case. > > > It seems weird to ignore a status change > > if it is detected. In the case that we hit, it was a change to os-vif > > where it was recreating a port. > > Which was a bug, right? yes kind of. we could have fixed it by merging the nova change i had or reverting the os-vif change. i revert the os-vif change as the nova change was hitting a different bug in neutron. but only one entity. os-vif or the hyperviror should have been creating the port on ovs. so it was a bug when both were. > > > But it could just as easily be some vendor-specific "that port just > > died" kind of thing. > > In which case, the test waiting for SSH to be available would timeout > because connectivity would be broken anyway, no? if it did not recover yes it would. > > > > Why not update the status of the port if you > > know it has changed? > > Sorry, I don't see where anyone is suggesting not changing the status of > the port if some non-bug real scenario changes the status of the port? > > > Also, the patch itself (outside the ironic case) just adds a window > > for the status to bounce. > > Unless I'm mistaken, the patch is simply changing the condition that the > tempest test uses to identify broken VM connectivity. It will use the > SSH connectivity test instead of looking at the port status test. > > The SSH test was determined to be a more stable test of VM network > connectivity than relying on the Neutron port status indicator which can > be a little flaky. ssh is more reliable for hotpug as we needed to wait for the guest os to process the hotplug event. waithing for the vm to be pingable or sshable is more reliable in that specific case. the port status being active simply means that the port is curently configured by neutron. that gives you no knolage of if the gust has processed the hotplug event. in general im not sure if ssh connectivity would be more reliabel but if that is what the test requires to work its better to expeclitly validate it then use the port status as a proxy. > > Or am I missing something? its a valid question i think port status and vm connectity are two different things. if you are writing an api test then port status hsould be suffient. if you need to connect to the vm in any way it becomes a senario test in which case wait for sshable or pingable might be more suitable. not sure if i answer your question however. > > -jay > > From smooney at redhat.com Fri Jan 25 18:19:08 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 25 Jan 2019 18:19:08 +0000 Subject: [dev] Upcoming git-review changes In-Reply-To: References: <911e155d0d84beeb1e7e9bc0b2e10266895ee9ed.camel@redhat.com> <63924e09c22eb64f826b989a24535781fd60537e.camel@redhat.com> Message-ID: On Fri, 2019-01-25 at 17:31 +0000, Stephen Finucane wrote: > On Fri, 2019-01-25 at 17:09 +0000, Sean Mooney wrote: > > On Fri, 2019-01-25 at 16:55 +0000, Stephen Finucane wrote: > > > Happy Friday, > > > > > > Just a quick heads up that the upcoming release of git-review, likely > > > 2.0, removes a feature. Currently, git-review will detect the presence > > > of "bug", "lp", "blueprint" or "bp" in commit messages and, if found, > > > set the Gerrit topic for that submitted review to something based on > > > the blueprint/bug. > > > > > > As noted in the commit that removed this [1], this hadn't been updated > > > to support StoryBoard and there was no desire to do so given the > > > OpenStack-specific nature of this feature along with the fact that it > > > often did the wrong thing. > > > > > > We (being the people who authored and reviewed the patch) think this is > > > definitely a good thing to do, but if anyone has genuine concerns about > > > the removal of this functionality, be sure to raise them here or on > > > #openstack-infra before we push that release out next week. > > > > does it strill automatically set the topic to the current git branch name? > > i never really use the feature that is being remvoed but i always use the reverse. > > e.g. create a branch with the correct name and rely on git review setting the gerrit topic to match. > > No, this feature continues to work as is, and as always can be disabled > with the '-T'/'--no-topic' flags. cool +1 i have been bitten more by the feature your removing then it has helped me in the past so no concern from me :) > > Stephen > > > > Cheers, > > > Stephen > > > > > > [1] https://github.com/openstack-infra/git-review/commit/03768832c4 > > > > > > > > From lauren at openstack.org Fri Jan 25 18:34:09 2019 From: lauren at openstack.org (Lauren Sell) Date: Fri, 25 Jan 2019 10:34:09 -0800 Subject: Why COA exam is being retired? In-Reply-To: <1688640cbe0.27a5.eb5fa01e01bf15c6e0d805bdb1ad935e@jbryce.com> References: <25c27f7e-80ec-2eb5-6b88-5627bc9f1f01@admin.grnet.gr> <16640d78-1124-a21d-8658-b7d9b2d50509@gmail.com> <5077d9dc-c4af-8736-0db3-2e05cbc1e992@gmail.com> <20190125152713.dxbxgkzoevzw35f2@csail.mit.edu> <1688640cbe0.27a5.eb5fa01e01bf15c6e0d805bdb1ad935e@jbryce.com> Message-ID: Thanks very much for the feedback. When we launched the COA, the commercial market for OpenStack was much more crowded (read: fragmented), and the availability of individuals with OpenStack experience was more scarce. That indicated a need for a vendor neutral certification to test baseline OpenStack proficiency, and to help provide a target for training curriculum being developed by companies in the ecosystem. Three years on, the commercial ecosystem has become easier to navigate, and there are a few thousand professionals who have taken the COA and had on-the-job experience. As those conditions have changed, we've been trying to evaluate the best ways to use the Foundation's resources and time to support the current needs for education and certification. The COA in its current form is pretty resource intensive, because it’s a hands-on exam that runs in a virtual OpenStack environment. To maintain the exam (including keeping it current to OpenStack releases) would require a pretty significant investment in terms of time and money this year. From the data and demand we’re seeing, the COA did not seem to be a top priority compared to our investments in programs that push knowledge and training into the ecosystem like Upstream Institute, supporting OpenStack training partners, mentoring, and sponsoring internship programs like Outreachy and Google Summer of Code. That said, we’ve honestly been surprised by the response from training partners and the community as plans have been trickling out these past few weeks, and are open to discussing it. If there are people and companies who are willing to invest time and resources into a neutral certification exam, we could investigate alternative paths. It's very helpful to hear which education activities you find most valuable, and if you'd like to have a deeper discussion or volunteer to help, let me know and we can schedule a community call next week. Regardless of the future of the COA exam, we will of course continue to maintain the training marketplace at openstack.org to promote commercial training partners and certifications. There are also some great books and resources developed by community members listed alongside the community training. > From: Jay Bryant jungleboyj at gmail.com > Date: January 25, 2019 07:42:55 > Subject: Re: Why COA exam is being retired? > To: openstack-discuss at lists.openstack.org > >> On 1/25/2019 9:27 AM, Jonathan Proulx wrote: >>> On Fri, Jan 25, 2019 at 10:09:04AM -0500, Jay Pipes wrote: >>> :On 01/25/2019 09:09 AM, Erik McCormick wrote: >>> :> On Fri, Jan 25, 2019, 8:58 AM Jay Bryant >> >>> :> That's sad. I really appreciated having a non-vendory, ubiased, >>> :> community-driven option. >>> : >>> :+10 >>> : >>> :> If a vendor folds or moves on from Openstack, your certification >>> :> becomes worthless. Presumably, so long as there is Openstack, there >>> :> will be the foundation at its core. I hope they might reconsider. >>> : >>> :+100 >>> >>> So to clarify is the COA certifiaction going away or is the Foundation >>> just no longer administerign the exam? >>> >>> It would be a shame to loose a standard unbiased certification, but if >>> this is a transition away from directly providing the training and >>> only providing the exam specification that may be reasonable. >>> >>> -Jon >> >> When Allison e-mailed me last week they said they were having meetings >> to figure out how to go forward with the COA. The foundations partners >> were going to be offering the exam through September and they were >> working on communicating the status of things to the community. >> >> So, probably best to not jump to conclusions and wait for the official >> word from the community. >> >> - Jay > > > From senrique at redhat.com Fri Jan 25 18:45:02 2019 From: senrique at redhat.com (Sofia Enriquez) Date: Fri, 25 Jan 2019 15:45:02 -0300 Subject: [cinder] Proposing new Core Members ... In-Reply-To: <7c45be94-34c4-2555-9532-fb721ac783ed@gmail.com> References: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> <20190108223535.GA29520@sm-workstation> <7c45be94-34c4-2555-9532-fb721ac783ed@gmail.com> Message-ID: Congrats!! El El mié, 23 ene. 2019 a las 11:14, Jay Bryant escribió: > All, > > There were no concerns with these nominations so I have added Yikun and > Rajat to the core list. > > Welcome to the Cinder Core team! > > Thank you for all the efforts and I look forward to working with you > both in the future! > > Jay > > (jungleboyj) > > > On 1/8/2019 4:35 PM, Sean McGinnis wrote: > > On Tue, Jan 08, 2019 at 04:00:14PM -0600, Jay Bryant wrote: > >> Team, > >> > >> I would like propose two people who have been taking a more active role > in > >> Cinder reviews as Core Team Members: > >> > > > >> I think that both Rajat and Yikun will be welcome additions to help > replace > >> the cores that have recently been removed. > >> > > +1 from me. Both have been doing a good job giving constructive feedback > on > > reviews and have been spending some time reviewing code other than their > own > > direct interests, so I think they would be welcome additions. > > > > Sean > > -- Sofia Enriquez Associate Software Engineer Red Hat PnT Ingeniero Butty 240, Piso 14 (C1001AFB) Buenos Aires - Argentina +541143297471 (8426471) senrique at redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From lbragstad at gmail.com Fri Jan 25 19:16:15 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Fri, 25 Jan 2019 13:16:15 -0600 Subject: [keystone] x509 authentication Message-ID: Hi all, We've been going over keystone gaps that need to be addressed for edge use cases every Tuesday. Since Berlin, Oath has open-sourced some of their custom authentication plugins for keystone that help them address these gaps. The basic idea is that users authenticate to some external identity provider (Athenz in Oath's case), and then present an Athenz token to keystone. The custom plugins decode the token from Athenz to determine the user, project, roles assignments, and other useful bits of information. After that, it creates any resources that don't exist in keystone already. Ultimately, a user can authenticate against a keystone node and have specific resources provisioned automatically. In Berlin, engineers from Oath were saying they'd like to move away from Athenz tokens altogether and use x509 certificates issued by Athenz instead. The auto-provisioning approach is very similar to a feature we have in keystone already. In Berlin, and shortly after, there was general agreement that if we could support x509 authentication with auto-provisioning via keystone federation, that would pretty much solve Oath's use case without having to maintain custom keystone plugins. Last week, Colleen started digging into keystone's existing x509 authentication support. I'll start with the good news, which is x509 authentication works, for the most part. It's been a feature in keystone for a long time, and it landed after we implemented federation support around the Kilo release. Chances are there won't be a need for a keystone specification like we were initially thinking in the edge meetings. Unfortunately, the implementation for x509 authentication has outdated documentation, is extremely fragile, hard to set up, and hasn't been updated with improvements we've made to the federation API since the original implementation (like shadow users or auto-provisioning, which work with other federated protocols like OpenID Connect and SAML). We've started tracking the gaps with bugs [0] so that we have things written down. I think the good thing is that once we get this cleaned up, we'll be able to re-use some of the newer federation features with x509 authentication/federation. These updates would make x509 a first-class federated protocol. The approach, pending the bug fixes, would remove the need for Oath's custom authentication plugins. It could be useful for edge deployments, or even deployments with many regions, by allowing users to be auto-provisioned in each region. Although, it doesn't necessarily solve the network partition issue. Now that we have an idea of where to start and some bug reports [0], I'm wondering if anyone is interested in helping with the update or refactor. Because this won't require a specification, we can get started on it sooner, instead of having to wait for Train development and a new specification. I'm also curious if anyone has comments or questions about the approach. Thanks, Lance [0] https://bugs.launchpad.net/keystone/+bugs?field.tag=x509 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Fri Jan 25 19:36:54 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 25 Jan 2019 13:36:54 -0600 Subject: [oslo] Proposing Zane Bitter as general Oslo core In-Reply-To: <4461f5929edb4091d9d2bea084226c00a6a6631d.camel@redhat.com> References: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> <4461f5929edb4091d9d2bea084226c00a6a6631d.camel@redhat.com> Message-ID: <78154144-48e8-8566-5f91-72cf4e957b28@nemebean.com> I will admit this is partially motivated by the fact I forgot he couldn't +2 my oslo.utils patch and was annoyed by that. ;-) On 1/25/19 8:54 AM, Stephen Finucane wrote: > I already thought he was one, heh. +1 from me. > > On Thu, 2019-01-24 at 16:17 -0600, Ben Nemec wrote: >> Hi all, >> >> Zane is already core on oslo.service, but he's been doing good stuff >> in >> adjacent projects as well. We could keep playing whack-a-mole with >> giving him +2 on more repos, but I trust his judgment so I'm >> proposing >> we just add him to the oslo-core group. >> >> If there are no objections in the next week I'll proceed with the >> addition. >> >> Thanks. >> >> -Ben >> > > From davidmnoriega at gmail.com Fri Jan 25 21:34:40 2019 From: davidmnoriega at gmail.com (David M Noriega) Date: Fri, 25 Jan 2019 13:34:40 -0800 Subject: [tripleo][ptl]What is the future of availability monitoring in TripleO? Message-ID: I've noticed that the availability monitoring integration in TripleO using Sensu is marked for deprecation post Rocky, but I do not see any blueprints detailing a replacement plan like there is for fluentd to rsyslog for the logging integration. Where can I find information on what the roadmap is? Is this a feature that will be dropped and left to operators to implement? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mihalis68 at gmail.com Fri Jan 25 21:47:18 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Fri, 25 Jan 2019 16:47:18 -0500 Subject: [ops] Berlin ops meet up march 2019 In-Reply-To: References: Message-ID: I haven't heard of any objections to moving the Berlin ops meetup to be the 6th and 7th of March, so let's aim for that. I'll ask the foundation folks doing the evenbrite to make it for those dates. Chris On Thu, Jan 24, 2019 at 6:27 PM Chris Morgan wrote: > We (the ops meet ups team) just heard that the city of Berlin just decided > that March 8th 2019 is to be a public holiday. Our first reaction is to see > if we can just pull the ops meet up forward one day (making it the 6th and > 7th instead of the 7th and the 8th). Please let us know if this would be a > problem for you ASAP. A good example might be if you have already booked > flights. Since the even tickets have not been issued yet, we are hoping > that's not the case for anyone. > > Moving to the 6th and 7th allows it to happen during normal business days > for the host, Deutsch Telekom. I don't personally like the idea of trying > to keep the 8th when it's a public holiday. > > Regards, > > Chris > -- > Chris Morgan > -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Fri Jan 25 22:44:34 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Fri, 25 Jan 2019 23:44:34 +0100 Subject: [tc][all][self-healing-sig] Service-side health checks community goal for Train cycle Message-ID: <158c354c1d7a3e6fb261202b34d4e3233d5f39bc.camel@evrard.me> Hello everyone, As you might have seen on the ML, two of the 3 top contenders for the Train cycle community got some traction. Let's here talk about the last one: The service-side health checks. While people were interested in this goal previously, nobody really came forward on the pre-work. Last week, I met a few of my colleagues to see what we can do together. Matt (irc: mattoliverau), Adam (irc: aspiers), and I discussed about the different ways to implement this new API, with the help of many in #openstack-sdk. Long story short, the current framework might be "good enough" for extension already, as we could have extra "backends" (basically "tests"), to increase the coverage of this healthcheck endpoint. While the immediate next step would be to work on the v2 prototype that Graham started (see link [1], anyone is welcome to help there!), the next step would be far easier if it was crowd sourced: We need to know which service is already using that oslo middleware, which service doesn't want to use it, and which service is already ready for healtchecks. When we'll have a lay of the land, we'll know where the energy will be spent in this community goal: Would that be bringing oslo.middleware to services or bringing common "backends" that can be used by each service (like DB/MQ/cache checks). I would be very happy if you could have a look at this ethercal [2], and add/edit your project capabilities there. Thank you in advance. Jean-Philippe Evrard (evrardjp) [1]: https://review.openstack.org/#/c/617924/ [2]: https://ethercalc.openstack.org/di0mxkiepll8 From mriedemos at gmail.com Sat Jan 26 00:47:06 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 25 Jan 2019 18:47:06 -0600 Subject: G8 H8 Message-ID: <9891003e-9777-b684-05c9-8dfc22363e07@gmail.com> Time for a quick update on gate status. * There were some shelve tests that were failing ssh pretty badly in the tempest-slow job due to a neutron issue: https://launchpad.net/bugs/1812552. It seems https://review.openstack.org/#/c/631944/ might have squashed that bug. * Probably our biggest issue right now is test_subnet_details failing: http://status.openstack.org/elastic-recheck/#1813198. I suspect that is somehow related to using cirros 0.4.0 in devstack as of Jan 20. I have a tempest patch up for review to help debug that when it fails https://review.openstack.org/#/c/633225 since it seems we're not parsing nic names properly which is how we get the mangled udhcpc..pid file name. * Another nasty one that is affecting unit/functional tests (the bug is against nova but the query hits other projects as well) is http://status.openstack.org/elastic-recheck/#1813147 where subunit parsing fails. It seems cinder had to deal with something like this recently too so the nova team needs to figure out what cinder did to resolve this. I'm not sure if this is a recent regression or not, but the logstash trends start around Jan 17 so it could be recent. * https://bugs.launchpad.net/cinder/+bug/1810526 is a cinder bug related to etcd intermittently dropping connections and then cinder services hit ToozConnectionErrors which cause other things to fail, like volume status updates are lost during delete and then tempest times out waiting for the volume to be deleted. I have a fingerprint in the bug but it shows up in successful jobs too which is frustrating. I would expect that for grenade while services are being restarted (although do we restart etcd in grenade?) but it also shows up in non-grenade jobs. I believe cinder is just using tooz+etcd as a distributed lock manager so I'm not sure how valid it would be to add retries on that locking code or not when the service is unavailable. One suggestion in IRC was to not use tooz/etcd for DLM in single-node jobs but that kind of side-steps the issue - but if etcd is lagging because of lots of services eating up resources on the single node, it might not be a bad option. -- Thanks, Matt From mriedemos at gmail.com Sat Jan 26 00:50:30 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 25 Jan 2019 18:50:30 -0600 Subject: [nova] Per-instance serial number implementation question In-Reply-To: <58b21f9476536a5e9f1fad6596e9882aa6ad7e1d.camel@redhat.com> References: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> <58b21f9476536a5e9f1fad6596e9882aa6ad7e1d.camel@redhat.com> Message-ID: <70de7489-117e-4816-1db6-977bb3009935@gmail.com> On 1/25/2019 10:47 AM, Sean Mooney wrote: > you can get the hashed host-id form the instance metadata to determin if two > instance are on the same host from a tenant perspective in a hypervior indepent > way so i think that usecase would still be supported. I was thinking the same thing. Not even just instance metadata (API) / config drive, it's right in the REST API for the server resource (the hostId field). -- Thanks, Matt From mriedemos at gmail.com Sat Jan 26 00:52:21 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 25 Jan 2019 18:52:21 -0600 Subject: [nova] Per-instance serial number implementation question In-Reply-To: <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> References: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> Message-ID: On 1/25/2019 10:35 AM, Stephen Finucane wrote: > He noted that one would be a valid point in > claiming the host OS identity should have been reported in > 'chassis.serial' instead of 'system.serial' in the first place [1] but > changing it now is definitely not zero risk. If I'm reading those docs correctly, chassis.serial was new in libvirt 4.1.0 which is quite a bit newer than our minimum libvirt version support. -- Thanks, Matt From cyril at redhat.com Sat Jan 26 04:26:15 2019 From: cyril at redhat.com (Cyril Roelandt) Date: Sat, 26 Jan 2019 05:26:15 +0100 Subject: Adding type hints to OpenStack (starting with Oslo) Message-ID: <20190126042615.GF12721@debian> Hello, I have recently been looking at mypy[1], a static type checker for Python. It uses optional type hints[2] to perform type checking. Using mypy has many benefits: - it allows developers to find type-related bugs that could cause a program to crash or display an unexpected behaviour at runtime; - it replaces type information that is usually given in docstrings, and may be outdated; - mypy runs can be integrated into a CI, making sure there are no regressions; - mypy works with both Python 2 and 3; - adding type information may be done incrementally; - library developers may expose type information, helping their users. Type information is available for most of the standard Python library, and for the most popular libraries on PyPI, through typeshed[3]. It is therefore possible to write bad code such as: $ cat test.py import requests requests.get(123) And have mypy warn us that something is wrong: $ mypy test.py test.py:2: error: Argument 1 has incompatible type "int"; expected "Union[str, bytes]" I would like to add type hints to OpenStack, starting with a small project (because it would probably be quick and easy) that is used by a lot of other OpenStack projects (because they would benefit from the type hints as well). Oslo seems like a reasonable choice. I took the liberty of implementing a proof of concept on top of oslo.config[4]. Using this branch, you should be able to: 1) Run "tox -etype" to run mypy on the whole code base 2) Write a badly typed program such as this one: $ cat test.py from oslo_config import cfg common_opts = [ cfg.IntOpt('test', min='3'), cfg.HostAddressOpt('host', version='4'), ] And have mypy show you the type errors: $ mypy test.py test.py:4: error: Argument "min" to "IntOpt" has incompatible type "str"; expected "Optional[int]" test.py:5: error: Argument "version" to "HostAddressOpt" has incompatible type "str"; expected "Optional[int]" Does anyone know about mypy? Would anyone be interested in seeing type hints added to OpenStack? Looking forward to hearing your thoughts, Cyril [1] http://mypy-lang.org/ [2] https://www.python.org/dev/peps/pep-0484/ [3] https://github.com/python/typeshed [4] https://github.com/CyrilRoelandteNovance/oslo.config/tree/type From glongwave at gmail.com Sat Jan 26 08:41:57 2019 From: glongwave at gmail.com (ChangBo Guo) Date: Sat, 26 Jan 2019 16:41:57 +0800 Subject: [oslo] Proposing Zane Bitter as general Oslo core In-Reply-To: <78154144-48e8-8566-5f91-72cf4e957b28@nemebean.com> References: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> <4461f5929edb4091d9d2bea084226c00a6a6631d.camel@redhat.com> <78154144-48e8-8566-5f91-72cf4e957b28@nemebean.com> Message-ID: +1 Ben Nemec 于2019年1月26日周六 上午3:40写道: > I will admit this is partially motivated by the fact I forgot he > couldn't +2 my oslo.utils patch and was annoyed by that. ;-) > > On 1/25/19 8:54 AM, Stephen Finucane wrote: > > I already thought he was one, heh. +1 from me. > > > > On Thu, 2019-01-24 at 16:17 -0600, Ben Nemec wrote: > >> Hi all, > >> > >> Zane is already core on oslo.service, but he's been doing good stuff > >> in > >> adjacent projects as well. We could keep playing whack-a-mole with > >> giving him +2 on more repos, but I trust his judgment so I'm > >> proposing > >> we just add him to the oslo-core group. > >> > >> If there are no objections in the next week I'll proceed with the > >> addition. > >> > >> Thanks. > >> > >> -Ben > >> > > > > > > -- ChangBo Guo(gcb) Community Director @EasyStack -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Sat Jan 26 11:25:33 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Sat, 26 Jan 2019 11:25:33 +0000 Subject: [nova] Per-instance serial number implementation question In-Reply-To: References: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> Message-ID: <491f036c485b9eb7e72ef74d22755215a8994d99.camel@redhat.com> On Fri, 2019-01-25 at 18:52 -0600, Matt Riedemann wrote: > On 1/25/2019 10:35 AM, Stephen Finucane wrote: > > He noted that one would be a valid point in > > claiming the host OS identity should have been reported in > > 'chassis.serial' instead of 'system.serial' in the first place [1] but > > changing it now is definitely not zero risk. > > If I'm reading those docs correctly, chassis.serial was new in libvirt > 4.1.0 which is quite a bit newer than our minimum libvirt version support. Good point. Guess it doesn't matter though if we have the two alternatives you and Sean have suggested for figuring this stuff out? The important thing is that release note. Setting 'chassis.serial' would be a nice TODO if we have 4.1.0. Stephen From doug at doughellmann.com Sat Jan 26 14:02:53 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Sat, 26 Jan 2019 09:02:53 -0500 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: <20190126042615.GF12721@debian> References: <20190126042615.GF12721@debian> Message-ID: Cyril Roelandt writes: > Hello, > > I have recently been looking at mypy[1], a static type checker for > Python. It uses optional type hints[2] to perform type checking. Using > mypy has many benefits: > > - it allows developers to find type-related bugs that could cause a > program to crash or display an unexpected behaviour at runtime; > - it replaces type information that is usually given in docstrings, and > may be outdated; > - mypy runs can be integrated into a CI, making sure there are no > regressions; > - mypy works with both Python 2 and 3; > - adding type information may be done incrementally; > - library developers may expose type information, helping their users. > > Type information is available for most of the standard Python library, > and for the most popular libraries on PyPI, through typeshed[3]. It is > therefore possible to write bad code such as: > > $ cat test.py > import requests > requests.get(123) > > And have mypy warn us that something is wrong: > > $ mypy test.py > test.py:2: error: Argument 1 has incompatible type "int"; expected "Union[str, bytes]" > > > I would like to add type hints to OpenStack, starting with a small > project (because it would probably be quick and easy) that is used by a > lot of other OpenStack projects (because they would benefit from the > type hints as well). Oslo seems like a reasonable choice. > > 1) Run "tox -etype" to run mypy on the whole code base > 2) Write a badly typed program such as this one: > > $ cat test.py > from oslo_config import cfg > > common_opts = [ > cfg.IntOpt('test', min='3'), > cfg.HostAddressOpt('host', version='4'), > ] > > And have mypy show you the type errors: > > $ mypy test.py > test.py:4: error: Argument "min" to "IntOpt" has incompatible type "str"; expected "Optional[int]" > test.py:5: error: Argument "version" to "HostAddressOpt" has incompatible type "str"; expected "Optional[int]" > > > Does anyone know about mypy? Would anyone be interested in seeing type hints > added to OpenStack? > > Looking forward to hearing your thoughts, > Cyril > > > [1] http://mypy-lang.org/ > [2] https://www.python.org/dev/peps/pep-0484/ > [3] https://github.com/python/typeshed > [4] https://github.com/CyrilRoelandteNovance/oslo.config/tree/type > We're still required to support python 2 through the beginning of the U cycle [5]. Is it possible to apply the type hints in a way that allows us to maintain that support? [5] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html -- Doug From cyril at redhat.com Sat Jan 26 15:05:24 2019 From: cyril at redhat.com (Cyril Roelandt) Date: Sat, 26 Jan 2019 16:05:24 +0100 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: References: <20190126042615.GF12721@debian> Message-ID: <20190126150524.GG12721@debian> Hello Doug, On 01/26/19 09:02, Doug Hellmann wrote: > We're still required to support python 2 through the beginning of the U > cycle [5]. Is it possible to apply the type hints in a way that allows > us to maintain that support? > > [5] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html There are two ways to apply type hints: 1) Use a new syntax that only works with Python 3 2) Use comments (see my github branch, where comments starting with "type:" can be found) I used the second approach in order to not break Python 2 compatibility. Regards, Cyril From sergey at vilgelm.info Sat Jan 26 17:06:12 2019 From: sergey at vilgelm.info (Sergey Vilgelm) Date: Sat, 26 Jan 2019 11:06:12 -0600 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: <20190126150524.GG12721@debian> References: <20190126042615.GF12721@debian> <20190126150524.GG12721@debian> Message-ID: Cyril, We have lots of doc strings in rst format and many of them already have the `@param` and `@type` tags. Does the mypy support this format or we should to rewrite doc string and add the `type:` tags just for mypy? -- Sergey Vilgelm https://www.vilgelm.info On Jan 26, 2019, 9:07 AM -0600, Cyril Roelandt , wrote: > Hello Doug, > > On 01/26/19 09:02, Doug Hellmann wrote: > > We're still required to support python 2 through the beginning of the U > > cycle [5]. Is it possible to apply the type hints in a way that allows > > us to maintain that support? > > > > [5] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html > > There are two ways to apply type hints: > 1) Use a new syntax that only works with Python 3 > 2) Use comments (see my github branch, where comments starting with > "type:" can be found) > > I used the second approach in order to not break Python 2 compatibility. > > > Regards, > Cyril > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Sat Jan 26 17:16:03 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Sat, 26 Jan 2019 18:16:03 +0100 Subject: G8 H8 In-Reply-To: <9891003e-9777-b684-05c9-8dfc22363e07@gmail.com> References: <9891003e-9777-b684-05c9-8dfc22363e07@gmail.com> Message-ID: Hi, > Wiadomość napisana przez Matt Riedemann w dniu 26.01.2019, o godz. 01:47: > > Time for a quick update on gate status. > > * There were some shelve tests that were failing ssh pretty badly in the tempest-slow job due to a neutron issue: https://launchpad.net/bugs/1812552. It seems https://review.openstack.org/#/c/631944/ might have squashed that bug. > > * Probably our biggest issue right now is test_subnet_details failing: http://status.openstack.org/elastic-recheck/#1813198. I suspect that is somehow related to using cirros 0.4.0 in devstack as of Jan 20. I have a tempest patch up for review to help debug that when it fails https://review.openstack.org/#/c/633225 since it seems we're not parsing nic names properly which is how we get the mangled udhcpc..pid file name. I was looking at logs from failed job [1] and what I noticed in tempest log [2] is fact that couple of times this command returned proper „eth0” interface and then it once return empty string which, looking at command in tempest test means IMO that IP address (10.1.0.3 in above example) wasn’t configured on any interface. Maybe this interface is losing its IP address during renew lease process and we just should make tempest test more proof for such (temporary I hope) issue. > > * Another nasty one that is affecting unit/functional tests (the bug is against nova but the query hits other projects as well) is http://status.openstack.org/elastic-recheck/#1813147 where subunit parsing fails. It seems cinder had to deal with something like this recently too so the nova team needs to figure out what cinder did to resolve this. I'm not sure if this is a recent regression or not, but the logstash trends start around Jan 17 so it could be recent. We have same issue in neutron-functional job on python 3. It is waiting for review in [3]. I was recently talk with about it with Matthew Treinish on IRC [4] and it looks that limiting output on pythonlogging stream did the trick and we finally should be able to make it working. Probably You will need to do something similar. > > * https://bugs.launchpad.net/cinder/+bug/1810526 is a cinder bug related to etcd intermittently dropping connections and then cinder services hit ToozConnectionErrors which cause other things to fail, like volume status updates are lost during delete and then tempest times out waiting for the volume to be deleted. I have a fingerprint in the bug but it shows up in successful jobs too which is frustrating. I would expect that for grenade while services are being restarted (although do we restart etcd in grenade?) but it also shows up in non-grenade jobs. I believe cinder is just using tooz+etcd as a distributed lock manager so I'm not sure how valid it would be to add retries on that locking code or not when the service is unavailable. One suggestion in IRC was to not use tooz/etcd for DLM in single-node jobs but that kind of side-steps the issue - but if etcd is lagging because of lots of services eating up resources on the single node, it might not be a bad option. > > -- > > Thanks, > > Matt > [1] http://logs.openstack.org/78/570078/17/check/tempest-slow/161ea32/job-output.txt.gz#_2019-01-24_18_26_22_886987 [2] http://logs.openstack.org/78/570078/17/check/tempest-slow/161ea32/controller/logs/tempest_log.txt [3] https://review.openstack.org/#/c/577383/ [4] http://eavesdrop.openstack.org/irclogs/%23openstack-qa/%23openstack-qa.2019-01-23.log.html#t2019-01-23T21:52:34 — Slawek Kaplonski Senior software engineer Red Hat From doug at doughellmann.com Sat Jan 26 18:13:00 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Sat, 26 Jan 2019 13:13:00 -0500 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: <20190126150524.GG12721@debian> References: <20190126042615.GF12721@debian> <20190126150524.GG12721@debian> Message-ID: <34669BF9-B85C-41DD-9EC6-DD455C00853C@doughellmann.com> > On Jan 26, 2019, at 10:05 AM, Cyril Roelandt wrote: > > Hello Doug, > >> On 01/26/19 09:02, Doug Hellmann wrote: >> We're still required to support python 2 through the beginning of the U >> cycle [5]. Is it possible to apply the type hints in a way that allows >> us to maintain that support? >> >> [5] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html > > There are two ways to apply type hints: > 1) Use a new syntax that only works with Python 3 > 2) Use comments (see my github branch, where comments starting with > "type:" can be found) > > I used the second approach in order to not break Python 2 compatibility. > > > Regards, > Cyril Ok, good. I don’t have any issue with experimenting with type hints, and they might be useful. Why don’t you go ahead and submit your patches to gerrit so we can see what they look like and review them there. Doug From morgan.fainberg at gmail.com Sat Jan 26 18:18:41 2019 From: morgan.fainberg at gmail.com (Morgan Fainberg) Date: Sat, 26 Jan 2019 10:18:41 -0800 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: References: <20190126042615.GF12721@debian> <20190126150524.GG12721@debian> Message-ID: Honestly, I would wait for this for the U release. We can then go to the annotation style and if needed update docstrings at the same time. My concern with pushing this now is a high churn of code that should be duplicated once we drop PY2 support. Other than my above concerns the type checking in mypy would be great to have. --Morgan On Sat, Jan 26, 2019, 09:14 Sergey Vilgelm Cyril, > > We have lots of doc strings in rst format and many of them already have > the `@param` and `@type` tags. Does the mypy support this format or we > should to rewrite doc string and add the `type:` tags just for mypy? > > -- > Sergey Vilgelm > https://www.vilgelm.info > On Jan 26, 2019, 9:07 AM -0600, Cyril Roelandt , wrote: > > Hello Doug, > > On 01/26/19 09:02, Doug Hellmann wrote: > > We're still required to support python 2 through the beginning of the U > cycle [5]. Is it possible to apply the type hints in a way that allows > us to maintain that support? > > [5] > https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html > > > There are two ways to apply type hints: > 1) Use a new syntax that only works with Python 3 > 2) Use comments (see my github branch, where comments starting with > "type:" can be found) > > I used the second approach in order to not break Python 2 compatibility. > > > Regards, > Cyril > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Sat Jan 26 18:29:29 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Sat, 26 Jan 2019 13:29:29 -0500 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: References: <20190126042615.GF12721@debian> <20190126150524.GG12721@debian> Message-ID: <688957DD-31A3-42F0-A77A-3638171740AD@doughellmann.com> I agree with your concerns about code churn. I think it’s still useful to run a small experiment to demonstrate how we might actually benefit from them though, which applying them to one library would let us do. > On Jan 26, 2019, at 1:18 PM, Morgan Fainberg wrote: > > Honestly, I would wait for this for the U release. We can then go to the annotation style and if needed update docstrings at the same time. My concern with pushing this now is a high churn of code that should be duplicated once we drop PY2 support. > > Other than my above concerns the type checking in mypy would be great to have. > > --Morgan > >> On Sat, Jan 26, 2019, 09:14 Sergey Vilgelm > Cyril, >> >> We have lots of doc strings in rst format and many of them already have the `@param` and `@type` tags. Does the mypy support this format or we should to rewrite doc string and add the `type:` tags just for mypy? >> >> -- >> Sergey Vilgelm >> https://www.vilgelm.info >>> On Jan 26, 2019, 9:07 AM -0600, Cyril Roelandt , wrote: >>> Hello Doug, >>> >>>> On 01/26/19 09:02, Doug Hellmann wrote: >>>> We're still required to support python 2 through the beginning of the U >>>> cycle [5]. Is it possible to apply the type hints in a way that allows >>>> us to maintain that support? >>>> >>>> [5] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html >>> >>> There are two ways to apply type hints: >>> 1) Use a new syntax that only works with Python 3 >>> 2) Use comments (see my github branch, where comments starting with >>> "type:" can be found) >>> >>> I used the second approach in order to not break Python 2 compatibility. >>> >>> >>> Regards, >>> Cyril >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Sat Jan 26 23:27:17 2019 From: smooney at redhat.com (Sean Mooney) Date: Sat, 26 Jan 2019 23:27:17 +0000 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: <688957DD-31A3-42F0-A77A-3638171740AD@doughellmann.com> References: <20190126042615.GF12721@debian> <20190126150524.GG12721@debian> <688957DD-31A3-42F0-A77A-3638171740AD@doughellmann.com> Message-ID: <852334fb1bbde9cf4a866db1df9c647bd54a9dfe.camel@redhat.com> On Sat, 2019-01-26 at 13:29 -0500, Doug Hellmann wrote: > I agree with your concerns about code churn. I think it’s still useful to run a small experiment to demonstrate how we > might actually benefit from them though, which applying them to one library would let us do. > > On Jan 26, 2019, at 1:18 PM, Morgan Fainberg wrote: > > > Honestly, I would wait for this for the U release. We can then go to the annotation style and if needed update > > docstrings at the same time. My concern with pushing this now is a high churn of code that should be duplicated once > > we drop PY2 support. > > > > Other than my above concerns the type checking in mypy would be great to have. > > > > --Morgan > > > > On Sat, Jan 26, 2019, 09:14 Sergey Vilgelm > > Cyril, > > > > > > We have lots of doc strings in rst format and many of them already have the `@param` and `@type` tags. Does the > > > mypy support this format or we should to rewrite doc string and add the `type:` tags just for mypy? > > > > > > -- > > > Sergey Vilgelm > > > https://www.vilgelm.info > > > On Jan 26, 2019, 9:07 AM -0600, Cyril Roelandt , wrote: > > > > Hello Doug, > > > > > > > > On 01/26/19 09:02, Doug Hellmann wrote: > > > > > We're still required to support python 2 through the beginning of the U > > > > > cycle [5]. Is it possible to apply the type hints in a way that allows > > > > > us to maintain that support? > > > > > > > > > > [5] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html > > > > > > > > There are two ways to apply type hints: > > > > 1) Use a new syntax that only works with Python 3 > > > > 2) Use comments (see my github branch, where comments starting with > > > > "type:" can be found) there is also a third, looping stephen into the conversation. stephen did some experiments in this regard in the past. the commet form is backwards compatiabl but you can also put thetype hints in a seperate .pyi file. The .pyi file version has the advantage of also working for c modules and not modifying any of the existing code. we previously disscused the idea of using https://github.com/Instagram/MonkeyType to auto discover the types of existing libs and generate the stub .pyi files. ideally we should just be able to run our existing unit/functional test under mockeytype to discover the relevent types and create the pyi files. i dont think stephen or i had the chance to persue that since we discuseed it in denver. > > > > > > > > I used the second approach in order to not break Python 2 compatibility. > > > > > > > > > > > > Regards, > > > > Cyril > > > > From cyril at redhat.com Sun Jan 27 01:03:26 2019 From: cyril at redhat.com (Cyril Roelandt) Date: Sun, 27 Jan 2019 02:03:26 +0100 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: <852334fb1bbde9cf4a866db1df9c647bd54a9dfe.camel@redhat.com> References: <20190126042615.GF12721@debian> <20190126150524.GG12721@debian> <688957DD-31A3-42F0-A77A-3638171740AD@doughellmann.com> <852334fb1bbde9cf4a866db1df9c647bd54a9dfe.camel@redhat.com> Message-ID: <20190127010326.GH12721@debian> Hello, On 01/26/19 23:27, Sean Mooney wrote: > there is also a third, looping stephen into the conversation. > stephen did some experiments in this regard in the past. > the commet form is backwards compatiabl but you can also put thetype hints in > a seperate .pyi file. The .pyi file version has the advantage of also working for c modules > and not modifying any of the existing code. Indeed, I forgot to mention this way of adding type hints. I must admit it is not my favourite, since it requires editing code in two different places. I did not know it worked with C modules as well, which is really nice. > > we previously disscused the idea of using > https://github.com/Instagram/MonkeyType > to auto discover the types of existing libs and generate the stub .pyi files. > ideally we should just be able to run our existing unit/functional test under mockeytype > to discover the relevent types and create the pyi files. > This seems interesting, but I think it requires the unit tests to be really thorough. I guess this would require careful human review anyway. Regards, Cyril From cyril at redhat.com Sun Jan 27 01:39:58 2019 From: cyril at redhat.com (Cyril Roelandt) Date: Sun, 27 Jan 2019 02:39:58 +0100 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: <34669BF9-B85C-41DD-9EC6-DD455C00853C@doughellmann.com> References: <20190126042615.GF12721@debian> <20190126150524.GG12721@debian> <34669BF9-B85C-41DD-9EC6-DD455C00853C@doughellmann.com> Message-ID: <20190127013958.GA22280@debian> Hello, On 01/26/19 13:13, Doug Hellmann wrote: > Ok, good. I don’t have any issue with experimenting with type hints, and they might be useful. Why don’t you go ahead and submit your patches to gerrit so we can see what they look like and review them there. > For anyone interested in taking a look at the patches, they are now available on Gerrit: https://review.openstack.org/#/c/633376/ https://review.openstack.org/#/c/633377/ Regards, Cyril From jpenick at gmail.com Fri Jan 25 21:01:52 2019 From: jpenick at gmail.com (James Penick) Date: Fri, 25 Jan 2019 13:01:52 -0800 Subject: [Edge-computing] [keystone] x509 authentication In-Reply-To: References: Message-ID: Hey Lance, We'd definitely be interested in helping with the work. I'll grab some volunteers from my team and get them in touch within the next few days. -James On Fri, Jan 25, 2019 at 11:16 AM Lance Bragstad wrote: > Hi all, > > We've been going over keystone gaps that need to be addressed for edge use > cases every Tuesday. Since Berlin, Oath has open-sourced some of their > custom authentication plugins for keystone that help them address these > gaps. > > The basic idea is that users authenticate to some external identity > provider (Athenz in Oath's case), and then present an Athenz token to > keystone. The custom plugins decode the token from Athenz to determine the > user, project, roles assignments, and other useful bits of information. > After that, it creates any resources that don't exist in keystone already. > Ultimately, a user can authenticate against a keystone node and have > specific resources provisioned automatically. In Berlin, engineers from > Oath were saying they'd like to move away from Athenz tokens altogether and > use x509 certificates issued by Athenz instead. The auto-provisioning > approach is very similar to a feature we have in keystone already. In > Berlin, and shortly after, there was general agreement that if we could > support x509 authentication with auto-provisioning via keystone federation, > that would pretty much solve Oath's use case without having to maintain > custom keystone plugins. > > Last week, Colleen started digging into keystone's existing x509 > authentication support. I'll start with the good news, which is x509 > authentication works, for the most part. It's been a feature in keystone > for a long time, and it landed after we implemented federation support > around the Kilo release. Chances are there won't be a need for a keystone > specification like we were initially thinking in the edge meetings. > Unfortunately, the implementation for x509 authentication has outdated > documentation, is extremely fragile, hard to set up, and hasn't been > updated with improvements we've made to the federation API since the > original implementation (like shadow users or auto-provisioning, which work > with other federated protocols like OpenID Connect and SAML). We've started > tracking the gaps with bugs [0] so that we have things written down. > > I think the good thing is that once we get this cleaned up, we'll be able > to re-use some of the newer federation features with x509 > authentication/federation. These updates would make x509 a first-class > federated protocol. The approach, pending the bug fixes, would remove the > need for Oath's custom authentication plugins. It could be useful for edge > deployments, or even deployments with many regions, by allowing users to be > auto-provisioned in each region. Although, it doesn't necessarily solve the > network partition issue. > > Now that we have an idea of where to start and some bug reports [0], I'm > wondering if anyone is interested in helping with the update or refactor. > Because this won't require a specification, we can get started on it > sooner, instead of having to wait for Train development and a new > specification. I'm also curious if anyone has comments or questions about > the approach. > > Thanks, > > Lance > > [0] https://bugs.launchpad.net/keystone/+bugs?field.tag=x509 > _______________________________________________ > Edge-computing mailing list > Edge-computing at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing > -------------- next part -------------- An HTML attachment was scrubbed... URL: From moguimar at redhat.com Mon Jan 28 00:52:20 2019 From: moguimar at redhat.com (Moises Guimaraes de Medeiros) Date: Mon, 28 Jan 2019 01:52:20 +0100 Subject: [oslo] Proposing Zane Bitter as general Oslo core In-Reply-To: References: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> <4461f5929edb4091d9d2bea084226c00a6a6631d.camel@redhat.com> <78154144-48e8-8566-5f91-72cf4e957b28@nemebean.com> Message-ID: +2 :P Em sáb, 26 de jan de 2019 às 09:42, ChangBo Guo escreveu: > +1 > > Ben Nemec 于2019年1月26日周六 上午3:40写道: > >> I will admit this is partially motivated by the fact I forgot he >> couldn't +2 my oslo.utils patch and was annoyed by that. ;-) >> >> On 1/25/19 8:54 AM, Stephen Finucane wrote: >> > I already thought he was one, heh. +1 from me. >> > >> > On Thu, 2019-01-24 at 16:17 -0600, Ben Nemec wrote: >> >> Hi all, >> >> >> >> Zane is already core on oslo.service, but he's been doing good stuff >> >> in >> >> adjacent projects as well. We could keep playing whack-a-mole with >> >> giving him +2 on more repos, but I trust his judgment so I'm >> >> proposing >> >> we just add him to the oslo-core group. >> >> >> >> If there are no objections in the next week I'll proceed with the >> >> addition. >> >> >> >> Thanks. >> >> >> >> -Ben >> >> >> > >> > >> >> > > -- > ChangBo Guo(gcb) > Community Director @EasyStack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sindi at smartfundisolutions.com Sun Jan 27 22:34:44 2019 From: sindi at smartfundisolutions.com (Sindisiwe Chuma) Date: Mon, 28 Jan 2019 00:34:44 +0200 Subject: [publiccloud] New Contributor Joining In-Reply-To: References: Message-ID: Hi All, I am Sindi, a new member. I am interested in participating in the Pubic Cloud Operators Working Group. Are there current projects or initiatives running and documentation available to familiarize myself with the work done and currently being done? Could you please refer me to resources containing information. Kind Regards, Sindi +27 72 572 7757 -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Mon Jan 28 02:06:17 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 28 Jan 2019 11:06:17 +0900 Subject: [Seachlight] Team meeting today at 13:30UTC Message-ID: Hi team, We will have the team meeting today at 13:30UTC. Join us at #openstack-searchlight Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Jan 28 07:05:57 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 28 Jan 2019 16:05:57 +0900 Subject: [qa][tempest] Waiting for interface status == ACTIVE before checking status In-Reply-To: References: <168829ea636.ac2416e683695.6782940432609689439@ghanshyammann.com> <92c4b8e2-80b6-3964-f98b-f6a363bc8cd9@gmail.com> Message-ID: <16893476a1e.f2f37fb8119707.7983113641299119992@ghanshyammann.com> ---- On Sat, 26 Jan 2019 03:17:44 +0900 Sean Mooney wrote ---- > On Fri, 2019-01-25 at 12:26 -0500, Jay Pipes wrote: > > On 01/25/2019 12:04 PM, Terry Wilson wrote: > > > On Thu, Jan 24, 2019 at 7:34 PM Ghanshyam Mann wrote: > > > > > > > As Sean also pointed that in patch that we should go for the approach of > > > > "making sure all attached interface to server is active, server is sshable > > > > bthe efore server can be used in test" [1]. This is something we agreed > > > > in Denver PTG for afazekas proposal[2]. > > > > > > > > If we see the from user perspective , user can have an Active VM with > > > > active port which can flip to down in between of that port usage. This seems bug to me. > > > > > > To me, this ignores real-world situations where a port status *can* > > > change w/o user interaction. > > > > How is this ignoring that scenario? > the only case i know of for definity would be if the admin state is down > which should not prevent the vm from booting but neutron shoudl not allow network > connecitigy in this case. Can it happen in between of connectivity also? I mean when VM is active and SSHable then, down admin state can cause port to become down. > > > > > > It seems weird to ignore a status change > > > if it is detected. In the case that we hit, it was a change to os-vif > > > where it was recreating a port. > > > > Which was a bug, right? > yes kind of. > we could have fixed it by merging the nova change i had or > reverting the os-vif change. > i revert the os-vif change as the nova change was hitting a different bug > in neutron. but only one entity. os-vif or the hyperviror should have been creating > the port on ovs. so it was a bug when both were. > > > > > But it could just as easily be some vendor-specific "that port just > > > died" kind of thing. > > > > In which case, the test waiting for SSH to be available would timeout > > because connectivity would be broken anyway, no? > if it did not recover yes it would. > > > > > > > Why not update the status of the port if you > > > know it has changed? > > > > Sorry, I don't see where anyone is suggesting not changing the status of > > the port if some non-bug real scenario changes the status of the port? > > > > > Also, the patch itself (outside the ironic case) just adds a window > > > for the status to bounce. > > > > Unless I'm mistaken, the patch is simply changing the condition that the > > tempest test uses to identify broken VM connectivity. It will use the > > SSH connectivity test instead of looking at the port status test. > > > > The SSH test was determined to be a more stable test of VM network > > connectivity than relying on the Neutron port status indicator which can > > be a little flaky. > ssh is more reliable for hotpug as we needed to wait for the guest os to > process the hotplug event. waithing for the vm to be pingable or sshable > is more reliable in that specific case. the port status being active simply > means that the port is curently configured by neutron. that gives you no knolage > of if the gust has processed the hotplug event. +1, I agree on hotplug event case and yes Tempest test should make test VM usable for test after sshable/pingable success. afazekas updated few test for that and it will be reasonable thing to do. > > in general im not sure if ssh connectivity would be more reliabel but if that > is what the test requires to work its better to expeclitly validate it then use > the port status as a proxy. > > > > Or am I missing something? > its a valid question i think port status and vm connectity are two different things. > > if you are writing an api test then port status hsould be suffient. > if you need to connect to the vm in any way it becomes a senario test > in which case wait for sshable or pingable might be more suitable. Yeah, scenario tests expect the end-to-end connectivity internal/external to tenants. Tempest API tests hardly check the ssh verification. -gmann > > not sure if i answer your question however. > > > > -jay > > > > > > > From skaplons at redhat.com Mon Jan 28 08:20:33 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Mon, 28 Jan 2019 09:20:33 +0100 Subject: [neutron] CI meeting on 29.01.2019 cancelled Message-ID: <31F11224-FE5E-4292-9F42-E6B3DA757E49@redhat.com> Hi, I will not be able to drive Neutron CI meeting on Tuesday 29.01.2019 so lets cancel it for this week. — Slawek Kaplonski Senior software engineer Red Hat From lajos.katona at ericsson.com Mon Jan 28 08:51:17 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Mon, 28 Jan 2019 08:51:17 +0000 Subject: G8 H8 In-Reply-To: References: <9891003e-9777-b684-05c9-8dfc22363e07@gmail.com> Message-ID: <2df964b1-000f-d85f-e8d9-a6998d02554c@ericsson.com> Hi, On 2019. 01. 26. 18:16, Slawomir Kaplonski wrote: > Hi, > >> Wiadomość napisana przez Matt Riedemann w dniu 26.01.2019, o godz. 01:47: >> >> Time for a quick update on gate status. >> >> * There were some shelve tests that were failing ssh pretty badly in the tempest-slow job due to a neutron issue: https://launchpad.net/bugs/1812552. It seems https://review.openstack.org/#/c/631944/ might have squashed that bug. >> >> * Probably our biggest issue right now is test_subnet_details failing: http://status.openstack.org/elastic-recheck/#1813198. I suspect that is somehow related to using cirros 0.4.0 in devstack as of Jan 20. I have a tempest patch up for review to help debug that when it fails https://review.openstack.org/#/c/633225 since it seems we're not parsing nic names properly which is how we get the mangled udhcpc..pid file name. > I was looking at logs from failed job [1] and what I noticed in tempest log [2] is fact that couple of times this command returned proper „eth0” interface and then it once return empty string which, looking at command in tempest test means IMO that IP address (10.1.0.3 in above example) wasn’t configured on any interface. Maybe this interface is losing its IP address during renew lease process and we just should make tempest test more proof for such (temporary I hope) issue. I tried to do the same in a loop from cirros 0.4.0, but I can't remove IP from interface. Of course it's possible that something else happens there out of the command executed from tempests. > >> * Another nasty one that is affecting unit/functional tests (the bug is against nova but the query hits other projects as well) is http://status.openstack.org/elastic-recheck/#1813147 where subunit parsing fails. It seems cinder had to deal with something like this recently too so the nova team needs to figure out what cinder did to resolve this. I'm not sure if this is a recent regression or not, but the logstash trends start around Jan 17 so it could be recent. > We have same issue in neutron-functional job on python 3. It is waiting for review in [3]. I was recently talk with about it with Matthew Treinish on IRC [4] and it looks that limiting output on pythonlogging stream did the trick and we finally should be able to make it working. > Probably You will need to do something similar. > >> * https://bugs.launchpad.net/cinder/+bug/1810526 is a cinder bug related to etcd intermittently dropping connections and then cinder services hit ToozConnectionErrors which cause other things to fail, like volume status updates are lost during delete and then tempest times out waiting for the volume to be deleted. I have a fingerprint in the bug but it shows up in successful jobs too which is frustrating. I would expect that for grenade while services are being restarted (although do we restart etcd in grenade?) but it also shows up in non-grenade jobs. I believe cinder is just using tooz+etcd as a distributed lock manager so I'm not sure how valid it would be to add retries on that locking code or not when the service is unavailable. One suggestion in IRC was to not use tooz/etcd for DLM in single-node jobs but that kind of side-steps the issue - but if etcd is lagging because of lots of services eating up resources on the single node, it might not be a bad option. >> >> -- >> >> Thanks, >> >> Matt >> > [1] http://logs.openstack.org/78/570078/17/check/tempest-slow/161ea32/job-output.txt.gz#_2019-01-24_18_26_22_886987 > [2] http://logs.openstack.org/78/570078/17/check/tempest-slow/161ea32/controller/logs/tempest_log.txt > [3] https://review.openstack.org/#/c/577383/ > [4] http://eavesdrop.openstack.org/irclogs/%23openstack-qa/%23openstack-qa.2019-01-23.log.html#t2019-01-23T21:52:34 > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > > From feilong at catalyst.net.nz Mon Jan 28 09:21:28 2019 From: feilong at catalyst.net.nz (feilong) Date: Mon, 28 Jan 2019 22:21:28 +1300 Subject: [magnum][queens] issues In-Reply-To: References: Message-ID: <2bc6bc9b-7e3b-5131-dd22-d1934cd9a107@catalyst.net.nz> Hi Ignazio, I'm jumping in to help. But I'd like to understand the issue correctly. You mentioned using explicitly 'export' you can workaround it before with old Magnum versions. Does it still work for you now? And what's your current Magnum version? Thanks. On 22/01/19 6:32 AM, Ignazio Cassano wrote: > I think that the script used to write /etc/sysconfig/heat-parms should > insert an export for any variable initialized. > Any case resourcegroup worked fine before applying last patch. > What is changed ? > Thanks in Advance for any help. > Regards  > Ignazio > > Il giorno Lun 21 Gen 2019 12:42 Ignazio Cassano > > ha scritto: > > I am trying patches you just released for magnum (git fetch > git://git.openstack.org/openstack/magnum > > refs/changes/30/629130/9 && git checkout FETCH_HEAD) > I got same issues on proxy. In the old version I modified with the > help of spyros the scripts under > /usr/lib/python2.7/dist-packages/magnum/drivers/common/templates/kubernetes/fragments > because PROXY variables are not inherited                   > in /etc/sysconfig/heat-params PROXY E NO PROXY variables are > present but we must modify configure-kubernetes-master.sh to force > them                   > . /etc/sysconfig/heat-params > echo "configuring kubernetes (master)" >  _prefix=${CONTAINER_INFRA_PREFIX:-docker.io/openstackmagnum/ > } > export HTTP_PROXY=${HTTP_PROXY} > export HTTPS_PROXY=${HTTPS_PROXY} > export NO_PROXY=${NO_PROXY} > echo "HTTP_PROXY IS ${HTTP_PROXY}" > exporting the above variables when external network has a proxy, > the master is installed but stack hangs creating kube master > Resource Group > > Regards > Ignazio > -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon Jan 28 09:42:28 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 28 Jan 2019 10:42:28 +0100 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: <20190127013958.GA22280@debian> References: <20190126042615.GF12721@debian> <20190126150524.GG12721@debian> <34669BF9-B85C-41DD-9EC6-DD455C00853C@doughellmann.com> <20190127013958.GA22280@debian> Message-ID: Really interesting! I will take a look to your patches. I personaly think it's a good idea to introduce these checks. +1 about code churn. Le dim. 27 janv. 2019 à 02:41, Cyril Roelandt a écrit : > Hello, > > On 01/26/19 13:13, Doug Hellmann wrote: > > Ok, good. I don’t have any issue with experimenting with type hints, and > they might be useful. Why don’t you go ahead and submit your patches to > gerrit so we can see what they look like and review them there. > > > > For anyone interested in taking a look at the patches, they are now > available on Gerrit: > > https://review.openstack.org/#/c/633376/ > https://review.openstack.org/#/c/633377/ > > > Regards, > Cyril > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Jan 28 09:42:35 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 28 Jan 2019 10:42:35 +0100 Subject: [magnum][queens] issues In-Reply-To: <2bc6bc9b-7e3b-5131-dd22-d1934cd9a107@catalyst.net.nz> References: <2bc6bc9b-7e3b-5131-dd22-d1934cd9a107@catalyst.net.nz> Message-ID: Hello Feilong, I need the workaround on old and and patched version of magnum. The old version I installed with apt is: oot at tst2-magnum-ubu:~# dpkg -l|grep magnum ii magnum-api 6.1.0-0ubuntu1~cloud0 all OpenStack containers as a service ii magnum-common 6.1.0-0ubuntu1~cloud0 all OpenStack containers as a service - API server ii magnum-conductor 6.1.0-0ubuntu1~cloud0 all OpenStack containers as a service - conductor ii python-magnum 6.1.0-0ubuntu1~cloud0 all OpenStack containers as a service - Python library ii python-magnumclient 2.8.0-0ubuntu1~cloud0 all client library for Magnum API - Python 2.x Then I pached magnum with the following commands: cd /tmp git clone git://git.openstack.org/openstack/magnum cd magnum git fetch git://git.openstack.org/openstack/magnum refs/changes/30/629130/9 && git checkout FETCH_HEAD mv /usr/lib/python2.7/dist-packages/magnum /usr/lib/python2.7/dist-packages/magnum.orig cp -rp magnum /usr/lib/python2.7/dist-packages/ Then I applyed again my workaround because my external network used by magnum needs a proxy for accessing internet. But on pre-patched version magnum heat stacks work fine. Magnum patched stacks hang creating kube master Resource Group. Thanks Ignazio Il giorno lun 28 gen 2019 alle ore 10:25 feilong ha scritto: > Hi Ignazio, > > I'm jumping in to help. But I'd like to understand the issue correctly. > You mentioned using explicitly 'export' you can workaround it before with > old Magnum versions. Does it still work for you now? And what's your > current Magnum version? Thanks. > > > On 22/01/19 6:32 AM, Ignazio Cassano wrote: > > I think that the script used to write /etc/sysconfig/heat-parms should > insert an export for any variable initialized. > Any case resourcegroup worked fine before applying last patch. > What is changed ? > Thanks in Advance for any help. > Regards > Ignazio > > Il giorno Lun 21 Gen 2019 12:42 Ignazio Cassano > ha scritto: > >> I am trying patches you just released for magnum (git fetch git:// >> git.openstack.org/openstack/magnum refs/changes/30/629130/9 && git >> checkout FETCH_HEAD) >> I got same issues on proxy. In the old version I modified with the help >> of spyros the scripts under >> /usr/lib/python2.7/dist-packages/magnum/drivers/common/templates/kubernetes/fragments >> because PROXY variables are not inherited >> in /etc/sysconfig/heat-params PROXY E NO PROXY variables are present but >> we must modify configure-kubernetes-master.sh to force >> them >> . /etc/sysconfig/heat-params >> echo "configuring kubernetes (master)" >> _prefix=${CONTAINER_INFRA_PREFIX:-docker.io/openstackmagnum/} >> export HTTP_PROXY=${HTTP_PROXY} >> export HTTPS_PROXY=${HTTPS_PROXY} >> export NO_PROXY=${NO_PROXY} >> echo "HTTP_PROXY IS ${HTTP_PROXY}" >> exporting the above variables when external network has a proxy, the >> master is installed but stack hangs creating kube master Resource Group >> >> Regards >> Ignazio >> >> -- > Cheers & Best regards, > Feilong Wang (王飞龙) > ------------------------------------------------------ > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > ------------------------------------------------------ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From feilong at catalyst.net.nz Mon Jan 28 09:47:03 2019 From: feilong at catalyst.net.nz (feilong) Date: Mon, 28 Jan 2019 22:47:03 +1300 Subject: [magnum][queens] issues In-Reply-To: References: <2bc6bc9b-7e3b-5131-dd22-d1934cd9a107@catalyst.net.nz> Message-ID: <21968eba-0ac6-2f10-d72a-1ccf387291f3@catalyst.net.nz> Is the heat stack creation stuck ? Can you see any error from Heat log? It would be nice if you can pop up into #openstack-containers IRC channel so that we can discuss more details?  Thanks. On 28/01/19 10:42 PM, Ignazio Cassano wrote: > Hello Feilong, I need the workaround on old and and patched version of > magnum. > > The old version I installed with apt is: > > oot at tst2-magnum-ubu:~# dpkg -l|grep magnum > ii  magnum-api                          > 6.1.0-0ubuntu1~cloud0                      all          OpenStack > containers as a service > ii  magnum-common                       > 6.1.0-0ubuntu1~cloud0                      all          OpenStack > containers as a service - API server > ii  magnum-conductor                    > 6.1.0-0ubuntu1~cloud0                      all          OpenStack > containers as a service - conductor > ii  python-magnum                       > 6.1.0-0ubuntu1~cloud0                      all          OpenStack > containers as a service - Python library > ii  python-magnumclient                 > 2.8.0-0ubuntu1~cloud0                      all          client library > for Magnum API - Python 2.x > > > > Then I pached magnum with the following commands: > > cd /tmp > git clone git://git.openstack.org/openstack/magnum > > cd magnum > git fetch git://git.openstack.org/openstack/magnum > refs/changes/30/629130/9 > && git checkout FETCH_HEAD > mv /usr/lib/python2.7/dist-packages/magnum > /usr/lib/python2.7/dist-packages/magnum.orig > cp -rp magnum /usr/lib/python2.7/dist-packages/ > > Then I applyed again my workaround because my external network used by > magnum needs a proxy for accessing internet. > But on pre-patched version magnum heat stacks work fine. > Magnum patched stacks hang creating kube master Resource Group. > Thanks > Ignazio > > > > > > Il giorno lun 28 gen 2019 alle ore 10:25 feilong > > ha scritto: > > Hi Ignazio, > > I'm jumping in to help. But I'd like to understand the issue > correctly. You mentioned using explicitly 'export' you can > workaround it before with old Magnum versions. Does it still work > for you now? And what's your current Magnum version? Thanks. > > > On 22/01/19 6:32 AM, Ignazio Cassano wrote: >> I think that the script used to write /etc/sysconfig/heat-parms >> should insert an export for any variable initialized. >> Any case resourcegroup worked fine before applying last patch. >> What is changed ? >> Thanks in Advance for any help. >> Regards  >> Ignazio >> >> Il giorno Lun 21 Gen 2019 12:42 Ignazio Cassano >> > ha >> scritto: >> >> I am trying patches you just released for magnum (git fetch >> git://git.openstack.org/openstack/magnum >> >> refs/changes/30/629130/9 && git checkout FETCH_HEAD) >> I got same issues on proxy. In the old version I modified >> with the help of spyros the scripts under >> /usr/lib/python2.7/dist-packages/magnum/drivers/common/templates/kubernetes/fragments >> because PROXY variables are not inherited                   >> in /etc/sysconfig/heat-params PROXY E NO PROXY variables are >> present but we must modify configure-kubernetes-master.sh to >> force them                   >> . /etc/sysconfig/heat-params >> echo "configuring kubernetes (master)" >>  _prefix=${CONTAINER_INFRA_PREFIX:-docker.io/openstackmagnum/ >> } >> export HTTP_PROXY=${HTTP_PROXY} >> export HTTPS_PROXY=${HTTPS_PROXY} >> export NO_PROXY=${NO_PROXY} >> echo "HTTP_PROXY IS ${HTTP_PROXY}" >> exporting the above variables when external network has a >> proxy, the master is installed but stack hangs creating kube >> master Resource Group >> >> Regards >> Ignazio >> > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > ------------------------------------------------------ > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > ------------------------------------------------------ > -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Jan 28 09:57:27 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 28 Jan 2019 10:57:27 +0100 Subject: [magnum][queens] issues In-Reply-To: <21968eba-0ac6-2f10-d72a-1ccf387291f3@catalyst.net.nz> References: <2bc6bc9b-7e3b-5131-dd22-d1934cd9a107@catalyst.net.nz> <21968eba-0ac6-2f10-d72a-1ccf387291f3@catalyst.net.nz> Message-ID: OK Give me 5 minutes to reapply the patch and I'll connect to IRC Thanks Il giorno lun 28 gen 2019 alle ore 10:47 feilong ha scritto: > Is the heat stack creation stuck ? Can you see any error from Heat log? It > would be nice if you can pop up into #openstack-containers IRC channel so > that we can discuss more details? Thanks. > > > On 28/01/19 10:42 PM, Ignazio Cassano wrote: > > Hello Feilong, I need the workaround on old and and patched version of > magnum. > > The old version I installed with apt is: > > oot at tst2-magnum-ubu:~# dpkg -l|grep magnum > ii magnum-api > 6.1.0-0ubuntu1~cloud0 all OpenStack > containers as a service > ii magnum-common > 6.1.0-0ubuntu1~cloud0 all OpenStack > containers as a service - API server > ii magnum-conductor > 6.1.0-0ubuntu1~cloud0 all OpenStack > containers as a service - conductor > ii python-magnum > 6.1.0-0ubuntu1~cloud0 all OpenStack > containers as a service - Python library > ii python-magnumclient > 2.8.0-0ubuntu1~cloud0 all client library for > Magnum API - Python 2.x > > > > Then I pached magnum with the following commands: > > cd /tmp > git clone git://git.openstack.org/openstack/magnum > cd magnum > git fetch git://git.openstack.org/openstack/magnum > refs/changes/30/629130/9 && git checkout FETCH_HEAD > mv /usr/lib/python2.7/dist-packages/magnum > /usr/lib/python2.7/dist-packages/magnum.orig > cp -rp magnum /usr/lib/python2.7/dist-packages/ > > Then I applyed again my workaround because my external network used by > magnum needs a proxy for accessing internet. > But on pre-patched version magnum heat stacks work fine. > Magnum patched stacks hang creating kube master Resource Group. > Thanks > Ignazio > > > > > > Il giorno lun 28 gen 2019 alle ore 10:25 feilong > ha scritto: > >> Hi Ignazio, >> >> I'm jumping in to help. But I'd like to understand the issue correctly. >> You mentioned using explicitly 'export' you can workaround it before with >> old Magnum versions. Does it still work for you now? And what's your >> current Magnum version? Thanks. >> >> >> On 22/01/19 6:32 AM, Ignazio Cassano wrote: >> >> I think that the script used to write /etc/sysconfig/heat-parms should >> insert an export for any variable initialized. >> Any case resourcegroup worked fine before applying last patch. >> What is changed ? >> Thanks in Advance for any help. >> Regards >> Ignazio >> >> Il giorno Lun 21 Gen 2019 12:42 Ignazio Cassano >> ha scritto: >> >>> I am trying patches you just released for magnum (git fetch git:// >>> git.openstack.org/openstack/magnum refs/changes/30/629130/9 && git >>> checkout FETCH_HEAD) >>> I got same issues on proxy. In the old version I modified with the help >>> of spyros the scripts under >>> /usr/lib/python2.7/dist-packages/magnum/drivers/common/templates/kubernetes/fragments >>> because PROXY variables are not inherited >>> in /etc/sysconfig/heat-params PROXY E NO PROXY variables are present but >>> we must modify configure-kubernetes-master.sh to force >>> them >>> . /etc/sysconfig/heat-params >>> echo "configuring kubernetes (master)" >>> _prefix=${CONTAINER_INFRA_PREFIX:-docker.io/openstackmagnum/} >>> export HTTP_PROXY=${HTTP_PROXY} >>> export HTTPS_PROXY=${HTTPS_PROXY} >>> export NO_PROXY=${NO_PROXY} >>> echo "HTTP_PROXY IS ${HTTP_PROXY}" >>> exporting the above variables when external network has a proxy, the >>> master is installed but stack hangs creating kube master Resource Group >>> >>> Regards >>> Ignazio >>> >>> -- >> Cheers & Best regards, >> Feilong Wang (王飞龙) >> ------------------------------------------------------ >> Senior Cloud Software Engineer >> Tel: +64-48032246 >> Email: flwang at catalyst.net.nz >> Catalyst IT Limited >> Level 6, Catalyst House, 150 Willis Street, Wellington >> ------------------------------------------------------ >> >> -- > Cheers & Best regards, > Feilong Wang (王飞龙) > ------------------------------------------------------ > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > ------------------------------------------------------ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Jan 28 10:04:47 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 28 Jan 2019 11:04:47 +0100 Subject: [magnum][queens] issues In-Reply-To: <21968eba-0ac6-2f10-d72a-1ccf387291f3@catalyst.net.nz> References: <2bc6bc9b-7e3b-5131-dd22-d1934cd9a107@catalyst.net.nz> <21968eba-0ac6-2f10-d72a-1ccf387291f3@catalyst.net.nz> Message-ID: I am on #openstack-containers IRC Il giorno lun 28 gen 2019 alle ore 10:47 feilong ha scritto: > Is the heat stack creation stuck ? Can you see any error from Heat log? It > would be nice if you can pop up into #openstack-containers IRC channel so > that we can discuss more details? Thanks. > > > On 28/01/19 10:42 PM, Ignazio Cassano wrote: > > Hello Feilong, I need the workaround on old and and patched version of > magnum. > > The old version I installed with apt is: > > oot at tst2-magnum-ubu:~# dpkg -l|grep magnum > ii magnum-api > 6.1.0-0ubuntu1~cloud0 all OpenStack > containers as a service > ii magnum-common > 6.1.0-0ubuntu1~cloud0 all OpenStack > containers as a service - API server > ii magnum-conductor > 6.1.0-0ubuntu1~cloud0 all OpenStack > containers as a service - conductor > ii python-magnum > 6.1.0-0ubuntu1~cloud0 all OpenStack > containers as a service - Python library > ii python-magnumclient > 2.8.0-0ubuntu1~cloud0 all client library for > Magnum API - Python 2.x > > > > Then I pached magnum with the following commands: > > cd /tmp > git clone git://git.openstack.org/openstack/magnum > cd magnum > git fetch git://git.openstack.org/openstack/magnum > refs/changes/30/629130/9 && git checkout FETCH_HEAD > mv /usr/lib/python2.7/dist-packages/magnum > /usr/lib/python2.7/dist-packages/magnum.orig > cp -rp magnum /usr/lib/python2.7/dist-packages/ > > Then I applyed again my workaround because my external network used by > magnum needs a proxy for accessing internet. > But on pre-patched version magnum heat stacks work fine. > Magnum patched stacks hang creating kube master Resource Group. > Thanks > Ignazio > > > > > > Il giorno lun 28 gen 2019 alle ore 10:25 feilong > ha scritto: > >> Hi Ignazio, >> >> I'm jumping in to help. But I'd like to understand the issue correctly. >> You mentioned using explicitly 'export' you can workaround it before with >> old Magnum versions. Does it still work for you now? And what's your >> current Magnum version? Thanks. >> >> >> On 22/01/19 6:32 AM, Ignazio Cassano wrote: >> >> I think that the script used to write /etc/sysconfig/heat-parms should >> insert an export for any variable initialized. >> Any case resourcegroup worked fine before applying last patch. >> What is changed ? >> Thanks in Advance for any help. >> Regards >> Ignazio >> >> Il giorno Lun 21 Gen 2019 12:42 Ignazio Cassano >> ha scritto: >> >>> I am trying patches you just released for magnum (git fetch git:// >>> git.openstack.org/openstack/magnum refs/changes/30/629130/9 && git >>> checkout FETCH_HEAD) >>> I got same issues on proxy. In the old version I modified with the help >>> of spyros the scripts under >>> /usr/lib/python2.7/dist-packages/magnum/drivers/common/templates/kubernetes/fragments >>> because PROXY variables are not inherited >>> in /etc/sysconfig/heat-params PROXY E NO PROXY variables are present but >>> we must modify configure-kubernetes-master.sh to force >>> them >>> . /etc/sysconfig/heat-params >>> echo "configuring kubernetes (master)" >>> _prefix=${CONTAINER_INFRA_PREFIX:-docker.io/openstackmagnum/} >>> export HTTP_PROXY=${HTTP_PROXY} >>> export HTTPS_PROXY=${HTTPS_PROXY} >>> export NO_PROXY=${NO_PROXY} >>> echo "HTTP_PROXY IS ${HTTP_PROXY}" >>> exporting the above variables when external network has a proxy, the >>> master is installed but stack hangs creating kube master Resource Group >>> >>> Regards >>> Ignazio >>> >>> -- >> Cheers & Best regards, >> Feilong Wang (王飞龙) >> ------------------------------------------------------ >> Senior Cloud Software Engineer >> Tel: +64-48032246 >> Email: flwang at catalyst.net.nz >> Catalyst IT Limited >> Level 6, Catalyst House, 150 Willis Street, Wellington >> ------------------------------------------------------ >> >> -- > Cheers & Best regards, > Feilong Wang (王飞龙) > ------------------------------------------------------ > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > ------------------------------------------------------ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Mon Jan 28 10:18:41 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 28 Jan 2019 10:18:41 +0000 Subject: Adding type hints to OpenStack (starting with Oslo) In-Reply-To: <852334fb1bbde9cf4a866db1df9c647bd54a9dfe.camel@redhat.com> References: <20190126042615.GF12721@debian> <20190126150524.GG12721@debian> <688957DD-31A3-42F0-A77A-3638171740AD@doughellmann.com> <852334fb1bbde9cf4a866db1df9c647bd54a9dfe.camel@redhat.com> Message-ID: On Sat, 2019-01-26 at 23:27 +0000, Sean Mooney wrote: > On Sat, 2019-01-26 at 13:29 -0500, Doug Hellmann wrote: > > I agree with your concerns about code churn. I think it’s still useful to run a small experiment to demonstrate how we > > might actually benefit from them though, which applying them to one library would let us do. > > > > On Jan 26, 2019, at 1:18 PM, Morgan Fainberg wrote: > > > > > Honestly, I would wait for this for the U release. We can then go to the annotation style and if needed update > > > docstrings at the same time. My concern with pushing this now is a high churn of code that should be duplicated once > > > we drop PY2 support. > > > > > > Other than my above concerns the type checking in mypy would be great to have. > > > > > > --Morgan > > > > > > On Sat, Jan 26, 2019, 09:14 Sergey Vilgelm > > > Cyril, > > > > > > > > We have lots of doc strings in rst format and many of them already have the `@param` and `@type` tags. Does the > > > > mypy support this format or we should to rewrite doc string and add the `type:` tags just for mypy? > > > > > > > > -- > > > > Sergey Vilgelm > > > > https://www.vilgelm.info > > > > On Jan 26, 2019, 9:07 AM -0600, Cyril Roelandt , wrote: > > > > > Hello Doug, > > > > > > > > > > On 01/26/19 09:02, Doug Hellmann wrote: > > > > > > We're still required to support python 2 through the beginning of the U > > > > > > cycle [5]. Is it possible to apply the type hints in a way that allows > > > > > > us to maintain that support? > > > > > > > > > > > > [5] https://governance.openstack.org/tc/resolutions/20180529-python2-deprecation-timeline.html > > > > > > > > > > There are two ways to apply type hints: > > > > > 1) Use a new syntax that only works with Python 3 > > > > > 2) Use comments (see my github branch, where comments starting with > > > > > "type:" can be found) > there is also a third, looping stephen into the conversation. > stephen did some experiments in this regard in the past. > the commet form is backwards compatiabl but you can also put thetype hints in > a seperate .pyi file. The .pyi file version has the advantage of also working for c modules > and not modifying any of the existing code. > > we previously disscused the idea of using > https://github.com/Instagram/MonkeyType > to auto discover the types of existing libs and generate the stub .pyi files. > ideally we should just be able to run our existing unit/functional test under mockeytype > to discover the relevent types and create the pyi files. > > i dont think stephen or i had the chance to persue that since we discuseed it in denver. Indeed, I had some patches proposed against nova using the comment- style syntax that were met with a resounding meh [1]. I also started working on oslo changes but got stuck with 'oslo.versionedobject', which does funky stuff with fields that seemed to break mypy's introspection (or whatever it's doing). If you were to follow through on this, I would personally start with that particular library (plus a consumer of said library) to see if you could figure out some of the quirks here. Personally, I've basically given up on all of this until U when I can use the Python 3.2+ annotation syntax. I would be happy to review any changes of yours though and I'm pretty sure we could script the conversion from comment-type annotation to native annotations (I think the Instagram or Dropbox devs published additional tools to do just this). Stephen [1] https://review.openstack.org/#/q/topic:bp/integrate-mypy-type-checking > > > > > I used the second approach in order to not break Python 2 compatibility. > > > > > > > > > > > > > > > Regards, > > > > > Cyril > > > > > From artem.goncharov at gmail.com Mon Jan 28 10:20:42 2019 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Mon, 28 Jan 2019 11:20:42 +0100 Subject: [tc][all] Train Community Goal - CLI Message-ID: Hi everybody, One of the community goals for Train is to work on deprecation of individual clients in favor of unified OpenStackClient. This in turn consists (or might consist) from few individual targets: - bringing current state of the CLI support for each service on par with native service client (especially covering changes added with latest microversions) - ensuring SDK also supports all of the service/resource/attribute/method - switching OSC service to SDK, as soon as it is on the same level, as native service client (someone might argue, that this is a mandatory part for reaching the target, but I personally this should be as important for reaching the goal (in addition to avoid double work)) - deprecating individual clients (when prerequisites are fulfilled) In order to drive a bit the whole goal I have started working on gathering differences we have for services in OSC: https://etherpad.openstack.org/p/osc-gaps-analysis. I tried to analyze current state going from the API side, where for each service we have a set of resources (with corresponding attributes) and methods on those. To achieve that I was "parsing" a service documentation (do not kick me too hard for that ;-) and processing it further, leaving overall info as "yaml" since there is lots of info and in the background there are still some source "documents". Resulting document in some form represent a current status and a todo list. I would be happy if people who feel responsible for implementing CLI would have a look on that and provide a feedback, whether they find the whole analysis helpful or not (if not whether there are better ideas on how to track status and todo). There is still a lot to be done even to figure out the current status, but I feel we need to start moving if we want to achieve the target (even switching just a few services would be already an achievement). So, if you are willing to support the goal - please join me. Any help and work is welcome. Regards, Artem -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Mon Jan 28 10:29:12 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Mon, 28 Jan 2019 11:29:12 +0100 Subject: [tc][all][self-healing-sig] Service-side health checks community goal for Train cycle In-Reply-To: <158c354c1d7a3e6fb261202b34d4e3233d5f39bc.camel@evrard.me> References: <158c354c1d7a3e6fb261202b34d4e3233d5f39bc.camel@evrard.me> Message-ID: <1548671352.507178.1645094472.39B42BCA@webmail.messagingengine.com> On Fri, Jan 25, 2019, at 11:44 PM, Jean-Philippe Evrard wrote: > Hello everyone, > > As you might have seen on the ML, two of the 3 top contenders for the > Train cycle community got some traction. Let's here talk about the last > one: The service-side health checks. > > While people were interested in this goal previously, nobody really > came forward on the pre-work. > > Last week, I met a few of my colleagues to see what we can do together. > Matt (irc: mattoliverau), Adam (irc: aspiers), and I discussed about > the different ways to implement this new API, with the help of many in > #openstack-sdk. > > Long story short, the current framework might be "good enough" for > extension already, as we could have extra "backends" (basically > "tests"), to increase the coverage of this healthcheck endpoint. > > While the immediate next step would be to work on the v2 prototype that > Graham started (see link [1], anyone is welcome to help there!), the > next step would be far easier if it was crowd sourced: We need to know > which service is already using that oslo middleware, which service > doesn't want to use it, and which service is already ready for > healtchecks. > > When we'll have a lay of the land, we'll know where the energy will be > spent in this community goal: Would that be bringing oslo.middleware to > services or bringing common "backends" that can be > used by each service (like DB/MQ/cache checks). > > I would be very happy if you could have a look at this ethercal [2], > and add/edit your project capabilities there. > > Thank you in advance. > Jean-Philippe Evrard (evrardjp) > > [1]: https://review.openstack.org/#/c/617924/ > [2]: https://ethercalc.openstack.org/di0mxkiepll8 > > > I noticed the ethercalc has a column "Project has paste with healthcheck in pipelines in paste.ini". Is using Paste a requirement for this goal? If so, I think that's a non-starter. Keystone just removed Paste, but does already have support for healthchecks via oslo.middleware: https://docs.openstack.org/keystone/latest/admin/health-check-middleware.html Colleen From bence.romsics at gmail.com Mon Jan 28 10:31:14 2019 From: bence.romsics at gmail.com (Bence Romsics) Date: Mon, 28 Jan 2019 11:31:14 +0100 Subject: G8 H8 In-Reply-To: <2df964b1-000f-d85f-e8d9-a6998d02554c@ericsson.com> References: <9891003e-9777-b684-05c9-8dfc22363e07@gmail.com> <2df964b1-000f-d85f-e8d9-a6998d02554c@ericsson.com> Message-ID: Hi, On Mon, Jan 28, 2019 at 9:53 AM Lajos Katona wrote: > On 2019. 01. 26. 18:16, Slawomir Kaplonski wrote: > >> Wiadomość napisana przez Matt Riedemann w dniu 26.01.2019, o godz. 01:47: > >> * Probably our biggest issue right now is test_subnet_details failing: http://status.openstack.org/elastic-recheck/#1813198. I suspect that is somehow related to using cirros 0.4.0 in devstack as of Jan 20. I have a tempest patch up for review to help debug that when it fails https://review.openstack.org/#/c/633225 since it seems we're not parsing nic names properly which is how we get the mangled udhcpc..pid file name. > > I was looking at logs from failed job [1] and what I noticed in tempest log [2] is fact that couple of times this command returned proper „eth0” interface and then it once return empty string which, looking at command in tempest test means IMO that IP address (10.1.0.3 in above example) wasn’t configured on any interface. Maybe this interface is losing its IP address during renew lease process and we just should make tempest test more proof for such (temporary I hope) issue. > I tried to do the same in a loop from cirros 0.4.0, but I can't remove > IP from interface. Of course it's possible that something else happens > there out of the command executed from tempests. For what it's worth I think the problem can be reproduced like this: 1) take a cirros image (either 0.3.5 or 0.4.0) 2) boot a vm with it (I booted it by libvirt, didn't even use openstack) 3) look up the current ip of eth0 manually: ip a (here: 100.109.0.64) 4) run this command once: sudo /bin/kill -USR1 $( cat /var/run/udhcpc.$( ip -o addr | awk '/100.109.0.64 / {print $2}' ).pid ) There's no apparent error. 5) run the same command in a tight loop: while true ; do sudo /bin/kill -USR1 $( cat /var/run/udhcpc.$( ip -o addr | aw k '/100.109.0.64/ {print $2}' ).pid ) ; done This reliably produces error messages like: cat: can't open '/var/run/udhcpc..pid': No such file or directory kill: you need to specify whom to kill cat: can't open '/var/run/udhcpc..pid': No such file or directory kill: you need to specify whom to kill cat: can't open '/var/run/udhcpc..pid': No such file or directory kill: you need to specify whom to kill cat: can't open '/var/run/udhcpc..pid': No such file or directory kill: you need to specify whom to kill cat: can't open '/var/run/udhcpc..pid': No such file or directory kill: you need to specify whom to kill That's how far I got in debugging this at the moment. Cheers, Bence From ltoscano at redhat.com Mon Jan 28 10:58:07 2019 From: ltoscano at redhat.com (Luigi Toscano) Date: Mon, 28 Jan 2019 11:58:07 +0100 Subject: [tc][all] Train Community Goal - CLI In-Reply-To: References: Message-ID: <1956757.s3rWK0Cms4@whitebase.usersys.redhat.com> On Monday, 28 January 2019 11:20:42 CET Artem Goncharov wrote: > Hi everybody, > > One of the community goals for Train is to work on deprecation of > individual clients in favor of unified OpenStackClient. This in turn > consists (or might consist) from few individual targets: > - bringing current state of the CLI support for each service on par with > native service client (especially covering changes added with latest > microversions) > - ensuring SDK also supports all of the service/resource/attribute/method > - switching OSC service to SDK, as soon as it is on the same level, as > native service client (someone might argue, that this is a mandatory part > for reaching the target, but I personally this should be as important for > reaching the goal (in addition to avoid double work)) > - deprecating individual clients (when prerequisites are fulfilled) > > In order to drive a bit the whole goal I have started working on gathering > differences we have for services in OSC: > https://etherpad.openstack.org/p/osc-gaps-analysis. An important detail about this email is that it focuses solely on the services handled in the core of openstackclient and openstacksdk (the Group Formerly Know As Core). I have two points about this: - switching to OSC and this fullfilling the goal should not be connected to switching to SDK - which is an importang goal in itself, but I suspect that the effort may be higher than just adding the missing bits to the existing OSC clients; - the entire analysis does not consider the OSC support for all the other projects, which may or not may have switched and which may support OSC more than SDK, so that it would be better to postpone the coordinated switch. Ciao -- Luigi From artem.goncharov at gmail.com Mon Jan 28 11:04:18 2019 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Mon, 28 Jan 2019 12:04:18 +0100 Subject: [tc][all] Train Community Goal - CLI In-Reply-To: <1956757.s3rWK0Cms4@whitebase.usersys.redhat.com> References: <1956757.s3rWK0Cms4@whitebase.usersys.redhat.com> Message-ID: On Mon, Jan 28, 2019 at 11:58 AM Luigi Toscano wrote: > On Monday, 28 January 2019 11:20:42 CET Artem Goncharov wrote: > > Hi everybody, > > > An important detail about this email is that it focuses solely on the > services > handled in the core of openstackclient and openstacksdk (the Group > Formerly > Know As Core). Right > I have two points about this: > - switching to OSC and this fullfilling the goal should not be connected > to > switching to SDK - which is an importang goal in itself, but I suspect > that > the effort may be higher than just adding the missing bits to the existing > OSC > clients; > agree, that it must not be connected. While doing analysis I thought it will be helpful to track status for both at the same time. We might need to "push" on this also. > - the entire analysis does not consider the OSC support for all the other > projects, which may or not may have switched and which may support OSC > more > than SDK, so that it would be better to postpone the coordinated switch. > Sure it does not. I am not sure a single person would be ever feasible doing a complete analysis for all possible plugins. As you already mentioned - let's focus on core services first. But if you are able to support in extending the list for any other services - you are of course welcome. > > Ciao > -- > Luigi > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Mon Jan 28 11:11:17 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Mon, 28 Jan 2019 12:11:17 +0100 Subject: [tc][all][self-healing-sig] Service-side health checks community goal for Train cycle In-Reply-To: <1548671352.507178.1645094472.39B42BCA@webmail.messagingengine.com> References: <158c354c1d7a3e6fb261202b34d4e3233d5f39bc.camel@evrard.me> <1548671352.507178.1645094472.39B42BCA@webmail.messagingengine.com> Message-ID: <7cc5aa565a3a50a2d520d99e3ddcd6da5502e990.camel@evrard.me> On Mon, 2019-01-28 at 11:29 +0100, Colleen Murphy wrote: > On Fri, Jan 25, 2019, at 11:44 PM, Jean-Philippe Evrard wrote: > > Hello everyone, > > > > As you might have seen on the ML, two of the 3 top contenders for > > the > > Train cycle community got some traction. Let's here talk about the > > last > > one: The service-side health checks. > > > > While people were interested in this goal previously, nobody really > > came forward on the pre-work. > > > > Last week, I met a few of my colleagues to see what we can do > > together. > > Matt (irc: mattoliverau), Adam (irc: aspiers), and I discussed > > about > > the different ways to implement this new API, with the help of many > > in > > #openstack-sdk. > > > > Long story short, the current framework might be "good enough" for > > extension already, as we could have extra "backends" (basically > > "tests"), to increase the coverage of this healthcheck endpoint. > > > > While the immediate next step would be to work on the v2 prototype > > that > > Graham started (see link [1], anyone is welcome to help there!), > > the > > next step would be far easier if it was crowd sourced: We need to > > know > > which service is already using that oslo middleware, which service > > doesn't want to use it, and which service is already ready for > > healtchecks. > > > > When we'll have a lay of the land, we'll know where the energy will > > be > > spent in this community goal: Would that be bringing > > oslo.middleware to > > services or bringing common "backends" that > > can be > > used by each service (like DB/MQ/cache checks). > > > > I would be very happy if you could have a look at this ethercal > > [2], > > and add/edit your project capabilities there. > > > > Thank you in advance. > > Jean-Philippe Evrard (evrardjp) > > > > [1]: https://review.openstack.org/#/c/617924/ > > [2]: https://ethercalc.openstack.org/di0mxkiepll8 > > > > > > > I noticed the ethercalc has a column "Project has paste with > healthcheck in pipelines in paste.ini". Is using Paste a requirement > for this goal? If so, I think that's a non-starter. > > Keystone just removed Paste, but does already have support for > healthchecks via oslo.middleware: > > https://docs.openstack.org/keystone/latest/admin/health-check-middleware.html > > Colleen > It is not a non-starter. I knew this would show up :) It's fine that some projects do differently (for example swift has different middleware, keystone is not using paste). I think it's also too big of a change to move everyone to one single technology in a cycle :) Instead, I want to focus on the real use case for people (bringing a common healthcheck "api" itself), which doesn't matter on the technology. But I still would like to crowdsource the information about how projects are doing things, as it would help understand the complexity better. Jean-Philippe Evrard (evrardjp) From cdent+os at anticdent.org Mon Jan 28 11:34:02 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 28 Jan 2019 11:34:02 +0000 (GMT) Subject: [tc][all][self-healing-sig] Service-side health checks community goal for Train cycle In-Reply-To: <7cc5aa565a3a50a2d520d99e3ddcd6da5502e990.camel@evrard.me> References: <158c354c1d7a3e6fb261202b34d4e3233d5f39bc.camel@evrard.me> <1548671352.507178.1645094472.39B42BCA@webmail.messagingengine.com> <7cc5aa565a3a50a2d520d99e3ddcd6da5502e990.camel@evrard.me> Message-ID: On Mon, 28 Jan 2019, Jean-Philippe Evrard wrote: > It is not a non-starter. I knew this would show up :) > It's fine that some projects do differently (for example swift has > different middleware, keystone is not using paste). Tangent so that people are clear on the state of Paste and PasteDeploy. I recommend projects move away from using either. Until recently both were abandonware, not receiving updates, and had issues working with Python3. I managed to locate maintainers from a few years ago, and negotiated to bring them under some level of maintenance, but in both cases the people involved are only interested in doing limited management to keep the projects barely alive. pastedeploy (the thing that is more often used in OpenStack, and is usually used to load the paste.ini file and doesn't have to have a dependency on paste itself) is now under the Pylons project: https://github.com/Pylons/pastedeploy Paste itself is with me: https://github.com/cdent/paste > I think it's also too big of a change to move everyone to one single > technology in a cycle :) Instead, I want to focus on the real use case > for people (bringing a common healthcheck "api" itself), which doesn't > matter on the technology. I agree that the healthcheck change can and should be completely separate from any question of what is used to load middleware. That's the great thing about WSGI. As long as the healthcheck tooling presents are "normal" WSGI interface it ought to either "just work" or be wrappable by other tooling, so I wouldn't spend too much time making a survey of how people are doing middleware. The tricky part (but not that tricky) will be with managing how the "tests" are provided to the middleware. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From lajos.katona at ericsson.com Mon Jan 28 12:09:52 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Mon, 28 Jan 2019 12:09:52 +0000 Subject: [neutron] Neutron Bug Deputy report for week of Jan 21 Message-ID: <942c6a27-36a7-47d1-9080-9e60989cbd70@ericsson.com> Hi Neutrinos, Here is the summary of neutron bugs that came in last week (starting from Jan. 21st). * Gate failures: * tempest-slow tests fails often (https://bugs.launchpad.net/neutron/+bug/1812552) * TestNetworkBasicOps:test_subnet_details intermittently fails with "cat: can't open '/var/run/udhcpc..pid': No such file or directory" (https://bugs.launchpad.net/neutron/+bug/1813198) => mriedem added extra logs to tempest (https://review.openstack.org/633225), let's see how that helps debugging. * neutron functional tests break with oslo.utils 3.39.1 and above (https://bugs.launchpad.net/neutron/+bug/1812922) => oslo.utils patch is on its way, in requirements the wrong oslo.utils versions were blacklisted. * Functional test neutron.tests.functional.services.portforwarding.test_port_forwarding.PortForwardingTestCase.test_concurrent_create_port_forwarding_delete_port fails (https://bugs.launchpad.net/neutron/+bug/1813540) * Unit test neutron.tests.unit.services.revisions.test_revision_plugin.TestRevisionPlugin. test_port_ip_update_revises failing (https://bugs.launchpad.net/neutron/+bug/1813417) * Medium * L2 agent do not clear old QoS rules after restart (https://bugs.launchpad.net/neutron/+bug/1812576) * In progress (https://review.openstack.org/632014) * Neutron-server takes a long time when creating a port with multiple addresses at once (https://bugs.launchpad.net/neutron/+bug/1813253) * Race during adding and updating same port in L3 agent's info can generate wrong radvd config file (https://bugs.launchpad.net/neutron/+bug/1813279). In progress: https://review.openstack.org/633236 * an instance can see other instances' unicast packages when security group firewall_driver is openvswitch (https://bugs.launchpad.net/neutron/+bug/1813439) * Invalid * Port with no active binding mark as dead (https://bugs.launchpad.net/neutron/+bug/1812788) * DOC: * create vm failed, RequiredOptError: value required for option lock_path in group (https://bugs.launchpad.net/neutron/+bug/1812497) * Fix on master: https://review.openstack.org/632316 * Networking Option 2: Self-service networks in neutron (https://bugs.launchpad.net/neutron/+bug/1812958) * duplicate of 1812497 Regards Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Jan 28 12:59:02 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 28 Jan 2019 13:59:02 +0100 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: <87bm477xae.fsf@meyer.lemoncheese.net> References: <87bm477xae.fsf@meyer.lemoncheese.net> Message-ID: <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> James E. Blair wrote: > As part of the recent infrastructure work described in > http://lists.openstack.org/pipermail/openstack-discuss/2019-January/002026.html > we now have the ability to fairly easily support uploading of container > images to the "openstack/" namespace on Docker Hub. The Infrastructure > team does have an account on Docker Hub with ownership rights to this > space. > > It is now fairly simple for us to allow any OpenStack project to upload > to openstack/$short_name. As a (perhaps unlikely, but simple) example, > Nova could upload images to "openstack/nova", including suffixed images, > such as "openstack/nova-scheduler". > [...] > > I believe it's within the TC's purview to decide whether this should > happen, and if so, what policies should govern it (i.e., what projects > are entitled to upload to openstack/). > > It's possible that the status quo where deployment projects upload to > their own namespaces (e.g., loci/) while openstack/ remains empty is > desirable. However, since we recently gained the technical ability to > handle this, I thought it worth bringing up. Thanks for bringing this up. Each solution has its benefits, and I don't have a super-strong opinion on it. I'm leaning toward status quo: unless we consistently publish containers for most (or even all) deliverables, we should keep them in separate namespaces. A centralized "openstack" namespace conveys some official-ness and completeness -- it would make sense if we published all deliverablkes as containers every cycle as part of the release management work, for example. If it only contains a few select containers published at different times under different rules, it's likely to be more confusing than helping... -- Thierry Carrez (ttx) From cdent+os at anticdent.org Mon Jan 28 13:06:22 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 28 Jan 2019 13:06:22 +0000 (GMT) Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> Message-ID: On Mon, 28 Jan 2019, Thierry Carrez wrote: > I'm leaning toward status quo: unless we consistently publish containers for > most (or even all) deliverables, we should keep them in separate namespaces. That makes a lot of sense but another way to look at it is: If we start publishing some containers into a consistent namespace it might encourage projects to start owning "blessed" containers of themselves, which is probably a good thing. And having a location with vacancies might encourage people to fill\ it, whereas otherwise the incentive is weak. > A centralized "openstack" namespace conveys some official-ness and > completeness -- it would make sense if we published all deliverablkes as > containers every cycle as part of the release management work, for example. > If it only contains a few select containers published at different times > under different rules, it's likely to be more confusing than helping... The current container situation is already pretty confusing... -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From smooney at redhat.com Mon Jan 28 13:39:21 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 28 Jan 2019 13:39:21 +0000 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> Message-ID: <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> On Mon, 2019-01-28 at 13:06 +0000, Chris Dent wrote: > On Mon, 28 Jan 2019, Thierry Carrez wrote: > > > I'm leaning toward status quo: unless we consistently publish containers for > > most (or even all) deliverables, we should keep them in separate namespaces. > > That makes a lot of sense but another way to look at it is: > > If we start publishing some containers into a consistent namespace > it might encourage projects to start owning "blessed" containers of > themselves, which is probably a good thing. well that raises the question of what type of containter someinthing like opesntac/nova should be a kolla container a loci container lxd containers a container build with pbr the way zuul is published. someting else determined by the porject? having yet another way to build openstack container is proably not a good thing. even if a common way of building the container was agreed on there is also the question of what base os is it derived form. finding a vender neutral answer to the above that does not "play favorites" with projects, distros or technologies will be challenging. > > And having a location with vacancies might encourage people to fill\ > it, whereas otherwise the incentive is weak. there are already pretty complete set of offical containers from the kolla project on dockerhub here https://hub.docker.com/u/kolla/ and less so from loci here https://hub.docker.com/u/loci and https://hub.docker.com/u/gantry > > > A centralized "openstack" namespace conveys some official-ness and > > completeness -- it would make sense if we published all deliverablkes as > > containers every cycle as part of the release management work, for example. > > If it only contains a few select containers published at different times > > under different rules, it's likely to be more confusing than helping... > > The current container situation is already pretty confusing... > From bence.romsics at gmail.com Mon Jan 28 13:48:40 2019 From: bence.romsics at gmail.com (Bence Romsics) Date: Mon, 28 Jan 2019 14:48:40 +0100 Subject: G8 H8 In-Reply-To: References: <9891003e-9777-b684-05c9-8dfc22363e07@gmail.com> <2df964b1-000f-d85f-e8d9-a6998d02554c@ericsson.com> Message-ID: I uploaded a try to fix #1813198 here: https://review.openstack.org/633502 We'll see some test results soon if it works or not. Cheers, Bence From alfredo.deluca at gmail.com Mon Jan 28 14:24:42 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Mon, 28 Jan 2019 15:24:42 +0100 Subject: [openstack-ansible][magnum] Message-ID: Hi all. I finally instaledl successufully openstack ansible (queens) but, after creating a cluster template I create k8s cluster, it stuck on kube_masters b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate in progress....and after around an hour it says...time out. k8s master seems to be up.....at least as VM. any idea? *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Mon Jan 28 15:24:21 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Mon, 28 Jan 2019 10:24:21 -0500 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> Message-ID: On Mon, Jan 28, 2019 at 8:41 AM Sean Mooney wrote: > > On Mon, 2019-01-28 at 13:06 +0000, Chris Dent wrote: > > On Mon, 28 Jan 2019, Thierry Carrez wrote: > > > > > I'm leaning toward status quo: unless we consistently publish containers for > > > most (or even all) deliverables, we should keep them in separate namespaces. > > > > That makes a lot of sense but another way to look at it is: > > > > If we start publishing some containers into a consistent namespace > > it might encourage projects to start owning "blessed" containers of > > themselves, which is probably a good thing. > well that raises the question of what type of containter someinthing like > opesntac/nova should be > > a kolla container > a loci container > lxd containers > a container build with pbr the way zuul is published. > someting else determined by the porject? > > having yet another way to build openstack container is proably > not a good thing. > > even if a common way of building the container was agreed on > there is also the question of what base os is it derived form. > > finding a vender neutral answer to the above that does not "play favorites" > with projects, distros or technologies will be challenging. > > > > And having a location with vacancies might encourage people to fill\ > > it, whereas otherwise the incentive is weak. > > there are already pretty complete set of offical containers from the kolla > project on dockerhub here https://hub.docker.com/u/kolla/ and less so from loci > here https://hub.docker.com/u/loci and https://hub.docker.com/u/gantry > > > > > > > > > A centralized "openstack" namespace conveys some official-ness and > > > completeness -- it would make sense if we published all deliverablkes as > > > containers every cycle as part of the release management work, for example. > > > If it only contains a few select containers published at different times > > > under different rules, it's likely to be more confusing than helping... > > > > The current container situation is already pretty confusing... > > > > I think we should all agree to a certain set of way that we publish our Docker images, in the same sense that we have one way of publishing Python packages (i.e. for the most part using pbr, etc). I know the Zuul team has done work around pbrx, we also have a lot of domain knowledge from the Kolla and LOCI teams. I'm sure that by working together, we can come up with a well thought-out process of official image deliverables. I would also be in favor of basing it on top of a simple python base image (which I believe comes through Debian), however, the story of delivering something that includes binaries becomes interesting. Perhaps, we should come up with the first initial step of providing a common way of building images (so a use can clone a repo and do 'docker build .') which will eliminate the obligation of having to deal with binaries, and then afterwards reconsider the ideal way of shipping those out. -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From jaypipes at gmail.com Mon Jan 28 15:39:28 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Mon, 28 Jan 2019 10:39:28 -0500 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> Message-ID: On 01/28/2019 10:24 AM, Mohammed Naser wrote: > Perhaps, we should come up with the first initial step of providing > a common way of building images (so a use can clone a repo and do > 'docker build .') which will eliminate the obligation of having to > deal with binaries, and then afterwards reconsider the ideal way of > shipping those out. Isn't that precisely what LOCI offers, Mohammed? Best, -jay From mnaser at vexxhost.com Mon Jan 28 15:43:34 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Mon, 28 Jan 2019 10:43:34 -0500 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> Message-ID: On Mon, Jan 28, 2019 at 10:41 AM Jay Pipes wrote: > > On 01/28/2019 10:24 AM, Mohammed Naser wrote: > > Perhaps, we should come up with the first initial step of providing > > a common way of building images (so a use can clone a repo and do > > 'docker build .') which will eliminate the obligation of having to > > deal with binaries, and then afterwards reconsider the ideal way of > > shipping those out. > > Isn't that precisely what LOCI offers, Mohammed? > > Best, > -jay > I haven't studied LOCI as much however I think that it would be good to perhaps look into bringing that approach in-repo rather than out-of-repo so a user can simply git clone, docker build . I have to admit, I'm not super familiar with LOCI but as far as I know, that's indeed what I believe it does. -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From jaypipes at gmail.com Mon Jan 28 15:58:24 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Mon, 28 Jan 2019 10:58:24 -0500 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> Message-ID: <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> On 01/28/2019 10:43 AM, Mohammed Naser wrote: > On Mon, Jan 28, 2019 at 10:41 AM Jay Pipes wrote: >> >> On 01/28/2019 10:24 AM, Mohammed Naser wrote: >>> Perhaps, we should come up with the first initial step of providing >>> a common way of building images (so a use can clone a repo and do >>> 'docker build .') which will eliminate the obligation of having to >>> deal with binaries, and then afterwards reconsider the ideal way of >>> shipping those out. >> >> Isn't that precisely what LOCI offers, Mohammed? >> >> Best, >> -jay >> > > I haven't studied LOCI as much however I think that it would be good to > perhaps look into bringing that approach in-repo rather than out-of-repo > so a user can simply git clone, docker build . > > I have to admit, I'm not super familiar with LOCI but as far as I know, that's > indeed what I believe it does. Yes, that's what LOCI can do, kinda. :) Technically there's some Makefile foo that iterates over projects to build images for, but it's essentially what it does. Alternately, you don't even need to build locally. You can do: docker build https://git.openstack.org/openstack/loci.git \ --build-arg PROJECT=keystone \ --tag keystone:ubuntu IMHO, the real innovation that LOCI brings is the way that it builds wheel packages into an intermediary docker build container and then installs the service-specific Python code into a virtualenv inside the target project docker container after injecting the built wheels. That, and LOCI made a good (IMHO) decision to just focus on building the images and not deploying those images (using Ansible, Puppet, Chef, k8s, whatever). They kept the deployment concerns separate, which is a great decision since deployment tools are a complete dumpster fire (all of them). Best, -jay From mnaser at vexxhost.com Mon Jan 28 16:00:29 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Mon, 28 Jan 2019 11:00:29 -0500 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> Message-ID: On Mon, Jan 28, 2019 at 10:58 AM Jay Pipes wrote: > > On 01/28/2019 10:43 AM, Mohammed Naser wrote: > > On Mon, Jan 28, 2019 at 10:41 AM Jay Pipes wrote: > >> > >> On 01/28/2019 10:24 AM, Mohammed Naser wrote: > >>> Perhaps, we should come up with the first initial step of providing > >>> a common way of building images (so a use can clone a repo and do > >>> 'docker build .') which will eliminate the obligation of having to > >>> deal with binaries, and then afterwards reconsider the ideal way of > >>> shipping those out. > >> > >> Isn't that precisely what LOCI offers, Mohammed? > >> > >> Best, > >> -jay > >> > > > > I haven't studied LOCI as much however I think that it would be good to > > perhaps look into bringing that approach in-repo rather than out-of-repo > > so a user can simply git clone, docker build . > > > > I have to admit, I'm not super familiar with LOCI but as far as I know, that's > > indeed what I believe it does. > > Yes, that's what LOCI can do, kinda. :) Technically there's some > Makefile foo that iterates over projects to build images for, but it's > essentially what it does. > > Alternately, you don't even need to build locally. You can do: > > docker build https://git.openstack.org/openstack/loci.git \ > --build-arg PROJECT=keystone \ > --tag keystone:ubuntu > > IMHO, the real innovation that LOCI brings is the way that it builds > wheel packages into an intermediary docker build container and then > installs the service-specific Python code into a virtualenv inside the > target project docker container after injecting the built wheels. > > That, and LOCI made a good (IMHO) decision to just focus on building the > images and not deploying those images (using Ansible, Puppet, Chef, k8s, > whatever). They kept the deployment concerns separate, which is a great > decision since deployment tools are a complete dumpster fire (all of them). Thanks for that, I didn't know about this, I'll do some more reading about LOCI and it how it goes about doing this. Thanks Jay. > Best, > -jay -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From jaypipes at gmail.com Mon Jan 28 16:18:55 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Mon, 28 Jan 2019 11:18:55 -0500 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> Message-ID: On 01/28/2019 11:00 AM, Mohammed Naser wrote: > On Mon, Jan 28, 2019 at 10:58 AM Jay Pipes wrote: >> >> On 01/28/2019 10:43 AM, Mohammed Naser wrote: >>> On Mon, Jan 28, 2019 at 10:41 AM Jay Pipes wrote: >>>> >>>> On 01/28/2019 10:24 AM, Mohammed Naser wrote: >>>>> Perhaps, we should come up with the first initial step of providing >>>>> a common way of building images (so a use can clone a repo and do >>>>> 'docker build .') which will eliminate the obligation of having to >>>>> deal with binaries, and then afterwards reconsider the ideal way of >>>>> shipping those out. >>>> >>>> Isn't that precisely what LOCI offers, Mohammed? >>>> >>>> Best, >>>> -jay >>>> >>> >>> I haven't studied LOCI as much however I think that it would be good to >>> perhaps look into bringing that approach in-repo rather than out-of-repo >>> so a user can simply git clone, docker build . >>> >>> I have to admit, I'm not super familiar with LOCI but as far as I know, that's >>> indeed what I believe it does. >> >> Yes, that's what LOCI can do, kinda. :) Technically there's some >> Makefile foo that iterates over projects to build images for, but it's >> essentially what it does. >> >> Alternately, you don't even need to build locally. You can do: >> >> docker build https://git.openstack.org/openstack/loci.git \ >> --build-arg PROJECT=keystone \ >> --tag keystone:ubuntu >> >> IMHO, the real innovation that LOCI brings is the way that it builds >> wheel packages into an intermediary docker build container and then >> installs the service-specific Python code into a virtualenv inside the >> target project docker container after injecting the built wheels. >> >> That, and LOCI made a good (IMHO) decision to just focus on building the >> images and not deploying those images (using Ansible, Puppet, Chef, k8s, >> whatever). They kept the deployment concerns separate, which is a great >> decision since deployment tools are a complete dumpster fire (all of them). > > Thanks for that, I didn't know about this, I'll do some more reading about LOCI > and it how it goes about doing this. > > Thanks Jay. No problem. Also a good thing to keep in mind is that kolla-ansible is able to deploy LOCI images, AFAIK, instead of the "normal" Kolla images. I have not tried this myself, however, so perhaps someone with experience in this might chime in. Best, -jay From smooney at redhat.com Mon Jan 28 16:29:17 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 28 Jan 2019 16:29:17 +0000 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> Message-ID: <9a16957adab5264d16e16eaf583a1e44ae49152f.camel@redhat.com> On Mon, 2019-01-28 at 10:43 -0500, Mohammed Naser wrote: > On Mon, Jan 28, 2019 at 10:41 AM Jay Pipes wrote: > > > > On 01/28/2019 10:24 AM, Mohammed Naser wrote: > > > Perhaps, we should come up with the first initial step of providing > > > a common way of building images (so a use can clone a repo and do > > > 'docker build .') which will eliminate the obligation of having to > > > deal with binaries, and then afterwards reconsider the ideal way of > > > shipping those out. > > > > Isn't that precisely what LOCI offers, Mohammed? > > > > Best, > > -jay > > the problem with that appraoch is we have is we have to bless a specific base image which effectivly mean that it is unlikely that this would form the basis of a vendor product. if that is not the goal. e.g. support a common set of images that can be used in vendor distrobutions and the target is instead developers, testing and role your own deployemnts that dont use a downstream vendor distobution that is fine. if we did want to support vendor distobutiosn we would likely have to do one of the following. dynamicaly generate the docker file form a template like kolla does so we can set the base image. that could be as simple as "tox -e continer-build -- base_image=ubuntu:latest" > I haven't studied LOCI as much however I think that it would be good to > perhaps look into bringing that approach in-repo rather than out-of-repo > so a user can simply git clone, docker build . well im not sure if you have noticed but alot of people cant even agree on "docker build" lately. personcally i like the idea of tiny base image with python-3, pip and a compiler. and have a work worflow similar to "tox -e continer-build" which would create the root files system image by then just pip installing the current project. if that just does docker build fine by me. i personally dont like the push to abandon all thinks docker inc and create new tools for exactly the same thing. the one think i would urge however regradless of what we decided to do or not do is, lets not package our depencies under openstack/. e.g. we should not aim to have openstack/mysql or openstack/rabbitmq but i would also argue we should not aim to have a nova-libvirt contianer or neutron ovs contianer either. if the goal was to have in repo definitons of the services own blessed contianer there is no repo that these dependencies would naturaly fit with and im sure the mysql comunity can proably do a better job of contaierising it then us. > I have to admit, I'm not super familiar with LOCI but as far as I know, that's > indeed what I believe it does. loci basically does this but unlike kolla it does not define a common abi for how the contiers are to be run. that has pros and cons but it does mean that every deployment tool that consumes the loci image basically has to invent it itself which is kind of wastful. when containerising openstack, neutron and horizon are often the elephants in the room as managing the installation of vendor/service specific plugins or neutron extension is a pain. e.g how do you build an image with networking-ovn and vpnaas and use the same mechanium to build ml2/ovs with networking-sfc. you either end up installing them all or choosing a subset and people end up building there own image. kolla adresses this by generating the dockerfiles dynamically form a template so that you can build with only the plugins you want and using that same templating you cna select source (git or tarball) or binary installs and the base distro and architecture. the last point is something that is often forgotten. we do most of our ci on x86 but i hear openstack works really well on arm and power too so what ever images are produced should supprot those too. loci does not to my knoladge whve mulit distor or multi arch support but based on what little i know i dont think that is a fundemetal limitation of how it works. From smooney at redhat.com Mon Jan 28 16:31:44 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 28 Jan 2019 16:31:44 +0000 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> Message-ID: On Mon, 2019-01-28 at 11:18 -0500, Jay Pipes wrote: > On 01/28/2019 11:00 AM, Mohammed Naser wrote: > > On Mon, Jan 28, 2019 at 10:58 AM Jay Pipes wrote: > > > > > > On 01/28/2019 10:43 AM, Mohammed Naser wrote: > > > > On Mon, Jan 28, 2019 at 10:41 AM Jay Pipes wrote: > > > > > > > > > > On 01/28/2019 10:24 AM, Mohammed Naser wrote: > > > > > > Perhaps, we should come up with the first initial step of providing > > > > > > a common way of building images (so a use can clone a repo and do > > > > > > 'docker build .') which will eliminate the obligation of having to > > > > > > deal with binaries, and then afterwards reconsider the ideal way of > > > > > > shipping those out. > > > > > > > > > > Isn't that precisely what LOCI offers, Mohammed? > > > > > > > > > > Best, > > > > > -jay > > > > > > > > > > > > > I haven't studied LOCI as much however I think that it would be good to > > > > perhaps look into bringing that approach in-repo rather than out-of-repo > > > > so a user can simply git clone, docker build . > > > > > > > > I have to admit, I'm not super familiar with LOCI but as far as I know, that's > > > > indeed what I believe it does. > > > > > > Yes, that's what LOCI can do, kinda. :) Technically there's some > > > Makefile foo that iterates over projects to build images for, but it's > > > essentially what it does. > > > > > > Alternately, you don't even need to build locally. You can do: > > > > > > docker build https://git.openstack.org/openstack/loci.git \ > > > --build-arg PROJECT=keystone \ > > > --tag keystone:ubuntu > > > > > > IMHO, the real innovation that LOCI brings is the way that it builds > > > wheel packages into an intermediary docker build container and then > > > installs the service-specific Python code into a virtualenv inside the > > > target project docker container after injecting the built wheels. > > > > > > That, and LOCI made a good (IMHO) decision to just focus on building the > > > images and not deploying those images (using Ansible, Puppet, Chef, k8s, > > > whatever). They kept the deployment concerns separate, which is a great > > > decision since deployment tools are a complete dumpster fire (all of them). > > > > Thanks for that, I didn't know about this, I'll do some more reading about LOCI > > and it how it goes about doing this. > > > > Thanks Jay. > > No problem. Also a good thing to keep in mind is that kolla-ansible is > able to deploy LOCI images, AFAIK, instead of the "normal" Kolla images. > I have not tried this myself, however, so perhaps someone with > experience in this might chime in. the loci images would have to conform to the kolla abit which requires a few files like kolla_start to existit but it principal it could if that requirement was fulfilled. > Best, > -jay > > > From smooney at redhat.com Mon Jan 28 16:52:30 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 28 Jan 2019 16:52:30 +0000 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> Message-ID: On Mon, 2019-01-28 at 16:31 +0000, Sean Mooney wrote: > On Mon, 2019-01-28 at 11:18 -0500, Jay Pipes wrote: > > On 01/28/2019 11:00 AM, Mohammed Naser wrote: > > > On Mon, Jan 28, 2019 at 10:58 AM Jay Pipes wrote: > > > > > > > > On 01/28/2019 10:43 AM, Mohammed Naser wrote: > > > > > On Mon, Jan 28, 2019 at 10:41 AM Jay Pipes wrote: > > > > > > > > > > > > On 01/28/2019 10:24 AM, Mohammed Naser wrote: > > > > > > > Perhaps, we should come up with the first initial step of providing > > > > > > > a common way of building images (so a use can clone a repo and do > > > > > > > 'docker build .') which will eliminate the obligation of having to > > > > > > > deal with binaries, and then afterwards reconsider the ideal way of > > > > > > > shipping those out. > > > > > > > > > > > > Isn't that precisely what LOCI offers, Mohammed? > > > > > > > > > > > > Best, > > > > > > -jay > > > > > > > > > > > > > > > > I haven't studied LOCI as much however I think that it would be good to > > > > > perhaps look into bringing that approach in-repo rather than out-of-repo > > > > > so a user can simply git clone, docker build . > > > > > > > > > > I have to admit, I'm not super familiar with LOCI but as far as I know, that's > > > > > indeed what I believe it does. > > > > > > > > Yes, that's what LOCI can do, kinda. :) Technically there's some > > > > Makefile foo that iterates over projects to build images for, but it's > > > > essentially what it does. > > > > > > > > Alternately, you don't even need to build locally. You can do: > > > > > > > > docker build https://git.openstack.org/openstack/loci.git \ > > > > --build-arg PROJECT=keystone \ > > > > --tag keystone:ubuntu > > > > > > > > IMHO, the real innovation that LOCI brings is the way that it builds > > > > wheel packages into an intermediary docker build container and then > > > > installs the service-specific Python code into a virtualenv inside the > > > > target project docker container after injecting the built wheels. > > > > > > > > That, and LOCI made a good (IMHO) decision to just focus on building the > > > > images and not deploying those images (using Ansible, Puppet, Chef, k8s, > > > > whatever). They kept the deployment concerns separate, which is a great > > > > decision since deployment tools are a complete dumpster fire (all of them). > > > > > > Thanks for that, I didn't know about this, I'll do some more reading about LOCI > > > and it how it goes about doing this. > > > > > > Thanks Jay. > > > > No problem. Also a good thing to keep in mind is that kolla-ansible is > > able to deploy LOCI images, AFAIK, instead of the "normal" Kolla images. > > I have not tried this myself, however, so perhaps someone with > > experience in this might chime in. > > the loci images would have to conform to the kolla abit which requires a few files > like kolla_start to existit but it principal it could if that requirement was fulfilled. this is the kolla image api for reference https://docs.openstack.org/kolla/latest/admin/kolla_api.html https://github.com/openstack/kolla/blob/master/doc/source/admin/kolla_api.rst all kolla images share that external facing api so if you use loci to build an image an then inject the required api shim as a layer it would work. you can also use the iamge manually the same way by defining the relevent env varibale or monting configs. docker run -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS \ -e KOLLA_CONFIG_FILE=/config.json \ -v /path/to/config.json:/config.json kolla-image of cource you can bypass it too and execute command directly in the contienr too e.g. just start nova-compute. the point was to define a commmon way to inject configuration, include what command to run externally after the image was built so that they could be reused by different deployment tools like kolla-k8s, tripleo or just a buch or bash commands. the workflow is the same. prepfare a directory with a buch of config files for the service. spawn the container with that directory bind mounted into the container and set an env var to point at the kolla config.json that specifed where teh config shoudl be copied, with what owership/permission and what command to run. im not sure if thsi is a good or a bad thing but any tool that supported the kolla image api should be able to use loci built image if those image suport it too. > > > Best, > > -jay > > > > > > > > From bdobreli at redhat.com Mon Jan 28 17:12:43 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Mon, 28 Jan 2019 18:12:43 +0100 Subject: [tripleo][ptl]What is the future of availability monitoring in TripleO? In-Reply-To: References: Message-ID: On 25.01.2019 22:34, David M Noriega wrote: > I've noticed that the availability monitoring integration in TripleO > using Sensu is marked for deprecation post Rocky, but I do not see any > blueprints detailing a replacement plan like there is for fluentd to > rsyslog for the logging integration. Where can I find information on > what the roadmap is? Is this a feature that will be dropped and left to > operators to implement? I know there is a spec [0], but it seems it needs some refreshing, like moving the target release. [0] https://review.openstack.org/#/c/523493/ -- Best regards, Bogdan Dobrelya, Irc #bogdando From mriedemos at gmail.com Mon Jan 28 17:35:19 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Mon, 28 Jan 2019 11:35:19 -0600 Subject: G8 H8 In-Reply-To: References: <9891003e-9777-b684-05c9-8dfc22363e07@gmail.com> <2df964b1-000f-d85f-e8d9-a6998d02554c@ericsson.com> Message-ID: On 1/28/2019 7:48 AM, Bence Romsics wrote: > I uploaded a try to fix #1813198 here:https://review.openstack.org/633502 > > We'll see some test results soon if it works or not. Thanks Bence. I've proposed a change to skip the test until it's fixed: https://review.openstack.org/#/c/633566/1 And rebased your change on top in case we want to recheck that a few times. -- Thanks, Matt From mrhillsman at gmail.com Mon Jan 28 18:46:04 2019 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Mon, 28 Jan 2019 12:46:04 -0600 Subject: Fwd: [all] [uc] OpenStack UC Meeting @ 1900 UTC In-Reply-To: References: Message-ID: Hi everyone, Just a reminder that the UC meeting will be in #openstack-uc in about 15 minutes from now. Please feel empowered to add to the agenda here - https://etherpad.openstack.org/p/uc - and we hope to see you there! -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 -------------- next part -------------- An HTML attachment was scrubbed... URL: From moshele at mellanox.com Mon Jan 28 19:54:27 2019 From: moshele at mellanox.com (Moshe Levi) Date: Mon, 28 Jan 2019 19:54:27 +0000 Subject: [neutron][taas] taas OVS Driver Implementation Message-ID: Hi Guys, I was looking to contribute to taas to allow ovs hardware offload [1]. Do any one have some documentation of the ovs pipeline? I see this wired behavior on local mirroring I can see that if the monitor port and the VM production port on the same server the traffic for the monitor port is tagged with the local vlan of the port. I am not sure why it happen because the monitor port is configured to be in access of the local vlan. I proposed the following fix [2] and I wonder if someone can review it [1] - https://github.com/openstack/neutron/blob/master/doc/source/admin/conig-ovs-offload.rst [2] - https://review.openstack.org/#/c/632679/ Thanks, Moshe -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Mon Jan 28 20:57:52 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Mon, 28 Jan 2019 15:57:52 -0500 Subject: [tc] Technical Committee status update Message-ID: This is a summary of work being done by the Technical Committee members. The full list of active items is managed in the wiki: https://wiki.openstack.org/wiki/Technical_Committee_Tracker We also track TC objectives for the cycle using StoryBoard at: https://storyboard.openstack.org/#!/project/923 == Recent Activity == Project updates: * Renat Akhmerov has replaced Dougal Matthews as Mistral PTL ** https://review.openstack.org/#/c/625537/ * The Sahara team added repositories for extracting their plugins ** https://review.openstack.org/#/c/628210/ * The murano-deployment repository was retired ** https://review.openstack.org/#/c/628860/ * The assert:supports-upgrade tag was added to watcher ** https://review.openstack.org/#/c/619470/ * The openstack-virtual-baremetal repository was added to the TripleO team ** https://review.openstack.org/#/c/631028/ * The charm-interface-cinder-backend repository was added to the OpenStack Charms project ** https://review.openstack.org/#/c/631251/ * The puppet-stackalytics repository has been retired ** https://review.openstack.org/#/c/631820/ Other updates: * Zane drafted a resolution about how we should choose which Python versions to track ** https://review.openstack.org/#/c/613145/ * Sean updated the PTI to reflect the runtimes supported for Stein ** https://review.openstack.org/#/c/611080/ * Thierry and Chris drafted a description of the role of the TC ** https://review.openstack.org/#/c/622400/ * I added a "house rule" for approving simple documentation changes without requiring a full TC vote ** https://review.openstack.org/#/c/625005/ * Jeremy updated the technical vision to cover hiding implementation details ** https://review.openstack.org/#/c/628181/ == TC Meetings == The most recent TC meeting was on 3 January. Logs were sent to the mailing list after the meeting. * http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001376.html The next TC meeting will be 7 February @ 1400 UTC in #openstack-tc. See http://eavesdrop.openstack.org/#Technical_Committee_Meeting for details == Ongoing Discussions == Frank has proposed a list of translators as ATCs for the i18n team. * https://review.openstack.org/#/c/633398/ Ganshyam is working on updating the technical vision to cover feature discovery. * https://review.openstack.org/#/c/621516/ Jean-Philippe has proposed adding openSUSE to the list of supported platforms in the PTI. * https://review.openstack.org/#/c/633460/ == Contacting the TC == The Technical Committee uses a series of weekly "office hour" time slots for synchronous communication. We hope that by having several such times scheduled, we will have more opportunities to engage with members of the community from different timezones. Office hour times in #openstack-tc: - 09:00 UTC on Tuesdays - 01:00 UTC on Wednesdays - 15:00 UTC on Thursdays If you have something you would like the TC to discuss, you can add it to our office hour conversation starter etherpad at: https://etherpad.openstack.org/p/tc-office-hour-conversation-starters Many of us also run IRC bouncers which stay in #openstack-tc most of the time, so please do not feel that you need to wait for an office hour time to pose a question or offer a suggestion. You can use the string "tc-members" to alert the members to your question. You will find channel logs with past conversations at http://eavesdrop.openstack.org/irclogs/%23openstack-tc/ If you expect your topic to require significant discussion or to need input from members of the community other than the TC, please start a mailing list discussion on openstack-discuss at lists.openstack.org and use the subject tag "[tc]" to bring it to the attention of TC members. -- Doug From mriedemos at gmail.com Mon Jan 28 22:37:58 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Mon, 28 Jan 2019 16:37:58 -0600 Subject: [placement] update 19-03 In-Reply-To: References: Message-ID: <4923a703-b4de-5e84-ad2c-7fc0680cbb72@gmail.com> On 1/25/2019 6:49 AM, Chris Dent wrote: > ## Nested > > > * I did a review push on this today and the bottom half dozen or so patches are ready for another core to review so we can hopefully shorten that series a bit. There are nits throughout several of the changes but gibi can collect those and address in a follow up patch. -- Thanks, Matt From hongbin034 at gmail.com Tue Jan 29 02:22:33 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Mon, 28 Jan 2019 21:22:33 -0500 Subject: [Zun] No team meeting at Jan 29 and Feb 05 In-Reply-To: <154872833040.25837.13043020739376004643.launchpad@chaenomeles.canonical.com> References: <154872833040.25837.13043020739376004643.launchpad@chaenomeles.canonical.com> Message-ID: FYI ---------- Forwarded message --------- From: Ji.Wei <17366060309 at 163.com> Date: Mon, Jan 28, 2019 at 9:18 PM Subject: Zun irc meeting suspension notice To: hongbin Hello everyone: Zun's irc meeting will be suspended during the Chinese traditional festival, Chinese New Year (29 January 2019 and 05 February 2019). If you have any questions, please contact me (17366060309 at 163.com). Thank you for your contribution to Zun in the past year :) -- This message was sent from Launchpad by Ji.Wei (https://launchpad.net/~jiwei) to each member of the zun-drivers team using the "Contact this team" link on the zun-drivers team page (https://launchpad.net/~zun-drivers). For more information see https://help.launchpad.net/YourAccount/ContactingPeople -------------- next part -------------- An HTML attachment was scrubbed... URL: From liliueecg at gmail.com Tue Jan 29 03:52:28 2019 From: liliueecg at gmail.com (Li Liu) Date: Mon, 28 Jan 2019 22:52:28 -0500 Subject: [Cyborg] No IRC meeting on Jan 29 and Feb 05 Message-ID: Hi Team, The IRC meetings for the next couple weeks will be canceled due to Chinese New Year and resumed on the week of Feb 12th. Happy Chinese New Year! 新年快乐! -- Thank you Regards Li Liu -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.duarte.cardoso at intel.com Tue Jan 29 07:25:58 2019 From: igor.duarte.cardoso at intel.com (Duarte Cardoso, Igor) Date: Tue, 29 Jan 2019 07:25:58 +0000 Subject: [neutron] OVS OpenFlow L3 DVR / dvr_bridge agent_mode Message-ID: Hi Neutron, I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS for longer. I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. This can impact how long it's going to take to complete the work and whether it can make it to stein-3. 1. Should OVS support also be added to the legacy router? And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS or kernel) instead of creating more combinations? 2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the unit tests? (since the l3_agent.ini needs to be configured differently). 3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending on whether they were created as ``distributed``)? We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel-based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only kernel-based RouterInfos depending on the ``agent_mode``. We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a recheck. [1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr [2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536 [3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:merged) [4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29 [5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17 Thank you! Best regards, Igor D.C. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chkumar246 at gmail.com Tue Jan 29 07:42:13 2019 From: chkumar246 at gmail.com (Chandan kumar) Date: Tue, 29 Jan 2019 13:12:13 +0530 Subject: [tripleo][openstack-ansible] collaboration on os_tempest role update VIII - Jan 29, 2019 Message-ID: Hello, Here is the 8 th update (Jan 22 to Jan 29, 2019) on collaboration on os_tempest[1] role between TripleO and OpenStack-Ansible projects. Things got merged: openstack-ansible-tests: * Setup clouds.yaml on tempest node - https://review.openstack.org/631794 * Gather all facts while preparing hosts - https://review.openstack.org/632609 * Gather different port status on different hosts - https://review.openstack.org/633179 TripleO: * Added requirements for integrating os_tempest role - https://review.openstack.org/628421 * Use os_tempest for running tempest on standalone - https://review.openstack.org/628415 Summary: We have almost completed the work of integrating os_tempest with tripleo-ci, The final patch is about to land by EOD or by tomorrow and here is the latest result: http://logs.openstack.org/00/627500/76/check/tripleo-ci-centos-7-standalone-os-tempest/099a747/logs/stestr_results.html Things In-Progress: os_tempest: * Added tempest.conf for heat_plugin - https://review.openstack.org/632021 * Use tempest_cloud_name in tempestconf - https://review.openstack.org/631708 * Always generate stackviz irrespective of tests pass or fail - https://review.openstack.org/631967 * Update cirros image from 3.5 to 3.6 - https://review.openstack.org/633208 * Added dependencies of os_tempest role - https://review.openstack.org/632726 * Only init a workspace if doesn't exists - https://review.openstack.org/633549 * Add support for aarch64 images - https://review.openstack.org/620032 * Adds tempest run command with --test-list option - https://review.openstack.org/631351 * Add telemetry distro plugin install for aodh - https://review.openstack.org/632125 TripleO: * Set keystone specific vars for os_tempest from clouds.yaml - https://review.openstack.org/633185 * Run tempest using os_tempest role in standalone job - https://review.openstack.org/627500 Summary: Currently, os_tempest tempest.scenario.test_server_basic_ops.TestServerBasicOps.test_server_basic_ops tests are failing while ssh into cirros image with ssh timeout only on CentOS7. It might be a floating Ip/networking issue. Any help on this appreciated. Failure: http://logs.openstack.org/08/633208/8/check/openstack-ansible-functional-centos-7/a8dcd57/logs/openstack/tempest1/stestr_results.html We also find other issues on centOS 7 jobs, work in progress to fix it * Use venv_packages_to_symlink to symlink to import libvirt-python - https://review.openstack.org/633474 * Ensure selinux bindings are linked into the venv - https://review.openstack.org/#/c/633513/ Goal of this week: We will be working on fixing the os_tempest gate and get above patches merged. We are working on fixing it. Thanks to jrosser and odyssey4me for helping us in debugging centOS7 issues while trying to reproduce the issue locally. Here is the 7th update [2]. Have queries, Feel free to ping us on #tripleo or #openstack-ansible channel. Links: [1.] http://git.openstack.org/cgit/openstack/openstack-ansible-os_tempest [2.] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001946.html Thanks, Chandan Kumar From eumel at arcor.de Tue Jan 29 07:51:54 2019 From: eumel at arcor.de (Frank Kloeker) Date: Tue, 29 Jan 2019 08:51:54 +0100 Subject: [all] Is the Denver Summit save? Message-ID: Good morning, just a question, out of curiosity: Is the Denver Summit save? I mean, we received a lot of breaking news in the morning, that Huawei is charged in court and is doing a lot of bad things (i.e. stolen a robotics arm). I don't want to bring in any political discussions. But surely as you know, Huawei is one of the top contributor to our Open Source project. They work hard and do not need to steal things. Is there any chance that one of our friends will be caught next? At the summit between political fronts? I feel a little uncomfortable with it. We are an open community and should clarify this inconsistency. kind regards Frank From skaplons at redhat.com Tue Jan 29 07:52:06 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Tue, 29 Jan 2019 08:52:06 +0100 Subject: [neutron] OVS OpenFlow L3 DVR / dvr_bridge agent_mode In-Reply-To: References: Message-ID: <89C38C03-E9FF-4DD3-9264-17AB77F80353@redhat.com> Hi, > Wiadomość napisana przez Duarte Cardoso, Igor w dniu 29.01.2019, o godz. 08:25: > > Hi Neutron, > > I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS for longer. > > I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. This can impact how long it's going to take to complete the work and whether it can make it to stein-3. > > 1. Should OVS support also be added to the legacy router? > And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS or kernel) instead of creating more combinations? IMHO new config option could be better. Than You can have agent_mode like it is now and new „switch” to change between OVS and kernel backend for it. We can of course forbid some combinations at the beginning and add support for them later if that would be necessary. > > 2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the unit tests? (since the l3_agent.ini needs to be configured differently). I think that we should propose new neutron-tempest-plugin scenario job (based on neutron-tempest-plugin-dvr-multinode-scenario probably) but with configured DVR mode in this new way. That should be enough for the beginning IMO. Of course some unit/functional tests should be added also to Your patch :) > > 3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending on whether they were created as ``distributed``)? > We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel-based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only kernel-based RouterInfos depending on the ``agent_mode``. Please keep in mind that there is spec about refactor RouterInfo to make it less coupled with L3 agent’s code. It’s in [1]. Maybe You can work on this together :) > > We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a recheck. > > [1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr > [2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536 > [3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:merged) > [4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29 > [5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17 > > Thank you! > > Best regards, > Igor D.C. [1] https://review.openstack.org/#/c/625647/ — Slawek Kaplonski Senior software engineer Red Hat From jean-philippe at evrard.me Tue Jan 29 08:22:15 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 29 Jan 2019 09:22:15 +0100 Subject: [openstack-helm] Support for Docker Registry with authentication turned on ? In-Reply-To: <9ACD444D-18D0-48FA-803A-38FF561DA32C@windriver.com> References: <9ACD444D-18D0-48FA-803A-38FF561DA32C@windriver.com> Message-ID: On Tue, 2019-01-22 at 12:35 +0000, Waines, Greg wrote: > Hey ... We’re relatively new to openstack-helm. > > We are trying to use the openstack-helm charts with a Docker Registry > that has token authentication turned on. > With the current charts, there does not seem to be a way to do this. > I.e. there is not an ‘imagePullSecrets’ in the defined > pods/containers or in the defined serviceAccounts . > Our thinking would be to add a default imagePullSecret to all of the > serviceAccounts defined in the openstack-helm serviceaccount > template. > > OR is there another way to use openstack-helm charts with a Docker > Registry with authentication turned on ? > > Any info is appreciated, > Greg / Angie / Jerry. Hello, Did you get an answer there? Could you post it to the ML, please? Regards, Jean-Philippe Evrard (evrardjp) From alfredo.deluca at gmail.com Tue Jan 29 10:12:53 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Tue, 29 Jan 2019 11:12:53 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: Hi all. Anyone got the same issue with Magnum? Cheers On Mon, Jan 28, 2019 at 3:24 PM Alfredo De Luca wrote: > Hi all. > I finally instaledl successufully openstack ansible (queens) but, after > creating a cluster template I create k8s cluster, it stuck on > > > kube_masters > > b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 > > OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate > in progress....and after around an hour it says...time out. k8s master > seems to be up.....at least as VM. > > any idea? > > > > > *Alfredo* > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemens.hardewig at crandale.de Tue Jan 29 10:34:08 2019 From: clemens.hardewig at crandale.de (=?utf-8?Q?Clemens_Hardewig?=) Date: Tue, 29 Jan 2019 10:34:08 +0000 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: Yes, you should check the cloud-init logs of your master. Without having seen them, I would guess a network issue or you have selected for your minion nodes a flavor using swap perhaps ... So, log files are the first step you could dig into... Br c Von meinem iPhone gesendet Am 28.01.2019 um 15:34 schrieb Alfredo De Luca >: Hi all. I finally instaledl successufully openstack ansible (queens) but, after creating a cluster template I create k8s cluster, it stuck on  kube_masters b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate in progress....and after around an hour it says...time out. k8s master seems to be up.....at least as VM.  any idea?   Alfredo -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Tue Jan 29 11:07:01 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Tue, 29 Jan 2019 12:07:01 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: thanks Clemens. I looked at the cloud-init-output.log on the master... and at the moment is doing the following.... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 Network ....could be but not sure where to look at On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig < clemens.hardewig at crandale.de> wrote: > Yes, you should check the cloud-init logs of your master. Without having > seen them, I would guess a network issue or you have selected for your > minion nodes a flavor using swap perhaps ... > So, log files are the first step you could dig into... > Br c > Von meinem iPhone gesendet > > Am 28.01.2019 um 15:34 schrieb Alfredo De Luca : > > Hi all. > I finally instaledl successufully openstack ansible (queens) but, after > creating a cluster template I create k8s cluster, it stuck on > > > kube_masters > > b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 > > OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate > in progress....and after around an hour it says...time out. k8s master > seems to be up.....at least as VM. > > any idea? > > > > > *Alfredo* > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at stackhpc.com Tue Jan 29 11:14:06 2019 From: doug at stackhpc.com (Doug Szumski) Date: Tue, 29 Jan 2019 11:14:06 +0000 Subject: [kayobe] Proposing Pierre Riteau for core In-Reply-To: References: Message-ID: <62dda16e-0581-2de3-8f45-3b30510f0a28@stackhpc.com> On 25/01/2019 17:23, Mark Goddard wrote: > Hi, > > I'd like to propose Pierre Riteau (priteau) for core. He has > contributed a number of good patches and provided some thoughtful and > useful reviews. > > Cores, please respond +1 or -1. +1 for Pierre, it will be great to have him on the team. > > Mark From clemens.hardewig at crandale.de Tue Jan 29 11:16:38 2019 From: clemens.hardewig at crandale.de (=?utf-8?Q?Clemens_Hardewig?=) Date: Tue, 29 Jan 2019 11:16:38 +0000 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: Yea, this means waiting for something... it will continue forever .... look to the last messages before this log sequence starts ... Von meinem iPhone gesendet Am 29.01.2019 um 12:08 schrieb Alfredo De Luca >: thanks Clemens. I looked at the cloud-init-output.log  on the master... and at the moment is doing the following.... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 Network ....could be but not sure where to look at On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig > wrote: Yes, you should check the cloud-init logs of your master. Without having seen them, I would guess a network issue or you have selected for your minion nodes a flavor using swap perhaps ... So, log files are the first step you could dig into... Br c Von meinem iPhone gesendet Am 28.01.2019 um 15:34 schrieb Alfredo De Luca >: Hi all. I finally instaledl successufully openstack ansible (queens) but, after creating a cluster template I create k8s cluster, it stuck on  kube_masters b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate in progress....and after around an hour it says...time out. k8s master seems to be up.....at least as VM.  any idea?   Alfredo -- Alfredo -------------- next part -------------- An HTML attachment was scrubbed... URL: From clemens.hardewig at crandale.de Tue Jan 29 11:23:09 2019 From: clemens.hardewig at crandale.de (=?utf-8?Q?Clemens_Hardewig?=) Date: Tue, 29 Jan 2019 11:23:09 +0000 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: At least on fedora there is a second cloud Init log as far as I remember-Look into both  Br c Von meinem iPhone gesendet Am 29.01.2019 um 12:08 schrieb Alfredo De Luca >: thanks Clemens. I looked at the cloud-init-output.log  on the master... and at the moment is doing the following.... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 Network ....could be but not sure where to look at On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig > wrote: Yes, you should check the cloud-init logs of your master. Without having seen them, I would guess a network issue or you have selected for your minion nodes a flavor using swap perhaps ... So, log files are the first step you could dig into... Br c Von meinem iPhone gesendet Am 28.01.2019 um 15:34 schrieb Alfredo De Luca >: Hi all. I finally instaledl successufully openstack ansible (queens) but, after creating a cluster template I create k8s cluster, it stuck on  kube_masters b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate in progress....and after around an hour it says...time out. k8s master seems to be up.....at least as VM.  any idea?   Alfredo -- Alfredo -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Jan 29 12:14:38 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 29 Jan 2019 12:14:38 +0000 Subject: [neutron] OVS OpenFlow L3 DVR / dvr_bridge agent_mode In-Reply-To: <89C38C03-E9FF-4DD3-9264-17AB77F80353@redhat.com> References: <89C38C03-E9FF-4DD3-9264-17AB77F80353@redhat.com> Message-ID: On Tue, 2019-01-29 at 08:52 +0100, Slawomir Kaplonski wrote: > Hi, > > > Wiadomość napisana przez Duarte Cardoso, Igor w dniu 29.01.2019, o godz. 08:25: > > > > Hi Neutron, > > > > I've been internally collaborating on the ``dvr_bridge`` L3 agent mode [1][2][3] work (David Shaughnessy, Xubo > > Zhang), which allows the L3 agent to make use of Open vSwitch / OpenFlow to implement ``distributed`` IPv4 Routers > > thus bypassing kernel namespaces and iptables and opening the door for higher performance by keeping packets in OVS > > for longer. > > > > I want to share a few questions in order to gather feedback from you. I understand parts of these questions may have > > been answered in the past before my involvement, but I believe it's still important to revisit and clarify them. > > This can impact how long it's going to take to complete the work and whether it can make it to stein-3. > > > > 1. Should OVS support also be added to the legacy router? > > And if so, would it make more sense to have a new variable (not ``agent_mode``) to specify what backend to use (OVS > > or kernel) instead of creating more combinations? > > IMHO new config option could be better. Than You can have agent_mode like it is now and new „switch” to change between > OVS and kernel backend for it. We can of course forbid some combinations at the beginning and add support for them > later if that would be necessary. i would like to see it implement in the legacy router case too. there will be little extra code required to do so and it will make testing the shared code simpler. when this feature was first concived for icehose it was targeting repalceing the legacy router and later extended to be an alternitve to dvr. but as suggested above a new config option sound like a good way to go. > > > > > 2. What is expected in terms of CI for this? Regarding testing, what should this first patch include apart from the > > unit tests? (since the l3_agent.ini needs to be configured differently). > > I think that we should propose new neutron-tempest-plugin scenario job (based on neutron-tempest-plugin-dvr-multinode- > scenario probably) but with configured DVR mode in this new way. That should be enough for the beginning IMO. > Of course some unit/functional tests should be added also to Your patch :) when this was proposed a few cycles ago the expection in testing was fullstack tests + unit and functional. the intent being to not need another job in the gate for a different routing mod however if the neutron team are open to adding a tempest job for this configuration then that is obviously better. from my understanding of the feautre this can be tested entirly upstream but it may be nice to add testing with dpdk via intel nfv ci which i belive still runs on neutron changes. it should relitivly simple to change the agent mode to dvr_bridge or whatever the new option is for the exisiting job. i am creating a personal replacement for the nfv ci for nova that will be doing some ovs-dpdk testing also so i can look into enabling this feature to get indirect testing depending on capasity also. > > > > > 3. What problems can be anticipated by having the same agent managing both kernel and OVS powered routers (depending > > on whether they were created as ``distributed``)? > > We are experimenting with different ways of decoupling RouterInfo (mainly as part of the L3 agent refactor patch) > > and haven't been able to find the right balance yet. On one end we have an agent that is still coupled with kernel- > > based RouterInfo, and on the other end we have an agent that either only accepts OVS-based RouterInfos or only > > kernel-based RouterInfos depending on the ``agent_mode``. > > Please keep in mind that there is spec about refactor RouterInfo to make it less coupled with L3 agent’s code. It’s in > [1]. Maybe You can work on this together :) > > > > > We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor one should be able to pass Zuul after a > > recheck. > > > > [1] Spec: https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr > > [2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536 > > [3] Gerrit topic: https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:merged) > > [4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29 > > [5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17 > > > > Thank you! > > > > Best regards, > > Igor D.C. > > [1] https://review.openstack.org/#/c/625647/ > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > From ignaziocassano at gmail.com Tue Jan 29 13:26:06 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 29 Jan 2019 14:26:06 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: Hello Alfredo, your external network is using proxy ? If you using a proxy, and yuo configured it in cluster template, you must setup no proxy for 127.0.0.1 Ignazio Il giorno mar 29 gen 2019 alle ore 12:26 Clemens Hardewig < clemens.hardewig at crandale.de> ha scritto: > At least on fedora there is a second cloud Init log as far as I > remember-Look into both > > Br c > > Von meinem iPhone gesendet > > Am 29.01.2019 um 12:08 schrieb Alfredo De Luca : > > thanks Clemens. > I looked at the cloud-init-output.log on the master... and at the moment > is doing the following.... > > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '' ']' > + sleep 5 > > Network ....could be but not sure where to look at > > > On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig < > clemens.hardewig at crandale.de> wrote: > >> Yes, you should check the cloud-init logs of your master. Without having >> seen them, I would guess a network issue or you have selected for your >> minion nodes a flavor using swap perhaps ... >> So, log files are the first step you could dig into... >> Br c >> Von meinem iPhone gesendet >> >> Am 28.01.2019 um 15:34 schrieb Alfredo De Luca > >: >> >> Hi all. >> I finally instaledl successufully openstack ansible (queens) but, after >> creating a cluster template I create k8s cluster, it stuck on >> >> >> kube_masters >> >> b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 >> >> OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate >> in progress....and after around an hour it says...time out. k8s master >> seems to be up.....at least as VM. >> >> any idea? >> >> >> >> >> *Alfredo* >> >> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lyarwood at redhat.com Tue Jan 29 14:03:04 2019 From: lyarwood at redhat.com (Lee Yarwood) Date: Tue, 29 Jan 2019 14:03:04 +0000 Subject: [placement] update 19-03 In-Reply-To: References: Message-ID: <20190129140304.pnd6wuqw3wvr2kxr@lyarwood.usersys.redhat.com> On 25-01-19 12:49:33, Chris Dent wrote: > > [..] > > Deployment related changes: > > * [TripleO](https://review.openstack.org/#/q/topic:tripleo-placement-extraction) Just to add some more colour here, the following tripleo-heat-templates change that is ultimately responsible for deploying an extracted Placement service in TripleO is now passing CI and ready for serious review: placement: Extract the service from Nova https://review.openstack.org/#/c/630644/ I still have to find time to dig into the RDO failures but I assume this is due to the outstanding tripleo-common and puppet-tripleo changes not being used by these jobs. On the puppet side, I'm reworking the puppet-nova removal change to use puppet-placement, fixing the currently broken puppet-tripleo tests and almost ready to land the puppet-openstack-integration change now the required pyvers changes have mostly merged. I'm very confident at this point of having this all landed by the next check-in meeting and should also be able to make a start with the upgrade tasks later this week ahead of schedule. Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: not available URL: From Greg.Waines at windriver.com Tue Jan 29 14:05:22 2019 From: Greg.Waines at windriver.com (Waines, Greg) Date: Tue, 29 Jan 2019 14:05:22 +0000 Subject: [Edge-computing] [keystone] x509 authentication In-Reply-To: References: Message-ID: <1677A2C3-F648-4611-8332-474938ECC677@windriver.com> Hey Lance, I like the plan. Just a clarifying question on “Although, it doesn't necessarily solve the network partition issue.” . * I’m assuming this is in a scenario where after the network partition the Edge Cloud and local client(s) do not have access to the Identity Provider ? * And in this case, it doesn’t work because ? * For a new local client (without any cached tokens), even if there are local shadow users already configured, the authentication still requires communication with the Identity Provider ? Is this correct ? Greg. From: Lance Bragstad Date: Friday, January 25, 2019 at 2:16 PM To: "edge-computing at lists.openstack.org" , "openstack-discuss at lists.openstack.org" Subject: [Edge-computing] [keystone] x509 authentication Hi all, We've been going over keystone gaps that need to be addressed for edge use cases every Tuesday. Since Berlin, Oath has open-sourced some of their custom authentication plugins for keystone that help them address these gaps. The basic idea is that users authenticate to some external identity provider (Athenz in Oath's case), and then present an Athenz token to keystone. The custom plugins decode the token from Athenz to determine the user, project, roles assignments, and other useful bits of information. After that, it creates any resources that don't exist in keystone already. Ultimately, a user can authenticate against a keystone node and have specific resources provisioned automatically. In Berlin, engineers from Oath were saying they'd like to move away from Athenz tokens altogether and use x509 certificates issued by Athenz instead. The auto-provisioning approach is very similar to a feature we have in keystone already. In Berlin, and shortly after, there was general agreement that if we could support x509 authentication with auto-provisioning via keystone federation, that would pretty much solve Oath's use case without having to maintain custom keystone plugins. Last week, Colleen started digging into keystone's existing x509 authentication support. I'll start with the good news, which is x509 authentication works, for the most part. It's been a feature in keystone for a long time, and it landed after we implemented federation support around the Kilo release. Chances are there won't be a need for a keystone specification like we were initially thinking in the edge meetings. Unfortunately, the implementation for x509 authentication has outdated documentation, is extremely fragile, hard to set up, and hasn't been updated with improvements we've made to the federation API since the original implementation (like shadow users or auto-provisioning, which work with other federated protocols like OpenID Connect and SAML). We've started tracking the gaps with bugs [0] so that we have things written down. I think the good thing is that once we get this cleaned up, we'll be able to re-use some of the newer federation features with x509 authentication/federation. These updates would make x509 a first-class federated protocol. The approach, pending the bug fixes, would remove the need for Oath's custom authentication plugins. It could be useful for edge deployments, or even deployments with many regions, by allowing users to be auto-provisioned in each region. Although, it doesn't necessarily solve the network partition issue. Now that we have an idea of where to start and some bug reports [0], I'm wondering if anyone is interested in helping with the update or refactor. Because this won't require a specification, we can get started on it sooner, instead of having to wait for Train development and a new specification. I'm also curious if anyone has comments or questions about the approach. Thanks, Lance [0] https://bugs.launchpad.net/keystone/+bugs?field.tag=x509 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mihalis68 at gmail.com Tue Jan 29 14:21:14 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 29 Jan 2019 09:21:14 -0500 Subject: [ops] ops meetups team meeting 2019-1-22 minutes Message-ID: Meeting ended Tue Jan 22 15:48:39 2019 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) 10:48 AM Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-01-22-15.09.html 10:48 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-01-22-15.09.txt 10:48 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-01-22-15.09.log.html Next meeting in 40 minutes on #openstack-operators see you there! Chris -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.lake at surrey.ac.uk Mon Jan 28 13:17:22 2019 From: d.lake at surrey.ac.uk (David Lake) Date: Mon, 28 Jan 2019 13:17:22 +0000 Subject: Issue with launching instance with OVS-DPDK Message-ID: Hello I've built an Openstack all-in-one using OVS-DPDK via Devstack. I can launch instances which use the "m1.small" flavour (which I have modified to include the hw:mem_size large as per the DPDK instructions) but as soon as I try to launch anything more than m1.small, I get this error: Jan 28 12:56:52 localhost nova-conductor: #033[01;31mERROR nova.scheduler.utils [#033[01;36mNone req-917cd3b9-8ce6-41af-8d44-045002512c91 #033[00;36madmin admin#033[01;31m] #033[01;35m[instance: 25cfee28-08e9-419c-afdb-4d0fe515fb2a] #033[01;31mError from last host: localhost (node localhost): [u'Traceback (most recent call last):\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 1935, in _do_build_and_run_instance\n filter_properties, request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2215, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 25cfee28-08e9-419c-afdb-4d0fe515fb2a was re-scheduled: internal error: qemu unexpectedly closed the monitor: 2019-01-28T12:56:48.127594Z qemu-kvm: -chardev socket,id=charnet0,path=/var/run/openvswitch/vhu46b3c508-f8,server: info: QEMU waiting for connection on: disconnected:unix:/var/run/openvswitch/vhu46b3c508-f8,server\n2019-01-28T12:56:49.251071Z qemu-kvm: -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/4-instance-00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM\n']#033[00m#033[00m My Hypervisor is reporting 510.7GB of RAM and 61 vCPUs. Build is the latest git clone of Devstack. Thanks David -------------- next part -------------- An HTML attachment was scrubbed... URL: From naftinajeh94 at gmail.com Mon Jan 28 13:42:29 2019 From: naftinajeh94 at gmail.com (Najeh Nafti) Date: Mon, 28 Jan 2019 14:42:29 +0100 Subject: openstack-summit-Denver-2019 Message-ID: Dear OpenStack Project team, My name is Najeh Nafti. I am a master degree student from Tunisia North Africa. I'm currently working for a thesis project based on OpenStack, but i didn't find the need resources that can help me to realize my project. I would like to request your approval to attend OpenStack-summit-Denver-2019 with your financial support. This event is a unique opportunity for me to gain knowledge and insight I need to solve daily challenges and to help me contribute to the overall goals of my study. Although I understand there might be financial constraints in sending me to this event, I believe it will be an investment that results in immediate and longer term benefits. My objectives in attending this event are: - To increase knowledge in my discipline area. - To network with my peers from all over the world to share information, learn how they are solving similar problems, and collaborate to find innovative approaches. OpenStack-summit-Denver-2019 would be a valuable experience for me and one that would benefit my thesis. Thank you for your consideration. Looking forward to hearing from you. Najeh. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Waines at windriver.com Tue Jan 29 13:49:36 2019 From: Greg.Waines at windriver.com (Waines, Greg) Date: Tue, 29 Jan 2019 13:49:36 +0000 Subject: [openstack-helm] Support for Docker Registry with authentication turned on ? In-Reply-To: References: <9ACD444D-18D0-48FA-803A-38FF561DA32C@windriver.com> Message-ID: <524FB672-6962-4860-B857-7E931F9BACAF@windriver.com> I had the following discussion with openstack-helm guys on their IRC channel during their ‘office hours’. Our plan is to write up a SPEC for this in openstack-helm. [10:48:56] hey there ... general question on the topic of interworking with a Docker Registry with authentication turned on [10:49:07] Has anyone looked at how to extend the helm-toolkit function to support docker registry credentials ? [10:49:22] e.g. we were thinking of adding an optional imagePullSecret entry in the serviceAccount template ? [10:49:31] Although don't understand how we could put this in an 'optional' manner ? [10:49:37] Any thoughts ? [11:30:29] hey GregWaines -- it could be handled as optional by wrapping that section of the template in a conditional. we do that for other optional fields, like tolerations on daemonsets [11:30:33] let me grab a link [11:31:10] https://github.com/openstack/openstack-helm-infra/blob/master/fluent-logging/templates/daemonset-fluent-bit.yaml#L96-L98 [11:33:22] the other option we just experimented with .... [11:33:49] if you ALWAYS put in the ImagePullSecret in the serviceAccount template ... with a well-known secret name [11:34:18] then it appears that this STILL works with a Registry with noauth ....if the secret does not exist or even if the secret exists [11:34:40] ... and then would also work with a Registry with auth turned on ... as long as the secret exists with the proper credentials [11:35:08] would that be acceptable upstream ? [11:35:37] i.e. would require no change to upstream operational model if using noauth Registry [11:36:04] but if using a tokenAuth Registry ... would require that user first create that secret and then apply the helm charts [11:51:18] srwilkers: we looked at doing something similar to your example .... but in the serviceAccount template, I think the only env variables that can be checked are from the specific helm chart ... and there really isn't a variable common across all helm charts that we could use [11:55:59] GregWaines: well, this would require adding something common across all charts to take advantage of. ideally, this would start small (ie, create a helm-toolkit function, then added it to a chart as a RFC upstream), then once proved out it could be rolled out across the rest of the charts [11:56:10] preferably, something under the current images: key in the charts probably [11:59:06] srwilkers: k, thanks for your input ... we'll probably work on suggesting something upstream in a SPEC in the near future [11:59:26] i think that might be the best way forward GregWaines :) [11:59:43] let me know when you're ready to throw a spec up and want some eyes on it [12:47:25] srwilkers: will do. Greg. From: Jean-Philippe Evrard Date: Tuesday, January 29, 2019 at 3:22 AM To: Greg Waines , "openstack-discuss at lists.openstack.org" Cc: "Wang, Jing (Angie)" Subject: Re: [openstack-helm] Support for Docker Registry with authentication turned on ? On Tue, 2019-01-22 at 12:35 +0000, Waines, Greg wrote: Hey ... We’re relatively new to openstack-helm. We are trying to use the openstack-helm charts with a Docker Registry that has token authentication turned on. With the current charts, there does not seem to be a way to do this. I.e. there is not an ‘imagePullSecrets’ in the defined pods/containers or in the defined serviceAccounts . Our thinking would be to add a default imagePullSecret to all of the serviceAccounts defined in the openstack-helm serviceaccount template. OR is there another way to use openstack-helm charts with a Docker Registry with authentication turned on ? Any info is appreciated, Greg / Angie / Jerry. Hello, Did you get an answer there? Could you post it to the ML, please? Regards, Jean-Philippe Evrard (evrardjp) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrhillsman at gmail.com Tue Jan 29 14:49:45 2019 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Tue, 29 Jan 2019 08:49:45 -0600 Subject: openstack-summit-Denver-2019 In-Reply-To: References: Message-ID: Hi Najeh, Glad to see your interest in OpenStack and I hope you get to the summit. You can apply for travel support here - https://openstackfoundation.formstack.com/forms/travelsupportdenver On Tue, Jan 29, 2019 at 8:42 AM Najeh Nafti wrote: > Dear OpenStack Project team, > > My name is Najeh Nafti. I am a master degree student from Tunisia North > Africa. > I'm currently working for a thesis project based on OpenStack, but i > didn't find the need resources that can help me to realize my project. > > I would like to request your approval to attend > OpenStack-summit-Denver-2019 with your financial support. This event is a > unique opportunity for me to gain knowledge and insight I need to solve > daily challenges and to help me contribute to the overall goals of my study. > > Although I understand there might be financial constraints in sending me > to this event, I believe it will be an investment that results in immediate > and longer term benefits. My objectives in attending this event are: > > - > > To increase knowledge in my discipline area. > > > - > > To network with my peers from all over the world to share information, > learn how they are solving similar problems, and collaborate to find > innovative approaches. > > > OpenStack-summit-Denver-2019 would be a valuable experience for me and > one that would benefit my thesis. > > > Thank you for your consideration. > > Looking forward to hearing from you. > Najeh. > -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Jan 29 14:55:09 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 29 Jan 2019 14:55:09 +0000 Subject: Issue with launching instance with OVS-DPDK In-Reply-To: References: Message-ID: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> On Mon, 2019-01-28 at 13:17 +0000, David Lake wrote: > Hello > > I’ve built an Openstack all-in-one using OVS-DPDK via Devstack. > > I can launch instances which use the “m1.small” flavour (which I have modified to include the hw:mem_size large as per > the DPDK instructions) but as soon as I try to launch anything more than m1.small, I get this error: > > Jan 28 12:56:52 localhost nova-conductor: #033[01;31mERROR nova.scheduler.utils [#033[01;36mNone req-917cd3b9-8ce6- > 41af-8d44-045002512c91 #033[00;36madmin admin#033[01;31m] #033[01;35m[instance: 25cfee28-08e9-419c-afdb-4d0fe515fb2a] > #033[01;31mError from last host: localhost (node localhost): [u'Traceback (most recent call last):\n', u' File > "/opt/stack/nova/nova/compute/manager.py", line 1935, in _do_build_and_run_instance\n filter_properties, > request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2215, in _build_and_run_instance\n > instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 25cfee28-08e9- > 419c-afdb-4d0fe515fb2a was re-scheduled: internal error: qemu unexpectedly closed the monitor: 2019-01- > 28T12:56:48.127594Z qemu-kvm: -chardev socket,id=charnet0,path=/var/run/openvswitch/vhu46b3c508-f8,server: info: QEMU > waiting for connection on: disconnected:unix:/var/run/openvswitch/vhu46b3c508-f8,server\n2019-01-28T12:56:49.251071Z > qemu-kvm: -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu/4-instance- > 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: os_mem_prealloc: Insufficient free host memory pages > available to allocate guest RAM\n']#033[00m#033[00m > > > My Hypervisor is reporting 510.7GB of RAM and 61 vCPUs. how much of that ram did you allocate as hugepages. can you provide the output of cat /proc/meminfo > > Build is the latest git clone of Devstack. > > Thanks > > David From alfredo.deluca at gmail.com Tue Jan 29 15:07:55 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Tue, 29 Jan 2019 16:07:55 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: Hi Ignazio and Clemens. I haven\t configure the proxy and all the logs on the kube master keep saying the following + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished [+]poststarthook/extensions/third-party-resources ok [-]poststarthook/rbac/bootstrap-roles failed: not finished healthz check failed' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished [+]poststarthook/extensions/third-party-resources ok [-]poststarthook/rbac/bootstrap-roles failed: not finished healthz check failed' ']' + sleep 5 Not sure what to do. My configuration is ... eth0 - 10.1.8.113 But the openstack configration in terms of networkin is the default from ansible-openstack which is 172.29.236.100/22 Maybe that's the problem? On Tue, Jan 29, 2019 at 2:26 PM Ignazio Cassano wrote: > Hello Alfredo, > your external network is using proxy ? > If you using a proxy, and yuo configured it in cluster template, you must > setup no proxy for 127.0.0.1 > Ignazio > > Il giorno mar 29 gen 2019 alle ore 12:26 Clemens Hardewig < > clemens.hardewig at crandale.de> ha scritto: > >> At least on fedora there is a second cloud Init log as far as I >> remember-Look into both >> >> Br c >> >> Von meinem iPhone gesendet >> >> Am 29.01.2019 um 12:08 schrieb Alfredo De Luca > >: >> >> thanks Clemens. >> I looked at the cloud-init-output.log on the master... and at the moment >> is doing the following.... >> >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> >> Network ....could be but not sure where to look at >> >> >> On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig < >> clemens.hardewig at crandale.de> wrote: >> >>> Yes, you should check the cloud-init logs of your master. Without having >>> seen them, I would guess a network issue or you have selected for your >>> minion nodes a flavor using swap perhaps ... >>> So, log files are the first step you could dig into... >>> Br c >>> Von meinem iPhone gesendet >>> >>> Am 28.01.2019 um 15:34 schrieb Alfredo De Luca >> >: >>> >>> Hi all. >>> I finally instaledl successufully openstack ansible (queens) but, after >>> creating a cluster template I create k8s cluster, it stuck on >>> >>> >>> kube_masters >>> >>> b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 >>> >>> OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate >>> in progress....and after around an hour it says...time out. k8s master >>> seems to be up.....at least as VM. >>> >>> any idea? >>> >>> >>> >>> >>> *Alfredo* >>> >>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From will at stackhpc.com Tue Jan 29 15:15:59 2019 From: will at stackhpc.com (William Szumski) Date: Tue, 29 Jan 2019 15:15:59 +0000 Subject: [kayobe] Proposing Pierre Riteau for core In-Reply-To: <62dda16e-0581-2de3-8f45-3b30510f0a28@stackhpc.com> References: <62dda16e-0581-2de3-8f45-3b30510f0a28@stackhpc.com> Message-ID: I'm also +1 for this. He's been active in reviews and patches for some time now. On Tue, 29 Jan 2019 at 11:19, Doug Szumski wrote: > > On 25/01/2019 17:23, Mark Goddard wrote: > > Hi, > > > > I'd like to propose Pierre Riteau (priteau) for core. He has > > contributed a number of good patches and provided some thoughtful and > > useful reviews. > > > > Cores, please respond +1 or -1. > +1 for Pierre, it will be great to have him on the team. > > > > Mark > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 29 15:25:04 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 29 Jan 2019 16:25:04 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: Hello, in openstack there are a lot of newtroks . Your 172.29 blabla bla network is probably the network where openstack endpoint are exposed , right ? If yes, that is not the network where virtual machine are attached. In your openstack ym must have also networks for virtual machines. When you create a magnum cluster, yum must specify an external netowrok used by virtual machine for download packages from internet and to be contacted . Magnum create private netowrk (probablly your 10.1.8 network) which is connected to the external network by a virtual router created by magnum heat template. Try to see your network topology in openstack dashboard. Ignazio Il giorno mar 29 gen 2019 alle ore 16:08 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > Hi Ignazio and Clemens. I haven\t configure the proxy and all the logs on > the kube master keep saying the following > > + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished > [+]poststarthook/extensions/third-party-resources ok > [-]poststarthook/rbac/bootstrap-roles failed: not finished > healthz check failed' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished > [+]poststarthook/extensions/third-party-resources ok > [-]poststarthook/rbac/bootstrap-roles failed: not finished > healthz check failed' ']' > + sleep 5 > > Not sure what to do. > My configuration is ... > eth0 - 10.1.8.113 > > But the openstack configration in terms of networkin is the default from > ansible-openstack which is 172.29.236.100/22 > > Maybe that's the problem? > > > > > > > On Tue, Jan 29, 2019 at 2:26 PM Ignazio Cassano > wrote: > >> Hello Alfredo, >> your external network is using proxy ? >> If you using a proxy, and yuo configured it in cluster template, you must >> setup no proxy for 127.0.0.1 >> Ignazio >> >> Il giorno mar 29 gen 2019 alle ore 12:26 Clemens Hardewig < >> clemens.hardewig at crandale.de> ha scritto: >> >>> At least on fedora there is a second cloud Init log as far as I >>> remember-Look into both >>> >>> Br c >>> >>> Von meinem iPhone gesendet >>> >>> Am 29.01.2019 um 12:08 schrieb Alfredo De Luca >> >: >>> >>> thanks Clemens. >>> I looked at the cloud-init-output.log on the master... and at the >>> moment is doing the following.... >>> >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '' ']' >>> + sleep 5 >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '' ']' >>> + sleep 5 >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '' ']' >>> + sleep 5 >>> >>> Network ....could be but not sure where to look at >>> >>> >>> On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig < >>> clemens.hardewig at crandale.de> wrote: >>> >>>> Yes, you should check the cloud-init logs of your master. Without >>>> having seen them, I would guess a network issue or you have selected for >>>> your minion nodes a flavor using swap perhaps ... >>>> So, log files are the first step you could dig into... >>>> Br c >>>> Von meinem iPhone gesendet >>>> >>>> Am 28.01.2019 um 15:34 schrieb Alfredo De Luca < >>>> alfredo.deluca at gmail.com>: >>>> >>>> Hi all. >>>> I finally instaledl successufully openstack ansible (queens) but, after >>>> creating a cluster template I create k8s cluster, it stuck on >>>> >>>> >>>> kube_masters >>>> >>>> b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 >>>> >>>> OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate >>>> in progress....and after around an hour it says...time out. k8s master >>>> seems to be up.....at least as VM. >>>> >>>> any idea? >>>> >>>> >>>> >>>> >>>> *Alfredo* >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pabelanger at redhat.com Tue Jan 29 15:53:26 2019 From: pabelanger at redhat.com (Paul Belanger) Date: Tue, 29 Jan 2019 10:53:26 -0500 Subject: [infra] fedora-29 node failures Message-ID: <20190129155326.GA23049@localhost.localdomain> Morning, Jobs depending on fedora-latest (fedora-29 in this case) look to be in a broken state currently. Since this past weekend, we seems to be hitting an issue with network manager and dbus. From what I can see, we are only currently launching fedora-29 nodes in limestone (resulting it long wait times), and when they come online, seem to fail quikcly with DNS related issues. If you are interested in these jobs, please join #openstack-infra and aid in debugging them. Thanks, Paul From mihalis68 at gmail.com Tue Jan 29 15:56:07 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 29 Jan 2019 10:56:07 -0500 Subject: [all] Is the Denver Summit save? In-Reply-To: References: Message-ID: Frank, just to clarify, is your question "is it safe for us to go to denver?" i.e. Deutsch Telekom employees, given that the US just made very serious allegations against a fellow open stack contributing telecom company? Chris On Tue, Jan 29, 2019 at 2:56 AM Frank Kloeker wrote: > Good morning, > > just a question, out of curiosity: Is the Denver Summit save? I mean, we > received a lot of breaking news in the morning, that Huawei is charged > in court and is doing a lot of bad things (i.e. stolen a robotics arm). > I don't want to bring in any political discussions. But surely as you > know, Huawei is one of the top contributor to our Open Source project. > They work hard and do not need to steal things. Is there any chance that > one of our friends will be caught next? At the summit between political > fronts? I feel a little uncomfortable with it. > We are an open community and should clarify this inconsistency. > > kind regards > > Frank > > -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Tue Jan 29 16:09:47 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Tue, 29 Jan 2019 11:09:47 -0500 Subject: [openstack-ansible] bug squash day! Message-ID: Hi team, As you may have noticed, bug triage during our meetings has been something that has kinda killed attendance (really, no one seems to enjoy it, believe it or not!) I wanted to propose for us to take a day to go through as much bugs as possible, triaging and fixing as much as we can. It'd be a fun day and we can also hop on a more higher bandwidth way to talk about this stuff while we grind through it all. Is this something that people are interested in, if so, is there any times/days that work better in the week to organize? Thanks! Mohammed -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From chris at openstack.org Tue Jan 29 17:03:50 2019 From: chris at openstack.org (Chris Hoge) Date: Tue, 29 Jan 2019 09:03:50 -0800 Subject: [baremetal-sig][ironic] Proposing Formation of Bare Metal SIG Message-ID: <4191B2EA-A6F0-4183-B0EF-C5C013E3A982@openstack.org> This morning at the OpenStack Foundation board meeting we announced that we will focusing more our open infrastructure messaging efforts on bare-metal, and the promotion of Ironic in particular. To support these efforts, I'd like to form a Bare Metal SIG to bring together community members to collaborate on this work. The purpose of the Bare Metal SIG will be to promote the development and use of Ironic and other OpenStack bare-metal software. This will include marketing efforts like case studies of Ironic clusters in industry and academia, supporting integration of Ironic with projects like Airship and the Kubernetes Cluster API, coordinating presentations for industry events, developing documentation and tutorials, gathering feedback from the community on usage and feature gaps, and other broader community-facing efforts to encourage the adoption of Ironic as a bare-metal management tool. If you would like to participate, please indicate your interest in the linked planning etherpad. Ideally we would like to have broad engagement from across the community, from developers and practitioners alike. We'd like to highlight all of the efforts and usage across our community communicate how powerful Ironic is for hardware management. https://etherpad.openstack.org/p/bare-metal-sig Thanks in advance to everyone. I’ve been using Ironic to manage my home cluster for quite a while now, and I’m really excited to be working on the SIG and in supporting the efforts of the Ironic team and the users who are running Ironic in production. Chris Hoge Strategic Program Manager OpenStack Foundation From anteaya at anteaya.info Tue Jan 29 17:51:31 2019 From: anteaya at anteaya.info (Anita Kuno) Date: Tue, 29 Jan 2019 12:51:31 -0500 Subject: [all] Is the Denver Summit save? In-Reply-To: References: Message-ID: <6166c276-7986-c857-2adc-9d66a8f38a60@anteaya.info> On 2019-01-29 10:56 a.m., Chris Morgan wrote: > Frank, just to clarify, is your question "is it safe for us to go to > denver?" i.e. Deutsch Telekom employees, given that the US just made very > serious allegations against a fellow open stack contributing telecom > company? > > Chris > > On Tue, Jan 29, 2019 at 2:56 AM Frank Kloeker wrote: > >> Good morning, >> >> just a question, out of curiosity: Is the Denver Summit save? I mean, we >> received a lot of breaking news in the morning, that Huawei is charged >> in court and is doing a lot of bad things (i.e. stolen a robotics arm). >> I don't want to bring in any political discussions. But surely as you >> know, Huawei is one of the top contributor to our Open Source project. >> They work hard and do not need to steal things. Is there any chance that >> one of our friends will be caught next? At the summit between political >> fronts? I feel a little uncomfortable with it. >> We are an open community and should clarify this inconsistency. >> >> kind regards >> >> Frank >> >> > One of the things I think we can agree on in an open source community is the importance of making decisions about our own behaviour as individuals, based on facts. I don't have an answer to the question posed. I do have some suggestions for those interested on how to access facts in this matter. I find that reading multiple news sources on a given issue to be very helpful as I try to understand the full picture. In this matter, I find that reading Canadian news sources, cbc.ca/news, thestar.com (you will be prompted to subscribe, you don't have to subscribe), and globeandmail.com (some articles are for subscribers only) to be very useful. The Canadian news site nationalpost.com used to be a source I read often, but now I believe all articles are for subscribers only. Internationally I find bbc.co.uk helpful. Thank you, Anita From lbragstad at gmail.com Tue Jan 29 17:55:00 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Tue, 29 Jan 2019 11:55:00 -0600 Subject: [Edge-computing] [keystone] x509 authentication In-Reply-To: <1677A2C3-F648-4611-8332-474938ECC677@windriver.com> References: <1677A2C3-F648-4611-8332-474938ECC677@windriver.com> Message-ID: On Tue, Jan 29, 2019 at 8:06 AM Waines, Greg wrote: > Hey Lance, > > I like the plan. > > > > Just a clarifying question on *“Although, it doesn't necessarily solve > the network partition issue.”* . > > - I’m assuming this is in a scenario where after the network partition > the Edge Cloud and local client(s) do not have access to the Identity > Provider ? > > Correct. Using x509 certificates to authenticate to keystone still requires access to keystone. Keystone doesn't necessarily need a link to the "identity provider" in this case, since the authentication path doesn't require online validation. This is different from using SAML assertions, where the entire flow establishes a connection between keystone acting as the service provider (at the edge) and an identity provider somewhere authenticating the user. > > - > - And in this case, it doesn’t work because ? > - For a new local client (without any cached tokens), > even if there are local shadow users already configured, > the authentication still requires communication with the Identity > Provider ? > > I guess it depends on the architecture you're considering [0]. There isn't a hard requirement to talk to the identity provider of an x509 certificate, but token validation still needs to work. If you're deploying the architecture with the distributed control plane, you can authenticate with an x509 certificate against any keystone. For example, only using a centralized data center and medium/large edge cluster would make keystone available everywhere, so a network partition might not be an issue. Conversely, if you authenticate for a token with an x509 certificate and use it to spin up compute resources in a small edge cluster, which doesn't have a keystone deployment, the network partition is going to make online token validation impossible, if you're calling APIs in the small edge directly. The same issue is going to be present for deployments following the centralized control plane architecture since keystone is only available in the central data center and isn't available at the large, medium, or small edge sites. Validating the token online from edge sites to the centralized data center is going to be susceptible to network partitions. In my opinion, the big difference between x509 and other federated protocols is that it doesn't really have a hard requirement on linking back to the identity provider. In the case of SAML, the identity provider is anything that has the ability to issue SAML assertions proving the identity of its users (e.g., keystone acting as an identity provider, ADFS, etc.) With x509 certificates, the identity provider is a certificate authority that issues and signs user certificates. Keystone needs to be configured to "trust" certificated signed by that certificate authority. When a user authenticates, keystone relies on SSL plugin libraries to validate the certificate against the root certificate authority, but this is done offline since the SSL configuration has a copy of the root certificates. >From there, the plan is to treat each trusted certificate authority as its own identity provider, so all users with certificates signed by the same authority get mapped into the same namespace/identity provider. Once the users have their signed certificate, they can authenticate for tokens without a link being established between keystone and whatever certificate authority issued the certificate. The downside is that certificate revocation and certificate distribution is now a thing you need to worry about. James might have more input there since it sounds like this is the approach they are shooting for with Athenz (which is the certificate authority in their case.) I hope that helps clear things up? [0] https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures > > - > - > > Is this correct ? > > > > Greg. > > > > > > *From: *Lance Bragstad > *Date: *Friday, January 25, 2019 at 2:16 PM > *To: *"edge-computing at lists.openstack.org" < > edge-computing at lists.openstack.org>, " > openstack-discuss at lists.openstack.org" < > openstack-discuss at lists.openstack.org> > *Subject: *[Edge-computing] [keystone] x509 authentication > > > > Hi all, > > > > We've been going over keystone gaps that need to be addressed for edge use > cases every Tuesday. Since Berlin, Oath has open-sourced some of their > custom authentication plugins for keystone that help them address these > gaps. > > > > The basic idea is that users authenticate to some external identity > provider (Athenz in Oath's case), and then present an Athenz token to > keystone. The custom plugins decode the token from Athenz to determine the > user, project, roles assignments, and other useful bits of information. > After that, it creates any resources that don't exist in keystone already. > Ultimately, a user can authenticate against a keystone node and have > specific resources provisioned automatically. In Berlin, engineers from > Oath were saying they'd like to move away from Athenz tokens altogether and > use x509 certificates issued by Athenz instead. The auto-provisioning > approach is very similar to a feature we have in keystone already. In > Berlin, and shortly after, there was general agreement that if we could > support x509 authentication with auto-provisioning via keystone federation, > that would pretty much solve Oath's use case without having to maintain > custom keystone plugins. > > > > Last week, Colleen started digging into keystone's existing x509 > authentication support. I'll start with the good news, which is x509 > authentication works, for the most part. It's been a feature in keystone > for a long time, and it landed after we implemented federation support > around the Kilo release. Chances are there won't be a need for a keystone > specification like we were initially thinking in the edge meetings. > Unfortunately, the implementation for x509 authentication has outdated > documentation, is extremely fragile, hard to set up, and hasn't been > updated with improvements we've made to the federation API since the > original implementation (like shadow users or auto-provisioning, which work > with other federated protocols like OpenID Connect and SAML). We've started > tracking the gaps with bugs [0] so that we have things written down. > > > > I think the good thing is that once we get this cleaned up, we'll be able > to re-use some of the newer federation features with x509 > authentication/federation. These updates would make x509 a first-class > federated protocol. The approach, pending the bug fixes, would remove the > need for Oath's custom authentication plugins. It could be useful for edge > deployments, or even deployments with many regions, by allowing users to be > auto-provisioned in each region. Although, it doesn't necessarily solve the > network partition issue. > > > > Now that we have an idea of where to start and some bug reports [0], I'm > wondering if anyone is interested in helping with the update or refactor. > Because this won't require a specification, we can get started on it > sooner, instead of having to wait for Train development and a new > specification. I'm also curious if anyone has comments or questions about > the approach. > > > > Thanks, > > > > Lance > > > > [0] https://bugs.launchpad.net/keystone/+bugs?field.tag=x509 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lbragstad at gmail.com Tue Jan 29 17:55:09 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Tue, 29 Jan 2019 11:55:09 -0600 Subject: [Edge-computing] [keystone] x509 authentication In-Reply-To: References: Message-ID: On Fri, Jan 25, 2019 at 3:02 PM James Penick wrote: > Hey Lance, > We'd definitely be interested in helping with the work. I'll grab some > volunteers from my team and get them in touch within the next few days. > > Awesome, that sounds great! I'm open to using this thread for more technical communication if needed. Otherwise, #openstack-keystone is always open for folks to swing by if they want to discuss things there. FWIW - we brought this up in the keystone meeting today and there several other people interested in this work. There is probably going to be an opportunity to break the work up a bit. > -James > > > On Fri, Jan 25, 2019 at 11:16 AM Lance Bragstad > wrote: > >> Hi all, >> >> We've been going over keystone gaps that need to be addressed for edge >> use cases every Tuesday. Since Berlin, Oath has open-sourced some of their >> custom authentication plugins for keystone that help them address these >> gaps. >> >> The basic idea is that users authenticate to some external identity >> provider (Athenz in Oath's case), and then present an Athenz token to >> keystone. The custom plugins decode the token from Athenz to determine the >> user, project, roles assignments, and other useful bits of information. >> After that, it creates any resources that don't exist in keystone already. >> Ultimately, a user can authenticate against a keystone node and have >> specific resources provisioned automatically. In Berlin, engineers from >> Oath were saying they'd like to move away from Athenz tokens altogether and >> use x509 certificates issued by Athenz instead. The auto-provisioning >> approach is very similar to a feature we have in keystone already. In >> Berlin, and shortly after, there was general agreement that if we could >> support x509 authentication with auto-provisioning via keystone federation, >> that would pretty much solve Oath's use case without having to maintain >> custom keystone plugins. >> >> Last week, Colleen started digging into keystone's existing x509 >> authentication support. I'll start with the good news, which is x509 >> authentication works, for the most part. It's been a feature in keystone >> for a long time, and it landed after we implemented federation support >> around the Kilo release. Chances are there won't be a need for a keystone >> specification like we were initially thinking in the edge meetings. >> Unfortunately, the implementation for x509 authentication has outdated >> documentation, is extremely fragile, hard to set up, and hasn't been >> updated with improvements we've made to the federation API since the >> original implementation (like shadow users or auto-provisioning, which work >> with other federated protocols like OpenID Connect and SAML). We've started >> tracking the gaps with bugs [0] so that we have things written down. >> >> I think the good thing is that once we get this cleaned up, we'll be able >> to re-use some of the newer federation features with x509 >> authentication/federation. These updates would make x509 a first-class >> federated protocol. The approach, pending the bug fixes, would remove the >> need for Oath's custom authentication plugins. It could be useful for edge >> deployments, or even deployments with many regions, by allowing users to be >> auto-provisioned in each region. Although, it doesn't necessarily solve the >> network partition issue. >> >> Now that we have an idea of where to start and some bug reports [0], I'm >> wondering if anyone is interested in helping with the update or refactor. >> Because this won't require a specification, we can get started on it >> sooner, instead of having to wait for Train development and a new >> specification. I'm also curious if anyone has comments or questions about >> the approach. >> >> Thanks, >> >> Lance >> >> [0] https://bugs.launchpad.net/keystone/+bugs?field.tag=x509 >> _______________________________________________ >> Edge-computing mailing list >> Edge-computing at lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amy at demarco.com Tue Jan 29 18:23:01 2019 From: amy at demarco.com (Amy) Date: Tue, 29 Jan 2019 12:23:01 -0600 Subject: [openstack-ansible] bug squash day! In-Reply-To: References: Message-ID: <79F15A86-0C6C-465C-B56D-593BE0FB5E0B@demarco.com> I like bug squashes as I think they tend to be both productive as well as team building at the same time as folks work together. Amy (spots) Sent from my iPhone > On Jan 29, 2019, at 10:09 AM, Mohammed Naser wrote: > > Hi team, > > As you may have noticed, bug triage during our meetings has been > something that has kinda killed attendance (really, no one seems to > enjoy it, believe it or not!) > > I wanted to propose for us to take a day to go through as much bugs as > possible, triaging and fixing as much as we can. It'd be a fun day > and we can also hop on a more higher bandwidth way to talk about this > stuff while we grind through it all. > > Is this something that people are interested in, if so, is there any > times/days that work better in the week to organize? > > Thanks! > Mohammed > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > From mark at stackhpc.com Tue Jan 29 18:33:45 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 29 Jan 2019 18:33:45 +0000 Subject: [kayobe] Proposing Pierre Riteau for core In-Reply-To: References: Message-ID: Given the departure of yankcrime, we have quorum. Thanks, I'll add Pierre to the kayobe-core group. Mark On Fri, 25 Jan 2019 at 17:23, Mark Goddard wrote: > Hi, > > I'd like to propose Pierre Riteau (priteau) for core. He has contributed a > number of good patches and provided some thoughtful and useful reviews. > > Cores, please respond +1 or -1. > > Mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eumel at arcor.de Tue Jan 29 19:04:33 2019 From: eumel at arcor.de (Frank Kloeker) Date: Tue, 29 Jan 2019 20:04:33 +0100 Subject: [all] Is the Denver Summit save? In-Reply-To: References: Message-ID: <1cf03681ca2f30d6f64599f723d7f349@arcor.de> Hi Chris, not sure. T-Mobile is also a foreign company in the US which conquered many market shares. Hopefully our name is not on the target list, yet. Frank Am 2019-01-29 16:56, schrieb Chris Morgan: > Frank, just to clarify, is your question "is it safe for us to go to > denver?" i.e. Deutsch Telekom employees, given that the US just made > very serious allegations against a fellow open stack contributing > telecom company? > > Chris > > On Tue, Jan 29, 2019 at 2:56 AM Frank Kloeker wrote: > >> Good morning, >> >> just a question, out of curiosity: Is the Denver Summit save? I >> mean, we >> received a lot of breaking news in the morning, that Huawei is >> charged >> in court and is doing a lot of bad things (i.e. stolen a robotics >> arm). >> I don't want to bring in any political discussions. But surely as >> you >> know, Huawei is one of the top contributor to our Open Source >> project. >> They work hard and do not need to steal things. Is there any chance >> that >> one of our friends will be caught next? At the summit between >> political >> fronts? I feel a little uncomfortable with it. >> We are an open community and should clarify this inconsistency. >> >> kind regards >> >> Frank > > -- > Chris Morgan From jon at csail.mit.edu Tue Jan 29 19:24:58 2019 From: jon at csail.mit.edu (Jonathan Proulx) Date: Tue, 29 Jan 2019 14:24:58 -0500 Subject: [all] Is the Denver Summit save? In-Reply-To: <1cf03681ca2f30d6f64599f723d7f349@arcor.de> References: <1cf03681ca2f30d6f64599f723d7f349@arcor.de> Message-ID: <20190129192458.sgjttd2rl56fgeqg@csail.mit.edu> I certainly don't have any special knowlege on this topic. But as an American it seems to me Huawei is the only company specifically targeted, though there may be reason for concern for other Chinese companies. Even in those though only very high level executives are likely to have coause for cencern. If you are working for Deutsch Telekom I can't imagine a problem. If you're working for Huawei unless you have significan influnence over corporate decision making (in which case you would be asking corporate lawyers no a mailing list for risk assessment) I also can't imagine a serious problem at worst they may not issue visas or delay processing them to effectively but not "officially" deny travel. These are just my opinions from how I'm reading public news sources so take it for what it is -jon On Tue, Jan 29, 2019 at 08:04:33PM +0100, Frank Kloeker wrote: :Hi Chris, : :not sure. T-Mobile is also a foreign company in the US which conquered many :market shares. :Hopefully our name is not on the target list, yet. : :Frank : : :Am 2019-01-29 16:56, schrieb Chris Morgan: :> Frank, just to clarify, is your question "is it safe for us to go to :> denver?" i.e. Deutsch Telekom employees, given that the US just made :> very serious allegations against a fellow open stack contributing :> telecom company? :> :> Chris :> :> On Tue, Jan 29, 2019 at 2:56 AM Frank Kloeker wrote: :> :> > Good morning, :> > :> > just a question, out of curiosity: Is the Denver Summit save? I :> > mean, we :> > received a lot of breaking news in the morning, that Huawei is :> > charged :> > in court and is doing a lot of bad things (i.e. stolen a robotics :> > arm). :> > I don't want to bring in any political discussions. But surely as :> > you :> > know, Huawei is one of the top contributor to our Open Source :> > project. :> > They work hard and do not need to steal things. Is there any chance :> > that :> > one of our friends will be caught next? At the summit between :> > political :> > fronts? I feel a little uncomfortable with it. :> > We are an open community and should clarify this inconsistency. :> > :> > kind regards :> > :> > Frank :> :> -- :> Chris Morgan : : From bitskrieg at bitskrieg.net Tue Jan 29 19:26:11 2019 From: bitskrieg at bitskrieg.net (Chris Apsey) Date: Tue, 29 Jan 2019 14:26:11 -0500 Subject: [all] Is the Denver Summit save? In-Reply-To: <1cf03681ca2f30d6f64599f723d7f349@arcor.de> References: <1cf03681ca2f30d6f64599f723d7f349@arcor.de> Message-ID: <168701d4b808$81ac0310$85040930$@bitskrieg.net> Frank, Certain Huawei leadership entities are being targeted for very specific reasons. I would imagine that the everyday engineer is not of interest to the US Government (or any government for that matter) since they wouldn't really have any knowledge of any of the things that the US Government is alleging. We probably won't see any Huawei executives giving keynotes anytime soon, but I wouldn't expect any other impact beyond that. Chris -----Original Message----- From: Frank Kloeker Sent: Tuesday, January 29, 2019 2:05 PM To: Chris Morgan Cc: openstack-discuss at lists.openstack.org Subject: Re: [all] Is the Denver Summit save? Hi Chris, not sure. T-Mobile is also a foreign company in the US which conquered many market shares. Hopefully our name is not on the target list, yet. Frank Am 2019-01-29 16:56, schrieb Chris Morgan: > Frank, just to clarify, is your question "is it safe for us to go to > denver?" i.e. Deutsch Telekom employees, given that the US just made > very serious allegations against a fellow open stack contributing > telecom company? > > Chris > > On Tue, Jan 29, 2019 at 2:56 AM Frank Kloeker wrote: > >> Good morning, >> >> just a question, out of curiosity: Is the Denver Summit save? I mean, >> we received a lot of breaking news in the morning, that Huawei is >> charged in court and is doing a lot of bad things (i.e. stolen a >> robotics arm). >> I don't want to bring in any political discussions. But surely as you >> know, Huawei is one of the top contributor to our Open Source >> project. >> They work hard and do not need to steal things. Is there any chance >> that one of our friends will be caught next? At the summit between >> political fronts? I feel a little uncomfortable with it. >> We are an open community and should clarify this inconsistency. >> >> kind regards >> >> Frank > > -- > Chris Morgan From eumel at arcor.de Tue Jan 29 19:26:21 2019 From: eumel at arcor.de (Frank Kloeker) Date: Tue, 29 Jan 2019 20:26:21 +0100 Subject: [openstack-ansible] bug squash day! In-Reply-To: References: Message-ID: <717c065910a2365e8d9674f987227771@arcor.de> Am 2019-01-29 17:09, schrieb Mohammed Naser: > Hi team, > > As you may have noticed, bug triage during our meetings has been > something that has kinda killed attendance (really, no one seems to > enjoy it, believe it or not!) > > I wanted to propose for us to take a day to go through as much bugs as > possible, triaging and fixing as much as we can. It'd be a fun day > and we can also hop on a more higher bandwidth way to talk about this > stuff while we grind through it all. > > Is this something that people are interested in, if so, is there any > times/days that work better in the week to organize? Interesting. Something in EU timezone would be nice. Or what about: Bug around the clock? So 24 hours of bug triage :) kind regards Frank From anteaya at anteaya.info Tue Jan 29 19:39:11 2019 From: anteaya at anteaya.info (Anita Kuno) Date: Tue, 29 Jan 2019 14:39:11 -0500 Subject: [all] Is the Denver Summit save? In-Reply-To: <1cf03681ca2f30d6f64599f723d7f349@arcor.de> References: <1cf03681ca2f30d6f64599f723d7f349@arcor.de> Message-ID: <496b3768-19c4-ee10-e1b4-1978f824cad3@anteaya.info> On 2019-01-29 2:04 p.m., Frank Kloeker wrote: > T-Mobile is also a foreign company in the US which conquered many market > shares. > Hopefully our name is not on the target list, yet. Did you personally violate trade sanctions? If your source of news has not seen fit to identify the cause of the issue as a violation of trade sanctions, I encourage a increase in scope in your news gathering. Thank you, Anita From aspiers at suse.com Tue Jan 29 19:40:40 2019 From: aspiers at suse.com (Adam Spiers) Date: Tue, 29 Jan 2019 19:40:40 +0000 Subject: [all] Is the Denver Summit save? In-Reply-To: <6166c276-7986-c857-2adc-9d66a8f38a60@anteaya.info> References: <6166c276-7986-c857-2adc-9d66a8f38a60@anteaya.info> Message-ID: <20190129194040.4lq4ngg3hek5myrn@pacific.linksys.moosehall> Anita Kuno wrote: >One of the things I think we can agree on in an open source community >is the importance of making decisions about our own behaviour as >individuals, based on facts. Agreed. >I don't have an answer to the question posed. I do have some >suggestions for those interested on how to access facts in this >matter. > >I find that reading multiple news sources on a given issue to be very >helpful as I try to understand the full picture. This is a great suggestion. >In this matter, I find that reading Canadian news sources, >cbc.ca/news, thestar.com (you will be prompted to subscribe, you don't >have to subscribe), and globeandmail.com (some articles are for >subscribers only) to be very useful. The Canadian news site >nationalpost.com used to be a source I read often, but now I believe >all articles are for subscribers only. > >Internationally I find bbc.co.uk helpful. Yes, BBC News is probably still fine for most news outside the UK. Although FWIW (and at a risk of going off on a tangent) my personal opinion as a Brit is that whilst BBC has traditionally been a fairly reliable news source with a (global?) reputation for impartiality, sadly it is no longer in general quite what it used to be. In particular I would recommend taking any coverage in which UK politics has a stake with a pinch of salt. Here again Anita's advice to consult multiple sources from different viewpoints is excellent :-) From anteaya at anteaya.info Tue Jan 29 19:50:35 2019 From: anteaya at anteaya.info (Anita Kuno) Date: Tue, 29 Jan 2019 14:50:35 -0500 Subject: [all] Is the Denver Summit save? In-Reply-To: <20190129194040.4lq4ngg3hek5myrn@pacific.linksys.moosehall> References: <6166c276-7986-c857-2adc-9d66a8f38a60@anteaya.info> <20190129194040.4lq4ngg3hek5myrn@pacific.linksys.moosehall> Message-ID: <26c3bf66-3ed4-9caf-a3f0-a74640da39db@anteaya.info> On 2019-01-29 2:40 p.m., Adam Spiers wrote: > Anita Kuno wrote: >> One of the things I think we can agree on in an open source community >> is the importance of making decisions about our own behaviour as >> individuals, based on facts. > > Agreed. > >> I don't have an answer to the question posed. I do have some >> suggestions for those interested on how to access facts in this matter. >> >> I find that reading multiple news sources on a given issue to be very >> helpful as I try to understand the full picture. > > This is a great suggestion. >> In this matter, I find that reading Canadian news sources, >> cbc.ca/news, thestar.com (you will be prompted to subscribe, you don't >> have to subscribe), and globeandmail.com (some articles are for >> subscribers only) to be very useful. The Canadian news site >> nationalpost.com used to be a source I read often, but now I believe >> all articles are for subscribers only. >> Internationally I find bbc.co.uk helpful. > > Yes, BBC News is probably still fine for most news outside the UK. > Although FWIW (and at a risk of going off on a tangent) my personal > opinion as a Brit is that whilst BBC has traditionally been a fairly > reliable news source with a (global?) reputation for impartiality, sadly > it is no longer in general quite what it used to be.  In particular I > would recommend taking any coverage in which UK politics has a stake > with a pinch of salt. Thanks Adam, I'm always glad to have some commentary from folks who know the source well. I've reserved rather a muddy space in my mind for the current Brit political situation, I keep hoping today will be the day I can understand it. Sadly today is, again, not the day. If you have a suggestion for another international news source, I'm open to adding it to my list as well. This invitation is open to any reader of this email as well as Adam. Thank you, Anita > Here again Anita's advice to consult multiple > sources from different viewpoints is excellent :-) From clemens.hardewig at crandale.de Tue Jan 29 19:51:53 2019 From: clemens.hardewig at crandale.de (Clemens) Date: Tue, 29 Jan 2019 20:51:53 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: <9A793860-40C7-44ED-9286-DD0BB140C401@crandale.de> Well - as you only sent a small potion of the log, I can still only guessing that the problem lies with your network config. As Ignazio said, the most straight way to install a K8s cluster based on the default template is to let magnum create a new network and router for you. This is achieved by leaving the private network field empty in horizon. It seems to be that your cluster has not started the basic Kubernetes services (Etcd, Kubernetes-apiserver, -controller-manager, …) sucessfully. I don’t know how you have started your cluster (CLI or horizon) so perhaps you could share that. Br c > Am 29.01.2019 um 16:07 schrieb Alfredo De Luca : > > Hi Ignazio and Clemens. I haven\t configure the proxy and all the logs on the kube master keep saying the following > > + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished > [+]poststarthook/extensions/third-party-resources ok > [-]poststarthook/rbac/bootstrap-roles failed: not finished > healthz check failed' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished > [+]poststarthook/extensions/third-party-resources ok > [-]poststarthook/rbac/bootstrap-roles failed: not finished > healthz check failed' ']' > + sleep 5 > > Not sure what to do. > My configuration is ... > eth0 - 10.1.8.113 > > But the openstack configration in terms of networkin is the default from ansible-openstack which is 172.29.236.100/22 > > Maybe that's the problem? > > > > > > > On Tue, Jan 29, 2019 at 2:26 PM Ignazio Cassano > wrote: > Hello Alfredo, > your external network is using proxy ? > If you using a proxy, and yuo configured it in cluster template, you must setup no proxy for 127.0.0.1 > Ignazio > > Il giorno mar 29 gen 2019 alle ore 12:26 Clemens Hardewig > ha scritto: > At least on fedora there is a second cloud Init log as far as I remember-Look into both > > Br c > > Von meinem iPhone gesendet > > Am 29.01.2019 um 12:08 schrieb Alfredo De Luca >: > >> thanks Clemens. >> I looked at the cloud-init-output.log on the master... and at the moment is doing the following.... >> >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> >> Network ....could be but not sure where to look at >> >> >> On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig > wrote: >> Yes, you should check the cloud-init logs of your master. Without having seen them, I would guess a network issue or you have selected for your minion nodes a flavor using swap perhaps ... >> So, log files are the first step you could dig into... >> Br c >> Von meinem iPhone gesendet >> >> Am 28.01.2019 um 15:34 schrieb Alfredo De Luca >: >> >>> Hi all. >>> I finally instaledl successufully openstack ansible (queens) but, after creating a cluster template I create k8s cluster, it stuck on >>> >>> >>> kube_masters b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 OS::Heat::ResourceGroup 16 minutes Create In Progress state changed >>> create in progress....and after around an hour it says...time out. k8s master seems to be up.....at least as VM. >>> >>> any idea? >>> >>> >>> >>> >>> Alfredo >>> >> >> >> -- >> Alfredo >> > > > -- > Alfredo > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3783 bytes Desc: not available URL: From mrunge at matthias-runge.de Tue Jan 29 19:55:28 2019 From: mrunge at matthias-runge.de (Matthias Runge) Date: Tue, 29 Jan 2019 20:55:28 +0100 Subject: [tripleo][ptl]What is the future of availability monitoring in TripleO? In-Reply-To: References: Message-ID: <20190129195528.GF8793@hilbert.berg.ol> On Fri, Jan 25, 2019 at 01:34:40PM -0800, David M Noriega wrote: > I've noticed that the availability monitoring integration in TripleO using > Sensu is marked for deprecation post Rocky, but I do not see any blueprints > detailing a replacement plan like there is for fluentd to rsyslog for the > logging integration. Where can I find information on what the roadmap is? > Is this a feature that will be dropped and left to operators to implement? David, Sensu is marked for deprecation, which means it won't get installed by default in the near future anymore. Sensu (and fluentd combined) have a long tail of dependent packages (~ 150, if I'm not mistaken). For the future, I would look at replacing it with a combination of collectd, Prometheus and Alertmanager. I'll submit a spec for discussion in some future. Matthias -- Matthias Runge From clemens.hardewig at crandale.de Tue Jan 29 20:16:22 2019 From: clemens.hardewig at crandale.de (Clemens) Date: Tue, 29 Jan 2019 21:16:22 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: References: Message-ID: <1F00FD58-4132-4C42-A9C2-41E3FF8A84C4@crandale.de> … an more important: check the other log cloud-init.log for error messages (not only cloud-init-output.log) > Am 29.01.2019 um 16:07 schrieb Alfredo De Luca : > > Hi Ignazio and Clemens. I haven\t configure the proxy and all the logs on the kube master keep saying the following > > + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished > [+]poststarthook/extensions/third-party-resources ok > [-]poststarthook/rbac/bootstrap-roles failed: not finished > healthz check failed' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished > [+]poststarthook/extensions/third-party-resources ok > [-]poststarthook/rbac/bootstrap-roles failed: not finished > healthz check failed' ']' > + sleep 5 > > Not sure what to do. > My configuration is ... > eth0 - 10.1.8.113 > > But the openstack configration in terms of networkin is the default from ansible-openstack which is 172.29.236.100/22 > > Maybe that's the problem? > > > > > > > On Tue, Jan 29, 2019 at 2:26 PM Ignazio Cassano > wrote: > Hello Alfredo, > your external network is using proxy ? > If you using a proxy, and yuo configured it in cluster template, you must setup no proxy for 127.0.0.1 > Ignazio > > Il giorno mar 29 gen 2019 alle ore 12:26 Clemens Hardewig > ha scritto: > At least on fedora there is a second cloud Init log as far as I remember-Look into both > > Br c > > Von meinem iPhone gesendet > > Am 29.01.2019 um 12:08 schrieb Alfredo De Luca >: > >> thanks Clemens. >> I looked at the cloud-init-output.log on the master... and at the moment is doing the following.... >> >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> >> Network ....could be but not sure where to look at >> >> >> On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig > wrote: >> Yes, you should check the cloud-init logs of your master. Without having seen them, I would guess a network issue or you have selected for your minion nodes a flavor using swap perhaps ... >> So, log files are the first step you could dig into... >> Br c >> Von meinem iPhone gesendet >> >> Am 28.01.2019 um 15:34 schrieb Alfredo De Luca >: >> >>> Hi all. >>> I finally instaledl successufully openstack ansible (queens) but, after creating a cluster template I create k8s cluster, it stuck on >>> >>> >>> kube_masters b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 OS::Heat::ResourceGroup 16 minutes Create In Progress state changed >>> create in progress....and after around an hour it says...time out. k8s master seems to be up.....at least as VM. >>> >>> any idea? >>> >>> >>> >>> >>> Alfredo >>> >> >> >> -- >> Alfredo >> > > > -- > Alfredo > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3783 bytes Desc: not available URL: From doug at doughellmann.com Tue Jan 29 21:43:41 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Tue, 29 Jan 2019 16:43:41 -0500 Subject: [Release-job-failures] Release of openstack/ovsdbapp failed In-Reply-To: References: Message-ID: zuul at openstack.org writes: > Build failed. > > - release-openstack-python http://logs.openstack.org/76/767f3ea3efcedb4c8df1dae1a6b40d0b0a230d3d/release/release-openstack-python/9701f8c/ : POST_FAILURE in 3m 17s > - announce-release announce-release : SKIPPED > - propose-update-constraints propose-update-constraints : SKIPPED > > _______________________________________________ > Release-job-failures mailing list > Release-job-failures at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/release-job-failures This release failed due to a known issue for which Sean has already proposed a fix. https://review.openstack.org/633829 -- Doug From smooney at redhat.com Tue Jan 29 21:46:15 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 29 Jan 2019 21:46:15 +0000 Subject: Issue with launching instance with OVS-DPDK In-Reply-To: References: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> Message-ID: On Tue, 2019-01-29 at 18:05 +0000, David Lake wrote: > Answers
in-line
> > Thanks > > David > > -----Original Message----- > From: Sean Mooney > Sent: 29 January 2019 14:55 > To: Lake, David (PG/R - Elec Electronic Eng) ; openstack-dev at lists.openstack.org > Cc: Ge, Chang Dr (Elec Electronic Eng) > Subject: Re: Issue with launching instance with OVS-DPDK > > On Mon, 2019-01-28 at 13:17 +0000, David Lake wrote: > > Hello > > > > I’ve built an Openstack all-in-one using OVS-DPDK via Devstack. > > > > I can launch instances which use the “m1.small” flavour (which I have > > modified to include the hw:mem_size large as per the DPDK instructions) but as soon as I try to launch anything more > > than m1.small, I get this error: > > > > Jan 28 12:56:52 localhost nova-conductor: #033[01;31mERROR > > nova.scheduler.utils [#033[01;36mNone req-917cd3b9-8ce6- > > 41af-8d44-045002512c91 #033[00;36madmin admin#033[01;31m] > > #033[01;35m[instance: 25cfee28-08e9-419c-afdb-4d0fe515fb2a] > > #033[01;31mError from last host: localhost (node localhost): [u'Traceback (most recent call last):\n', u' File > > "/opt/stack/nova/nova/compute/manager.py", line 1935, in _do_build_and_run_instance\n filter_properties, > > request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2215, in _build_and_run_instance\n > > instance_uuid=instance.uuid, reason=six.text_type(e))\n', > > u'RescheduledException: Build of instance 25cfee28-08e9- > > 419c-afdb-4d0fe515fb2a was re-scheduled: internal error: qemu > > unexpectedly closed the monitor: 2019-01- 28T12:56:48.127594Z > > qemu-kvm: -chardev > > socket,id=charnet0,path=/var/run/openvswitch/vhu46b3c508-f8,server: > > info: QEMU waiting for connection on: > > disconnected:unix:/var/run/openvswitch/vhu46b3c508-f8,server\n2019-01- > > 28T12:56:49.251071Z > > qemu-kvm: -object > > memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/ > > libvirt/qemu/4-instance- > > 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: > > os_mem_prealloc: Insufficient free host memory pages available to > > allocate guest RAM\n']#033[00m#033[00m > > > > > > My Hypervisor is reporting 510.7GB of RAM and 61 vCPUs. > > how much of that ram did you allocate as hugepages. > >
OVS_NUM_HUGEPAGES=3072
ok so you used networking-ovs-dpdks ablitiy to automatically allocate 2MB hugepages at runtime so this should have allocate 6GB of hugepages per numa node. > > can you provide the output of cat /proc/meminfo > >
> > MemTotal: 526779552 kB > MemFree: 466555316 kB > MemAvailable: 487218548 kB > Buffers: 2308 kB > Cached: 22962972 kB > SwapCached: 0 kB > Active: 29493384 kB > Inactive: 13344640 kB > Active(anon): 20826364 kB > Inactive(anon): 522012 kB > Active(file): 8667020 kB > Inactive(file): 12822628 kB > Unevictable: 43636 kB > Mlocked: 47732 kB > SwapTotal: 4194300 kB > SwapFree: 4194300 kB > Dirty: 20 kB > Writeback: 0 kB > AnonPages: 19933028 kB > Mapped: 171680 kB > Shmem: 1450564 kB > Slab: 1224444 kB > SReclaimable: 827696 kB > SUnreclaim: 396748 kB > KernelStack: 69392 kB > PageTables: 181020 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 261292620 kB > Committed_AS: 84420252 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 1352128 kB > VmallocChunk: 34154915836 kB > HardwareCorrupted: 0 kB > AnonHugePages: 5365760 kB > CmaTotal: 0 kB > CmaFree: 0 kB > HugePages_Total: 6144 since we have 6144 total and OVS_NUM_HUGEPAGES was set to 3072 this indicate the host has 2 numa nodes > HugePages_Free: 2048 and you currently have 4G of 2MB hugepages free. however this will also be split across numa nodes. the qemu commandline you provied which i have coppied below is trying to allocate 4G of hugepage memory from a single host numa node qemu-kvm: -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/ libvirt/qemu/4-instance- 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM\n']#033[00m#033[00m as a result the vm is failing to boot because nova cannot create the vm with a singel numa node. if you set hw:numa_nodes=2 this vm would likely boot but since you have a 512G hostyou should be able to increase OVS_NUM_HUGEPAGES to something like OVS_NUM_HUGEPAGES=14336. this will allocate 60G of 2MB hugepages total. if you want to allocate more then about 96G of hugepages you should set OVS_ALLOCATE_HUGEPAGES=False and instead allcoate the hugepages on the kernel commandline using 1G hugepages. e.g. default_hugepagesz=1G hugepagesz=1G hugepages=480 This is becase it take a long time for ovs-dpdk to scan all the hugepages on start up. setting default_hugepagesz=1G hugepagesz=1G hugepages=480 will leave 32G of ram for the host. if it a comptue node and not a contorller you can safly reduce the the free host ram to 16G e.g. default_hugepagesz=1G hugepagesz=1G hugepages=496 i would not advice allocating much more above than 496G of hugepages as the qemu emularot over head can eaially get into the 10s of gigs if you have 50+ vms running. > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > DirectMap4k: 746304 kB > DirectMap2M: 34580480 kB > DirectMap1G: 502267904 kB > [stack at localhost devstack]$ > >
> > > > > Build is the latest git clone of Devstack. > > > > Thanks > > > > David > > From miguel at mlavalle.com Tue Jan 29 23:18:58 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Tue, 29 Jan 2019 17:18:58 -0600 Subject: [openstack-dev] [Neutron] Propose Liu Yulong for Neutron core Message-ID: Hi Stackers, I want to nominate Liu Yulong (irc: liuyulong) as a member of the Neutron core team. Liu started contributing to Neutron back in Mitaka, fixing bugs in HA routers. Since then, he has specialized in L3 networking, developing a deep knowledge of DVR. More recently, he single handedly implemented QoS for floating IPs with this series of patches: https://review.openstack.org/#/q/topic:bp/floating-ip-rate-limit+(status:open+OR+status:merged). He has also been very busy helping to improve the implementation of port forwardings and adding QoS to them. He also works for a large operator in China, which allows him to bring an important operational perspective from that part of the world to our project. The quality and number of his code reviews during the Stein cycle is on par with the leading members of the core team: https://www.stackalytics.com/?module=neutron-group. I will keep this nomination open for a week as customary. Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From zh.f at outlook.com Wed Jan 30 01:51:48 2019 From: zh.f at outlook.com (Zhang Fan) Date: Wed, 30 Jan 2019 01:51:48 +0000 Subject: [openstack-dev] [Neutron] Propose Liu Yulong for Neutron core In-Reply-To: References: Message-ID: Big +1 👍🏻. Yulong deserves to be part of the leading team :) Best Wishes. Fan Zhang On Jan 30, 2019, at 07:18, Miguel Lavalle > wrote: Hi Stackers, I want to nominate Liu Yulong (irc: liuyulong) as a member of the Neutron core team. Liu started contributing to Neutron back in Mitaka, fixing bugs in HA routers. Since then, he has specialized in L3 networking, developing a deep knowledge of DVR. More recently, he single handedly implemented QoS for floating IPs with this series of patches: https://review.openstack.org/#/q/topic:bp/floating-ip-rate-limit+(status:open+OR+status:merged). He has also been very busy helping to improve the implementation of port forwardings and adding QoS to them. He also works for a large operator in China, which allows him to bring an important operational perspective from that part of the world to our project. The quality and number of his code reviews during the Stein cycle is on par with the leading members of the core team: https://www.stackalytics.com/?module=neutron-group. I will keep this nomination open for a week as customary. Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From rony.khan at brilliant.com.bd Wed Jan 30 04:07:10 2019 From: rony.khan at brilliant.com.bd (Md. Farhad Hasan Khan) Date: Wed, 30 Jan 2019 10:07:10 +0600 Subject: [openstack] [Neutron] Instance automatically got shutdown Message-ID: <010801d4b851$46017770$d2046650$@brilliant.com.bd> Hi, we are getting Openstack instance automatically shut down. Instance not shut down from horizon/cli & from inside instance. This is happening randomly in our environment. Please help us to solve this. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cat /etc/nova/nova.conf [DEFAULT] sync_power_state_interval=-1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Here is log: [root at compute2 ~]# cat /var/log/nova/nova-compute.log |grep 4e143f50-49d4-40c9-b80e-c99dfe45d9d4 2019-01-30 08:06:18.750 417615 INFO nova.compute.manager [-] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] VM Stopped (Lifecycle Event) 2019-01-30 08:06:18.855 417615 INFO nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (4). Updating power_state in the DB to match the hypervisor. 2019-01-30 08:06:19.027 417615 WARNING nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance shutdown by itself. Calling the stop API. Current vm_state: active, current task_state: None, original DB power_state: 1, current VM power_state: 4 2019-01-30 08:06:19.376 417615 INFO nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance is already powered off in the hypervisor when stop is called. 2019-01-30 08:06:19.441 417615 INFO nova.virt.libvirt.driver [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance already shutdown. 2019-01-30 08:06:19.445 417615 INFO nova.virt.libvirt.driver [-] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance destroyed successfully. Thanks & B’Rgds, Rony -------------- next part -------------- An HTML attachment was scrubbed... URL: From bluejay.ahn at gmail.com Wed Jan 30 04:15:13 2019 From: bluejay.ahn at gmail.com (Jaesuk Ahn) Date: Wed, 30 Jan 2019 13:15:13 +0900 Subject: [openstack-helm] would like to discuss review turnaround time Message-ID: Dear all, There has been several patch sets getting sparse reviews. Since some of authors wrote these patch sets are difficult to join IRC meeting due to time and language constraints, I would like to pass some of their voice, and get more detail feedback from core reviewers and other devs via ML. I fully understand core reviewers are quite busy and believe they are doing their best efforts. period! However, I sometimes feel that turnaround time for some of patch sets are really long. I would like to hear opinion from others and suggestions on how to improve this. It can be either/both something each patch set owner need to do more, or/and it could be something we as a openstack-helm project can improve. For instance, it could be influenced by time differences, lack of irc presence, or anything else. etc. I really would like to find out there are anything we can improve together. I would like to get any kind of advise on the following. - sometimes, it is really difficult to get core reviewers' comments or reviews. I routinely put the list of patch sets on irc meeting agenda, however, there still be a long turnaround time between comments. As a result, it usually takes a long time to process a patch set, does sometimes cause rebase as well. - Having said that, I would like to have any advise on what we need to do more, for instance, do we need to be in irc directly asking each patch set to core reviewers? do we need to put core reviewers' name when we push patch set? etc. - Some of patch sets are being reviewed and merged quickly, and some of patch sets are not. I would like to know what makes this difference so that I can tell my developers how to do better job writing and communicating patch sets. There are just some example patch sets currently under review stage. 1. https://review.openstack.org/#/c/603971/ >> this ps has been discussed for its contents and scope. Cloud you please add if there is anything else we need to do other than wrapping some of commit message? 2. https://review.openstack.org/#/c/633456/ >> this is simple fix. how can we make core reviewer notice this patch set so that they can quickly view? 3. https://review.openstack.org/#/c/625803/ >> we have been getting feedbacks and questions on this patch set, that has been good. but round-trip time for the recent comments takes a week or more. because of that delay (?), the owner of this patch set needed to rebase this one often. Will this kind of case be improved if author engages more on irc channel or via mailing list to get feedback rather than relying on gerrit reviews? Frankly speaking, I don't know if this is a real issue or just way it is. I just want to pass some of voice from our developers, and really would like to hear what others think and find a better way to communicate. Thanks you. -- *Jaesuk Ahn*, Ph.D. Software Labs, SK Telecom -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Jan 30 06:28:28 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 30 Jan 2019 15:28:28 +0900 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: <86ed4afc-056e-602a-e30c-08a51c2a2080@catalyst.net.nz> References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> <86ed4afc-056e-602a-e30c-08a51c2a2080@catalyst.net.nz> Message-ID: <1689d71d0ef.ef1d5f8d185664.5395252099905607931@ghanshyammann.com> ---- On Wed, 23 Jan 2019 08:21:27 +0900 Adrian Turjak wrote ---- > Thanks for the input! I'm willing to bet there are many people excited > about this goal, or will be when they realise it exists! > > The 'dirty' state I think would be solved with a report API in each > service (tell me everything a given project has resource wise). Such an > API would be useful without needing to query each resource list, and > potentially could be an easy thing to implement to help a purge library > figure out what to delete. I know right now our method for checking if a > project is 'dirty' is part of our quota checking scripts, and it has to > query a lot of APIs per service to build an idea of what a project has. > > As for using existing code, OSPurge could well be a starting point, but > the major part of this goal has to be that each OpenStack service (that > creates resources owned by a project) takes ownership of their own > deletion logic. This is why a top level library for cross project logic, > with per service plugin libraries is possibly the best approach. Each > library would follow the same template and abstraction layers (as > inherited from the top level library), but how each service implements > their own deletion is up to them. I would also push for them using the > SDK only as their point of interaction with the APIs (lets set some hard > requirements and standards!), because that is the python library we > should be using going forward. In addition such an approach could mean > that anyone can write a plugin for the top level library (e.g. internal > company only services) which will automatically get picked up if installed. +100 for not making keystone as Actor. Leaving purge responsibility to service side is the best way without any doubt. Instead of accepting Purge APIs from each service, I am thinking we should consider another approach also which can be the plugin-able approach. Ewe can expose the plugin interface from purge library/tool. Each service implements the interface with purge functionality(script or command etc). On discovery of each service's purge plugin, purge library/tool will start the deletion in required order etc. This can give 2 simple benefits 1. No need to detect the service availability before requesting them to purge the resources. I am not sure OSpurge check the availability of services or not. But in plugin approach case, that will not be required. For example, if Congress is not installed in my env then, congress's purge plugin will not be discovered so no need to check Congress service availability. 2. purge all resources interface will not be exposed to anyone except the Purge library/tool. In case of API, we are exposing the interface to user(admin/system scopped etc) which can delete all the resources of that service which is little security issue may be. This can be argued with existing delete API but those are per resource not all. Other side we can say those can be taken care by RBAC but still IMO exposing anything to even permissiable user(especially human) which can destruct the env is not a good idea where only right usage of that interface is something else (Purge library/tool in this case). Plugin-able can also have its cons but Let's first discuss all those possibilities. -gmann > > We would need robust and extensive testing for this, because deletion is > critical, and we need it to work, but also not cause damage in ways it > shouldn't. > > And you're right, purge tools purging outside of the scope asked for is > a worry. Our own internal logic actually works by having the triggering > admin user add itself to the project (and ensure no admin role), then > scope a token to just that project, and delete resources form the point > of view of a project user. That way it's kind of like a user deleting > their own resources, and in truth having a nicer way to even do that > (non-admin clearing of project) would be amazing for a lot of people who > don't want to close their account or disable their project, but just > want to delete stray resources and not get charged. > > On 23/01/19 4:03 AM, Tobias Urdin wrote: > > Thanks for the thorough feedback Adrian. > > > > My opinion is also that Keystone should not be the actor in executing > > this functionality but somewhere else > > whether that is Adjutant or any other form (application, library, CLI > > etc). > > > > I would also like to bring up the point about knowing if a project is > > "dirty" (it has provisioned resources). > > This is something that I think all business logic would benefit from, > > we've had issue with knowing when > > resources should be deleted, our solution is pretty much look at > > metrics the last X minutes, check if project > > is disabled and compare to business logic that says it should be deleted. > > > > While the above works it kills some of logical points of disabling a > > project since the only thing that knows if > > the project should be deleted or is actually disabled is the business > > logic application that says they clicked the > > deleted button and not disabled. > > > > Most of the functionality you are mentioning is things that the > > ospurge project has been working to implement and the > > maintainer even did a full rewrite which improved the dependency > > arrangement for resource removal. > > > > I think the biggest win for this community goal would be the > > developers of the projects would be available for input regarding > > the project specific code that does purging. There has been some > > really nasty bugs in ospurge in the past that if executed with the admin > > user you would wipe everything and not only that project, which is > > probably a issue that makes people think twice about > > using a purging toolkit at all. > > > > We should carefully consider what parts of ospurge could be reused, > > concept, code or anything in between that could help derive > > what direction we wan't to push this goal. > > > > I'm excited :) > > > > Best regards > > Tobias > > > > From gmann at ghanshyammann.com Wed Jan 30 06:39:50 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 30 Jan 2019 15:39:50 +0900 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> Message-ID: <1689d7c39b6.10fd89fc2185781.5856400223771308734@ghanshyammann.com> ---- On Tue, 22 Jan 2019 10:14:50 +0900 Adrian Turjak wrote ---- > I've expanded on the notes in the etherpad about why Keystone isn't the actor. > > At the summit we discussed this option, and all the people familiar with Keystone who were in the room (or in some later discussions), agreed that making Keystone the actor is a BAD idea. > > Keystone does not currently do any orchestration or workflow of this nature, making it do that adds a lot of extra logic which it just shouldn't need. After a project delete it would need to call all the APIs, and then confirm they succeeded, and maybe retry. This would have to be done asynchronously since waiting and confirming the deletion would take longer than a single API call to delete a project in Keystone should take. That kind of logic doesn't fit in Keystone. Not to mention there are issues on how Keystone would know which services support such an API, and where exactly it might be (although catalog + consistent API placement or discovery could solve that). > > Essentially, going down the route of "make this Keystone's problem" is in my opinion a hard NO, but I'll let the Keystone devs weigh in on that before we make that a very firm hard NO. > > As for solutions. Ideally we do implement the APIs per service (that's the end goal), but we ALSO make libraries that do deletion of resource using the existing APIs. If the library sees that a service version is one with the purge API it uses it, otherwise it has a fallback for less efficient deletion. This has the major benefit of working for all existing deployments, and ones stuck on older OpenStack versions. This is a universal problem and we need to solve it backwards AND forwards. > > By doing both (with a first step focus on the libraries) we can actually give projects more time to build the purge API, and maybe have the API portion of the goal extend into another cycle if needed. > > Essentially, we'd make a purge library that uses the SDK to delete resources. If a service has a purge endpoint, then the library (via the SDK) uses that. The specifics of how the library purges, or if the library will be split into multiple libraries (one top level, and then one per service) is to be decided. > > A rough look at what a deletion process might looks like: > 1. Disable project in Keystone (so no new resources can be created or modified), or clear all role assignments (and api-keys) from project. > 2. Purge platform orchestration services (Magnum, Sahara > 3. Purge Heat (Heat after Magnum, because magnum and such use Heat, and deleting Heat stacks without deleting the 'resource' which uses that stack can leave a mess) > 4. Purge everything left (order to be decided or potentially dynamically chosen). > 5. Delete or Disable Keystone project (disable is enough really). One important thing we need to discuss is about rollback. If any service or some services not able to delete their resources then, what Purge library should do ? error and rollback? success with non-deleted resources left behind ? error with saying list of non-deleted resources and hold the project deletion till then ? or It can be multiple run deletion but keep the project in disable state until all resources are gone. Because this library is going to provide the functionality of cleanup everything. Half cleaned project deletion can be another issue. IMO project can be in disable state until user able to delete all the resource from the library we provide. -gmann > > The actor is then first a CLI built into the purge library as a OSClient command, then secondly maybe an API or two in Adjutant which will use this library. Or anyone can use the library and make anything they want an actor. > > Ideally if we can even make the library allow selectively choosing which services to purge (conditional on dependency chain), that could be useful for cases where a user wants to delete everything except maybe what's in Swift or Cinder. > > > This is in many ways a HUGE goal, but one that we really need to accomplish. We've lived with this problem too long and the longer we leave it unsolved, the harder it becomes. > > > On 22/01/19 9:30 AM, Lance Bragstad wrote: > > > On Mon, Jan 21, 2019 at 2:18 PM Ed Leafe wrote: > On Jan 21, 2019, at 1:55 PM, Lance Bragstad wrote: > > > > Are you referring to the system scope approach detailed on line 38, here [0]? > > Yes. > > > I might be misunderstanding something, but I didn't think keystone was going to iterate all available services and call clean-up APIs. I think it was just that services would be able to expose an endpoint that cleans up resources without a project scoped token (e.g., it would be system scoped [1]). > > > > [0] https://etherpad.openstack.org/p/community-goal-project-deletion > > [1] https://docs.openstack.org/keystone/latest/admin/tokens-overview.html#system-scoped-tokens > > It is more likely that I’m misunderstanding. Reading that etherpad, it appeared that it was indeed the goal to have project deletion in Keystone cascade to all the services, but I guess I missed line 19. > > So if it isn’t Keystone calling this API on all the services, what would be the appropriate actor? > > The actor could still be something like os-purge or adjutant [0]. Depending on how the implementation shakes out in each service, the implementation in the actor could be an interation of all services calling the same API for each one. I guess the benefit is that the actor doesn't need to manage the deletion order based on the dependencies of the resources (internal or external to a service). > Adrian, and others, have given this a bunch more thought than I have. So I'm curious to hear if what I'm saying is in line with how they've envisioned things. I'm recalling most of this from Berlin. > [0] https://adjutant.readthedocs.io/en/latest/ > > -- Ed Leafe > > > > > > From alfredo.deluca at gmail.com Wed Jan 30 08:17:38 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 30 Jan 2019 09:17:38 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: <1F00FD58-4132-4C42-A9C2-41E3FF8A84C4@crandale.de> References: <1F00FD58-4132-4C42-A9C2-41E3FF8A84C4@crandale.de> Message-ID: hi Clemens and Ignazio. thanks for your support. it must be network related but I don't do something special apparently to create a simple k8s cluster. I ll post later on configurations and logs as you Clemens suggested. Cheers On Tue, Jan 29, 2019 at 9:16 PM Clemens wrote: > … an more important: check the other log cloud-init.log for error messages > (not only cloud-init-output.log) > > Am 29.01.2019 um 16:07 schrieb Alfredo De Luca : > > Hi Ignazio and Clemens. I haven\t configure the proxy and all the logs on > the kube master keep saying the following > > + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished > [+]poststarthook/extensions/third-party-resources ok > [-]poststarthook/rbac/bootstrap-roles failed: not finished > healthz check failed' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '' ']' > + sleep 5 > ++ curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished > [+]poststarthook/extensions/third-party-resources ok > [-]poststarthook/rbac/bootstrap-roles failed: not finished > healthz check failed' ']' > + sleep 5 > > Not sure what to do. > My configuration is ... > eth0 - 10.1.8.113 > > But the openstack configration in terms of networkin is the default from > ansible-openstack which is 172.29.236.100/22 > > Maybe that's the problem? > > > > > > > On Tue, Jan 29, 2019 at 2:26 PM Ignazio Cassano > wrote: > >> Hello Alfredo, >> your external network is using proxy ? >> If you using a proxy, and yuo configured it in cluster template, you must >> setup no proxy for 127.0.0.1 >> Ignazio >> >> Il giorno mar 29 gen 2019 alle ore 12:26 Clemens Hardewig < >> clemens.hardewig at crandale.de> ha scritto: >> >>> At least on fedora there is a second cloud Init log as far as I >>> remember-Look into both >>> >>> Br c >>> >>> Von meinem iPhone gesendet >>> >>> Am 29.01.2019 um 12:08 schrieb Alfredo De Luca >> >: >>> >>> thanks Clemens. >>> I looked at the cloud-init-output.log on the master... and at the >>> moment is doing the following.... >>> >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '' ']' >>> + sleep 5 >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '' ']' >>> + sleep 5 >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '' ']' >>> + sleep 5 >>> >>> Network ....could be but not sure where to look at >>> >>> >>> On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig < >>> clemens.hardewig at crandale.de> wrote: >>> >>>> Yes, you should check the cloud-init logs of your master. Without >>>> having seen them, I would guess a network issue or you have selected for >>>> your minion nodes a flavor using swap perhaps ... >>>> So, log files are the first step you could dig into... >>>> Br c >>>> Von meinem iPhone gesendet >>>> >>>> Am 28.01.2019 um 15:34 schrieb Alfredo De Luca < >>>> alfredo.deluca at gmail.com>: >>>> >>>> Hi all. >>>> I finally instaledl successufully openstack ansible (queens) but, after >>>> creating a cluster template I create k8s cluster, it stuck on >>>> >>>> >>>> kube_masters >>>> >>>> b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 >>>> >>>> OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate >>>> in progress....and after around an hour it says...time out. k8s master >>>> seems to be up.....at least as VM. >>>> >>>> any idea? >>>> >>>> >>>> >>>> >>>> *Alfredo* >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.se Wed Jan 30 08:39:07 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Wed, 30 Jan 2019 09:39:07 +0100 Subject: [tc][all] Project deletion community goal for Train cycle In-Reply-To: <1689d71d0ef.ef1d5f8d185664.5395252099905607931@ghanshyammann.com> References: <8d25cbc43d4fc43f8a98de37992d5531c8662cdc.camel@evrard.me> <47F67A8C-8C89-4B0A-BCF3-7F3100D2A1B7@leafe.com> <86ed4afc-056e-602a-e30c-08a51c2a2080@catalyst.net.nz> <1689d71d0ef.ef1d5f8d185664.5395252099905607931@ghanshyammann.com> Message-ID: Regarding gmann's #1. Existing OSPurge doesn't have any specific logic per-se but would just return no resources based on the has_service() method which I would assume is checking endpoints. [1] +1 On having a pluggable approach and to Adrian's feedback on having a strict policy on how they should be implemented. Best regards [1] https://review.openstack.org/#/c/600919/1/ospurge/resources/heat.py On 01/30/2019 07:36 AM, Ghanshyam Mann wrote: > ---- On Wed, 23 Jan 2019 08:21:27 +0900 Adrian Turjak wrote ---- > > Thanks for the input! I'm willing to bet there are many people excited > > about this goal, or will be when they realise it exists! > > > > The 'dirty' state I think would be solved with a report API in each > > service (tell me everything a given project has resource wise). Such an > > API would be useful without needing to query each resource list, and > > potentially could be an easy thing to implement to help a purge library > > figure out what to delete. I know right now our method for checking if a > > project is 'dirty' is part of our quota checking scripts, and it has to > > query a lot of APIs per service to build an idea of what a project has. > > > > As for using existing code, OSPurge could well be a starting point, but > > the major part of this goal has to be that each OpenStack service (that > > creates resources owned by a project) takes ownership of their own > > deletion logic. This is why a top level library for cross project logic, > > with per service plugin libraries is possibly the best approach. Each > > library would follow the same template and abstraction layers (as > > inherited from the top level library), but how each service implements > > their own deletion is up to them. I would also push for them using the > > SDK only as their point of interaction with the APIs (lets set some hard > > requirements and standards!), because that is the python library we > > should be using going forward. In addition such an approach could mean > > that anyone can write a plugin for the top level library (e.g. internal > > company only services) which will automatically get picked up if installed. > > +100 for not making keystone as Actor. Leaving purge responsibility to service > side is the best way without any doubt. > > Instead of accepting Purge APIs from each service, I am thinking > we should consider another approach also which can be the plugin-able approach. > Ewe can expose the plugin interface from purge library/tool. Each service implements > the interface with purge functionality(script or command etc). > On discovery of each service's purge plugin, purge library/tool will start the deletion > in required order etc. > > This can give 2 simple benefits > 1. No need to detect the service availability before requesting them to purge the resources. > I am not sure OSpurge check the availability of services or not. But in plugin approach case, > that will not be required. For example, if Congress is not installed in my env then, > congress's purge plugin will not be discovered so no need to check Congress service availability. > > 2. purge all resources interface will not be exposed to anyone except the Purge library/tool. > In case of API, we are exposing the interface to user(admin/system scopped etc) which can > delete all the resources of that service which is little security issue may be. This can be argued > with existing delete API but those are per resource not all. Other side we can say those can be > taken care by RBAC but still IMO exposing anything to even permissiable user(especially human) > which can destruct the env is not a good idea where only right usage of that interface is something > else (Purge library/tool in this case). > > Plugin-able can also have its cons but Let's first discuss all those possibilities. > > -gmann > > > > > We would need robust and extensive testing for this, because deletion is > > critical, and we need it to work, but also not cause damage in ways it > > shouldn't. > > > > And you're right, purge tools purging outside of the scope asked for is > > a worry. Our own internal logic actually works by having the triggering > > admin user add itself to the project (and ensure no admin role), then > > scope a token to just that project, and delete resources form the point > > of view of a project user. That way it's kind of like a user deleting > > their own resources, and in truth having a nicer way to even do that > > (non-admin clearing of project) would be amazing for a lot of people who > > don't want to close their account or disable their project, but just > > want to delete stray resources and not get charged. > > > > On 23/01/19 4:03 AM, Tobias Urdin wrote: > > > Thanks for the thorough feedback Adrian. > > > > > > My opinion is also that Keystone should not be the actor in executing > > > this functionality but somewhere else > > > whether that is Adjutant or any other form (application, library, CLI > > > etc). > > > > > > I would also like to bring up the point about knowing if a project is > > > "dirty" (it has provisioned resources). > > > This is something that I think all business logic would benefit from, > > > we've had issue with knowing when > > > resources should be deleted, our solution is pretty much look at > > > metrics the last X minutes, check if project > > > is disabled and compare to business logic that says it should be deleted. > > > > > > While the above works it kills some of logical points of disabling a > > > project since the only thing that knows if > > > the project should be deleted or is actually disabled is the business > > > logic application that says they clicked the > > > deleted button and not disabled. > > > > > > Most of the functionality you are mentioning is things that the > > > ospurge project has been working to implement and the > > > maintainer even did a full rewrite which improved the dependency > > > arrangement for resource removal. > > > > > > I think the biggest win for this community goal would be the > > > developers of the projects would be available for input regarding > > > the project specific code that does purging. There has been some > > > really nasty bugs in ospurge in the past that if executed with the admin > > > user you would wipe everything and not only that project, which is > > > probably a issue that makes people think twice about > > > using a purging toolkit at all. > > > > > > We should carefully consider what parts of ospurge could be reused, > > > concept, code or anything in between that could help derive > > > what direction we wan't to push this goal. > > > > > > I'm excited :) > > > > > > Best regards > > > Tobias > > > > > > > > > > > From skaplons at redhat.com Wed Jan 30 08:55:01 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 30 Jan 2019 09:55:01 +0100 Subject: [openstack] [Neutron] Instance automatically got shutdown In-Reply-To: <010801d4b851$46017770$d2046650$@brilliant.com.bd> References: <010801d4b851$46017770$d2046650$@brilliant.com.bd> Message-ID: <7CBF045A-84E5-457D-8742-A572893F6797@redhat.com> Hi, Why You tagged „Neutron” in topic? IMO it’s some issue related to Nova rather than Neutron. > Wiadomość napisana przez Md. Farhad Hasan Khan w dniu 30.01.2019, o godz. 05:07: > > Hi, > we are getting Openstack instance automatically shut down. Instance not shut down from horizon/cli & from inside instance. This is happening randomly in our environment. Please help us to solve this. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > cat /etc/nova/nova.conf > > [DEFAULT] > sync_power_state_interval=-1 > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > Here is log: > > [root at compute2 ~]# cat /var/log/nova/nova-compute.log |grep 4e143f50-49d4-40c9-b80e-c99dfe45d9d4 > 2019-01-30 08:06:18.750 417615 INFO nova.compute.manager [-] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] VM Stopped (Lifecycle Event) > 2019-01-30 08:06:18.855 417615 INFO nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (4). Updating power_state in the DB to match the hypervisor. > 2019-01-30 08:06:19.027 417615 WARNING nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance shutdown by itself. Calling the stop API. Current vm_state: active, current task_state: None, original DB power_state: 1, current VM power_state: 4 > 2019-01-30 08:06:19.376 417615 INFO nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance is already powered off in the hypervisor when stop is called. > 2019-01-30 08:06:19.441 417615 INFO nova.virt.libvirt.driver [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance already shutdown. > 2019-01-30 08:06:19.445 417615 INFO nova.virt.libvirt.driver [-] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance destroyed successfully. > > > Thanks & B’Rgds, > Rony — Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Wed Jan 30 09:21:57 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 30 Jan 2019 10:21:57 +0100 Subject: [openstack-dev] [Neutron] Propose Liu Yulong for Neutron core In-Reply-To: References: Message-ID: <9F0BB17B-826D-4576-AF20-776C1195BA9C@redhat.com> +1 from me. Congratulations Liu :) > Wiadomość napisana przez Miguel Lavalle w dniu 30.01.2019, o godz. 00:18: > > Hi Stackers, > > I want to nominate Liu Yulong (irc: liuyulong) as a member of the Neutron core team. Liu started contributing to Neutron back in Mitaka, fixing bugs in HA routers. Since then, he has specialized in L3 networking, developing a deep knowledge of DVR. More recently, he single handedly implemented QoS for floating IPs with this series of patches: https://review.openstack.org/#/q/topic:bp/floating-ip-rate-limit+(status:open+OR+status:merged). He has also been very busy helping to improve the implementation of port forwardings and adding QoS to them. He also works for a large operator in China, which allows him to bring an important operational perspective from that part of the world to our project. The quality and number of his code reviews during the Stein cycle is on par with the leading members of the core team: https://www.stackalytics.com/?module=neutron-group. > > I will keep this nomination open for a week as customary. > > Best regards > > Miguel — Slawek Kaplonski Senior software engineer Red Hat From openstack at nemebean.com Wed Jan 30 09:56:11 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 30 Jan 2019 10:56:11 +0100 Subject: [oslo] Proposing Zane Bitter as general Oslo core In-Reply-To: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> References: <55995180-3faf-f5f2-59df-2a0983e9370e@nemebean.com> Message-ID: <8ea3ebbe-02dd-9365-21ad-4d16c3707a2b@nemebean.com> All of the active cores are +1 and it's been approximately a week, so I made this official. Thanks for all your good work, Zane! (now go +2 my oslo.utils patch ;-) -Ben On 1/24/19 11:17 PM, Ben Nemec wrote: > Hi all, > > Zane is already core on oslo.service, but he's been doing good stuff in > adjacent projects as well. We could keep playing whack-a-mole with > giving him +2 on more repos, but I trust his judgment so I'm proposing > we just add him to the oslo-core group. > > If there are no objections in the next week I'll proceed with the addition. > > Thanks. > > -Ben > From stig.openstack at telfer.org Wed Jan 30 10:20:28 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Wed, 30 Jan 2019 10:20:28 +0000 Subject: [scientific-sig] IRC meeting today 1100 UTC: Baremetal SIG, GPFS+Manila, Open Infra Summit Message-ID: <4FCBCDAE-6298-4A03-AEB7-C91FC85D62C8@telfer.org> Hi All - We have a Scientific SIG IRC meeting at 1100 UTC (about 40 minutes time) in channel #openstack-meeting. Everyone is welcome. The agenda is online here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_30th_2019 Today we’d like to discuss the formation of the new baremetal SIG, do a little planning on participation at the Denver Summit, and share experiences on GPFS with Manila. Cheers, Stig From openstack at nemebean.com Wed Jan 30 10:29:20 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 30 Jan 2019 11:29:20 +0100 Subject: [oslo] oslo.utils EventletEvent changes Message-ID: Hey, We have a bit of an issue in the EventletEvent class in oslo.utils. We had to blacklist the versions that had the race fix because it turned out that it reintroduced the double-set bug. So the current version is subtly broken, and we released the oslo.service change that started to use it (which is how we found the double-set bug). There are actually two patches proposed right now related to this.[1][2] The first simply fixes the bug, the second optimizes it so we aren't doing unnecessary event replacement. What I would like to do is merge the first patch, release the library with just that, then merge the second patch and release again. That way if we're wrong about the optimization patch being correct (I don't think we are, but this is concurrency, so...) we have the safer fix available in a release. The reason I'm emailing instead of just doing this is that I'm going to be unavailable after today until the 11th, and it would be nice to get at least the first release done before then. I can maybe get that done today, but in case I don't I wanted to get this on record. Thanks. -Ben 1: https://review.openstack.org/#/c/632758 2: https://review.openstack.org/#/c/633053 From clemens.hardewig at crandale.de Wed Jan 30 10:43:41 2019 From: clemens.hardewig at crandale.de (=?utf-8?Q?Clemens_Hardewig?=) Date: Wed, 30 Jan 2019 10:43:41 +0000 Subject: [openstack-ansible][magnum] In-Reply-To: References: <1F00FD58-4132-4C42-A9C2-41E3FF8A84C4@crandale.de> Message-ID: Read the cloud-Init.log! There you can see that your /var/lib/.../part-011 part of the config script finishes with error. Check why. Von meinem iPhone gesendet Am 30.01.2019 um 10:11 schrieb Alfredo De Luca >: here are also the logs for the cloud init logs from the k8s master.... On Wed, Jan 30, 2019 at 9:30 AM Alfredo De Luca > wrote: In the meantime this is my cluster  template On Wed, Jan 30, 2019 at 9:17 AM Alfredo De Luca > wrote: hi Clemens and Ignazio. thanks for your support. it must be network related but I don't do something special apparently to create a simple k8s cluster.  I ll post later on configurations and logs as you Clemens suggested.  Cheers On Tue, Jan 29, 2019 at 9:16 PM Clemens > wrote: … an more important: check the other log cloud-init.log for error messages (not only cloud-init-output.log) Am 29.01.2019 um 16:07 schrieb Alfredo De Luca >: Hi Ignazio and Clemens. I haven\t configure the proxy  and all the logs on the kube master keep saying the following + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished [+]poststarthook/extensions/third-party-resources ok [-]poststarthook/rbac/bootstrap-roles failed: not finished healthz check failed' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished [+]poststarthook/extensions/third-party-resources ok [-]poststarthook/rbac/bootstrap-roles failed: not finished healthz check failed' ']' + sleep 5 Not sure what to do.  My configuration is ...  eth0 - 10.1.8.113 But the openstack configration in terms of networkin is the default from  ansible-openstack which is 172.29.236.100/22 Maybe that's the problem? On Tue, Jan 29, 2019 at 2:26 PM Ignazio Cassano > wrote: Hello Alfredo, your external network is using proxy ? If you using a proxy, and yuo configured it in cluster template, you must setup no proxy for 127.0.0.1 Ignazio Il giorno mar 29 gen 2019 alle ore 12:26 Clemens Hardewig > ha scritto: At least on fedora there is a second cloud Init log as far as I remember-Look into both  Br c Von meinem iPhone gesendet Am 29.01.2019 um 12:08 schrieb Alfredo De Luca >: thanks Clemens. I looked at the cloud-init-output.log  on the master... and at the moment is doing the following.... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 Network ....could be but not sure where to look at On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig > wrote: Yes, you should check the cloud-init logs of your master. Without having seen them, I would guess a network issue or you have selected for your minion nodes a flavor using swap perhaps ... So, log files are the first step you could dig into... Br c Von meinem iPhone gesendet Am 28.01.2019 um 15:34 schrieb Alfredo De Luca >: Hi all. I finally instaledl successufully openstack ansible (queens) but, after creating a cluster template I create k8s cluster, it stuck on  kube_masters b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate in progress....and after around an hour it says...time out. k8s master seems to be up.....at least as VM.  any idea?   Alfredo -- Alfredo -- Alfredo -- Alfredo -- Alfredo -- Alfredo -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaosorior at redhat.com Wed Jan 30 10:49:26 2019 From: jaosorior at redhat.com (Juan Antonio Osorio Robles) Date: Wed, 30 Jan 2019 12:49:26 +0200 Subject: [TripleO] containers logging to stdout Message-ID: <7cee5db5-f4cd-9e11-e0a3-7438154fb9af@redhat.com> Hello! In Queens, the a spec to provide the option to make containers log to standard output was proposed [1] [2]. Some work was done on that side, but due to the lack of traction, it wasn't completed. With the Train release coming, I think it would be a good idea to revive this effort, but make logging to stdout the default in that release. This would allow several benefits: * All logging from the containers would en up in journald; this would make it easier for us to forward the logs, instead of having to keep track of the different directories in /var/log/containers * The journald driver would add metadata to the logs about the container (we would automatically get what container ID issued the logs). * This wouldo also simplify the stacks (removing the Logging nested stack which is present in several templates). * Finally... if at some point we move towards kubernetes (or something in between), managing our containers, it would work with their logging tooling as well. Any thoughts? [1] https://specs.openstack.org/openstack/tripleo-specs/specs/queens/logging-stdout.html [2] https://blueprints.launchpad.net/tripleo/+spec/logging-stdout-rsyslog From jean-philippe at evrard.me Wed Jan 30 12:26:42 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Wed, 30 Jan 2019 13:26:42 +0100 Subject: [openstack-helm] would like to discuss review turnaround time In-Reply-To: References: Message-ID: Hello, Thank you for bringing that topic. Let me answer inline. Please note, this is my personal opinion. (No company or TC hat here. I realise that, as one of the TC members following the health of the osh project, this is a concerning mail, and I will report appropriately if further steps need to be taken). On Wed, 2019-01-30 at 13:15 +0900, Jaesuk Ahn wrote: > Dear all, > > There has been several patch sets getting sparse reviews. > Since some of authors wrote these patch sets are difficult to join > IRC > meeting due to time and language constraints, I would like to pass > some of > their voice, and get more detail feedback from core reviewers and > other > devs via ML. > > I fully understand core reviewers are quite busy and believe they are > doing > their best efforts. period! We can only hope for best effort of everyone :) I have no doubt here. I also believe the team is very busy. So here is my opinion: Any review is valuable. Core reviewers should not be the only ones to review patches The more people will review in all of the involved companies, the more they will get trusted in their reviews. That follows up with earned trust by the core reviewers, with eventually leads to becoming core reviewer. I believe we can make a difference by reviewing more, so that the existing core team could get extended. Just a highlight: at the moment, more than 90% of reviews are AT&T sponsored (counting independents working for at&t. See also https://www.stackalytics.com/?module=openstack-helm-group). That's very high. I believe extending the core team geographically/with different companies is a solution for the listed pain points. > However, I sometimes feel that turnaround time for some of patch sets > are > really long. I would like to hear opinion from others and suggestions > on > how to improve this. It can be either/both something each patch set > owner > need to do more, or/and it could be something we as a openstack-helm > project can improve. For instance, it could be influenced by time > differences, lack of irc presence, or anything else. etc. I really > would > like to find out there are anything we can improve together. I had the same impression myself: the turnaround time is big for a deployment project. The problem is not simple, and here are a few explanations I could think of: 1) most core reviewers are from a single company, and emergencies in their company are most likely to get prioritized over the community work. That leaves some reviews pending. 2) most core reviewers are from the same timezone in US, which means, in the best case, an asian contributor will have to wait a full day before seeing his work merged. If a core reviewer doesn't review this on his day work due to an emergency, you're putting the turnaround to two days at best. 3) most core reviewers are working in the same location: it's maybe hard for them to scale the conversation from their internal habits to a community driven project. Communication is a very important part of a community, and if that doesn't work, it is _very_ concerning to me. We raised the points of lack of (IRC presence|reviews) in previous community meetings. > > I would like to get any kind of advise on the following. > - sometimes, it is really difficult to get core reviewers' comments > or > reviews. I routinely put the list of patch sets on irc meeting > agenda, > however, there still be a long turnaround time between comments. As a > result, it usually takes a long time to process a patch set, does > sometimes > cause rebase as well. I thank our testing system auto rebases a lot :) The bigger problem is when you're working on something which eventually conflicts with some AT&T work that was prioritized internally. For that, I asked a clear list of what the priorities are. ( https://storyboard.openstack.org/#!/worklist/341 ) Anything outside that should IMO raise a little flag in our heads :) But it's up to the core reviewers to work with this in focus, and to the PTL to give directions. > - Having said that, I would like to have any advise on what we need > to do > more, for instance, do we need to be in irc directly asking each > patch set > to core reviewers? do we need to put core reviewers' name when we > push > patch set? etc. I believe that we should leverage IRC more for reviews. We are doing it in OSA, and it works fine. Of course core developers have their habits and a review dashboard, but fast/emergency reviews need to be socialized to get prioritized. There are other attempts in the community (like have a review priority in gerrit), but I am not entirely sold on bringing a technical solution to something that should be solved with more communication. > - Some of patch sets are being reviewed and merged quickly, and some > of > patch sets are not. I would like to know what makes this difference > so that > I can tell my developers how to do better job writing and > communicating > patch sets. > > There are just some example patch sets currently under review stage. > > 1. https://review.openstack.org/#/c/603971/ >> this ps has been > discussed > for its contents and scope. Cloud you please add if there is anything > else > we need to do other than wrapping some of commit message? > > 2. https://review.openstack.org/#/c/633456/ >> this is simple fix. > how can > we make core reviewer notice this patch set so that they can quickly > view? > > 3. https://review.openstack.org/#/c/625803/ >> we have been getting > feedbacks and questions on this patch set, that has been good. but > round-trip time for the recent comments takes a week or more. because > of > that delay (?), the owner of this patch set needed to rebase this one > often. Will this kind of case be improved if author engages more on > irc > channel or via mailing list to get feedback rather than relying on > gerrit > reviews? To me, the last one is more controversial than others (I don't believe we should give the opportunity to do that myself until we've done a security impact analysis). This change is also bigger than others, which is harder to both write and review. As far as I know, there was no spec that preceeded this work, so we couldn't discuss the approach before the code was written. I don't mind not having specs for changes to be honest, but it makes sense to have one if the subject is more controversial/harder, because people will have a tendency to put hard job aside. This review is the typical review that needs to be discussed in the community meeting, advocating for or against it until a decision is taken (merge or abandon). > > Frankly speaking, I don't know if this is a real issue or just way it > is. I > just want to pass some of voice from our developers, and really would > like > to hear what others think and find a better way to communicate. It doesn't matter if "it's a real issue" or "just the way it is". If there is a feeling of burden/pain, we should tackle the issue. So, yes, it's very important to raise the issue you feel! If you don't do it, nothing will change, the morale of developers will fall, and the health of the project will suffer. Transparency is key here. Thanks for voicing your opinion. > > > Thanks you. > > I would say my key take-aways are: 1) We need to review more 2) We need to communicate/socialize more on patchsets and issues. Let's be more active on IRC outside meetings. 3) The priority list need to be updated to be accurate. I am not sure this list is complete (there is no mention of docs image building there). 4) We need to extend the core team in different geographical regions and companies as soon as possible But of course it's only my analysis. I would be happy to see Pete answer here. Regards, Jeam-Philippe Evrard (evrardjp) From bdobreli at redhat.com Wed Jan 30 12:28:25 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Wed, 30 Jan 2019 13:28:25 +0100 Subject: [TripleO] containers logging to stdout In-Reply-To: <7cee5db5-f4cd-9e11-e0a3-7438154fb9af@redhat.com> References: <7cee5db5-f4cd-9e11-e0a3-7438154fb9af@redhat.com> Message-ID: <9cae2f96-5bc5-9206-f691-714cc18ee2c9@redhat.com> On 30.01.2019 11:49, Juan Antonio Osorio Robles wrote: > Hello! > > > In Queens, the a spec to provide the option to make containers log to > standard output was proposed [1] [2]. Some work was done on that side, > but due to the lack of traction, it wasn't completed. With the Train > release coming, I think it would be a good idea to revive this effort, > but make logging to stdout the default in that release. +1 > > This would allow several benefits: > > * All logging from the containers would en up in journald; this would > make it easier for us to forward the logs, instead of having to keep > track of the different directories in /var/log/containers and simplifies logs rotation a lot! > > * The journald driver would add metadata to the logs about the container > (we would automatically get what container ID issued the logs). > > * This wouldo also simplify the stacks (removing the Logging nested > stack which is present in several templates). > > * Finally... if at some point we move towards kubernetes (or something > in between), managing our containers, it would work with their logging > tooling as well. > > > Any thoughts? > > > [1] > https://specs.openstack.org/openstack/tripleo-specs/specs/queens/logging-stdout.html > > [2] https://blueprints.launchpad.net/tripleo/+spec/logging-stdout-rsyslog > > > -- Best regards, Bogdan Dobrelya, Irc #bogdando From emilien at redhat.com Wed Jan 30 12:37:21 2019 From: emilien at redhat.com (Emilien Macchi) Date: Wed, 30 Jan 2019 07:37:21 -0500 Subject: [TripleO] containers logging to stdout In-Reply-To: <7cee5db5-f4cd-9e11-e0a3-7438154fb9af@redhat.com> References: <7cee5db5-f4cd-9e11-e0a3-7438154fb9af@redhat.com> Message-ID: On Wed, Jan 30, 2019 at 5:53 AM Juan Antonio Osorio Robles < jaosorior at redhat.com> wrote: > Hello! > > > In Queens, the a spec to provide the option to make containers log to > standard output was proposed [1] [2]. Some work was done on that side, > but due to the lack of traction, it wasn't completed. With the Train > release coming, I think it would be a good idea to revive this effort, > but make logging to stdout the default in that release. > > This would allow several benefits: > > * All logging from the containers would en up in journald; this would > make it easier for us to forward the logs, instead of having to keep > track of the different directories in /var/log/containers > > * The journald driver would add metadata to the logs about the container > (we would automatically get what container ID issued the logs). > > * This wouldo also simplify the stacks (removing the Logging nested > stack which is present in several templates). > > * Finally... if at some point we move towards kubernetes (or something > in between), managing our containers, it would work with their logging > tooling as well Also, I would add that it'll be aligned with what we did for Paunch-managed containers (with Podman backend) where each ("long life") container has its own SystemD service (+ SystemD timer sometimes); so using journald makes total sense to me. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From naftinajeh94 at gmail.com Tue Jan 29 16:26:34 2019 From: naftinajeh94 at gmail.com (Najeh Nafti) Date: Tue, 29 Jan 2019 17:26:34 +0100 Subject: openstack-summit-Denver-2019 In-Reply-To: References: Message-ID: Dear sir, Thank you for the information! Regards. Najeh. On Tue, Jan 29, 2019, 3:50 PM Melvin Hillsman Hi Najeh, > > Glad to see your interest in OpenStack and I hope you get to the summit. > You can apply for travel support here - > https://openstackfoundation.formstack.com/forms/travelsupportdenver > > On Tue, Jan 29, 2019 at 8:42 AM Najeh Nafti > wrote: > >> Dear OpenStack Project team, >> >> My name is Najeh Nafti. I am a master degree student from Tunisia North >> Africa. >> I'm currently working for a thesis project based on OpenStack, but i >> didn't find the need resources that can help me to realize my project. >> >> I would like to request your approval to attend >> OpenStack-summit-Denver-2019 with your financial support. This event is >> a unique opportunity for me to gain knowledge and insight I need to solve >> daily challenges and to help me contribute to the overall goals of my study. >> >> Although I understand there might be financial constraints in sending me >> to this event, I believe it will be an investment that results in immediate >> and longer term benefits. My objectives in attending this event are: >> >> - >> >> To increase knowledge in my discipline area. >> >> >> - >> >> To network with my peers from all over the world to share >> information, learn how they are solving similar problems, and collaborate >> to find innovative approaches. >> >> >> OpenStack-summit-Denver-2019 would be a valuable experience for me and >> one that would benefit my thesis. >> >> >> Thank you for your consideration. >> >> Looking forward to hearing from you. >> Najeh. >> > > > -- > Kind regards, > > Melvin Hillsman > mrhillsman at gmail.com > mobile: (832) 264-2646 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.lake at surrey.ac.uk Tue Jan 29 18:05:41 2019 From: d.lake at surrey.ac.uk (David Lake) Date: Tue, 29 Jan 2019 18:05:41 +0000 Subject: Issue with launching instance with OVS-DPDK In-Reply-To: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> References: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> Message-ID: Answers
in-line
Thanks David -----Original Message----- From: Sean Mooney Sent: 29 January 2019 14:55 To: Lake, David (PG/R - Elec Electronic Eng) ; openstack-dev at lists.openstack.org Cc: Ge, Chang Dr (Elec Electronic Eng) Subject: Re: Issue with launching instance with OVS-DPDK On Mon, 2019-01-28 at 13:17 +0000, David Lake wrote: > Hello > > I’ve built an Openstack all-in-one using OVS-DPDK via Devstack. > > I can launch instances which use the “m1.small” flavour (which I have > modified to include the hw:mem_size large as per the DPDK instructions) but as soon as I try to launch anything more than m1.small, I get this error: > > Jan 28 12:56:52 localhost nova-conductor: #033[01;31mERROR > nova.scheduler.utils [#033[01;36mNone req-917cd3b9-8ce6- > 41af-8d44-045002512c91 #033[00;36madmin admin#033[01;31m] > #033[01;35m[instance: 25cfee28-08e9-419c-afdb-4d0fe515fb2a] > #033[01;31mError from last host: localhost (node localhost): [u'Traceback (most recent call last):\n', u' File > "/opt/stack/nova/nova/compute/manager.py", line 1935, in _do_build_and_run_instance\n filter_properties, > request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2215, in _build_and_run_instance\n > instance_uuid=instance.uuid, reason=six.text_type(e))\n', > u'RescheduledException: Build of instance 25cfee28-08e9- > 419c-afdb-4d0fe515fb2a was re-scheduled: internal error: qemu > unexpectedly closed the monitor: 2019-01- 28T12:56:48.127594Z > qemu-kvm: -chardev > socket,id=charnet0,path=/var/run/openvswitch/vhu46b3c508-f8,server: > info: QEMU waiting for connection on: > disconnected:unix:/var/run/openvswitch/vhu46b3c508-f8,server\n2019-01- > 28T12:56:49.251071Z > qemu-kvm: -object > memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/ > libvirt/qemu/4-instance- > 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: > os_mem_prealloc: Insufficient free host memory pages available to > allocate guest RAM\n']#033[00m#033[00m > > > My Hypervisor is reporting 510.7GB of RAM and 61 vCPUs. how much of that ram did you allocate as hugepages.
OVS_NUM_HUGEPAGES=3072
can you provide the output of cat /proc/meminfo
MemTotal: 526779552 kB MemFree: 466555316 kB MemAvailable: 487218548 kB Buffers: 2308 kB Cached: 22962972 kB SwapCached: 0 kB Active: 29493384 kB Inactive: 13344640 kB Active(anon): 20826364 kB Inactive(anon): 522012 kB Active(file): 8667020 kB Inactive(file): 12822628 kB Unevictable: 43636 kB Mlocked: 47732 kB SwapTotal: 4194300 kB SwapFree: 4194300 kB Dirty: 20 kB Writeback: 0 kB AnonPages: 19933028 kB Mapped: 171680 kB Shmem: 1450564 kB Slab: 1224444 kB SReclaimable: 827696 kB SUnreclaim: 396748 kB KernelStack: 69392 kB PageTables: 181020 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 261292620 kB Committed_AS: 84420252 kB VmallocTotal: 34359738367 kB VmallocUsed: 1352128 kB VmallocChunk: 34154915836 kB HardwareCorrupted: 0 kB AnonHugePages: 5365760 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 6144 HugePages_Free: 2048 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 746304 kB DirectMap2M: 34580480 kB DirectMap1G: 502267904 kB [stack at localhost devstack]$
> > Build is the latest git clone of Devstack. > > Thanks > > David From alfredo.deluca at gmail.com Wed Jan 30 08:30:42 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 30 Jan 2019 09:30:42 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: References: <1F00FD58-4132-4C42-A9C2-41E3FF8A84C4@crandale.de> Message-ID: [image: image.png] In the meantime this is my cluster template On Wed, Jan 30, 2019 at 9:17 AM Alfredo De Luca wrote: > hi Clemens and Ignazio. thanks for your support. > it must be network related but I don't do something special apparently to > create a simple k8s cluster. > I ll post later on configurations and logs as you Clemens suggested. > > > Cheers > > > > On Tue, Jan 29, 2019 at 9:16 PM Clemens > wrote: > >> … an more important: check the other log cloud-init.log for error >> messages (not only cloud-init-output.log) >> >> Am 29.01.2019 um 16:07 schrieb Alfredo De Luca > >: >> >> Hi Ignazio and Clemens. I haven\t configure the proxy and all the logs >> on the kube master keep saying the following >> >> + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished >> [+]poststarthook/extensions/third-party-resources ok >> [-]poststarthook/rbac/bootstrap-roles failed: not finished >> healthz check failed' ']' >> + sleep 5 >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished >> [+]poststarthook/extensions/third-party-resources ok >> [-]poststarthook/rbac/bootstrap-roles failed: not finished >> healthz check failed' ']' >> + sleep 5 >> >> Not sure what to do. >> My configuration is ... >> eth0 - 10.1.8.113 >> >> But the openstack configration in terms of networkin is the default from >> ansible-openstack which is 172.29.236.100/22 >> >> Maybe that's the problem? >> >> >> >> >> >> >> On Tue, Jan 29, 2019 at 2:26 PM Ignazio Cassano >> wrote: >> >>> Hello Alfredo, >>> your external network is using proxy ? >>> If you using a proxy, and yuo configured it in cluster template, you >>> must setup no proxy for 127.0.0.1 >>> Ignazio >>> >>> Il giorno mar 29 gen 2019 alle ore 12:26 Clemens Hardewig < >>> clemens.hardewig at crandale.de> ha scritto: >>> >>>> At least on fedora there is a second cloud Init log as far as I >>>> remember-Look into both >>>> >>>> Br c >>>> >>>> Von meinem iPhone gesendet >>>> >>>> Am 29.01.2019 um 12:08 schrieb Alfredo De Luca < >>>> alfredo.deluca at gmail.com>: >>>> >>>> thanks Clemens. >>>> I looked at the cloud-init-output.log on the master... and at the >>>> moment is doing the following.... >>>> >>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>> + '[' ok = '' ']' >>>> + sleep 5 >>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>> + '[' ok = '' ']' >>>> + sleep 5 >>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>> + '[' ok = '' ']' >>>> + sleep 5 >>>> >>>> Network ....could be but not sure where to look at >>>> >>>> >>>> On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig < >>>> clemens.hardewig at crandale.de> wrote: >>>> >>>>> Yes, you should check the cloud-init logs of your master. Without >>>>> having seen them, I would guess a network issue or you have selected for >>>>> your minion nodes a flavor using swap perhaps ... >>>>> So, log files are the first step you could dig into... >>>>> Br c >>>>> Von meinem iPhone gesendet >>>>> >>>>> Am 28.01.2019 um 15:34 schrieb Alfredo De Luca < >>>>> alfredo.deluca at gmail.com>: >>>>> >>>>> Hi all. >>>>> I finally instaledl successufully openstack ansible (queens) but, >>>>> after creating a cluster template I create k8s cluster, it stuck on >>>>> >>>>> >>>>> kube_masters >>>>> >>>>> b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 >>>>> >>>>> OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate >>>>> in progress....and after around an hour it says...time out. k8s master >>>>> seems to be up.....at least as VM. >>>>> >>>>> any idea? >>>>> >>>>> >>>>> >>>>> >>>>> *Alfredo* >>>>> >>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >> >> -- >> *Alfredo* >> >> >> > > -- > *Alfredo* > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 94962 bytes Desc: not available URL: From alfredo.deluca at gmail.com Wed Jan 30 09:08:41 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 30 Jan 2019 10:08:41 +0100 Subject: [openstack-ansible][magnum] In-Reply-To: References: <1F00FD58-4132-4C42-A9C2-41E3FF8A84C4@crandale.de> Message-ID: here are also the logs for the cloud init logs from the k8s master.... On Wed, Jan 30, 2019 at 9:30 AM Alfredo De Luca wrote: > [image: image.png] > In the meantime this is my cluster > template > > > > On Wed, Jan 30, 2019 at 9:17 AM Alfredo De Luca > wrote: > >> hi Clemens and Ignazio. thanks for your support. >> it must be network related but I don't do something special apparently to >> create a simple k8s cluster. >> I ll post later on configurations and logs as you Clemens suggested. >> >> >> Cheers >> >> >> >> On Tue, Jan 29, 2019 at 9:16 PM Clemens >> wrote: >> >>> … an more important: check the other log cloud-init.log for error >>> messages (not only cloud-init-output.log) >>> >>> Am 29.01.2019 um 16:07 schrieb Alfredo De Luca >> >: >>> >>> Hi Ignazio and Clemens. I haven\t configure the proxy and all the logs >>> on the kube master keep saying the following >>> >>> + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished >>> [+]poststarthook/extensions/third-party-resources ok >>> [-]poststarthook/rbac/bootstrap-roles failed: not finished >>> healthz check failed' ']' >>> + sleep 5 >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '' ']' >>> + sleep 5 >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '[-]poststarthook/bootstrap-controller failed: not finished >>> [+]poststarthook/extensions/third-party-resources ok >>> [-]poststarthook/rbac/bootstrap-roles failed: not finished >>> healthz check failed' ']' >>> + sleep 5 >>> >>> Not sure what to do. >>> My configuration is ... >>> eth0 - 10.1.8.113 >>> >>> But the openstack configration in terms of networkin is the default >>> from ansible-openstack which is 172.29.236.100/22 >>> >>> Maybe that's the problem? >>> >>> >>> >>> >>> >>> >>> On Tue, Jan 29, 2019 at 2:26 PM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Hello Alfredo, >>>> your external network is using proxy ? >>>> If you using a proxy, and yuo configured it in cluster template, you >>>> must setup no proxy for 127.0.0.1 >>>> Ignazio >>>> >>>> Il giorno mar 29 gen 2019 alle ore 12:26 Clemens Hardewig < >>>> clemens.hardewig at crandale.de> ha scritto: >>>> >>>>> At least on fedora there is a second cloud Init log as far as I >>>>> remember-Look into both >>>>> >>>>> Br c >>>>> >>>>> Von meinem iPhone gesendet >>>>> >>>>> Am 29.01.2019 um 12:08 schrieb Alfredo De Luca < >>>>> alfredo.deluca at gmail.com>: >>>>> >>>>> thanks Clemens. >>>>> I looked at the cloud-init-output.log on the master... and at the >>>>> moment is doing the following.... >>>>> >>>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>>> + '[' ok = '' ']' >>>>> + sleep 5 >>>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>>> + '[' ok = '' ']' >>>>> + sleep 5 >>>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>>> + '[' ok = '' ']' >>>>> + sleep 5 >>>>> >>>>> Network ....could be but not sure where to look at >>>>> >>>>> >>>>> On Tue, Jan 29, 2019 at 11:34 AM Clemens Hardewig < >>>>> clemens.hardewig at crandale.de> wrote: >>>>> >>>>>> Yes, you should check the cloud-init logs of your master. Without >>>>>> having seen them, I would guess a network issue or you have selected for >>>>>> your minion nodes a flavor using swap perhaps ... >>>>>> So, log files are the first step you could dig into... >>>>>> Br c >>>>>> Von meinem iPhone gesendet >>>>>> >>>>>> Am 28.01.2019 um 15:34 schrieb Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com>: >>>>>> >>>>>> Hi all. >>>>>> I finally instaledl successufully openstack ansible (queens) but, >>>>>> after creating a cluster template I create k8s cluster, it stuck on >>>>>> >>>>>> >>>>>> kube_masters >>>>>> >>>>>> b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 >>>>>> >>>>>> OS::Heat::ResourceGroup 16 minutes Create In Progress state changedcreate >>>>>> in progress....and after around an hour it says...time out. k8s master >>>>>> seems to be up.....at least as VM. >>>>>> >>>>>> any idea? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> *Alfredo* >>>>>> >>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >>> >>> -- >>> *Alfredo* >>> >>> >>> >> >> -- >> *Alfredo* >> >> > > -- > *Alfredo* > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 94962 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cloud-init.log Type: text/x-log Size: 126606 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cloud-init-output.log Type: text/x-log Size: 44975 bytes Desc: not available URL: From rony.khan at brilliant.com.bd Wed Jan 30 13:20:49 2019 From: rony.khan at brilliant.com.bd (Md. Farhad Hasan Khan) Date: Wed, 30 Jan 2019 19:20:49 +0600 Subject: [openstack] [Nova] Instance automatically got shutdown Message-ID: <025401d4b89e$9dd60720$d9821560$@brilliant.com.bd> Hi, Sorry for mistake. I corrected the subject line. Thanks Rony -----Original Message----- From: Slawomir Kaplonski [mailto:skaplons at redhat.com] Sent: Wednesday, January 30, 2019 2:55 PM To: rony.khan at brilliant.com.bd Cc: openstack-discuss at lists.openstack.org Subject: Re: [openstack] [Neutron] Instance automatically got shutdown Hi, Why You tagged „Neutron” in topic? IMO it’s some issue related to Nova rather than Neutron. > Wiadomość napisana przez Md. Farhad Hasan Khan w dniu 30.01.2019, o godz. 05:07: > > Hi, > we are getting Openstack instance automatically shut down. Instance not shut down from horizon/cli & from inside instance. This is happening randomly in our environment. Please help us to solve this. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > cat /etc/nova/nova.conf > > [DEFAULT] > sync_power_state_interval=-1 > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > Here is log: > > [root at compute2 ~]# cat /var/log/nova/nova-compute.log |grep 4e143f50-49d4-40c9-b80e-c99dfe45d9d4 > 2019-01-30 08:06:18.750 417615 INFO nova.compute.manager [-] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] VM Stopped (Lifecycle Event) > 2019-01-30 08:06:18.855 417615 INFO nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (4). Updating power_state in the DB to match the hypervisor. > 2019-01-30 08:06:19.027 417615 WARNING nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance shutdown by itself. Calling the stop API. Current vm_state: active, current task_state: None, original DB power_state: 1, current VM power_state: 4 > 2019-01-30 08:06:19.376 417615 INFO nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance is already powered off in the hypervisor when stop is called. > 2019-01-30 08:06:19.441 417615 INFO nova.virt.libvirt.driver [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance already shutdown. > 2019-01-30 08:06:19.445 417615 INFO nova.virt.libvirt.driver [-] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance destroyed successfully. > > > Thanks & B’Rgds, > Rony — Slawek Kaplonski Senior software engineer Red Hat From smooney at redhat.com Wed Jan 30 13:39:39 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 30 Jan 2019 13:39:39 +0000 Subject: [openstack] [Nova] Instance automatically got shutdown In-Reply-To: <025401d4b89e$9dd60720$d9821560$@brilliant.com.bd> References: <025401d4b89e$9dd60720$d9821560$@brilliant.com.bd> Message-ID: On Wed, 2019-01-30 at 19:20 +0600, Md. Farhad Hasan Khan wrote: > Hi, > Sorry for mistake. I corrected the subject line. dont worry about it :) have you checked that the instace was not killed by the kernel OOM reaper. the log snipit show that nova recived an instance lifecyle event from libvirt statign the vm was stoped so it just updated the db. the other way this could happen if if the guest just ran sudo poweroff. > > Thanks > Rony > > -----Original Message----- > From: Slawomir Kaplonski [mailto:skaplons at redhat.com] > Sent: Wednesday, January 30, 2019 2:55 PM > To: rony.khan at brilliant.com.bd > Cc: openstack-discuss at lists.openstack.org > Subject: Re: [openstack] [Neutron] Instance automatically got shutdown > > Hi, > > Why You tagged „Neutron” in topic? IMO it’s some issue related to Nova rather than Neutron. > > > Wiadomość napisana przez Md. Farhad Hasan Khan w dniu 30.01.2019, o godz. 05:07: > > > > Hi, > > we are getting Openstack instance automatically shut down. Instance not shut down from horizon/cli & from inside > > instance. This is happening randomly in our environment. Please help us to solve this. > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > cat /etc/nova/nova.conf > > > > [DEFAULT] > > sync_power_state_interval=-1 > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > > > Here is log: > > > > [root at compute2 ~]# cat /var/log/nova/nova-compute.log |grep 4e143f50-49d4-40c9-b80e-c99dfe45d9d4 > > 2019-01-30 08:06:18.750 417615 INFO nova.compute.manager [-] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] VM > > Stopped (Lifecycle Event) > > 2019-01-30 08:06:18.855 417615 INFO nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] > > [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] During _sync_instance_power_state the DB power_state (1) does not > > match the vm_power_state from the hypervisor (4). Updating power_state in the DB to match the hypervisor. > > 2019-01-30 08:06:19.027 417615 WARNING nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] > > [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance shutdown by itself. Calling the stop API. Current > > vm_state: active, current task_state: None, original DB power_state: 1, current VM power_state: 4 > > 2019-01-30 08:06:19.376 417615 INFO nova.compute.manager [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] > > [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance is already powered off in the hypervisor when stop is > > called. > > 2019-01-30 08:06:19.441 417615 INFO nova.virt.libvirt.driver [req-c2b7b111-e124-4aac-8971-59c3f840bacc - - - - -] > > [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] Instance already shutdown. > > 2019-01-30 08:06:19.445 417615 INFO nova.virt.libvirt.driver [-] [instance: 4e143f50-49d4-40c9-b80e-c99dfe45d9d4] > > Instance destroyed successfully. > > > > > > Thanks & B’Rgds, > > Rony > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > > From smooney at redhat.com Wed Jan 30 13:57:03 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 30 Jan 2019 13:57:03 +0000 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <87bm477xae.fsf@meyer.lemoncheese.net> <5a383fb2-cf13-c492-1c63-b61ed6442600@openstack.org> <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> Message-ID: <0bfbc77b51d93fd84206414686f111effeab7c79.camel@redhat.com> by the way in case it was not clear i am actully in favor of having vendor indepenat container for openstack. i would recommend basing such a container on the offical python:3-alpine image as it is only 30mb and has everything we should need to just pip install the project. it has python 3.7.2 currently but the 3-alpine tag tracks both the latest release of alpine and the latest release of python alpine supports. in some rare cases we might need to also install bindeps but i would hope that between bindeps and pip we could build small images for source fairly simpely and leave the orchestration of those images to the enduser. as i said before too if we choose to go donw this route however i would stongly encorrage not packaging any of our thirdparty dependcies like libvirt, mysql rabbitmq or ovs and deply all service api that can be deployed under uwsgi with it instead of apache again to keep the images as small as possible. that said loci and kolla both do resonably good jobs i think at this already so if we say with the status quo then i think that is fine too. perhaps this would be a good topic for the fourm/summit/ptg? i would see this kindof like a comunity goal if it was something we chose to do so it would be good to get feedback/input for those why might not have engaged on the tread so far. there is also the TC question from a policy perspective too ignoring the technical aspects above. On Mon, 2019-01-28 at 16:52 +0000, Sean Mooney wrote: > On Mon, 2019-01-28 at 16:31 +0000, Sean Mooney wrote: > > On Mon, 2019-01-28 at 11:18 -0500, Jay Pipes wrote: > > > On 01/28/2019 11:00 AM, Mohammed Naser wrote: > > > > On Mon, Jan 28, 2019 at 10:58 AM Jay Pipes wrote: > > > > > > > > > > On 01/28/2019 10:43 AM, Mohammed Naser wrote: > > > > > > On Mon, Jan 28, 2019 at 10:41 AM Jay Pipes wrote: > > > > > > > > > > > > > > On 01/28/2019 10:24 AM, Mohammed Naser wrote: > > > > > > > > Perhaps, we should come up with the first initial step of providing > > > > > > > > a common way of building images (so a use can clone a repo and do > > > > > > > > 'docker build .') which will eliminate the obligation of having to > > > > > > > > deal with binaries, and then afterwards reconsider the ideal way of > > > > > > > > shipping those out. > > > > > > > > > > > > > > Isn't that precisely what LOCI offers, Mohammed? > > > > > > > > > > > > > > Best, > > > > > > > -jay > > > > > > > > > > > > > > > > > > > I haven't studied LOCI as much however I think that it would be good to > > > > > > perhaps look into bringing that approach in-repo rather than out-of-repo > > > > > > so a user can simply git clone, docker build . > > > > > > > > > > > > I have to admit, I'm not super familiar with LOCI but as far as I know, that's > > > > > > indeed what I believe it does. > > > > > > > > > > Yes, that's what LOCI can do, kinda. :) Technically there's some > > > > > Makefile foo that iterates over projects to build images for, but it's > > > > > essentially what it does. > > > > > > > > > > Alternately, you don't even need to build locally. You can do: > > > > > > > > > > docker build https://git.openstack.org/openstack/loci.git \ > > > > > --build-arg PROJECT=keystone \ > > > > > --tag keystone:ubuntu > > > > > > > > > > IMHO, the real innovation that LOCI brings is the way that it builds > > > > > wheel packages into an intermediary docker build container and then > > > > > installs the service-specific Python code into a virtualenv inside the > > > > > target project docker container after injecting the built wheels. > > > > > > > > > > That, and LOCI made a good (IMHO) decision to just focus on building the > > > > > images and not deploying those images (using Ansible, Puppet, Chef, k8s, > > > > > whatever). They kept the deployment concerns separate, which is a great > > > > > decision since deployment tools are a complete dumpster fire (all of them). > > > > > > > > Thanks for that, I didn't know about this, I'll do some more reading about LOCI > > > > and it how it goes about doing this. > > > > > > > > Thanks Jay. > > > > > > No problem. Also a good thing to keep in mind is that kolla-ansible is > > > able to deploy LOCI images, AFAIK, instead of the "normal" Kolla images. > > > I have not tried this myself, however, so perhaps someone with > > > experience in this might chime in. > > > > the loci images would have to conform to the kolla abit which requires a few files > > like kolla_start to existit but it principal it could if that requirement was fulfilled. > > this is the kolla image api for reference > https://docs.openstack.org/kolla/latest/admin/kolla_api.html > https://github.com/openstack/kolla/blob/master/doc/source/admin/kolla_api.rst > all kolla images share that external facing api > so if you use loci to build an image an then inject the required api shim as a layer > it would work. > > you can also use the iamge manually the same way by defining the relevent env varibale > or monting configs. > > docker run -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS \ > -e KOLLA_CONFIG_FILE=/config.json \ > -v /path/to/config.json:/config.json kolla-image > > of cource you can bypass it too and execute command directly in the contienr too e.g. just start nova-compute. > > the point was to define a commmon way to inject configuration, include what command to run externally > after the image was built so that they could be reused by different deployment tools > like kolla-k8s, tripleo or just a buch or bash commands. > > the workflow is the same. > prepfare a directory with a buch of config files for the service. > spawn the container with that directory bind mounted into the container > and set an env var to point at the kolla config.json that specifed where > teh config shoudl be copied, with what owership/permission and what command to run. > > im not sure if thsi is a good or a bad thing but any tool that supported the kolla image > api should be able to use loci built image if those image suport it too. > > > > > > Best, > > > -jay > > > > > > > > > > > > > > > From fungi at yuggoth.org Wed Jan 30 14:35:36 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 30 Jan 2019 14:35:36 +0000 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: <0bfbc77b51d93fd84206414686f111effeab7c79.camel@redhat.com> References: <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> <0bfbc77b51d93fd84206414686f111effeab7c79.camel@redhat.com> Message-ID: <20190130143535.jvsuxoyutumct3my@yuggoth.org> On 2019-01-30 13:57:03 +0000 (+0000), Sean Mooney wrote: [...] > i would recommend basing such a container on the offical > python:3-alpine image as it is only 30mb and has everything we > should need to just pip install the project. it has python 3.7.2 > currently but the 3-alpine tag tracks both the latest release of > alpine and the latest release of python alpine supports. In a twist of irony, Python manylinux1 wheels assume glibc and so any with C extensions are unusable with Alpine's musl. As a result, we'll likely need to cross-compile any of our non-pure-Python dependencies from sdist/source with an appropriate toolchain and inject them into image. > in some rare cases we might need to also install bindeps but i > would hope that between bindeps and pip we could build small > images for source fairly simpely and leave the orchestration of > those images to the enduser. [...] The bindep tool does at least have support for Alpine now, so as long as there are packages available for our system dependencies that should hopefully be a viable option. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From nate.johnston at redhat.com Wed Jan 30 15:16:00 2019 From: nate.johnston at redhat.com (Nate Johnston) Date: Wed, 30 Jan 2019 10:16:00 -0500 Subject: [openstack-dev] [Neutron] Propose Liu Yulong for Neutron core In-Reply-To: References: Message-ID: <20190130151600.yjm5mx3wzoqrkelv@bishop> On Tue, Jan 29, 2019 at 05:18:58PM -0600, Miguel Lavalle wrote: > I want to nominate Liu Yulong (irc: liuyulong) as a member of the Neutron > core team. +1 from me. Yulong always gives very good feedback in reviews, and has been a driving force in the community. Congrats! Nate From hongbin.lu at huawei.com Wed Jan 30 15:23:15 2019 From: hongbin.lu at huawei.com (Hongbin Lu) Date: Wed, 30 Jan 2019 15:23:15 +0000 Subject: [openstack-dev] [Neutron] Propose Liu Yulong for Neutron core In-Reply-To: References: Message-ID: <0957CD8F4B55C0418161614FEC580D6B30915601@yyzeml704-chm.china.huawei.com> +1 From: Miguel Lavalle [mailto:miguel at mlavalle.com] Sent: January-29-19 6:19 PM To: openstack-discuss at lists.openstack.org Subject: [openstack-dev] [Neutron] Propose Liu Yulong for Neutron core Hi Stackers, I want to nominate Liu Yulong (irc: liuyulong) as a member of the Neutron core team. Liu started contributing to Neutron back in Mitaka, fixing bugs in HA routers. Since then, he has specialized in L3 networking, developing a deep knowledge of DVR. More recently, he single handedly implemented QoS for floating IPs with this series of patches: https://review.openstack.org/#/q/topic:bp/floating-ip-rate-limit+(status:open+OR+status:merged). He has also been very busy helping to improve the implementation of port forwardings and adding QoS to them. He also works for a large operator in China, which allows him to bring an important operational perspective from that part of the world to our project. The quality and number of his code reviews during the Stein cycle is on par with the leading members of the core team: https://www.stackalytics.com/?module=neutron-group. I will keep this nomination open for a week as customary. Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From lars at redhat.com Wed Jan 30 15:26:04 2019 From: lars at redhat.com (Lars Kellogg-Stedman) Date: Wed, 30 Jan 2019 10:26:04 -0500 Subject: [ironic] Hardware leasing with Ironic Message-ID: <20190130152604.ik7zi2w7hrpabahd@redhat.com> Howdy. I'm working with a group of people who are interested in enabling some form of baremetal leasing/reservations using Ironic. There are three key features we're looking for that aren't (maybe?) available right now: - multi-tenancy: in addition to the ironic administrator, we need to be able to define a node "owner" (someone who controls a specific node) and a node "consumer" (someone who has been granted temporary access to a specific node). An "owner" always has the ability to control node power or access the console, can mark a node as available or not, and can set lease policies (such as a maximum lease lifetime) for a node. A "consumer" is granted access to power control and console only when they hold an active lease, and otherwise has no control over the node. - leasing: a mechanism for marking nodes as available, requesting nodes for a specific length of time, and returning those nodes to the available pool when a lease has expired. - hardware only: we'd like the ability to leave os provisioning up to the "consumer". For example, after someone acquires a node via the leasing mechanism, they can use Foreman to provisioning an os onto the node. For example, a workflow might look something like this: - The owner of a baremetal node makes the node part of a pool of available hardware. They set a maximum lease lifetime of 5 days. - A consumer issues a lease request for "3 nodes with >= 48GB of memory and >= 1 GPU" and "1 node with >= 16GB of memory and >= 1TB of local disk", with a required lease time of 3 days. - The leasing system finds available nodes matching the hardware requirements and with owner-set lease policies matching the lease lifetime requirements. - The baremetal nodes are assigned to the consumer, who can then attach them to networks and make use of their own provisioning tools (which may be another Ironic instance?) to manage the hardware. The consumer is able to control power on these nodes and access the serial console. - At the end of the lease, the nodes are wiped and returned to the pool of available hardware. The previous consumer no longer has any access to the nodes. Our initial thought is to implement this as a service that sits in front of Ironic and provides the multi-tenancy and policy logic, while using Ironic to actually control the hardware. Does this seem like a reasonable path forward? On paper there's a lot of overlap here between what we want and features provided by things like the Nova schedulers or the Placement api, but it's not clear we can leverage those at the baremetal layer. Thanks for your thoughts, -- Lars Kellogg-Stedman | larsks @ {irc,twitter,github} http://blog.oddbit.com/ | From James.Gauld at windriver.com Wed Jan 30 15:40:49 2019 From: James.Gauld at windriver.com (Gauld, James) Date: Wed, 30 Jan 2019 15:40:49 +0000 Subject: [openstack-helm] How to specify nova override for multiple pci alias Message-ID: <8E5740EC88EF3E4BA3196F2545DC8625BA1CD21F@ALA-MBD.corp.ad.wrs.com> How can I specify a helm override to configure nova PCI alias when there are multiple aliases? I haven't been able to come up with a YAML compliant specification for this. Are there other alternatives to be able to specify this as an override? I assume that a nova Chart change would be required to support this custom one-alias-entry-per-line formatting. Any insights on how to achieve this in helm are welcomed. Background: There is a limitation in the nova.conf specification of PCI alias in that it does not allow multiple PCI aliases as a list. The code says "Supports multiple aliases by repeating the option (not by specifying a list value)". Basically nova currently only supports one-alias-entry-per-line format. Ideally I would specify global pci alias in a format similar to what can be achieved with PCI passthrough_whitelist, which can takes JSON list of dictionaries. This is what I am trying to specify in nova.conf (i.e., for nova-api-osapi and nova-compute): [pci] alias = {dict 1} alias = {dict 2} . . . The following nova configuration format is desired, but not as yet supported by nova: [pci] alias = [{dict 1}, {dict 2}] The following snippet of YAML works for PCI passthrough_whitelist, where the value encoded is a JSON string: conf: nova: overrides: nova_compute: hosts: - conf: nova: pci: passthrough_whitelist: '[{"class_id": "030000", "address": "0000:00:02.0"}]' Jim Gauld -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at openstack.org Wed Jan 30 15:56:06 2019 From: chris at openstack.org (Chris Hoge) Date: Wed, 30 Jan 2019 07:56:06 -0800 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: <20190130143535.jvsuxoyutumct3my@yuggoth.org> References: <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> <0bfbc77b51d93fd84206414686f111effeab7c79.camel@redhat.com> <20190130143535.jvsuxoyutumct3my@yuggoth.org> Message-ID: I want to clear up a few things about Loci images. To start with, I would not be comfortable publishing Loci images to the OpenStack namespace in Docker Hub because currently they have no functional testing. In several instances over the past couple of months we've sent up patches to fix images that just didn't work because of dependency issues. We're working on a way to do functional testing, and once we're gating with functional testing on master and stable branches we can revisit the issue. Still, we assume that deployment tooling will want to modify images anyway, and specifically designed the build system to accomodate injecting different binary and python dependencies. Also, Loci does not provide it's own Makefile for building images. The Dockerfile and installation scripts use environment variables to control the entire build process, which makes is very easy to use tools like Make or Ansible to build the images. Supporting multiple base operating systems is trivial with Loci and Docker image tagging. If we do push images to some central location, as a community we should think about adopting a common tagging strategy for consitency across all projects. For example, in my own little deployments I use a naming scheme that follows this pattern: loci-:- So Nova from master on Leap15 would be tagged as: loci-nova:master-leap15 We should be listening to demand for such images, but for now I encourage people interested in Loci to build their own to suit their particular needs. -Chris > On Jan 30, 2019, at 6:35 AM, Jeremy Stanley wrote: > > On 2019-01-30 13:57:03 +0000 (+0000), Sean Mooney wrote: > [...] >> i would recommend basing such a container on the offical >> python:3-alpine image as it is only 30mb and has everything we >> should need to just pip install the project. it has python 3.7.2 >> currently but the 3-alpine tag tracks both the latest release of >> alpine and the latest release of python alpine supports. > > In a twist of irony, Python manylinux1 wheels assume glibc and so > any with C extensions are unusable with Alpine's musl. As a result, > we'll likely need to cross-compile any of our non-pure-Python > dependencies from sdist/source with an appropriate toolchain and > inject them into image. > >> in some rare cases we might need to also install bindeps but i >> would hope that between bindeps and pip we could build small >> images for source fairly simpely and leave the orchestration of >> those images to the enduser. > [...] > > The bindep tool does at least have support for Alpine now, so as > long as there are packages available for our system dependencies > that should hopefully be a viable option. > -- > Jeremy Stanley From Tim.Bell at cern.ch Wed Jan 30 16:14:48 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Wed, 30 Jan 2019 16:14:48 +0000 Subject: [ironic] Hardware leasing with Ironic In-Reply-To: <20190130152604.ik7zi2w7hrpabahd@redhat.com> References: <20190130152604.ik7zi2w7hrpabahd@redhat.com> Message-ID: <5354829D-31EA-4CB2-A054-239D105C7EC9@cern.ch> Would Blazar provide much of this functionality? I think it only talks Nova at the moment. It doesn't quite cover the use case but one approach we have taken is to define resources which expire after a length of time. Details are in https://techblog.web.cern.ch/techblog/post/expiry-of-vms-in-cern-cloud/ and the Mistral workflows are at https://gitlab.cern.ch/cloud-infrastructure/mistral-workflows. Tim -----Original Message----- From: Lars Kellogg-Stedman Date: Wednesday, 30 January 2019 at 16:28 To: "openstack-discuss at lists.openstack.org" Cc: Tzu-Mainn Chen , "Ansari, Mohhamad Naved" , Kristi Nikolla , Julia Kreger , Ian Ballou Subject: [ironic] Hardware leasing with Ironic Howdy. I'm working with a group of people who are interested in enabling some form of baremetal leasing/reservations using Ironic. There are three key features we're looking for that aren't (maybe?) available right now: - multi-tenancy: in addition to the ironic administrator, we need to be able to define a node "owner" (someone who controls a specific node) and a node "consumer" (someone who has been granted temporary access to a specific node). An "owner" always has the ability to control node power or access the console, can mark a node as available or not, and can set lease policies (such as a maximum lease lifetime) for a node. A "consumer" is granted access to power control and console only when they hold an active lease, and otherwise has no control over the node. - leasing: a mechanism for marking nodes as available, requesting nodes for a specific length of time, and returning those nodes to the available pool when a lease has expired. - hardware only: we'd like the ability to leave os provisioning up to the "consumer". For example, after someone acquires a node via the leasing mechanism, they can use Foreman to provisioning an os onto the node. For example, a workflow might look something like this: - The owner of a baremetal node makes the node part of a pool of available hardware. They set a maximum lease lifetime of 5 days. - A consumer issues a lease request for "3 nodes with >= 48GB of memory and >= 1 GPU" and "1 node with >= 16GB of memory and >= 1TB of local disk", with a required lease time of 3 days. - The leasing system finds available nodes matching the hardware requirements and with owner-set lease policies matching the lease lifetime requirements. - The baremetal nodes are assigned to the consumer, who can then attach them to networks and make use of their own provisioning tools (which may be another Ironic instance?) to manage the hardware. The consumer is able to control power on these nodes and access the serial console. - At the end of the lease, the nodes are wiped and returned to the pool of available hardware. The previous consumer no longer has any access to the nodes. Our initial thought is to implement this as a service that sits in front of Ironic and provides the multi-tenancy and policy logic, while using Ironic to actually control the hardware. Does this seem like a reasonable path forward? On paper there's a lot of overlap here between what we want and features provided by things like the Nova schedulers or the Placement api, but it's not clear we can leverage those at the baremetal layer. Thanks for your thoughts, -- Lars Kellogg-Stedman | larsks @ {irc,twitter,github} http://blog.oddbit.com/ | From smooney at redhat.com Wed Jan 30 16:17:00 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 30 Jan 2019 16:17:00 +0000 Subject: [ironic] Hardware leasing with Ironic In-Reply-To: <20190130152604.ik7zi2w7hrpabahd@redhat.com> References: <20190130152604.ik7zi2w7hrpabahd@redhat.com> Message-ID: <97147bb32de48a2e9e0a6feeaac0b1053517c17e.camel@redhat.com> On Wed, 2019-01-30 at 10:26 -0500, Lars Kellogg-Stedman wrote: > Howdy. > > I'm working with a group of people who are interested in enabling some > form of baremetal leasing/reservations using Ironic. There are three > key features we're looking for that aren't (maybe?) available right > now: > > - multi-tenancy: in addition to the ironic administrator, we need to > be able to define a node "owner" (someone who controls a specific > node) and a node "consumer" (someone who has been granted temporary > access to a specific node). An "owner" always has the ability to > control node power or access the console, can mark a node as > available or not, and can set lease policies (such as a maximum > lease lifetime) for a node. A "consumer" is granted access to power > control and console only when they hold an active lease, and > otherwise has no control over the node. > > - leasing: a mechanism for marking nodes as available, requesting > nodes for a specific length of time, and returning those nodes to > the available pool when a lease has expired. > > - hardware only: we'd like the ability to leave os provisioning up to > the "consumer". For example, after someone acquires a node via the > leasing mechanism, they can use Foreman to provisioning an os onto > the node. > > For example, a workflow might look something like this: > > - The owner of a baremetal node makes the node part of a pool of > available hardware. They set a maximum lease lifetime of 5 days. > > - A consumer issues a lease request for "3 nodes with >= 48GB of > memory and >= 1 GPU" and "1 node with >= 16GB of memory and >= 1TB > of local disk", with a required lease time of 3 days. > > - The leasing system finds available nodes matching the hardware > requirements and with owner-set lease policies matching the lease > lifetime requirements. > > - The baremetal nodes are assigned to the consumer, who can then > attach them to networks and make use of their own provisioning tools > (which may be another Ironic instance?) to manage the hardware. The > consumer is able to control power on these nodes and access the > serial console. > > - At the end of the lease, the nodes are wiped and returned to the > pool of available hardware. The previous consumer no longer has any > access to the nodes. > > Our initial thought is to implement this as a service that sits in > front of Ironic and provides the multi-tenancy and policy logic, while > using Ironic to actually control the hardware. have you looked at blazar https://docs.openstack.org/blazar/queens/index.html it is basically desigedn to do this. > > Does this seem like a reasonable path forward? On paper there's a lot > of overlap here between what we want and features provided by things > like the Nova schedulers or the Placement api, but it's not clear > we can leverage those at the baremetal layer. > > Thanks for your thoughts, > From smooney at redhat.com Wed Jan 30 16:17:31 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 30 Jan 2019 16:17:31 +0000 Subject: [ironic][blazar] Hardware leasing with Ironic In-Reply-To: <97147bb32de48a2e9e0a6feeaac0b1053517c17e.camel@redhat.com> References: <20190130152604.ik7zi2w7hrpabahd@redhat.com> <97147bb32de48a2e9e0a6feeaac0b1053517c17e.camel@redhat.com> Message-ID: <8353856426a2733274069bfb5c2f7fb382da3889.camel@redhat.com> On Wed, 2019-01-30 at 16:17 +0000, Sean Mooney wrote: > On Wed, 2019-01-30 at 10:26 -0500, Lars Kellogg-Stedman wrote: > > Howdy. > > > > I'm working with a group of people who are interested in enabling some > > form of baremetal leasing/reservations using Ironic. There are three > > key features we're looking for that aren't (maybe?) available right > > now: > > > > - multi-tenancy: in addition to the ironic administrator, we need to > > be able to define a node "owner" (someone who controls a specific > > node) and a node "consumer" (someone who has been granted temporary > > access to a specific node). An "owner" always has the ability to > > control node power or access the console, can mark a node as > > available or not, and can set lease policies (such as a maximum > > lease lifetime) for a node. A "consumer" is granted access to power > > control and console only when they hold an active lease, and > > otherwise has no control over the node. > > > > - leasing: a mechanism for marking nodes as available, requesting > > nodes for a specific length of time, and returning those nodes to > > the available pool when a lease has expired. > > > > - hardware only: we'd like the ability to leave os provisioning up to > > the "consumer". For example, after someone acquires a node via the > > leasing mechanism, they can use Foreman to provisioning an os onto > > the node. > > > > For example, a workflow might look something like this: > > > > - The owner of a baremetal node makes the node part of a pool of > > available hardware. They set a maximum lease lifetime of 5 days. > > > > - A consumer issues a lease request for "3 nodes with >= 48GB of > > memory and >= 1 GPU" and "1 node with >= 16GB of memory and >= 1TB > > of local disk", with a required lease time of 3 days. > > > > - The leasing system finds available nodes matching the hardware > > requirements and with owner-set lease policies matching the lease > > lifetime requirements. > > > > - The baremetal nodes are assigned to the consumer, who can then > > attach them to networks and make use of their own provisioning tools > > (which may be another Ironic instance?) to manage the hardware. The > > consumer is able to control power on these nodes and access the > > serial console. > > > > - At the end of the lease, the nodes are wiped and returned to the > > pool of available hardware. The previous consumer no longer has any > > access to the nodes. > > > > Our initial thought is to implement this as a service that sits in > > front of Ironic and provides the multi-tenancy and policy logic, while > > using Ironic to actually control the hardware. > > have you looked at blazar > https://docs.openstack.org/blazar/queens/index.html > it is basically desigedn to do this. > > > > Does this seem like a reasonable path forward? On paper there's a lot > > of overlap here between what we want and features provided by things > > like the Nova schedulers or the Placement api, but it's not clear > > we can leverage those at the baremetal layer. > > > > Thanks for your thoughts, > > > > From smooney at redhat.com Wed Jan 30 16:23:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 30 Jan 2019 16:23:54 +0000 Subject: [TripleO] containers logging to stdout In-Reply-To: References: <7cee5db5-f4cd-9e11-e0a3-7438154fb9af@redhat.com> Message-ID: On Wed, 2019-01-30 at 07:37 -0500, Emilien Macchi wrote: > > > On Wed, Jan 30, 2019 at 5:53 AM Juan Antonio Osorio Robles wrote: > > Hello! > > > > > > In Queens, the a spec to provide the option to make containers log to > > standard output was proposed [1] [2]. Some work was done on that side, > > but due to the lack of traction, it wasn't completed. With the Train > > release coming, I think it would be a good idea to revive this effort, > > but make logging to stdout the default in that release. > > > > This would allow several benefits: > > > > * All logging from the containers would en up in journald; this would > > make it easier for us to forward the logs, instead of having to keep > > track of the different directories in /var/log/containers > > > > * The journald driver would add metadata to the logs about the container > > (we would automatically get what container ID issued the logs). > > > > * This wouldo also simplify the stacks (removing the Logging nested > > stack which is present in several templates). > > > > * Finally... if at some point we move towards kubernetes (or something > > in between), managing our containers, it would work with their logging > > tooling as well > > Also, I would add that it'll be aligned with what we did for Paunch-managed containers (with Podman backend) where > each ("long life") container has its own SystemD service (+ SystemD timer sometimes); so using journald makes total > sense to me. one thing to keep in mind is that journald apparently has rate limiting so if you contaiern are very verbose journald will actully slowdown the execution of the contaienr application as it slows down the rate at wich it can log. this came form a downstream conversation on irc were they were recommending that such applciation bypass journald and log to a file for best performacne. > -- > Emilien Macchi From lars at redhat.com Wed Jan 30 16:24:50 2019 From: lars at redhat.com (Lars Kellogg-Stedman) Date: Wed, 30 Jan 2019 11:24:50 -0500 Subject: [ironic] Hardware leasing with Ironic In-Reply-To: <5354829D-31EA-4CB2-A054-239D105C7EC9@cern.ch> References: <20190130152604.ik7zi2w7hrpabahd@redhat.com> <5354829D-31EA-4CB2-A054-239D105C7EC9@cern.ch> Message-ID: (sorry for the dupe, failed to reply all the first time around) On Wed, Jan 30, 2019 at 11:15 AM Tim Bell wrote: > Would Blazar provide much of this functionality? I think it only talks > Nova at the moment. > Thanks for the pointer. I'll take a closer look at Blazar, because in my head it was restricted to Nova resource reservations, but perhaps it can extend beyond that. From another perspective, if we can convince Nova to hand out access to unprovisioned baremetal hosts, that might make this more of an option. -- Lars Kellogg-Stedman -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Wed Jan 30 16:37:24 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 30 Jan 2019 11:37:24 -0500 Subject: [tc] agenda for upcoming TC meeting on 7 Feb Message-ID: TC Members, Our next meeting will be on Thursday, 7 Feb at 1400 UTC in #openstack-tc. This email contains the agenda for the meeting. If you will not be able to attend, please include your name in the "Apologies for Absence" section of the wiki page [0]. * corrections to TC member election section of bylaws are completed (fungi, dhellmann) * status update for project team evaluations based on technical vision (cdent, TheJulia) * defining the role of the TC (cdent, ttx) * keeping up with python 3 releases (dhellmann, gmann) * status update of Train cycle goals selection update (lbragstad, evrardjp) * TC governance resolution voting procedures (dhellmann) * upcoming TC election (dhellmann) * review proposed OIP acceptance criteria (dhellmann, wendar) * TC goals for Stein (dhellmann) [0] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee -- Doug From pierre at stackhpc.com Wed Jan 30 16:47:09 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 30 Jan 2019 16:47:09 +0000 Subject: [ironic] Hardware leasing with Ironic In-Reply-To: References: <20190130152604.ik7zi2w7hrpabahd@redhat.com> <5354829D-31EA-4CB2-A054-239D105C7EC9@cern.ch> Message-ID: On Wed, 30 Jan 2019 at 16:28, Lars Kellogg-Stedman wrote: > > (sorry for the dupe, failed to reply all the first time around) > > On Wed, Jan 30, 2019 at 11:15 AM Tim Bell wrote: >> >> Would Blazar provide much of this functionality? I think it only talks Nova at the moment. > > > Thanks for the pointer. I'll take a closer look at Blazar, because in my head it was restricted to Nova resource reservations, but perhaps it can extend beyond that. From another perspective, if we can convince Nova to hand out access to unprovisioned baremetal hosts, that might make this more of an option. Hi Lars, Blazar currently only supports reservation of nodes via Nova. It isn't yet compatible with Ironic nodes managed by Nova, because of the lack of support for host aggregates for Ironic. We have a plan to fix this using placement aggregates instead. However, Blazar is extendable, with a plugin architecture: a baremetal plugin could be developed that interacts directly with Ironic. This would allow leveraging the existing lease management code in Blazar. As an example, the Blazar project team has been busy this cycle implementing reservations of Neutron resources (floating IPs and network segments) [1]. Giving direct provisioning access to users means they will need BMC credentials and access to provisioning networks. If more isolation is required, you might want to take a look at HIL from the Mass Open Cloud [2]. I haven't used it but I have read one of their paper and it looks well-thought-out. Pierre [1] https://review.openstack.org/#/q/topic:bp/basic-network-plugin+(status:open+OR+status:merged) [2] https://massopen.cloud/blog/project-hil/ From Kevin.Fox at pnnl.gov Wed Jan 30 16:50:48 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Wed, 30 Jan 2019 16:50:48 +0000 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> <0bfbc77b51d93fd84206414686f111effeab7c79.camel@redhat.com> <20190130143535.jvsuxoyutumct3my@yuggoth.org>, Message-ID: <1A3C52DFCD06494D8528644858247BF01C28DBCF@EX10MBOX03.pnnl.gov> I performed a back to back upgrade of one of my kubernetes clusters across 2 separate major versions yesterday (1.11.x -> 1.13.x) in under 30 minutes. The prep time for it was about the same. I'm not writing this to sing k8s's praises and slam on OpenStack. I'm trying to help ensure folks have an understanding of OpenStacks continual situation here.... What OpenStack asks of Operators is a huge amount of work while similar software does not, while achieving very similar things. While its good that your not pushing folks to use untested stuff, that should be top priority to fix I think. One of the big reasons the k8s upgrade was so easy was not needing to rebuild the universe. The software deployed as part of the upgrade was 1, built upstream, 2, tested upstream, 3, upgrade tested upstream. What I deployed was completely binary identical, all the way down to libc, to what they released. This ensured to a high level of reliability that upgrades would be smooth. I pushed for a while to get all of that workflow in kolla/kolla-kubernetes and infra just wasn't ready at the time. they are now though, which is fantastic. Please seize this opportunity cause it really has the potential to help OpenStack's Operators in a big way. There are a few other reasons the upgrade was so easy/quick. Those should be tackled by OpenStack too. but that's for another thread... Thanks, Kevin ________________________________________ From: Chris Hoge [chris at openstack.org] Sent: Wednesday, January 30, 2019 7:56 AM To: openstack-discuss at lists.openstack.org Subject: Re: [infra][tc] Container images in openstack/ on Docker Hub I want to clear up a few things about Loci images. To start with, I would not be comfortable publishing Loci images to the OpenStack namespace in Docker Hub because currently they have no functional testing. In several instances over the past couple of months we've sent up patches to fix images that just didn't work because of dependency issues. We're working on a way to do functional testing, and once we're gating with functional testing on master and stable branches we can revisit the issue. Still, we assume that deployment tooling will want to modify images anyway, and specifically designed the build system to accomodate injecting different binary and python dependencies. Also, Loci does not provide it's own Makefile for building images. The Dockerfile and installation scripts use environment variables to control the entire build process, which makes is very easy to use tools like Make or Ansible to build the images. Supporting multiple base operating systems is trivial with Loci and Docker image tagging. If we do push images to some central location, as a community we should think about adopting a common tagging strategy for consitency across all projects. For example, in my own little deployments I use a naming scheme that follows this pattern: loci-:- So Nova from master on Leap15 would be tagged as: loci-nova:master-leap15 We should be listening to demand for such images, but for now I encourage people interested in Loci to build their own to suit their particular needs. -Chris > On Jan 30, 2019, at 6:35 AM, Jeremy Stanley wrote: > > On 2019-01-30 13:57:03 +0000 (+0000), Sean Mooney wrote: > [...] >> i would recommend basing such a container on the offical >> python:3-alpine image as it is only 30mb and has everything we >> should need to just pip install the project. it has python 3.7.2 >> currently but the 3-alpine tag tracks both the latest release of >> alpine and the latest release of python alpine supports. > > In a twist of irony, Python manylinux1 wheels assume glibc and so > any with C extensions are unusable with Alpine's musl. As a result, > we'll likely need to cross-compile any of our non-pure-Python > dependencies from sdist/source with an appropriate toolchain and > inject them into image. > >> in some rare cases we might need to also install bindeps but i >> would hope that between bindeps and pip we could build small >> images for source fairly simpely and leave the orchestration of >> those images to the enduser. > [...] > > The bindep tool does at least have support for Alpine now, so as > long as there are packages available for our system dependencies > that should hopefully be a viable option. > -- > Jeremy Stanley From openstack at nemebean.com Wed Jan 30 16:57:22 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 30 Jan 2019 17:57:22 +0100 Subject: [TripleO] containers logging to stdout In-Reply-To: References: <7cee5db5-f4cd-9e11-e0a3-7438154fb9af@redhat.com> Message-ID: <5d232d0e-fed2-fb77-9424-d54cbf3d9f55@nemebean.com> On 1/30/19 5:23 PM, Sean Mooney wrote: > On Wed, 2019-01-30 at 07:37 -0500, Emilien Macchi wrote: >> >> >> On Wed, Jan 30, 2019 at 5:53 AM Juan Antonio Osorio Robles wrote: >>> Hello! >>> >>> >>> In Queens, the a spec to provide the option to make containers log to >>> standard output was proposed [1] [2]. Some work was done on that side, >>> but due to the lack of traction, it wasn't completed. With the Train >>> release coming, I think it would be a good idea to revive this effort, >>> but make logging to stdout the default in that release. >>> >>> This would allow several benefits: >>> >>> * All logging from the containers would en up in journald; this would >>> make it easier for us to forward the logs, instead of having to keep >>> track of the different directories in /var/log/containers >>> >>> * The journald driver would add metadata to the logs about the container >>> (we would automatically get what container ID issued the logs). >>> >>> * This wouldo also simplify the stacks (removing the Logging nested >>> stack which is present in several templates). >>> >>> * Finally... if at some point we move towards kubernetes (or something >>> in between), managing our containers, it would work with their logging >>> tooling as well >> >> Also, I would add that it'll be aligned with what we did for Paunch-managed containers (with Podman backend) where >> each ("long life") container has its own SystemD service (+ SystemD timer sometimes); so using journald makes total >> sense to me. > one thing to keep in mind is that journald apparently has rate limiting so if you contaiern are very verbose journald > will actully slowdown the execution of the contaienr application as it slows down the rate at wich it can log. > this came form a downstream conversation on irc were they were recommending that such applciation bypass journald and > log to a file for best performacne. Another thing to check (if you haven't already) is what happens when journald restarts. We had an issue with os-collect-config where it died when journald was restarted because it started to get EPIPE responses when logging. I don't know if that would be an issue here, but it's something to check. >> -- >> Emilien Macchi > > From jaypipes at gmail.com Wed Jan 30 17:01:58 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Wed, 30 Jan 2019 12:01:58 -0500 Subject: [infra][tc] Container images in openstack/ on Docker Hub In-Reply-To: References: <6b7f89c24f48217f0e2b9c9c673246a366942513.camel@redhat.com> <8e98fa44-2ad6-7433-b0d8-699011d34430@gmail.com> <0bfbc77b51d93fd84206414686f111effeab7c79.camel@redhat.com> <20190130143535.jvsuxoyutumct3my@yuggoth.org> Message-ID: On 01/30/2019 10:56 AM, Chris Hoge wrote: > Also, Loci does not > provide it's own Makefile for building images. The Dockerfile and > installation scripts use environment variables to control the entire > build process, which makes is very easy to use tools like Make or > Ansible to build the images. Apologies. I may have been remembering a script in openstack-helm-infra or elsewhere (maybe the Zuul gate jobs? [1]) that looped through projects, setting the $PROJECT environs variable, and executing docker build. Sorry for the bad info. Best, -jay [1] https://github.com/openstack/loci/tree/master/.zuul.d From lars at redhat.com Wed Jan 30 17:05:01 2019 From: lars at redhat.com (Lars Kellogg-Stedman) Date: Wed, 30 Jan 2019 12:05:01 -0500 Subject: [ironic] Hardware leasing with Ironic In-Reply-To: References: <20190130152604.ik7zi2w7hrpabahd@redhat.com> <5354829D-31EA-4CB2-A054-239D105C7EC9@cern.ch> Message-ID: <20190130170501.hs2vsmm7iqdhmftc@redhat.com> On Wed, Jan 30, 2019 at 04:47:09PM +0000, Pierre Riteau wrote: > However, Blazar is extendable, with a plugin architecture: a baremetal > plugin could be developed that interacts directly with Ironic. This would require Ironic to support multi-tenancy first, right? > Giving direct provisioning access to users means they will need BMC > credentials and access to provisioning networks. If more isolation is > required, you might want to take a look at HIL from the Mass Open > Cloud [2]. I haven't used it but I have read one of their paper and it > looks well-thought-out. Ironically (hah!), the group I am working with *is* the Massachusetts Open Cloud, and we're looking to implement the ideas explored in HIL/BMI on top of OpenStack services. -- Lars Kellogg-Stedman | larsks @ {irc,twitter,github} http://blog.oddbit.com/ | From mnaser at vexxhost.com Wed Jan 30 17:17:01 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Wed, 30 Jan 2019 12:17:01 -0500 Subject: [openstack-ansible] adding James Denton to os_neutron core Message-ID: Hi team, I'm happy to announce the addition of James Denton to the os_neutron Ansible role core team. James is one of the very knowledgeable OpenStack Networking experts (having authored books about it as well!) and we're happy to have him working on our team. James has contributed a lot of work to add and maintain SDNs support in OpenStack Ansible, with many useful reviews across the os_neutron role (his most recent work, iterating on adding VPP support), he's also always available to help new users of our project in IRC. Thanks & welcome James! Regards, Mohammed -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From Kevin.Fox at pnnl.gov Wed Jan 30 17:26:16 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Wed, 30 Jan 2019 17:26:16 +0000 Subject: [TripleO] containers logging to stdout In-Reply-To: References: <7cee5db5-f4cd-9e11-e0a3-7438154fb9af@redhat.com> , Message-ID: <1A3C52DFCD06494D8528644858247BF01C28DCCB@EX10MBOX03.pnnl.gov> k8s's offical way of dealing with logs is to ensure use of the docker json logger, not the journald one. then all the k8s log shippers have a standard way to gather the logs. Docker supports log rotation and other options too. seems to work out pretty well in practice. log shipping with other cri drivers such as containerd seems to work well too. Thanks, Kevin ________________________________________ From: Sean Mooney [smooney at redhat.com] Sent: Wednesday, January 30, 2019 8:23 AM To: Emilien Macchi; Juan Antonio Osorio Robles Cc: openstack-discuss at lists.openstack.org Subject: Re: [TripleO] containers logging to stdout On Wed, 2019-01-30 at 07:37 -0500, Emilien Macchi wrote: > > > On Wed, Jan 30, 2019 at 5:53 AM Juan Antonio Osorio Robles wrote: > > Hello! > > > > > > In Queens, the a spec to provide the option to make containers log to > > standard output was proposed [1] [2]. Some work was done on that side, > > but due to the lack of traction, it wasn't completed. With the Train > > release coming, I think it would be a good idea to revive this effort, > > but make logging to stdout the default in that release. > > > > This would allow several benefits: > > > > * All logging from the containers would en up in journald; this would > > make it easier for us to forward the logs, instead of having to keep > > track of the different directories in /var/log/containers > > > > * The journald driver would add metadata to the logs about the container > > (we would automatically get what container ID issued the logs). > > > > * This wouldo also simplify the stacks (removing the Logging nested > > stack which is present in several templates). > > > > * Finally... if at some point we move towards kubernetes (or something > > in between), managing our containers, it would work with their logging > > tooling as well > > Also, I would add that it'll be aligned with what we did for Paunch-managed containers (with Podman backend) where > each ("long life") container has its own SystemD service (+ SystemD timer sometimes); so using journald makes total > sense to me. one thing to keep in mind is that journald apparently has rate limiting so if you contaiern are very verbose journald will actully slowdown the execution of the contaienr application as it slows down the rate at wich it can log. this came form a downstream conversation on irc were they were recommending that such applciation bypass journald and log to a file for best performacne. > -- > Emilien Macchi From haleyb.dev at gmail.com Wed Jan 30 19:43:04 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Wed, 30 Jan 2019 14:43:04 -0500 Subject: [openstack-dev] [Neutron] Propose Liu Yulong for Neutron core In-Reply-To: References: Message-ID: <3eba9227-a61d-5fa4-df61-88027c3646a3@gmail.com> +1 from me, Liu has been doing a lot of great work and giving helpful review feedback :) -Brian On 1/29/19 6:18 PM, Miguel Lavalle wrote: > Hi Stackers, > > I want to nominate Liu Yulong (irc: liuyulong) as a member of the > Neutron core team. Liu started contributing to Neutron back in Mitaka, > fixing bugs in HA routers. Since then, he has specialized in L3 > networking, developing a deep knowledge of DVR. More recently, he single > handedly implemented QoS for floating IPs with this series of patches: > https://review.openstack.org/#/q/topic:bp/floating-ip-rate-limit+(status:open+OR+status:merged). > He has also been very busy helping to improve the implementation of port > forwardings and adding QoS to them. He also works for a large operator > in China, which allows him to bring an important operational perspective > from that part of the world to our project. The quality and number of > his code reviews during the Stein cycle is on par with the leading > members of the core team: > https://www.stackalytics.com/?module=neutron-group. > > I will keep this nomination open for a week as customary. > > Best regards > > Miguel From smooney at redhat.com Wed Jan 30 19:58:24 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 30 Jan 2019 19:58:24 +0000 Subject: Issue with launching instance with OVS-DPDK In-Reply-To: References: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> Message-ID: On Wed, 2019-01-30 at 19:02 +0000, David Lake wrote: > Hi Sean > > I've set OVS_NUM_HUGEPAGES=14336 but now Devstack is failing to install... that appars to be unrelated you could disable the installation of tempest as a workaround but my guess is that it is related to the pip 19.0 or 19.0.1 relsase that was don in the last few days https://pypi.org/project/pip/#history pip config was intoduced in pip 10.0.0b1 https://pip.pypa.io/en/stable/news/#b1-2018-03-31 to disable tempest add "disable_service tempest" to your local.conf then unstack and stack. > > David > > full create: /opt/stack/tempest/.tox/tempest > ERROR: invocation failed (exit code 1), logfile: /opt/stack/tempest/.tox/tempest/log/full-0.log > ERROR: actionid: full > msg: getenv > cmdargs: '/usr/bin/python -m virtualenv --python /usr/bin/python tempest' > > Already using interpreter /usr/bin/python > New python executable in /opt/stack/tempest/.tox/tempest/bin/python > Complete output from command /opt/stack/tempest/.tox/tempest/bin/python -m pip config list: > ERROR: unknown command "config" > ---------------------------------------- > Traceback (most recent call last): > File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main > "__main__", fname, loader, pkg_name) > File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code > exec code in run_globals > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 2502, in > main() > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 793, in main > symlink=options.symlink, > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 1087, in create_environment > install_wheel(to_install, py_executable, search_dirs, download=download) > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 935, in install_wheel > _install_wheel_with_search_dir(download, project_names, py_executable, search_dirs) > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 964, in _install_wheel_with_search_dir > config = _pip_config(py_executable, python_path) > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 1038, in _pip_config > remove_from_env=["PIP_VERBOSE", "PIP_QUIET"], > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 886, in call_subprocess > raise OSError("Command {} failed with error code {}".format(cmd_desc, proc.returncode)) > OSError: Command /opt/stack/tempest/.tox/tempest/bin/python -m pip config list failed with error code 1 > > ERROR: Error creating virtualenv. Note that some special characters (e.g. ':' and unicode symbols) in paths are not > supported by virtualenv. Error details: InvocationError('/usr/bin/python -m virtualenv --python /usr/bin/python > tempest (see /opt/stack/tempest/.tox/tempest/log/full-0.log)', 1) > ___________________________________ summary ____________________________________ > ERROR: full: Error creating virtualenv. Note that some special characters (e.g. ':' and unicode symbols) in paths > are not supported by virtualenv. Error details: InvocationError('/usr/bin/python -m virtualenv --python > /usr/bin/python tempest (see /opt/stack/tempest/.tox/tempest/log/full-0.log)', 1) > > > -----Original Message----- > From: Sean Mooney > Sent: 29 January 2019 21:46 > To: Lake, David (PG/R - Elec Electronic Eng) ; openstack-dev at lists.openstack.org > Cc: Ge, Chang Dr (Elec Electronic Eng) > Subject: Re: Issue with launching instance with OVS-DPDK > > On Tue, 2019-01-29 at 18:05 +0000, David Lake wrote: > > Answers
in-line
> > > > Thanks > > > > David > > > > -----Original Message----- > > From: Sean Mooney > > Sent: 29 January 2019 14:55 > > To: Lake, David (PG/R - Elec Electronic Eng) ; > > openstack-dev at lists.openstack.org > > Cc: Ge, Chang Dr (Elec Electronic Eng) > > Subject: Re: Issue with launching instance with OVS-DPDK > > > > On Mon, 2019-01-28 at 13:17 +0000, David Lake wrote: > > > Hello > > > > > > I’ve built an Openstack all-in-one using OVS-DPDK via Devstack. > > > > > > I can launch instances which use the “m1.small” flavour (which I > > > have modified to include the hw:mem_size large as per the DPDK > > > instructions) but as soon as I try to launch anything more than m1.small, I get this error: > > > > > > Jan 28 12:56:52 localhost nova-conductor: #033[01;31mERROR > > > nova.scheduler.utils [#033[01;36mNone req-917cd3b9-8ce6- > > > 41af-8d44-045002512c91 #033[00;36madmin admin#033[01;31m] > > > #033[01;35m[instance: 25cfee28-08e9-419c-afdb-4d0fe515fb2a] > > > #033[01;31mError from last host: localhost (node localhost): [u'Traceback (most recent call last):\n', u' File > > > "/opt/stack/nova/nova/compute/manager.py", line 1935, in _do_build_and_run_instance\n filter_properties, > > > request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2215, in _build_and_run_instance\n > > > instance_uuid=instance.uuid, reason=six.text_type(e))\n', > > > u'RescheduledException: Build of instance 25cfee28-08e9- > > > 419c-afdb-4d0fe515fb2a was re-scheduled: internal error: qemu > > > unexpectedly closed the monitor: 2019-01- 28T12:56:48.127594Z > > > qemu-kvm: -chardev > > > socket,id=charnet0,path=/var/run/openvswitch/vhu46b3c508-f8,server: > > > info: QEMU waiting for connection on: > > > disconnected:unix:/var/run/openvswitch/vhu46b3c508-f8,server\n2019-0 > > > 1- > > > 28T12:56:49.251071Z > > > qemu-kvm: -object > > > memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepage > > > s/ > > > libvirt/qemu/4-instance- > > > 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: > > > os_mem_prealloc: Insufficient free host memory pages available to > > > allocate guest RAM\n']#033[00m#033[00m > > > > > > > > > My Hypervisor is reporting 510.7GB of RAM and 61 vCPUs. > > > > how much of that ram did you allocate as hugepages. > > > >
OVS_NUM_HUGEPAGES=3072
> > ok so you used networking-ovs-dpdks ablitiy to automatically allocate 2MB hugepages at runtime so this should have > allocate 6GB of hugepages per numa node. > > > > can you provide the output of cat /proc/meminfo > > > >
> > > > MemTotal: 526779552 kB > > MemFree: 466555316 kB > > MemAvailable: 487218548 kB > > Buffers: 2308 kB > > Cached: 22962972 kB > > SwapCached: 0 kB > > Active: 29493384 kB > > Inactive: 13344640 kB > > Active(anon): 20826364 kB > > Inactive(anon): 522012 kB > > Active(file): 8667020 kB > > Inactive(file): 12822628 kB > > Unevictable: 43636 kB > > Mlocked: 47732 kB > > SwapTotal: 4194300 kB > > SwapFree: 4194300 kB > > Dirty: 20 kB > > Writeback: 0 kB > > AnonPages: 19933028 kB > > Mapped: 171680 kB > > Shmem: 1450564 kB > > Slab: 1224444 kB > > SReclaimable: 827696 kB > > SUnreclaim: 396748 kB > > KernelStack: 69392 kB > > PageTables: 181020 kB > > NFS_Unstable: 0 kB > > Bounce: 0 kB > > WritebackTmp: 0 kB > > CommitLimit: 261292620 kB > > Committed_AS: 84420252 kB > > VmallocTotal: 34359738367 kB > > VmallocUsed: 1352128 kB > > VmallocChunk: 34154915836 kB > > HardwareCorrupted: 0 kB > > AnonHugePages: 5365760 kB > > CmaTotal: 0 kB > > CmaFree: 0 kB > > HugePages_Total: 6144 > > since we have 6144 total and OVS_NUM_HUGEPAGES was set to 3072 this indicate the host has 2 numa nodes > > HugePages_Free: 2048 > > and you currently have 4G of 2MB hugepages free. > however this will also be split across numa nodes. > > the qemu commandline you provied which i have coppied below is trying to allocate 4G of hugepage memory from a single > host numa node > > qemu-kvm: -object > memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/ > libvirt/qemu/4-instance- > 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: > os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM\n']#033[00m#033[00m > > as a result the vm is failing to boot because nova cannot create the vm with a singel numa node. > > if you set hw:numa_nodes=2 this vm would likely boot but since you have a 512G hostyou should be able to increase > OVS_NUM_HUGEPAGES to something like OVS_NUM_HUGEPAGES=14336. > this will allocate 60G of 2MB hugepages total. > > if you want to allocate more then about 96G of hugepages you should set OVS_ALLOCATE_HUGEPAGES=False and instead > allcoate the hugepages on the kernel commandline using 1G hugepages. > e.g. default_hugepagesz=1G hugepagesz=1G hugepages=480 This is becase it take a long time for ovs-dpdk to scan all the > hugepages on start up. > > setting default_hugepagesz=1G hugepagesz=1G hugepages=480 will leave 32G of ram for the host. > if it a comptue node and not a contorller you can safly reduce the the free host ram to 16G e.g. default_hugepagesz=1G > hugepagesz=1G hugepages=496 i would not advice allocating much more above than 496G of hugepages as the qemu emularot > over head can eaially get into the 10s of gigs if you have 50+ vms running. > > > > > HugePages_Rsvd: 0 > > HugePages_Surp: 0 > > Hugepagesize: 2048 kB > > DirectMap4k: 746304 kB > > DirectMap2M: 34580480 kB > > DirectMap1G: 502267904 kB > > [stack at localhost devstack]$ > > > >
> > > > > > > > Build is the latest git clone of Devstack. > > > > > > Thanks > > > > > > David > > > > > > From mriedemos at gmail.com Wed Jan 30 22:26:09 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 30 Jan 2019 16:26:09 -0600 Subject: [openstack] [Nova] Instance automatically got shutdown In-Reply-To: References: <025401d4b89e$9dd60720$d9821560$@brilliant.com.bd> Message-ID: On 1/30/2019 7:39 AM, Sean Mooney wrote: > have you checked that the instace was not killed by the kernel > OOM reaper. > > the log snipit show that nova recived an instance lifecyle event > from libvirt statign the vm was stoped so it just updated the db. > > the other way this could happen if if the guest just ran > sudo poweroff. Good things to investigate. I see the sync_power_state_interval periodic is already disabled so it has to be something with life cycle events from libvirt. There is another configuration option if you really need to use it to disable handling of life cycle events from libvirt: https://docs.openstack.org/nova/latest/configuration/config.html#workarounds.handle_virt_lifecycle_events -- Thanks, Matt From jaypipes at gmail.com Wed Jan 30 22:41:44 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Wed, 30 Jan 2019 17:41:44 -0500 Subject: Issue with launching instance with OVS-DPDK In-Reply-To: References: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> Message-ID: <5ac63cf1-7216-fceb-a4a6-f172d9d51fe6@gmail.com> On 01/30/2019 02:58 PM, Sean Mooney wrote: > On Wed, 2019-01-30 at 19:02 +0000, David Lake wrote: >> Hi Sean >> >> I've set OVS_NUM_HUGEPAGES=14336 but now Devstack is failing to install... > that appars to be unrelated > you could disable the installation of tempest as a workaround but > my guess is that it is related to the pip 19.0 or 19.0.1 relsase that was don in the last > few days https://pypi.org/project/pip/#history > > pip config was intoduced in pip 10.0.0b1 https://pip.pypa.io/en/stable/news/#b1-2018-03-31 https://bugs.launchpad.net/devstack/+bug/1813860 -jay From erin at openstack.org Thu Jan 31 00:01:15 2019 From: erin at openstack.org (Erin Disney) Date: Wed, 30 Jan 2019 18:01:15 -0600 Subject: Open Infrastructure Summit Shanghai - Dates and Location Message-ID: Hi everyone- We are excited to announce that after Denver, the Open Infrastructure Summit will be traveling to Shanghai, China the week of November 4, 2019. The Summit will be held at the Shanghai Expo Centre. Registration and sponsorship opportunities will be available soon. We’ll follow up on the mailing list, and you can check out the website for updates[1] and reach out to summit at openstack.org with any questions. Cheers, Erin [1] https://www.openstack.org/summit/shanghai-2019 Erin Disney OpenStack Marketing erin at openstack.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Thu Jan 31 00:53:55 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Wed, 30 Jan 2019 18:53:55 -0600 Subject: [openstack-ansible] adding James Denton to os_neutron core In-Reply-To: References: Message-ID: Hi, Great addition to the team. Congrats both to James for the appointment and to the team for the new great member! Regards On Wed, Jan 30, 2019 at 11:19 AM Mohammed Naser wrote: > Hi team, > > I'm happy to announce the addition of James Denton to the os_neutron > Ansible role core team. James is one of the very knowledgeable > OpenStack Networking experts (having authored books about it as well!) > and we're happy to have him working on our team. > > James has contributed a lot of work to add and maintain SDNs support > in OpenStack Ansible, with many useful reviews across the os_neutron > role (his most recent work, iterating on adding VPP support), he's > also always available to help new users of our project in IRC. > > Thanks & welcome James! > > Regards, > Mohammed > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amy at demarco.com Thu Jan 31 01:04:53 2019 From: amy at demarco.com (Amy Marrich) Date: Wed, 30 Jan 2019 19:04:53 -0600 Subject: [openstack-ansible] adding James Denton to os_neutron core In-Reply-To: References: Message-ID: Congrats James! Amy(spotz) On Wed, Jan 30, 2019 at 11:24 AM Mohammed Naser wrote: > Hi team, > > I'm happy to announce the addition of James Denton to the os_neutron > Ansible role core team. James is one of the very knowledgeable > OpenStack Networking experts (having authored books about it as well!) > and we're happy to have him working on our team. > > James has contributed a lot of work to add and maintain SDNs support > in OpenStack Ansible, with many useful reviews across the os_neutron > role (his most recent work, iterating on adding VPP support), he's > also always available to help new users of our project in IRC. > > Thanks & welcome James! > > Regards, > Mohammed > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From honjo.rikimaru at po.ntt-tx.co.jp Thu Jan 31 05:27:03 2019 From: honjo.rikimaru at po.ntt-tx.co.jp (Rikimaru Honjo) Date: Thu, 31 Jan 2019 14:27:03 +0900 Subject: [infra][zuul]Run only my 3rd party CI on my environment Message-ID: <17a356c2-9911-a4e9-43f3-6df04bf18a59@po.ntt-tx.co.jp> Hello, I have a question about Zuulv3. I'm preparing third party CI for openstack/masakari PJ. I'd like to run my CI by my Zuulv3 instance on my environment. In my understand, I should add my pipeline to the project of the following .zuul.yaml for my purpose. https://github.com/openstack/masakari/blob/master/.zuul.yaml But, as a result, my Zuulv3 instance also run existed pipelines(check & gate). I want to run only my pipeline on my environment. (And, existed piplines will be run on openstack-infra environment.) How can I make my Zuulv3 instance ignore other pipeline? Best regards, -- _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ Rikimaru Honjo E-mail:honjo.rikimaru at po.ntt-tx.co.jp From iwienand at redhat.com Thu Jan 31 07:30:38 2019 From: iwienand at redhat.com (Ian Wienand) Date: Thu, 31 Jan 2019 18:30:38 +1100 Subject: [infra] fedora-29 node failures In-Reply-To: <20190129155326.GA23049@localhost.localdomain> References: <20190129155326.GA23049@localhost.localdomain> Message-ID: <20190131073038.GA20808@fedora19.localdomain> On Tue, Jan 29, 2019 at 10:53:26AM -0500, Paul Belanger wrote: > Jobs depending on fedora-latest (fedora-29 in this case) look to be in a > broken state currently. Since this past weekend, we seems to be hitting > an issue with network manager and dbus. I could replicate fedora-minimal F29 not coming up with networking due to having no dbus. I don't know what changed, but it seems that the introduction of dbus-broker and related changes to dbus-daemon has left us where we don't get *any* dbus service running. I think that switching to dbus-broker is the correct solution https://review.openstack.org/#/c/634105/ -i From bluejay.ahn at gmail.com Thu Jan 31 07:35:16 2019 From: bluejay.ahn at gmail.com (Jaesuk Ahn) Date: Thu, 31 Jan 2019 16:35:16 +0900 Subject: [openstack-helm] would like to discuss review turnaround time In-Reply-To: References: Message-ID: Thank you for thoughtful reply. I was able to quickly add my opinion on some of your feedback, not all. please see inline. I will get back with more thought and idea. pls note that we have big holiday next week (lunar new year holiday), therefore, it might take some time. :) On Wed, Jan 30, 2019 at 9:26 PM Jean-Philippe Evrard < jean-philippe at evrard.me> wrote: > Hello, > > Thank you for bringing that topic. Let me answer inline. > Please note, this is my personal opinion. > (No company or TC hat here. I realise that, as one of the TC members > following the health of the osh project, this is a concerning mail, and > I will report appropriately if further steps need to be taken). > > On Wed, 2019-01-30 at 13:15 +0900, Jaesuk Ahn wrote: > > Dear all, > > > > There has been several patch sets getting sparse reviews. > > Since some of authors wrote these patch sets are difficult to join > > IRC > > meeting due to time and language constraints, I would like to pass > > some of > > their voice, and get more detail feedback from core reviewers and > > other > > devs via ML. > > > > I fully understand core reviewers are quite busy and believe they are > > doing > > their best efforts. period! > > We can only hope for best effort of everyone :) > I have no doubt here. I also believe the team is very busy. > > So here is my opinion: Any review is valuable. Core reviewers should > not be the only ones to review patches > The more people will review in all of the involved companies, the more > they will get trusted in their reviews. That follows up with earned > trust by the core reviewers, with eventually leads to becoming core > reviewer. > This is a very good point. I really need to encourage developers to at least cross-review each other's patch set. I will discuss with other team members how we can achieve this, we might need to introduce "half-a-day review only" schedule. Once my team had tried to review more in general, however it failed because of very limited time allowed to do so. At least, we can try to cross-review each other on patch sets, and explicitly assign time to do so. THIS will be our important homework to do. > > I believe we can make a difference by reviewing more, so that the > existing core team could get extended. Just a highlight: at the moment, > more than 90% of reviews are AT&T sponsored (counting independents > working for at&t. See also > https://www.stackalytics.com/?module=openstack-helm-group). That's very > high. > > I believe extending the core team geographically/with different > companies is a solution for the listed pain points. > I really would like to have that as well, however, efforts and time to become a candidate with "good enough" history seems very difficult. Matching the level (or amount of works) with what the current core reviewers does is not an easy thing to achieve. Frankly speaking, motivating someone to put that much effort is also challenging, especially with their reluctance (hesitant?) to do so due to language and time barrier. > > > However, I sometimes feel that turnaround time for some of patch sets > > are > > really long. I would like to hear opinion from others and suggestions > > on > > how to improve this. It can be either/both something each patch set > > owner > > need to do more, or/and it could be something we as a openstack-helm > > project can improve. For instance, it could be influenced by time > > differences, lack of irc presence, or anything else. etc. I really > > would > > like to find out there are anything we can improve together. > > I had the same impression myself: the turnaround time is big for a > deployment project. > > The problem is not simple, and here are a few explanations I could > think of: > 1) most core reviewers are from a single company, and emergencies in > their company are most likely to get prioritized over the community > work. That leaves some reviews pending. > 2) most core reviewers are from the same timezone in US, which means, > in the best case, an asian contributor will have to wait a full day > before seeing his work merged. If a core reviewer doesn't review this > on his day work due to an emergency, you're putting the turnaround to > two days at best. > 3) most core reviewers are working in the same location: it's maybe > hard for them to scale the conversation from their internal habits to a > community driven project. Communication is a very important part of a > community, and if that doesn't work, it is _very_ concerning to me. We > raised the points of lack of (IRC presence|reviews) in previous > community meetings. 2-1) other active developers are on the opposite side of the earth, which make more difficult to sync with core reviewers. No one wanted, but it somehow creates an invisible barrier. I do agree that "Communication" is a very important part of a community. Language and time differences are adding more difficulties on this as well. I am trying my best to be a good liaison, but never enough. There will be no clear solution. However, I will have a discussion again with team members to gather some ideas. > > > > > I would like to get any kind of advise on the following. > > - sometimes, it is really difficult to get core reviewers' comments > > or > > reviews. I routinely put the list of patch sets on irc meeting > > agenda, > > however, there still be a long turnaround time between comments. As a > > result, it usually takes a long time to process a patch set, does > > sometimes > > cause rebase as well. > > I thank our testing system auto rebases a lot :) > The bigger problem is when you're working on something which eventually > conflicts with some AT&T work that was prioritized internally. > > For that, I asked a clear list of what the priorities are. > ( https://storyboard.openstack.org/#!/worklist/341 ) > > Anything outside that should IMO raise a little flag in our heads :) > > But it's up to the core reviewers to work with this in focus, and to > the PTL to give directions. > > > > - Having said that, I would like to have any advise on what we need > > to do > > more, for instance, do we need to be in irc directly asking each > > patch set > > to core reviewers? do we need to put core reviewers' name when we > > push > > patch set? etc. > > I believe that we should leverage IRC more for reviews. We are doing it > in OSA, and it works fine. Of course core developers have their habits > and a review dashboard, but fast/emergency reviews need to be > socialized to get prioritized. There are other attempts in the > community (like have a review priority in gerrit), but I am not > entirely sold on bringing a technical solution to something that should > be solved with more communication. > > > - Some of patch sets are being reviewed and merged quickly, and some > > of > > patch sets are not. I would like to know what makes this difference > > so that > > I can tell my developers how to do better job writing and > > communicating > > patch sets. > > > > There are just some example patch sets currently under review stage. > > > > 1. https://review.openstack.org/#/c/603971/ >> this ps has been > > discussed > > for its contents and scope. Cloud you please add if there is anything > > else > > we need to do other than wrapping some of commit message? > > > > 2. https://review.openstack.org/#/c/633456/ >> this is simple fix. > > how can > > we make core reviewer notice this patch set so that they can quickly > > view? > > > > 3. https://review.openstack.org/#/c/625803/ >> we have been getting > > feedbacks and questions on this patch set, that has been good. but > > round-trip time for the recent comments takes a week or more. because > > of > > that delay (?), the owner of this patch set needed to rebase this one > > often. Will this kind of case be improved if author engages more on > > irc > > channel or via mailing list to get feedback rather than relying on > > gerrit > > reviews? > > To me, the last one is more controversial than others (I don't believe > we should give the opportunity to do that myself until we've done a > security impact analysis). This change is also bigger than others, > which is harder to both write and review. As far as I know, there was > no spec that preceeded this work, so we couldn't discuss the approach > before the code was written. > > I don't mind not having specs for changes to be honest, but it makes > sense to have one if the subject is more controversial/harder, because > people will have a tendency to put hard job aside. > > This review is the typical review that needs to be discussed in the > community meeting, advocating for or against it until a decision is > taken (merge or abandon). > I do agree on your analysis on this one. but, One thing the author really wanted to have was feedback, that can be either negative or positive. it could be something to ask to abandon, or rewrite. but lack of comments with a long turnaround time between comments (that means author waits days and weeks to see any additional comments) was the problem. It felt like somewhat abandoned without any strong reason. > > > > > Frankly speaking, I don't know if this is a real issue or just way it > > is. I > > just want to pass some of voice from our developers, and really would > > like > > to hear what others think and find a better way to communicate. > > It doesn't matter if "it's a real issue" or "just the way it is". > If there is a feeling of burden/pain, we should tackle the issue. > > So, yes, it's very important to raise the issue you feel! > If you don't do it, nothing will change, the morale of developers will > fall, and the health of the project will suffer. > Transparency is key here. > > Thanks for voicing your opinion. > > > > > > > Thanks you. > > > > > > I would say my key take-aways are: > 1) We need to review more > 2) We need to communicate/socialize more on patchsets and issues. Let's > be more active on IRC outside meetings. > Just one small note here: developers in my team prefer email communication sometime, where they can have time to think how to write their opinion on English. > 3) The priority list need to be updated to be accurate. I am not sure > this list is complete (there is no mention of docs image building > there). > I really want this happen. Things are often suddenly showed up on patch set and merged. It is a bit difficult to follow what is exactly happening on openstack-helm community. Of course, this required everyone's efforts. > 4) We need to extend the core team in different geographical regions > and companies as soon as possible > > But of course it's only my analysis. I would be happy to see Pete > answer here. > > Regards, > Jeam-Philippe Evrard (evrardjp) > > > A bit unrelated with topic, but I really want to say this. I DO REALLY appreciate openstack-helm community's effort to accept non-English documents as official one. (although it is slowly progressing ^^) I think this move is real diversity effort than any other move (recognizing there is a good value community need to bring in "as-is", even though that is non-English information) Cheers, -- *Jaesuk Ahn*, Ph.D. Software R&D Center, SK Telecom -------------- next part -------------- An HTML attachment was scrubbed... URL: From kailun.qin at intel.com Thu Jan 31 08:45:28 2019 From: kailun.qin at intel.com (Qin, Kailun) Date: Thu, 31 Jan 2019 08:45:28 +0000 Subject: [openstack-dev] [Neutron] Propose Liu Yulong for Neutron core In-Reply-To: References: Message-ID: Big +1 ☺ Congrats Yulong, well-deserved! BR, Kailun From: Miguel Lavalle [mailto:miguel at mlavalle.com] Sent: Wednesday, January 30, 2019 7:19 AM To: openstack-discuss at lists.openstack.org Subject: [openstack-dev] [Neutron] Propose Liu Yulong for Neutron core Hi Stackers, I want to nominate Liu Yulong (irc: liuyulong) as a member of the Neutron core team. Liu started contributing to Neutron back in Mitaka, fixing bugs in HA routers. Since then, he has specialized in L3 networking, developing a deep knowledge of DVR. More recently, he single handedly implemented QoS for floating IPs with this series of patches: https://review.openstack.org/#/q/topic:bp/floating-ip-rate-limit+(status:open+OR+status:merged). He has also been very busy helping to improve the implementation of port forwardings and adding QoS to them. He also works for a large operator in China, which allows him to bring an important operational perspective from that part of the world to our project. The quality and number of his code reviews during the Stein cycle is on par with the leading members of the core team: https://www.stackalytics.com/?module=neutron-group. I will keep this nomination open for a week as customary. Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Thu Jan 31 09:36:28 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Thu, 31 Jan 2019 10:36:28 +0100 Subject: [ironic] Hardware leasing with Ironic In-Reply-To: <20190130152604.ik7zi2w7hrpabahd@redhat.com> References: <20190130152604.ik7zi2w7hrpabahd@redhat.com> Message-ID: Hi, On 1/30/19 4:26 PM, Lars Kellogg-Stedman wrote: > Howdy. > > I'm working with a group of people who are interested in enabling some > form of baremetal leasing/reservations using Ironic. There are three > key features we're looking for that aren't (maybe?) available right > now: > > - multi-tenancy: in addition to the ironic administrator, we need to > be able to define a node "owner" (someone who controls a specific > node) and a node "consumer" (someone who has been granted temporary > access to a specific node). An "owner" always has the ability to > control node power or access the console, can mark a node as > available or not, and can set lease policies (such as a maximum > lease lifetime) for a node. A "consumer" is granted access to power > control and console only when they hold an active lease, and > otherwise has no control over the node. FYI we have an "owner" field in Ironic that you can use, but Ironic itself does not restrict access based on it. Well, does not *yet*, we can probably talk about it ;) > > - leasing: a mechanism for marking nodes as available, requesting > nodes for a specific length of time, and returning those nodes to > the available pool when a lease has expired. We're getting allocation API, which makes a part of it much easier: http://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/allocation-api.html. It does not have a notion of lease time though. I suspect it is better to leave it to the upper level. It also does not have advanced filters (RAM >= 16G, etc), you can pre-filter nodes instead. > > - hardware only: we'd like the ability to leave os provisioning up to > the "consumer". For example, after someone acquires a node via the > leasing mechanism, they can use Foreman to provisioning an os onto > the node. Allocation API is independent of deployment process, so you can allocate a node and leave it as it is. This is, however, not compatible with Nova approach. Nova does reservation and deployment in a seemingly single step. > > For example, a workflow might look something like this: > > - The owner of a baremetal node makes the node part of a pool of > available hardware. They set a maximum lease lifetime of 5 days. > > - A consumer issues a lease request for "3 nodes with >= 48GB of > memory and >= 1 GPU" and "1 node with >= 16GB of memory and >= 1TB > of local disk", with a required lease time of 3 days. > > - The leasing system finds available nodes matching the hardware > requirements and with owner-set lease policies matching the lease > lifetime requirements. > > - The baremetal nodes are assigned to the consumer, who can then > attach them to networks and make use of their own provisioning tools > (which may be another Ironic instance?) to manage the hardware. The > consumer is able to control power on these nodes and access the > serial console. > > - At the end of the lease, the nodes are wiped and returned to the > pool of available hardware. The previous consumer no longer has any > access to the nodes. > > Our initial thought is to implement this as a service that sits in > front of Ironic and provides the multi-tenancy and policy logic, while > using Ironic to actually control the hardware. ++ > > Does this seem like a reasonable path forward? On paper there's a lot > of overlap here between what we want and features provided by things > like the Nova schedulers or the Placement api, but it's not clear > we can leverage those at the baremetal layer. > > Thanks for your thoughts, > From thierry at openstack.org Thu Jan 31 10:19:57 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 31 Jan 2019 11:19:57 +0100 Subject: [all][tc] Formalizing cross-project pop-up teams Message-ID: <885eb5c9-55d7-2fea-ff83-b917b7d6c4d8@openstack.org> TL;DR: Maybe to help with cross-project work we should formalize temporary teams with clear objective and disband criteria, under the model of Kubernetes "working groups". Long version: Work in OpenStack is organized around project teams, who each own a set of git repositories. One well-known drawback of this organization is that it makes cross-project work harder, as someone has to coordinate activities that ultimately affects multiple project teams. We tried various ways to facilitate cross-project work in the past. It started with a top-level repository of cross-project specs, a formal effort which failed due to a disconnect between the spec approvers (TC), the people signed up to push the work, and the teams that would need to approve the independent work items. This was replaced by more informal "champions", doing project-management and other heavy lifting to get things done cross-project. This proved successful, but champions are often facing an up-hill battle and often suffer from lack of visibility / blessing / validation. SIGs are another construct that helps holding discussions and coordinating work around OpenStack problem spaces, beyond specific project teams. Those are great as a permanent structure, but sometimes struggle to translate into specific development work, and are a bit heavy-weight just to coordinate a given set of work items. Community goals fill the gap between champions and SIGs by blessing a given set of cross-community goals for a given release. However, given their nature (being blessed by the TC at every cycle), they are a better fit for small, cycle-long objectives that affect most of the OpenStack project teams, and great to push consistency across all projects. It feels like we are missing a way to formally describe a short-term, cross-project objective that only affects a number of teams, is not tied to a specific cycle, and organize work around a temporary team specifically formed to reach that objective. A team that would get support from the various affected project teams, increasing chances of success. Kubernetes encountered the same problem, with work organized around owners and permanent SIGs. They created the concept of a "working group"[1] with a clear limited objective, and a clear disband criteria. I feel like adopting something like it in OpenStack could help with work that affects multiple projects. We would not name it "working group" since that's already overloaded in OpenStack, but maybe "pop-up team" to stress the temporary nature of it. We've been sort-of informally using those in the past, but maybe formalizing and listing them could help getting extra visibility and prioritization. Thoughts? Alternate solutions? [1] https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md -- Thierry Carrez (ttx) From thierry at openstack.org Thu Jan 31 10:45:25 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 31 Jan 2019 11:45:25 +0100 Subject: [tc] The future of the "Help most needed" list Message-ID: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> Hi everyone, The "Help most needed" list[1] was created by the Technical Committee to clearly describe areas of the OpenStack open source project which were in the most need of urgent help. This was done partly to facilitate communications with corporate sponsors and engineering managers, and be able to point them to an official statement of need from "the project". [1] https://governance.openstack.org/tc/reference/help-most-needed.html This list encounters two issues. First it's hard to limit entries: a lot of projects teams, SIGs and other forms of working groups could use extra help. But more importantly, this list has had a very limited impact -- new contributors did not exactly magically show up in the areas we designated as in most need of help. When we raised that topic (again) at a Board+TC meeting, a suggestion was made that we should turn the list more into a "job description" style that would make it more palatable to the corporate world. I fear that would not really solve the underlying issue (which is that at our stage of the hype curve, no organization really has spare contributors to throw at random hard problems). So I wonder if we should not reframe the list and make it less "this team needs help" and more "I offer peer-mentoring in this team". A list of contributor internships offers, rather than a call for corporate help in the dark. I feel like that would be more of a win-win offer, and more likely to appeal to students, or OpenStack users trying to contribute back. Proper 1:1 mentoring takes a lot of time, and I'm not underestimating that. Only people that are ready to dedicate mentoring time should show up on this new "list"... which is why it should really list identified individuals rather than anonymous teams. It should also probably be one-off offers -- once taken, the offer should probably go off the list. Thoughts on that? Do you think reframing help-needed as mentoring-offered could help? Do you have alternate suggestions? -- Thierry Carrez (ttx) From pierre at stackhpc.com Thu Jan 31 10:58:58 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Thu, 31 Jan 2019 10:58:58 +0000 Subject: [ironic] Hardware leasing with Ironic In-Reply-To: <20190130170501.hs2vsmm7iqdhmftc@redhat.com> References: <20190130152604.ik7zi2w7hrpabahd@redhat.com> <5354829D-31EA-4CB2-A054-239D105C7EC9@cern.ch> <20190130170501.hs2vsmm7iqdhmftc@redhat.com> Message-ID: On Wed, 30 Jan 2019 at 17:05, Lars Kellogg-Stedman wrote: > > On Wed, Jan 30, 2019 at 04:47:09PM +0000, Pierre Riteau wrote: > > However, Blazar is extendable, with a plugin architecture: a baremetal > > plugin could be developed that interacts directly with Ironic. > > This would require Ironic to support multi-tenancy first, right? Yes, assuming this would be available as per your initial message. Although technically you could use the Blazar API as a wrapper to provide the multi-tenancy, it would require duplicating a lot of the Ironic API into Blazar, so I wouldn't recommend this approach. > > Giving direct provisioning access to users means they will need BMC > > credentials and access to provisioning networks. If more isolation is > > required, you might want to take a look at HIL from the Mass Open > > Cloud [2]. I haven't used it but I have read one of their paper and it > > looks well-thought-out. > > Ironically (hah!), the group I am working with *is* the Massachusetts > Open Cloud, and we're looking to implement the ideas explored in > HIL/BMI on top of OpenStack services. Heh, it's a small world :-) I would very happy to see these ideas implemented via OpenStack, it would surely help to get them more adopted. From dtantsur at redhat.com Thu Jan 31 11:09:07 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Thu, 31 Jan 2019 12:09:07 +0100 Subject: [ironic] Hardware leasing with Ironic In-Reply-To: References: <20190130152604.ik7zi2w7hrpabahd@redhat.com> <5354829D-31EA-4CB2-A054-239D105C7EC9@cern.ch> <20190130170501.hs2vsmm7iqdhmftc@redhat.com> Message-ID: On 1/31/19 11:58 AM, Pierre Riteau wrote: > On Wed, 30 Jan 2019 at 17:05, Lars Kellogg-Stedman wrote: >> >> On Wed, Jan 30, 2019 at 04:47:09PM +0000, Pierre Riteau wrote: >>> However, Blazar is extendable, with a plugin architecture: a baremetal >>> plugin could be developed that interacts directly with Ironic. >> >> This would require Ironic to support multi-tenancy first, right? > > Yes, assuming this would be available as per your initial message. Some first steps have been done: http://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/ownership-field.html. We need someone to drive the futher design and implementation though. > Although technically you could use the Blazar API as a wrapper to > provide the multi-tenancy, it would require duplicating a lot of the > Ironic API into Blazar, so I wouldn't recommend this approach. > >>> Giving direct provisioning access to users means they will need BMC >>> credentials and access to provisioning networks. If more isolation is >>> required, you might want to take a look at HIL from the Mass Open >>> Cloud [2]. I haven't used it but I have read one of their paper and it >>> looks well-thought-out. >> >> Ironically (hah!), the group I am working with *is* the Massachusetts >> Open Cloud, and we're looking to implement the ideas explored in >> HIL/BMI on top of OpenStack services. > > Heh, it's a small world :-) I would very happy to see these ideas > implemented via OpenStack, it would surely help to get them more > adopted. > From zhengzhenyulixi at gmail.com Thu Jan 31 11:31:11 2019 From: zhengzhenyulixi at gmail.com (Zhenyu Zheng) Date: Thu, 31 Jan 2019 19:31:11 +0800 Subject: [nova] Per-instance serial number implementation question In-Reply-To: <491f036c485b9eb7e72ef74d22755215a8994d99.camel@redhat.com> References: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> <491f036c485b9eb7e72ef74d22755215a8994d99.camel@redhat.com> Message-ID: Thanks alot for bring this up, if we decided to make unique serial the only choice, I guess we have to sort on what curcumstances it will change the serial of instances that already exists. Should we have a way to preserve the serial for exisiting instances in order to not cause any workload failue for our customers as changing the serial may cause some problem. On Sat, Jan 26, 2019 at 7:30 PM Stephen Finucane wrote: > On Fri, 2019-01-25 at 18:52 -0600, Matt Riedemann wrote: > > On 1/25/2019 10:35 AM, Stephen Finucane wrote: > > > He noted that one would be a valid point in > > > claiming the host OS identity should have been reported in > > > 'chassis.serial' instead of 'system.serial' in the first place [1] but > > > changing it now is definitely not zero risk. > > > > If I'm reading those docs correctly, chassis.serial was new in libvirt > > 4.1.0 which is quite a bit newer than our minimum libvirt version > support. > > Good point. Guess it doesn't matter though if we have the two > alternatives you and Sean have suggested for figuring this stuff out? > The important thing is that release note. Setting 'chassis.serial' > would be a nice TODO if we have 4.1.0. > > Stephen > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anlin.kong at gmail.com Thu Jan 31 11:32:44 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Fri, 1 Feb 2019 00:32:44 +1300 Subject: [tc] The future of the "Help most needed" list In-Reply-To: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> References: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> Message-ID: Huge +1 from me. If the team want help, they need to offer some help first. We could also work with kinds of internship programmes like Outreachy. Cheers, Lingxian Kong On Thu, Jan 31, 2019 at 11:48 PM Thierry Carrez wrote: > Hi everyone, > > The "Help most needed" list[1] was created by the Technical Committee to > clearly describe areas of the OpenStack open source project which were > in the most need of urgent help. This was done partly to facilitate > communications with corporate sponsors and engineering managers, and be > able to point them to an official statement of need from "the project". > > [1] https://governance.openstack.org/tc/reference/help-most-needed.html > > This list encounters two issues. First it's hard to limit entries: a lot > of projects teams, SIGs and other forms of working groups could use > extra help. But more importantly, this list has had a very limited > impact -- new contributors did not exactly magically show up in the > areas we designated as in most need of help. > > When we raised that topic (again) at a Board+TC meeting, a suggestion > was made that we should turn the list more into a "job description" > style that would make it more palatable to the corporate world. I fear > that would not really solve the underlying issue (which is that at our > stage of the hype curve, no organization really has spare contributors > to throw at random hard problems). > > So I wonder if we should not reframe the list and make it less "this > team needs help" and more "I offer peer-mentoring in this team". A list > of contributor internships offers, rather than a call for corporate help > in the dark. I feel like that would be more of a win-win offer, and more > likely to appeal to students, or OpenStack users trying to contribute back. > > Proper 1:1 mentoring takes a lot of time, and I'm not underestimating > that. Only people that are ready to dedicate mentoring time should show > up on this new "list"... which is why it should really list identified > individuals rather than anonymous teams. It should also probably be > one-off offers -- once taken, the offer should probably go off the list. > > Thoughts on that? Do you think reframing help-needed as > mentoring-offered could help? Do you have alternate suggestions? > > -- > Thierry Carrez (ttx) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Thu Jan 31 11:55:22 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Thu, 31 Jan 2019 11:55:22 +0000 Subject: [nova] Per-instance serial number implementation question In-Reply-To: References: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> <491f036c485b9eb7e72ef74d22755215a8994d99.camel@redhat.com> Message-ID: <3069f87d74c4f88d6c77072bdb520b813c0d617f.camel@redhat.com> On Thu, 2019-01-31 at 19:31 +0800, Zhenyu Zheng wrote: > Thanks alot for bring this up, if we decided to make unique serial > the only choice, I guess we have to sort on what curcumstances it > willchange the serial of instances that already exists. Should we > have a way to preserve the serial for exisiting instances in order to > not > cause any workload failue for our customers as changing the serial > may cause some problem. I think all that's necessary here is to add a reno calling out this change in behavior (along with the alternatives put forth by Sean and Matt) and, ideally, start setting 'chassis.serial' if libvirt > 4.1.0? Stephen > On Sat, Jan 26, 2019 at 7:30 PM Stephen Finucane > wrote: > > On Fri, 2019-01-25 at 18:52 -0600, Matt Riedemann wrote: > > > > > On 1/25/2019 10:35 AM, Stephen Finucane wrote: > > > > > > He noted that one would be a valid point in > > > > > > claiming the host OS identity should have been reported in > > > > > > 'chassis.serial' instead of 'system.serial' in the first place > > [1] but > > > > > > changing it now is definitely not zero risk. > > > > > > > > > > If I'm reading those docs correctly, chassis.serial was new in > > libvirt > > > > > 4.1.0 which is quite a bit newer than our minimum libvirt version > > support. > > > > > > > > Good point. Guess it doesn't matter though if we have the two > > > > alternatives you and Sean have suggested for figuring this stuff > > out? > > > > The important thing is that release note. Setting 'chassis.serial' > > > > would be a nice TODO if we have 4.1.0. > > > > > > > > Stephen > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu Jan 31 12:22:33 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 31 Jan 2019 12:22:33 +0000 Subject: Issue with launching instance with OVS-DPDK In-Reply-To: References: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> Message-ID: On Wed, 2019-01-30 at 22:24 +0000, David Lake wrote: > Hi Sean > > Thanks! > > Got it working and I can now spin up larger VMs. > > All I've got to do now is work out how to get SSSE3 support in my VM. I think I need to modify the flavour to > "Haswell" for that? there are a few ways but my perfered way is to configre libvirt to use the hosts cpu features. add the following at the very end of your local.conf for future restacks [[post-config|/etc/nova/nova.conf]] [filter_scheduler] enabled_filters = RetryFilter,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinity Filter,ServerGroupAffinityFilter,SameHostFilter,DifferentHostFilter,PciPassthroughFilter,NUMATopologyFilter [libvirt] cpu_mode = host-passthrough virt_type = kvm [[post-config|$NOVA_CPU_CONF]] [libvirt] cpu_mode = host-passthrough virt_type = kvm in the mean time you can just add [libvirt] cpu_mode = host-passthrough virt_type = kvm to /etc/nova/nova-cpu.conf and restart nova compute sudo systemctl restart devstack at n-cpu with those changes your vms will have acess to all the cpu features of the host cpu. > David -----Original Message----- From: Sean Mooney Sent: 30 January 2019 19:58 To: Lake, David (PG/R - Elec Electronic Eng) ; openstack-dev at lists.openstack.org Cc: Ge, Chang Dr (Elec Electronic Eng) Subject: Re: Issue with launching instance with OVS-DPDK On Wed, 2019-01-30 at 19:02 +0000, David Lake wrote: Hi Sean I've set OVS_NUM_HUGEPAGES=14336 but now Devstack is failing to install... that appars to be unrelated you could disable the installation of tempest as a workaround but my guess is that it is related to the pip 19.0 or 19.0.1 relsase that was don in the last few days https://pypi.org/project/pip/#history pip config was intoduced in pip 10.0.0b1 https://pip.pypa.io/en/stable/news/#b1-2018-03-31 to disable tempest add "disable_service tempest" to your local.conf then unstack and stack. David full create: /opt/stack/tempest/.tox/tempest ERROR: invocation failed (exit code 1), logfile: /opt/stack/tempest/.tox/tempest/log/full-0.log ERROR: actionid: full msg: getenv cmdargs: '/usr/bin/python -m virtualenv --python /usr/bin/python tempest' Already using interpreter /usr/bin/python New python executable in /opt/stack/tempest/.tox/tempest/bin/python Complete output from command /opt/stack/tempest/.tox/tempest/bin/python -m pip config list: ERROR: unknown command "config" ---------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/usr/lib/python2.7/site-packages/virtualenv.py", line 2502, in main() File "/usr/lib/python2.7/site-packages/virtualenv.py", line 793, in main symlink=options.symlink, File "/usr/lib/python2.7/site-packages/virtualenv.py", line 1087, in create_environment install_wheel(to_install, py_executable, search_dirs, download=download) File "/usr/lib/python2.7/site-packages/virtualenv.py", line 935, in install_wheel _install_wheel_with_search_dir(download, project_names, py_executable, search_dirs) File "/usr/lib/python2.7/site-packages/virtualenv.py", line 964, in _install_wheel_with_search_dir config = _pip_config(py_executable, python_path) File "/usr/lib/python2.7/site-packages/virtualenv.py", line 1038, in _pip_config remove_from_env=["PIP_VERBOSE", "PIP_QUIET"], File "/usr/lib/python2.7/site-packages/virtualenv.py", line 886, in call_subprocess raise OSError("Command {} failed with error code {}".format(cmd_desc, proc.returncode)) OSError: Command /opt/stack/tempest/.tox/tempest/bin/python -m pip config list failed with error code 1 ERROR: Error creating virtualenv. Note that some special characters (e.g. ':' and unicode symbols) in paths are not supported by virtualenv. Error details: InvocationError('/usr/bin/python -m virtualenv --python /usr/bin/python tempest (see /opt/stack/tempest/.tox/tempest/log/full-0.log)', 1) ___________________________________ summary ____________________________________ ERROR: full: Error creating virtualenv. Note that some special characters (e.g. ':' and unicode symbols) in paths are not supported by virtualenv. Error details: InvocationError('/usr/bin/python -m virtualenv --python /usr/bin/python tempest (see /opt/stack/tempest/.tox/tempest/log/full-0.log)', 1) -----Original Message----- From: Sean Mooney Sent: 29 January 2019 21:46 To: Lake, David (PG/R - Elec Electronic Eng) ; openstack-dev at lists.openstack.org Cc: Ge, Chang Dr (Elec Electronic Eng) Subject: Re: Issue with launching instance with OVS-DPDK On Tue, 2019-01-29 at 18:05 +0000, David Lake wrote: Answers
in-line
Thanks David -----Original Message----- From: Sean Mooney Sent: 29 January 2019 14:55 To: Lake, David (PG/R - Elec Electronic Eng) ; openstack-dev at lists.openstack.org Cc: Ge, Chang Dr (Elec Electronic Eng) Subject: Re: Issue with launching instance with OVS-DPDK On Mon, 2019-01-28 at 13:17 +0000, David Lake wrote: Hello I’ve built an Openstack all-in-one using OVS-DPDK via Devstack. I can launch instances which use the “m1.small” flavour (which I have modified to include the hw:mem_size large as per the DPDK instructions) but as soon as I try to launch anything more than m1.small, I get this error: Jan 28 12:56:52 localhost nova-conductor: #033[01;31mERROR nova.scheduler.utils [#033[01;36mNone req-917cd3b9-8ce6- 41af-8d44-045002512c91 #033[00;36madmin admin#033[01;31m] #033[01;35m[instance: 25cfee28-08e9-419c-afdb-4d0fe515fb2a] #033[01;31mError from last host: localhost (node localhost): [u'Traceback (most recent call last):\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 1935, in _do_build_and_run_instance\n filter_properties, request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2215, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 25cfee28-08e9- 419c-afdb-4d0fe515fb2a was re-scheduled: internal error: qemu unexpectedly closed the monitor: 2019-01- 28T12:56:48.127594Z qemu-kvm: -chardev socket,id=charnet0,path=/var/run/openvswitch/vhu46b3c508-f8,server: info: QEMU waiting for connection on: disconnected:unix:/var/run/openvswitch/vhu46b3c508-f8,server\n2019 -0 1- 28T12:56:49.251071Z qemu-kvm: -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepa ge s/ libvirt/qemu/4-instance- 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM\n']#033[00m#033[00m My Hypervisor is reporting 510.7GB of RAM and 61 vCPUs. how much of that ram did you allocate as hugepages.
OVS_NUM_HUGEPAGES=3072
ok so you used networking-ovs-dpdks ablitiy to automatically allocate 2MB hugepages at runtime so this should have allocate 6GB of hugepages per numa node. can you provide the output of cat /proc/meminfo
MemTotal: 526779552 kB MemFree: 466555316 kB MemAvailable: 487218548 kB Buffers: 2308 kB Cached: 22962972 kB SwapCached: 0 kB Active: 29493384 kB Inactive: 13344640 kB Active(anon): 20826364 kB Inactive(anon): 522012 kB Active(file): 8667020 kB Inactive(file): 12822628 kB Unevictable: 43636 kB Mlocked: 47732 kB SwapTotal: 4194300 kB SwapFree: 4194300 kB Dirty: 20 kB Writeback: 0 kB AnonPages: 19933028 kB Mapped: 171680 kB Shmem: 1450564 kB Slab: 1224444 kB SReclaimable: 827696 kB SUnreclaim: 396748 kB KernelStack: 69392 kB PageTables: 181020 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 261292620 kB Committed_AS: 84420252 kB VmallocTotal: 34359738367 kB VmallocUsed: 1352128 kB VmallocChunk: 34154915836 kB HardwareCorrupted: 0 kB AnonHugePages: 5365760 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 6144 since we have 6144 total and OVS_NUM_HUGEPAGES was set to 3072 this indicate the host has 2 numa nodes HugePages_Free: 2048 and you currently have 4G of 2MB hugepages free. however this will also be split across numa nodes. the qemu commandline you provied which i have coppied below is trying to allocate 4G of hugepage memory from a single host numa node qemu-kvm: -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/ libvirt/qemu/4-instance- 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM\n']#033[00m#033[00m as a result the vm is failing to boot because nova cannot create the vm with a singel numa node. if you set hw:numa_nodes=2 this vm would likely boot but since you have a 512G hostyou should be able to increase OVS_NUM_HUGEPAGES to something like OVS_NUM_HUGEPAGES=14336. this will allocate 60G of 2MB hugepages total. if you want to allocate more then about 96G of hugepages you should set OVS_ALLOCATE_HUGEPAGES=False and instead allcoate the hugepages on the kernel commandline using 1G hugepages. e.g. default_hugepagesz=1G hugepagesz=1G hugepages=480 This is becase it take a long time for ovs-dpdk to scan all the hugepages on start up. setting default_hugepagesz=1G hugepagesz=1G hugepages=480 will leave 32G of ram for the host. if it a comptue node and not a contorller you can safly reduce the the free host ram to 16G e.g. default_hugepagesz=1G hugepagesz=1G hugepages=496 i would not advice allocating much more above than 496G of hugepages as the qemu emularot over head can eaially get into the 10s of gigs if you have 50+ vms running. HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 746304 kB DirectMap2M: 34580480 kB DirectMap1G: 502267904 kB [stack at localhost devstack]$
Build is the latest git clone of Devstack. Thanks David From smooney at redhat.com Thu Jan 31 12:24:11 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 31 Jan 2019 12:24:11 +0000 Subject: Issue with launching instance with OVS-DPDK In-Reply-To: <5ac63cf1-7216-fceb-a4a6-f172d9d51fe6@gmail.com> References: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> <5ac63cf1-7216-fceb-a4a6-f172d9d51fe6@gmail.com> Message-ID: <3e43a059f960dc4882ab8e790e37cba5615fe286.camel@redhat.com> On Wed, 2019-01-30 at 17:41 -0500, Jay Pipes wrote: > On 01/30/2019 02:58 PM, Sean Mooney wrote: > > On Wed, 2019-01-30 at 19:02 +0000, David Lake wrote: > > > Hi Sean > > > > > > I've set OVS_NUM_HUGEPAGES=14336 but now Devstack is failing to install... > > > > that appars to be unrelated > > you could disable the installation of tempest as a workaround but > > my guess is that it is related to the pip 19.0 or 19.0.1 relsase that was don in the last > > few days https://pypi.org/project/pip/#history > > > > pip config was intoduced in pip 10.0.0b1 https://pip.pypa.io/en/stable/news/#b1-2018-03-31 > > https://bugs.launchpad.net/devstack/+bug/1813860 ah yes i completely forgot about the version cap in devstack that makes sense good catch. > > -jay > From smooney at redhat.com Thu Jan 31 12:34:28 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 31 Jan 2019 12:34:28 +0000 Subject: [infra][zuul]Run only my 3rd party CI on my environment In-Reply-To: <17a356c2-9911-a4e9-43f3-6df04bf18a59@po.ntt-tx.co.jp> References: <17a356c2-9911-a4e9-43f3-6df04bf18a59@po.ntt-tx.co.jp> Message-ID: On Thu, 2019-01-31 at 14:27 +0900, Rikimaru Honjo wrote: > Hello, > > I have a question about Zuulv3. > > I'm preparing third party CI for openstack/masakari PJ. > I'd like to run my CI by my Zuulv3 instance on my environment. > > In my understand, I should add my pipeline to the project of the following .zuul.yaml for my purpose. > > https://github.com/openstack/masakari/blob/master/.zuul.yaml > > But, as a result, my Zuulv3 instance also run existed pipelines(check & gate). > I want to run only my pipeline on my environment. > (And, existed piplines will be run on openstack-infra environment.) > > How can I make my Zuulv3 instance ignore other pipeline? you have two options that i know of. first you can simply not define a pipeline called gate and check in your zuul config repo. since you are already usign it that is not an option for you. second if you have your own ci config project that is hosted seperatly from upstream gerrit you can define in you pipeline that the gate and check piplines are only for that other souce. e.g. if you have two connections defiend in zuul you can use the pipline triggers to define that the triggers for the gate an check pipeline only work with your own gerrit instance and not openstacks i am similar seting up a personal thridparty ci at present. i have chosen to create a seperate pipeline with a different name for running against upstream changes using the git.openstack.org gerrit source i have not pushed the patch to trigger form upstream gerrit yet https://review.seanmooney.info/plugins/gitiles/ci-config/+/master/zuul.d/pipelines.yaml but you can see that my gate and check piplines only trigger form the gerrit source which is my own gerrit instacne at review.seanmooney.info i will be adding a dedicated pipeline for upstream as unlike my personal gerrit i never want my ci to submit/merge patches upstream. i hope that helps. the gerrit trigger docs can be found here https://zuul-ci.org/docs/zuul/admin/drivers/gerrit.html#trigger-configuration regards sean > > Best regards, From tobias.rydberg at citynetwork.eu Thu Jan 31 12:44:08 2019 From: tobias.rydberg at citynetwork.eu (Tobias Rydberg) Date: Thu, 31 Jan 2019 13:44:08 +0100 Subject: [Openstack-sigs] [publiccloud-wg] Late reminder weekly meeting Public Cloud WG Message-ID: <5998c593-ab87-88cd-7c18-e4067f9cae64@citynetwork.eu> Hi everyone, Time for a new meeting for PCWG - today (31st) 1400 UTC in #openstack-publiccloud! Agenda found at https://etherpad.openstack.org/p/publiccloud-wg Sorry for the late reminder! Talk to you later today! Cheers, Tobias -- Tobias Rydberg Senior Developer Twitter & IRC: tobberydberg www.citynetwork.eu | www.citycloud.com INNOVATION THROUGH OPEN IT INFRASTRUCTURE ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED From smooney at redhat.com Thu Jan 31 12:44:53 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 31 Jan 2019 12:44:53 +0000 Subject: [nova] Per-instance serial number implementation question In-Reply-To: <3069f87d74c4f88d6c77072bdb520b813c0d617f.camel@redhat.com> References: <78d9fe1d-0631-0552-d0ea-17bf44453dae@gmail.com> <5d71c05f6f234d7254d063a805bada10ba095bf5.camel@redhat.com> <491f036c485b9eb7e72ef74d22755215a8994d99.camel@redhat.com> <3069f87d74c4f88d6c77072bdb520b813c0d617f.camel@redhat.com> Message-ID: On Thu, 2019-01-31 at 11:55 +0000, Stephen Finucane wrote: > On Thu, 2019-01-31 at 19:31 +0800, Zhenyu Zheng wrote: > > Thanks alot for bring this up, if we decided to make unique serial the only choice, I guess we have to sort on what > > curcumstances it will > > change the serial of instances that already exists. Should we have a way to preserve the serial for exisiting > > instances in order to not > > cause any workload failue for our customers as changing the serial may cause some problem. > > I think all that's necessary here is to add a reno calling out this change in behavior (along with the alternatives > put forth by Sean and Matt) and, ideally, start setting 'chassis.serial' if libvirt > 4.1.0? since the workloads would already have to tolerate the serial changingn after a migration anyway i think we should be fine with jsut the reno as stephen says. > > Stephen > > > On Sat, Jan 26, 2019 at 7:30 PM Stephen Finucane wrote: > > > On Fri, 2019-01-25 at 18:52 -0600, Matt Riedemann wrote: > > > > On 1/25/2019 10:35 AM, Stephen Finucane wrote: > > > > > He noted that one would be a valid point in > > > > > claiming the host OS identity should have been reported in > > > > > 'chassis.serial' instead of 'system.serial' in the first place [1] but > > > > > changing it now is definitely not zero risk. > > > > > > > > If I'm reading those docs correctly, chassis.serial was new in libvirt > > > > 4.1.0 which is quite a bit newer than our minimum libvirt version support. > > > > > > Good point. Guess it doesn't matter though if we have the two > > > alternatives you and Sean have suggested for figuring this stuff out? > > > The important thing is that release note. Setting 'chassis.serial' > > > would be a nice TODO if we have 4.1.0. > > > > > > Stephen > > > > > > > > From smooney at redhat.com Thu Jan 31 12:48:41 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 31 Jan 2019 12:48:41 +0000 Subject: [tc] The future of the "Help most needed" list In-Reply-To: References: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> Message-ID: <37f6eaf4cca0f566755015293646bc4356346705.camel@redhat.com> On Fri, 2019-02-01 at 00:32 +1300, Lingxian Kong wrote: > Huge +1 from me. > > If the team want help, they need to offer some help first. We could also work with kinds of internship programmes like > Outreachy. > > Cheers, > Lingxian Kong > > > On Thu, Jan 31, 2019 at 11:48 PM Thierry Carrez wrote: > > Hi everyone, > > > > The "Help most needed" list[1] was created by the Technical Committee to > > clearly describe areas of the OpenStack open source project which were > > in the most need of urgent help. This was done partly to facilitate > > communications with corporate sponsors and engineering managers, and be > > able to point them to an official statement of need from "the project". > > > > [1] https://governance.openstack.org/tc/reference/help-most-needed.html > > > > This list encounters two issues. First it's hard to limit entries: a lot > > of projects teams, SIGs and other forms of working groups could use > > extra help. But more importantly, this list has had a very limited > > impact -- new contributors did not exactly magically show up in the > > areas we designated as in most need of help. > > > > When we raised that topic (again) at a Board+TC meeting, a suggestion > > was made that we should turn the list more into a "job description" > > style that would make it more palatable to the corporate world. I fear > > that would not really solve the underlying issue (which is that at our > > stage of the hype curve, no organization really has spare contributors > > to throw at random hard problems). > > > > So I wonder if we should not reframe the list and make it less "this > > team needs help" and more "I offer peer-mentoring in this team". A list > > of contributor internships offers, rather than a call for corporate help > > in the dark. I feel like that would be more of a win-win offer, and more > > likely to appeal to students, or OpenStack users trying to contribute back. > > > > Proper 1:1 mentoring takes a lot of time, and I'm not underestimating > > that. Only people that are ready to dedicate mentoring time should show > > up on this new "list"... which is why it should really list identified > > individuals rather than anonymous teams. It should also probably be > > one-off offers -- once taken, the offer should probably go off the list. > > > > Thoughts on that? Do you think reframing help-needed as > > mentoring-offered could help? Do you have alternate suggestions? perhaps but since this is the frist time i have heard of the help-needed list maybe we should jsut merge it into openstack-discuss and use a [help-needed] or [mentoring] lable so its more discoverably? > > From thierry at openstack.org Thu Jan 31 13:06:01 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 31 Jan 2019 14:06:01 +0100 Subject: [tc] The future of the "Help most needed" list In-Reply-To: <37f6eaf4cca0f566755015293646bc4356346705.camel@redhat.com> References: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> <37f6eaf4cca0f566755015293646bc4356346705.camel@redhat.com> Message-ID: Sean Mooney wrote: > perhaps but since this is the frist time i have heard of the help-needed list > maybe we should jsut merge it into openstack-discuss and use a [help-needed] or [mentoring] lable > so its more discoverably? Oh, this is not a mailing-list. It's just a published document: https://governance.openstack.org/tc/reference/help-most-needed.html Source lives at: http://git.openstack.org/cgit/openstack/governance/tree/reference/help-most-needed.rst -- Thierry Carrez (ttx) From fungi at yuggoth.org Thu Jan 31 13:37:17 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 31 Jan 2019 13:37:17 +0000 Subject: [tc] The future of the "Help most needed" list In-Reply-To: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> References: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> Message-ID: <20190131133716.2g2q3ef7e6v6jspl@yuggoth.org> On 2019-01-31 11:45:25 +0100 (+0100), Thierry Carrez wrote: [...] > I wonder if we should not reframe the list and make it less "this > team needs help" and more "I offer peer-mentoring in this team". A > list of contributor internships offers, rather than a call for > corporate help in the dark. [...] It would be good to better understand how this differs from or otherwise relates to the current "cohort mentoring" activities in the community. Would we ask them to advertise cohorts in the proposed new document, or combine efforts with other mentor volunteers who use the new framework? https://wiki.openstack.org/wiki/Mentoring#Cohort_Mentoring -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From colleen at gazlene.net Thu Jan 31 14:26:44 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Thu, 31 Jan 2019 15:26:44 +0100 Subject: [all][tc] Formalizing cross-project pop-up teams In-Reply-To: <885eb5c9-55d7-2fea-ff83-b917b7d6c4d8@openstack.org> References: <885eb5c9-55d7-2fea-ff83-b917b7d6c4d8@openstack.org> Message-ID: <1548944804.945378.1647818352.1EDB6215@webmail.messagingengine.com> On Thu, Jan 31, 2019, at 11:19 AM, Thierry Carrez wrote: > TL;DR: > Maybe to help with cross-project work we should formalize temporary > teams with clear objective and disband criteria, under the model of > Kubernetes "working groups". > > Long version: > > Work in OpenStack is organized around project teams, who each own a set > of git repositories. One well-known drawback of this organization is > that it makes cross-project work harder, as someone has to coordinate > activities that ultimately affects multiple project teams. > > We tried various ways to facilitate cross-project work in the past. It > started with a top-level repository of cross-project specs, a formal > effort which failed due to a disconnect between the spec approvers (TC), > the people signed up to push the work, and the teams that would need to > approve the independent work items. > > This was replaced by more informal "champions", doing project-management > and other heavy lifting to get things done cross-project. This proved > successful, but champions are often facing an up-hill battle and often > suffer from lack of visibility / blessing / validation. > > SIGs are another construct that helps holding discussions and > coordinating work around OpenStack problem spaces, beyond specific > project teams. Those are great as a permanent structure, but sometimes > struggle to translate into specific development work, and are a bit > heavy-weight just to coordinate a given set of work items. > > Community goals fill the gap between champions and SIGs by blessing a > given set of cross-community goals for a given release. However, given > their nature (being blessed by the TC at every cycle), they are a better > fit for small, cycle-long objectives that affect most of the OpenStack > project teams, and great to push consistency across all projects. > > It feels like we are missing a way to formally describe a short-term, > cross-project objective that only affects a number of teams, is not tied > to a specific cycle, and organize work around a temporary team > specifically formed to reach that objective. A team that would get > support from the various affected project teams, increasing chances of > success. > > Kubernetes encountered the same problem, with work organized around > owners and permanent SIGs. They created the concept of a "working > group"[1] with a clear limited objective, and a clear disband criteria. > I feel like adopting something like it in OpenStack could help with work > that affects multiple projects. We would not name it "working group" > since that's already overloaded in OpenStack, but maybe "pop-up team" to > stress the temporary nature of it. We've been sort-of informally using > those in the past, but maybe formalizing and listing them could help > getting extra visibility and prioritization. > > Thoughts? Alternate solutions? I like the idea. One question is, how would these groups be bootstrapped? At the moment, SIGs are formed by 1) people express an interest in a common idea 2) the SIG is proposed and approved by the TC and UC chairs 3) profit. With a more cross-project, deliverable-focused type of group, you would need to have buy-in from all project teams involved before bringing it up for approval by the TC - but getting that buy-in from many different groups can be difficult if you aren't already a blessed group. And if you didn't get buy-in first and the group became approved anyway, project teams may be resentful of having new objectives imposed on them when they may not even agree it's the right direction. Colleen > > [1] > https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md > > -- > Thierry Carrez (ttx) > From mriedemos at gmail.com Thu Jan 31 15:00:56 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 31 Jan 2019 09:00:56 -0600 Subject: [nova] Per-instance serial number implementation question In-Reply-To: References: Message-ID: <6ac4eac9-18ea-2cca-e2f7-76de8f80835b@gmail.com> I'm going to top post and try to summarize where we are on this thread since I have it on today's nova meeting agenda under the "stuck reviews" section. * The email started as a proposal to change the proposed image property and flavor extra spec from a confusing boolean to an enum. * The question was raised why even have any other option than a unique serial number for all instances based on the instance UUID. * Stephen asked Daniel Berrange (danpb) about the history of the [libvirt]/sysinfo_serial configuration option and it sounds like it was mostly added as a way to determine guests running on the same host, which can already be determined using the hostId parameter in the REST API (hostId is the hashed instance.host + instance.project_id so it's not exactly the same since it's unique per host and project, not just host). However, the hostId is not exposed to the guest in the metadata API / config drive - so that could be a regression for applications that used this somehow to calculate affinity within the guest based on the serial (note that mnaser has a patch to expose hostId in the metadata API / config drive [1]). * danpb said the system.serial we set today should really be chassis.serial but that's only available in libvirt >= 4.1.0 and our current minimum required version of libvirt is 1.3.1 so setting chassis.serial would have to be conditional on the running version of libvirt (this is common in that driver). * Applications that depend on the serial number within the guest were not guaranteed it would be unique or not change because migrating the guest to another host would change the serial number anyway (that's the point of the blueprint - to keep the serial unchanged for each guest), so if we just changed to always using unique serial numbers everywhere it should probably be OK (and tolerated/expected by guest applications). * Clearly we would have a release note if we change this behavior but keep in mind that end users are not reading release notes, and none of this is documented today anyway outside of the [libvirt]/sysinfo_serial config option. So a release note would really only help an operator or support personal if they get a ticket due to the change in behavior (which we probably wouldn't hear about upstream for 2+ years given how slow openstack deployments upgrade). So where are we? If we want the minimal amount of behavior change as possible then we just add the new image property / flavor extra spec / config option choice, but that arguably adds technical debt and virt-driver specific behavior to the API (again, that's not uncommon though). If we want to simplify, we don't add the image property / flavor extra spec. But what do we do about the existing config option? Do we add the 'unique' choice, make it the default, and then deprecate the option to at least signal the change is coming in Train? Or do we just deprecate the option in Stein and completely ignore it, always setting the unique serial number as the instance.uuid (and set the host serial in chassis.serial if libvirt>=4.1.0)? In addition, do we expose hostId in the metadata API / config drive via [1] so there is a true alternative *from within the guest* to determine guest affinity on the same host? I'm personally OK with [1] if there is some user documentation around it (as noted in the review). If we are not going to add the new image property / extra spec, my personal choice would be to: - Add the 'unique' choice to the [libvirt]/sysinfo_serial config option and make it the default for new deployments. - Deprecate the sysinfo_serial config option in Stein and remove it in Train. This would give at least some window of time for transition and/or raising a stink if someone thinks we should leave the old per-host behavior. - Merge mnaser's patch to expose hostId in the metadata API and config drive so users still have a way within the guest to determine that affinity for servers in the same project on the same host. What do others think? [1] https://review.openstack.org/#/c/577933/ On 1/24/2019 9:09 AM, Matt Riedemann wrote: > The proposal from the spec for this feature was to add an image property > (hw_unique_serial), flavor extra spec (hw:unique_serial) and new > "unique" choice to the [libvirt]/sysinfo_serial config option. The image > property and extra spec would be booleans but really only True values > make sense and False would be more or less ignored. There were no plans > to enforce strict checking of a boolean value, e.g. if the image > property was True but the flavor extra spec was False, we would not > raise an exception for incompatible values, we would just use OR logic > and take the image property True value. > > The boolean usage proposed is a bit confusing, as can be seen from > comments in the spec [1] and the proposed code change [2]. > > After thinking about this a bit, I'm now thinking maybe we should just > use a single-value enum for the image property and flavor extra spec: > > image: hw_guest_serial=unique > flavor: hw:guest_serial=unique > > If either are set, then we use a unique serial number for the guest. If > neither are set, then the serial number is based on the host > configuration as it is today. > > I think that's more clear usage, do others agree? Alex does. I can't > think of any cases where users would want hw_unique_serial=False, so > this removes that ability and confusion over whether or not to enforce > mismatching booleans. > > [1] > https://review.openstack.org/#/c/612531/2/specs/stein/approved/per-instance-libvirt-sysinfo-serial.rst at 43 > > [2] > https://review.openstack.org/#/c/619953/7/nova/virt/libvirt/driver.py at 4894 -- Thanks, Matt From smooney at redhat.com Thu Jan 31 15:19:15 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 31 Jan 2019 15:19:15 +0000 Subject: [nova] Per-instance serial number implementation question In-Reply-To: <6ac4eac9-18ea-2cca-e2f7-76de8f80835b@gmail.com> References: <6ac4eac9-18ea-2cca-e2f7-76de8f80835b@gmail.com> Message-ID: On Thu, 2019-01-31 at 09:00 -0600, Matt Riedemann wrote: > I'm going to top post and try to summarize where we are on this thread > since I have it on today's nova meeting agenda under the "stuck reviews" > section. > > * The email started as a proposal to change the proposed image property > and flavor extra spec from a confusing boolean to an enum. > > * The question was raised why even have any other option than a unique > serial number for all instances based on the instance UUID. > > * Stephen asked Daniel Berrange (danpb) about the history of the > [libvirt]/sysinfo_serial configuration option and it sounds like it was > mostly added as a way to determine guests running on the same host, > which can already be determined using the hostId parameter in the REST > API (hostId is the hashed instance.host + instance.project_id so it's > not exactly the same since it's unique per host and project, not just > host). However, the hostId is not exposed to the guest in the metadata > API / config drive - so that could be a regression for applications that > used this somehow to calculate affinity within the guest based on the > serial (note that mnaser has a patch to expose hostId in the metadata > API / config drive [1]). > > * danpb said the system.serial we set today should really be > chassis.serial but that's only available in libvirt >= 4.1.0 and our > current minimum required version of libvirt is 1.3.1 so setting > chassis.serial would have to be conditional on the running version of > libvirt (this is common in that driver). > > * Applications that depend on the serial number within the guest were > not guaranteed it would be unique or not change because migrating the > guest to another host would change the serial number anyway (that's the > point of the blueprint - to keep the serial unchanged for each guest), > so if we just changed to always using unique serial numbers everywhere > it should probably be OK (and tolerated/expected by guest applications). > > * Clearly we would have a release note if we change this behavior but > keep in mind that end users are not reading release notes, and none of > this is documented today anyway outside of the [libvirt]/sysinfo_serial > config option. So a release note would really only help an operator or > support personal if they get a ticket due to the change in behavior > (which we probably wouldn't hear about upstream for 2+ years given how > slow openstack deployments upgrade). > > So where are we? If we want the minimal amount of behavior change as > possible then we just add the new image property / flavor extra spec / > config option choice, but that arguably adds technical debt and > virt-driver specific behavior to the API (again, that's not uncommon > though). > > If we want to simplify, we don't add the image property / flavor extra > spec. But what do we do about the existing config option? > > Do we add the 'unique' choice, make it the default, and then deprecate > the option to at least signal the change is coming in Train? > > Or do we just deprecate the option in Stein and completely ignore it, > always setting the unique serial number as the instance.uuid (and set > the host serial in chassis.serial if libvirt>=4.1.0)? personally i would do ^ assuming we also do v > > In addition, do we expose hostId in the metadata API / config drive via > [1] so there is a true alternative *from within the guest* to determine > guest affinity on the same host? I'm personally OK with [1] if there is > some user documentation around it (as noted in the review). > > If we are not going to add the new image property / extra spec, my > personal choice would be to: > > - Add the 'unique' choice to the [libvirt]/sysinfo_serial config option > and make it the default for new deployments. > - Deprecate the sysinfo_serial config option in Stein and remove it in > Train. This would give at least some window of time for transition > and/or raising a stink if someone thinks we should leave the old > per-host behavior. > - Merge mnaser's patch to expose hostId in the metadata API and config > drive so users still have a way within the guest to determine that > affinity for servers in the same project on the same host. > > What do others think? yes i think your personal choice above makes sense too so i would be +1 on that too as it gives a cycle for people to move if they care about the serials. > > [1] https://review.openstack.org/#/c/577933/ > > On 1/24/2019 9:09 AM, Matt Riedemann wrote: > > The proposal from the spec for this feature was to add an image property > > (hw_unique_serial), flavor extra spec (hw:unique_serial) and new > > "unique" choice to the [libvirt]/sysinfo_serial config option. The image > > property and extra spec would be booleans but really only True values > > make sense and False would be more or less ignored. There were no plans > > to enforce strict checking of a boolean value, e.g. if the image > > property was True but the flavor extra spec was False, we would not > > raise an exception for incompatible values, we would just use OR logic > > and take the image property True value. > > > > The boolean usage proposed is a bit confusing, as can be seen from > > comments in the spec [1] and the proposed code change [2]. > > > > After thinking about this a bit, I'm now thinking maybe we should just > > use a single-value enum for the image property and flavor extra spec: > > > > image: hw_guest_serial=unique > > flavor: hw:guest_serial=unique > > > > If either are set, then we use a unique serial number for the guest. If > > neither are set, then the serial number is based on the host > > configuration as it is today. > > > > I think that's more clear usage, do others agree? Alex does. I can't > > think of any cases where users would want hw_unique_serial=False, so > > this removes that ability and confusion over whether or not to enforce > > mismatching booleans. > > > > [1] > > https://review.openstack.org/#/c/612531/2/specs/stein/approved/per-instance-libvirt-sysinfo-serial.rst at 43 > > > > [2] > > https://review.openstack.org/#/c/619953/7/nova/virt/libvirt/driver.py at 4894 > > From ignaziocassano at gmail.com Thu Jan 31 15:23:34 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 31 Jan 2019 16:23:34 +0100 Subject: [manila][glusterfs] on queens error Message-ID: Hello All, I installed manila on my queens openstack based on centos 7. I configured two servers with glusterfs replocation and ganesha nfs. I configured my controllers octavia,conf but when I try to create a share the manila scheduler logs reports: Failed to schedule create_share: No valid host was found. Failed to find a weighted host, the last executed filter was CapabilitiesFilter.: NoValidHost: No valid host was found. Failed to find a weighted host, the last executed filter was CapabilitiesFilter. 2019-01-31 16:07:32.614 159380 INFO manila.message.api [req-241d66b3-8004-410b-b000-c6d2d3536e4a 89f76bc5de5545f381da2c10c7df7f15 59f1f232ce28409593d66d8f6495e434 - - -] Creating message record for request_id = req-241d66b3-8004-410b-b000-c6d2d3536e4a I did not understand if controllers node must be connected to the network where shares must be exported for virtual machines, so my glusterfs are connected on the management network where openstack controllers are conencted and to the network where virtual machine are connected. My manila.conf section for glusterfs section is the following [gluster-manila565] driver_handles_share_servers = False share_driver = manila.share.drivers.glusterfs.GlusterfsShareDriver glusterfs_target = root at 10.102.184.229:/manila565 glusterfs_path_to_private_key = /etc/manila/id_rsa glusterfs_ganesha_server_username = root glusterfs_nfs_server_type = Ganesha glusterfs_ganesha_server_ip = 10.102.184.229 #glusterfs_servers = root at 10.102.185.19 ganesha_config_dir = /etc/ganesha PS 10.102.184.0/24 is the network where controlelrs expose endpoint 10.102.189.0/24 is the shared network inside openstack where virtual machines are connected. The gluster servers are connected on both. Any help, please ? Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Thu Jan 31 15:32:02 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 31 Jan 2019 10:32:02 -0500 Subject: [nova] Per-instance serial number implementation question In-Reply-To: <6ac4eac9-18ea-2cca-e2f7-76de8f80835b@gmail.com> References: <6ac4eac9-18ea-2cca-e2f7-76de8f80835b@gmail.com> Message-ID: On Thu, Jan 31, 2019 at 10:05 AM Matt Riedemann wrote: > > I'm going to top post and try to summarize where we are on this thread > since I have it on today's nova meeting agenda under the "stuck reviews" > section. > > * The email started as a proposal to change the proposed image property > and flavor extra spec from a confusing boolean to an enum. > > * The question was raised why even have any other option than a unique > serial number for all instances based on the instance UUID. > > * Stephen asked Daniel Berrange (danpb) about the history of the > [libvirt]/sysinfo_serial configuration option and it sounds like it was > mostly added as a way to determine guests running on the same host, > which can already be determined using the hostId parameter in the REST > API (hostId is the hashed instance.host + instance.project_id so it's > not exactly the same since it's unique per host and project, not just > host). However, the hostId is not exposed to the guest in the metadata > API / config drive - so that could be a regression for applications that > used this somehow to calculate affinity within the guest based on the > serial (note that mnaser has a patch to expose hostId in the metadata > API / config drive [1]). > > * danpb said the system.serial we set today should really be > chassis.serial but that's only available in libvirt >= 4.1.0 and our > current minimum required version of libvirt is 1.3.1 so setting > chassis.serial would have to be conditional on the running version of > libvirt (this is common in that driver). > > * Applications that depend on the serial number within the guest were > not guaranteed it would be unique or not change because migrating the > guest to another host would change the serial number anyway (that's the > point of the blueprint - to keep the serial unchanged for each guest), > so if we just changed to always using unique serial numbers everywhere > it should probably be OK (and tolerated/expected by guest applications). > > * Clearly we would have a release note if we change this behavior but > keep in mind that end users are not reading release notes, and none of > this is documented today anyway outside of the [libvirt]/sysinfo_serial > config option. So a release note would really only help an operator or > support personal if they get a ticket due to the change in behavior > (which we probably wouldn't hear about upstream for 2+ years given how > slow openstack deployments upgrade). > > So where are we? If we want the minimal amount of behavior change as > possible then we just add the new image property / flavor extra spec / > config option choice, but that arguably adds technical debt and > virt-driver specific behavior to the API (again, that's not uncommon > though). > > If we want to simplify, we don't add the image property / flavor extra > spec. But what do we do about the existing config option? > > Do we add the 'unique' choice, make it the default, and then deprecate > the option to at least signal the change is coming in Train? > > Or do we just deprecate the option in Stein and completely ignore it, > always setting the unique serial number as the instance.uuid (and set > the host serial in chassis.serial if libvirt>=4.1.0)? > > In addition, do we expose hostId in the metadata API / config drive via > [1] so there is a true alternative *from within the guest* to determine > guest affinity on the same host? I'm personally OK with [1] if there is > some user documentation around it (as noted in the review). > > If we are not going to add the new image property / extra spec, my > personal choice would be to: > > - Add the 'unique' choice to the [libvirt]/sysinfo_serial config option > and make it the default for new deployments. > - Deprecate the sysinfo_serial config option in Stein and remove it in > Train. This would give at least some window of time for transition > and/or raising a stink if someone thinks we should leave the old > per-host behavior. > - Merge mnaser's patch to expose hostId in the metadata API and config > drive so users still have a way within the guest to determine that > affinity for servers in the same project on the same host. I agree with this for a few reasons Assuming that a system serial means that it is colocated with another machine seems just taking advantage of a bug in the first place. That is not *documented* behaviour and serials should inherently be unique, it also exposes information about the host which should not be necessary, Matt has pointed me to an OSSN about this too: https://wiki.openstack.org/wiki/OSSN/OSSN-0028 I think we should indeed provide a unique serials (only, ideally) to avoid having the user shooting themselves in the foot by exposing information they didn't know they were exposing. The patch that I supplied was really meant to make that information available in a controllable way, it also provides a much more secure way of exposing that information because hostId is actually hashed with the tenant ID which means that one VM from one tenant can't know that it's hosted on the same VM as another one by usnig the hostId (and with all of the recent processor issues, this is a big plus in security). > What do others think? > > [1] https://review.openstack.org/#/c/577933/ > > On 1/24/2019 9:09 AM, Matt Riedemann wrote: > > The proposal from the spec for this feature was to add an image property > > (hw_unique_serial), flavor extra spec (hw:unique_serial) and new > > "unique" choice to the [libvirt]/sysinfo_serial config option. The image > > property and extra spec would be booleans but really only True values > > make sense and False would be more or less ignored. There were no plans > > to enforce strict checking of a boolean value, e.g. if the image > > property was True but the flavor extra spec was False, we would not > > raise an exception for incompatible values, we would just use OR logic > > and take the image property True value. > > > > The boolean usage proposed is a bit confusing, as can be seen from > > comments in the spec [1] and the proposed code change [2]. > > > > After thinking about this a bit, I'm now thinking maybe we should just > > use a single-value enum for the image property and flavor extra spec: > > > > image: hw_guest_serial=unique > > flavor: hw:guest_serial=unique > > > > If either are set, then we use a unique serial number for the guest. If > > neither are set, then the serial number is based on the host > > configuration as it is today. > > > > I think that's more clear usage, do others agree? Alex does. I can't > > think of any cases where users would want hw_unique_serial=False, so > > this removes that ability and confusion over whether or not to enforce > > mismatching booleans. > > > > [1] > > https://review.openstack.org/#/c/612531/2/specs/stein/approved/per-instance-libvirt-sysinfo-serial.rst at 43 > > > > [2] > > https://review.openstack.org/#/c/619953/7/nova/virt/libvirt/driver.py at 4894 > > > -- > > Thanks, > > Matt > -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From lbragstad at gmail.com Thu Jan 31 15:59:11 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Thu, 31 Jan 2019 09:59:11 -0600 Subject: [tc][all] Train Community Goals In-Reply-To: <6b498008e71b7dae651e54e29717f3ccedea50d1.camel@evrard.me> References: <66d73db6-9f84-1290-1ab8-cf901a7fb355@catalyst.net.nz> <6b498008e71b7dae651e54e29717f3ccedea50d1.camel@evrard.me> Message-ID: Hello everyone, I thought it would be good to have a quick recap of the various goal proposals. *Project clean-up* Adrian and Tobias Rydberg have volunteered to champion the goal. There has also been some productive discussion around the approaches detailed in the etherpad [0]. At this point is it safe to assume we've come to a conclusion on the proposed approach? If so, I think the next logical step would be to do a gap analysis on what the proposed approach would mean work-wise for all projects. Note, Assaf Muller brought the approach Neutron takes to my attention [1] and I wanted to highlight this here since it establishes a template for us to follow, or at least look at. Note, Neutron's approach is client-based, which might not be orthogonal with the client goal. Just something to keep in mind if those two happen to be accepted for the same release. [0] https://etherpad.openstack.org/p/community-goal-project-deletion [1] https://github.com/openstack/python-neutronclient/blob/master/neutronclient/neutron/v2_0/purge.py *Moving legacy clients to python-openstackclient* Artem has done quite a bit of pre-work here [2], which has been useful in understanding the volume of work required to complete this goal in its entirety. I suggest we look for seams where we can break this into more consumable pieces of work for a given release. For example, one possible goal would be to work on parity with python-openstackclient and openstacksdk. A follow-on goal would be to move the legacy clients. Alternatively, we could start to move all the project clients logic into python-openstackclient, and then have another goal to implement the common logic gaps into openstacksdk. Arriving at the same place but using different paths. The approach still has to be discussed and proposed. I do think it is apparent that we'll need to break this up, however. [2] https://etherpad.openstack.org/p/osc-gaps-analysis *Healthcheck middleware* There is currently no volunteer to champion for this goal. The first iteration of the work on the oslo.middleware was updated [3], and a gap analysis was started on the mailing lists [4]. If you want to get involved in this goal, don't hesitate to answer on the ML thread there. [3] https://review.openstack.org/#/c/617924/2 [4] https://ethercalc.openstack.org/di0mxkiepll8 Just a reminder that we would like to have all potential goals proposed for review in openstack/governance by the middle of this month, giving us 6 weeks to hash out details in Gerrit if we plan to have the goals merged by the end of March. This timeframe should give us 4 weeks to prepare any discussions we'd like to have in-person pertaining to those goals. Thanks for the time, Lance On Tue, Jan 8, 2019 at 4:11 AM Jean-Philippe Evrard wrote: > On Wed, 2018-12-19 at 06:58 +1300, Adrian Turjak wrote: > > I put my hand up during the summit for being at least one of the > > champions for the deletion of project resources effort. > > > > I have been meaning to do a follow up email and options as well as > > steps > > for how the goal might go, but my working holiday in Europe after the > > summit turned into more of a holiday than originally planned. > > > > I'll get a thread going around what I (and the public cloud working > > group) think project resource deletion should look like, and what the > > options are, and where we should aim to be with it. We can then turn > > that discussion into a final 'spec' of sorts. > > > > > > Great news! > > Do you need any help to get started there? > > Regards, > JP > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rtidwell at suse.com Thu Jan 31 16:02:34 2019 From: rtidwell at suse.com (Ryan Tidwell) Date: Thu, 31 Jan 2019 10:02:34 -0600 Subject: [neutron] OVS OpenFlow L3 DVR / dvr_bridge agent_mode In-Reply-To: References: Message-ID: On 1/29/19 1:25 AM, Duarte Cardoso, Igor wrote: > > Hi Neutron, > >   > > I've been internally collaborating on the ``dvr_bridge`` L3 agent mode > [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the L3 > agent to make use of Open vSwitch / OpenFlow to implement > ``distributed`` IPv4 Routers thus bypassing kernel namespaces and > iptables and opening the door for higher performance by keeping > packets in OVS for longer. > >   > > I want to share a few questions in order to gather feedback from you. > I understand parts of these questions may have been answered in the > past before my involvement, but I believe it's still important to > revisit and clarify them. This can impact how long it's going to take > to complete the work and whether it can make it to stein-3. > >   > > 1. Should OVS support also be added to the legacy router? > > And if so, would it make more sense to have a new variable (not > ``agent_mode``) to specify what backend to use (OVS or kernel) instead > of creating more combinations? > Personally, I would like to see all routers implemented completely in the OVS data path. We can't do everything at once, so the DVR-first approach here seems reasonable to me. As to the question of config flags, agent_mode has a specific meaning. It effectively tells the agent what role it's playing (SNAT, SNAT_HA, etc.), not how to do it. dvr_bridge isn't a new mode, it's really a change to the backend implementation of the router (ie the "how"). Because of that, I'm partial to an "agent_mode" flag which will toggle the router implementation between OVS and namespace implementations. > >   > > 2. What is expected in terms of CI for this? Regarding testing, what > should this first patch include apart from the unit tests? (since the > l3_agent.ini needs to be configured differently). > >   > > 3. What problems can be anticipated by having the same agent managing > both kernel and OVS powered routers (depending on whether they were > created as ``distributed``)? > > We are experimenting with different ways of decoupling RouterInfo > (mainly as part of the L3 agent refactor patch) and haven't been able > to find the right balance yet. On one end we have an agent that is > still coupled with kernel-based RouterInfo, and on the other end we > have an agent that either only accepts OVS-based RouterInfos or only > kernel-based RouterInfos depending on the ``agent_mode``. > >   > > We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor > one should be able to pass Zuul after a recheck. > >   > > [1] Spec: > https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr > > [2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536 > > [3] Gerrit topic: > https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:merged) > > [4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29 > > [5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17 > >   > > Thank you! > >   > > Best regards, > > Igor D.C. > >   > -Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rtidwell at suse.com Thu Jan 31 16:22:40 2019 From: rtidwell at suse.com (Ryan Tidwell) Date: Thu, 31 Jan 2019 10:22:40 -0600 Subject: [neutron] OVS OpenFlow L3 DVR / dvr_bridge agent_mode In-Reply-To: References: Message-ID: "I'm partial to an "agent_mode" flag which will toggle the router..." In my previous email I mention being in favor of not overloading agent_mode, I realized I had a typo that might be confusing. I'm partial to introducing something like "agent_backend" for toggling OVS vs. namespace routers, not "agent_mode". Sorry for the typo. -Ryan On 1/31/19 10:02 AM, Ryan Tidwell wrote: > > > On 1/29/19 1:25 AM, Duarte Cardoso, Igor wrote: >> >> Hi Neutron, >> >>   >> >> I've been internally collaborating on the ``dvr_bridge`` L3 agent >> mode [1][2][3] work (David Shaughnessy, Xubo Zhang), which allows the >> L3 agent to make use of Open vSwitch / OpenFlow to implement >> ``distributed`` IPv4 Routers thus bypassing kernel namespaces and >> iptables and opening the door for higher performance by keeping >> packets in OVS for longer. >> >>   >> >> I want to share a few questions in order to gather feedback from you. >> I understand parts of these questions may have been answered in the >> past before my involvement, but I believe it's still important to >> revisit and clarify them. This can impact how long it's going to take >> to complete the work and whether it can make it to stein-3. >> >>   >> >> 1. Should OVS support also be added to the legacy router? >> >> And if so, would it make more sense to have a new variable (not >> ``agent_mode``) to specify what backend to use (OVS or kernel) >> instead of creating more combinations? >> > Personally, I would like to see all routers implemented completely in > the OVS data path. We can't do everything at once, so the DVR-first > approach here seems reasonable to me. As to the question of config > flags, agent_mode has a specific meaning. It effectively tells the > agent what role it's playing (SNAT, SNAT_HA, etc.), not how to do it. > dvr_bridge isn't a new mode, it's really a change to the backend > implementation of the router (ie the "how"). Because of that, I'm > partial to an "agent_mode" flag which will toggle the router > implementation between OVS and namespace implementations. >> >>   >> >> 2. What is expected in terms of CI for this? Regarding testing, what >> should this first patch include apart from the unit tests? (since the >> l3_agent.ini needs to be configured differently). >> >>   >> >> 3. What problems can be anticipated by having the same agent managing >> both kernel and OVS powered routers (depending on whether they were >> created as ``distributed``)? >> >> We are experimenting with different ways of decoupling RouterInfo >> (mainly as part of the L3 agent refactor patch) and haven't been able >> to find the right balance yet. On one end we have an agent that is >> still coupled with kernel-based RouterInfo, and on the other end we >> have an agent that either only accepts OVS-based RouterInfos or only >> kernel-based RouterInfos depending on the ``agent_mode``. >> >>   >> >> We'd also appreciate reviews on the 2 patches [4][5]. The L3 refactor >> one should be able to pass Zuul after a recheck. >> >>   >> >> [1] Spec: >> https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr >> >> [2] RFE: https://bugs.launchpad.net/neutron/+bug/1705536 >> >> [3] Gerrit topic: >> https://review.openstack.org/#/q/topic:dvr_bridge+(status:open+OR+status:merged) >> >> [4] L3 agent refactor patch: https://review.openstack.org/#/c/528336/29 >> >> [5] dvr_bridge patch: https://review.openstack.org/#/c/472289/17 >> >>   >> >> Thank you! >> >>   >> >> Best regards, >> >> Igor D.C. >> >>   >> > -Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mjturek at linux.vnet.ibm.com Thu Jan 31 16:30:30 2019 From: mjturek at linux.vnet.ibm.com (Michael Turek) Date: Thu, 31 Jan 2019 11:30:30 -0500 Subject: [ironic] [thirdparty-ci] BaremetalBasicOps test Message-ID: <1bf8f3b4-ea39-6c17-3609-9289ceeeb7ed@linux.vnet.ibm.com> Hello all, Our ironic job has been broken and it seems to be due to a lack of IPs. We allocate two IPs to our job, one for the dhcp server, and one for the target node. This had been working for as long as the job has existed but recently (since about early December 2018), we've been broken. The job is able to clean the node during devstack, successfully deploy to the node during the tempest run, and is successfully validated via ssh. The node then moves to clean failed with a network error [1], and the job subsequently fails. Sometime between the validation and attempting to clean, the neutron port associated with the ironic port is deleted and a new port comes into existence. Where I'm having trouble is finding out what this port is. Based on it's MAC address It's a virtual port, and its MAC is not the same as the ironic port. We could add an IP to the job to fix it, but I'd rather not do that needlessly. Any insight or advice would be appreciated here! Thanks, Mike Turek [1] http://paste.openstack.org/show/743191/ From artem.goncharov at gmail.com Thu Jan 31 16:52:02 2019 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Thu, 31 Jan 2019 17:52:02 +0100 Subject: [tc][all] Train Community Goals In-Reply-To: References: <66d73db6-9f84-1290-1ab8-cf901a7fb355@catalyst.net.nz> <6b498008e71b7dae651e54e29717f3ccedea50d1.camel@evrard.me> Message-ID: Hi Lance and everybody thanks for recap. With respect to *moving legacy clients to python-openstackclient*. At the page Lance already mentioned [1] bottom I have listed different options and a small volunteering table to understand what will be feasible to achieve. If people are interested and ready to contribute - please fill yourself in. I personally do not think any target is reachable, unless people start picking things to be done. [1] https://etherpad.openstack.org/p/osc-gaps-analysis Regards, Artem On Thu, Jan 31, 2019 at 5:03 PM Lance Bragstad wrote: > Hello everyone, > > I thought it would be good to have a quick recap of the various goal > proposals. > > *Project clean-up* > > Adrian and Tobias Rydberg have volunteered to champion the goal. There has > also been some productive discussion around the approaches detailed in the > etherpad [0]. At this point is it safe to assume we've come to a conclusion > on the proposed approach? If so, I think the next logical step would be to > do a gap analysis on what the proposed approach would mean work-wise for > all projects. Note, Assaf Muller brought the approach Neutron takes to my > attention [1] and I wanted to highlight this here since it establishes a > template for us to follow, or at least look at. Note, Neutron's approach is > client-based, which might not be orthogonal with the client goal. Just > something to keep in mind if those two happen to be accepted for the same > release. > > [0] https://etherpad.openstack.org/p/community-goal-project-deletion > [1] > https://github.com/openstack/python-neutronclient/blob/master/neutronclient/neutron/v2_0/purge.py > > *Moving legacy clients to python-openstackclient* > > Artem has done quite a bit of pre-work here [2], which has been useful in > understanding the volume of work required to complete this goal in its > entirety. I suggest we look for seams where we can break this into more > consumable pieces of work for a given release. > > For example, one possible goal would be to work on parity with > python-openstackclient and openstacksdk. A follow-on goal would be to move > the legacy clients. Alternatively, we could start to move all the project > clients logic into python-openstackclient, and then have another goal to > implement the common logic gaps into openstacksdk. Arriving at the same > place but using different paths. The approach still has to be discussed and > proposed. I do think it is apparent that we'll need to break this up, > however. > > [2] https://etherpad.openstack.org/p/osc-gaps-analysis > > *Healthcheck middleware* > > There is currently no volunteer to champion for this goal. The first > iteration of the work on the oslo.middleware was updated [3], and a gap > analysis was started on the mailing lists [4]. > If you want to get involved in this goal, don't hesitate to answer on the > ML thread there. > > [3] https://review.openstack.org/#/c/617924/2 > [4] https://ethercalc.openstack.org/di0mxkiepll8 > > Just a reminder that we would like to have all potential goals proposed > for review in openstack/governance by the middle of this month, giving us 6 > weeks to hash out details in Gerrit if we plan to have the goals merged by > the end of March. This timeframe should give us 4 weeks to prepare any > discussions we'd like to have in-person pertaining to those goals. > > Thanks for the time, > > Lance > > On Tue, Jan 8, 2019 at 4:11 AM Jean-Philippe Evrard < > jean-philippe at evrard.me> wrote: > >> On Wed, 2018-12-19 at 06:58 +1300, Adrian Turjak wrote: >> > I put my hand up during the summit for being at least one of the >> > champions for the deletion of project resources effort. >> > >> > I have been meaning to do a follow up email and options as well as >> > steps >> > for how the goal might go, but my working holiday in Europe after the >> > summit turned into more of a holiday than originally planned. >> > >> > I'll get a thread going around what I (and the public cloud working >> > group) think project resource deletion should look like, and what the >> > options are, and where we should aim to be with it. We can then turn >> > that discussion into a final 'spec' of sorts. >> > >> > >> >> Great news! >> >> Do you need any help to get started there? >> >> Regards, >> JP >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Thu Jan 31 16:59:12 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 31 Jan 2019 10:59:12 -0600 Subject: [release] Release countdown for week R-9, February 4-8 Message-ID: <20190131165911.GA28730@sm-workstation> Welcome back to the countdown emails. Some important deadlines are starting to come up quick, so please take a look at what might affect you. Development Focus ----------------- Teams should be focused on implementing planned work for the cycle. It is also a good time to review those plans and reprioritize anything if needed based on the what progress has been made and what looks realistic to complete in the next few weeks. General Information ------------------- We have a few deadlines coming up as we get closer to the end of the cycle: * Non-client libraries (generally, any library that is not python-${PROJECT}client) must have a final release by February 28. Only critical bugfixes will be allowed past this point. Please make sure any important feature works has required library changes by this time. * Client libraries must have a final release by March 7. Quick reminder for teams with cycle-with-intermediary deliverables that are not libraries - we had mentioned earlier in the cycle that you may want to consider moving to the new cycle-with-rc model if you are not actually doing intermediary releases: http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000465.html Next week we will be looking at what cycle-with-intermediary deliverables have not done a release yet in Stein. We can still give a little time if teams want to quick get one in before Stein-3, but we will look at switching over those deliverables to cycle-with-rc if that appears to be a more appropriate release model for the way they are being released. Upcoming Deadlines & Dates -------------------------- Non-client library freeze: February 28 Stein-3 milestone: March 7 -- Sean McGinnis (smcginnis) From Kevin.Fox at pnnl.gov Thu Jan 31 17:14:49 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Thu, 31 Jan 2019 17:14:49 +0000 Subject: [all][tc] Formalizing cross-project pop-up teams In-Reply-To: <885eb5c9-55d7-2fea-ff83-b917b7d6c4d8@openstack.org> References: <885eb5c9-55d7-2fea-ff83-b917b7d6c4d8@openstack.org> Message-ID: <1A3C52DFCD06494D8528644858247BF01C28E6FE@EX10MBOX03.pnnl.gov> +1 ________________________________________ From: Thierry Carrez [thierry at openstack.org] Sent: Thursday, January 31, 2019 2:19 AM To: openstack-discuss at lists.openstack.org Subject: [all][tc] Formalizing cross-project pop-up teams TL;DR: Maybe to help with cross-project work we should formalize temporary teams with clear objective and disband criteria, under the model of Kubernetes "working groups". Long version: Work in OpenStack is organized around project teams, who each own a set of git repositories. One well-known drawback of this organization is that it makes cross-project work harder, as someone has to coordinate activities that ultimately affects multiple project teams. We tried various ways to facilitate cross-project work in the past. It started with a top-level repository of cross-project specs, a formal effort which failed due to a disconnect between the spec approvers (TC), the people signed up to push the work, and the teams that would need to approve the independent work items. This was replaced by more informal "champions", doing project-management and other heavy lifting to get things done cross-project. This proved successful, but champions are often facing an up-hill battle and often suffer from lack of visibility / blessing / validation. SIGs are another construct that helps holding discussions and coordinating work around OpenStack problem spaces, beyond specific project teams. Those are great as a permanent structure, but sometimes struggle to translate into specific development work, and are a bit heavy-weight just to coordinate a given set of work items. Community goals fill the gap between champions and SIGs by blessing a given set of cross-community goals for a given release. However, given their nature (being blessed by the TC at every cycle), they are a better fit for small, cycle-long objectives that affect most of the OpenStack project teams, and great to push consistency across all projects. It feels like we are missing a way to formally describe a short-term, cross-project objective that only affects a number of teams, is not tied to a specific cycle, and organize work around a temporary team specifically formed to reach that objective. A team that would get support from the various affected project teams, increasing chances of success. Kubernetes encountered the same problem, with work organized around owners and permanent SIGs. They created the concept of a "working group"[1] with a clear limited objective, and a clear disband criteria. I feel like adopting something like it in OpenStack could help with work that affects multiple projects. We would not name it "working group" since that's already overloaded in OpenStack, but maybe "pop-up team" to stress the temporary nature of it. We've been sort-of informally using those in the past, but maybe formalizing and listing them could help getting extra visibility and prioritization. Thoughts? Alternate solutions? [1] https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md -- Thierry Carrez (ttx) From lbragstad at gmail.com Thu Jan 31 17:45:43 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Thu, 31 Jan 2019 11:45:43 -0600 Subject: [all][tc] Formalizing cross-project pop-up teams In-Reply-To: <885eb5c9-55d7-2fea-ff83-b917b7d6c4d8@openstack.org> References: <885eb5c9-55d7-2fea-ff83-b917b7d6c4d8@openstack.org> Message-ID: On Thu, Jan 31, 2019 at 4:22 AM Thierry Carrez wrote: > TL;DR: > Maybe to help with cross-project work we should formalize temporary > teams with clear objective and disband criteria, under the model of > Kubernetes "working groups". > > Long version: > > Work in OpenStack is organized around project teams, who each own a set > of git repositories. One well-known drawback of this organization is > that it makes cross-project work harder, as someone has to coordinate > activities that ultimately affects multiple project teams. > > We tried various ways to facilitate cross-project work in the past. It > started with a top-level repository of cross-project specs, a formal > effort which failed due to a disconnect between the spec approvers (TC), > the people signed up to push the work, and the teams that would need to > approve the independent work items. > > This was replaced by more informal "champions", doing project-management > and other heavy lifting to get things done cross-project. This proved > successful, but champions are often facing an up-hill battle and often > suffer from lack of visibility / blessing / validation. > > SIGs are another construct that helps holding discussions and > coordinating work around OpenStack problem spaces, beyond specific > project teams. Those are great as a permanent structure, but sometimes > struggle to translate into specific development work, and are a bit > heavy-weight just to coordinate a given set of work items. > > Community goals fill the gap between champions and SIGs by blessing a > given set of cross-community goals for a given release. However, given > their nature (being blessed by the TC at every cycle), they are a better > fit for small, cycle-long objectives that affect most of the OpenStack > project teams, and great to push consistency across all projects. > > It feels like we are missing a way to formally describe a short-term, > cross-project objective that only affects a number of teams, is not tied > to a specific cycle, and organize work around a temporary team > specifically formed to reach that objective. A team that would get > support from the various affected project teams, increasing chances of > success. > FWIW - I've participated in groups that have attempted to self-organize like this in the past, but without a formal blessing. We started by socializing the problem and the need for a solution [0]. We scheduled reoccurring meetings around it [1], attempted to document progress [2], and spin up specific efforts to help us design a solution that worked for our community [3][4]. After we felt comfortable with what we had, we attempted to use cross-project specs [5] (which we abandoned for the reasons you mentioned) and community goals to start moving the needle [6]. We also attempted to document the outcomes using project specifications [7][8] and other tools [9]. The difference between what we did and what you're proposing, in my opinion, is that we didn't define our disband criteria very well [10][11] and no one officially blessed us by any means. We were just a group that collected around an issue we cared about solving. I do think the effort was useful and helped us make progress on a challenging problem, which we're still trying to resolve. Outside of having a formal name, do we expect the "pop-up" teams to include processes that make what we went through easier? Ultimately, we still had to self-organize and do a bunch of socializing to make progress. [0] http://lists.openstack.org/pipermail/openstack-dev/2016-November/107137.html [1] https://review.openstack.org/#/c/398500/ [2] https://etherpad.openstack.org/p/keystone-policy-meeting [3] http://lists.openstack.org/pipermail/openstack-dev/2017-October/123069.html [4] http://lists.openstack.org/pipermail/openstack-dev/2017-October/123331.html [5] https://review.openstack.org/#/c/523973/ [6] https://governance.openstack.org/tc/goals/queens/policy-in-code.html [7] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/policy-goals.html [8] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/policy-security-roadmap.html [9] https://trello.com/b/bpWycnwa/policy-roadmap [10] https://review.openstack.org/#/c/581800/ [11] http://eavesdrop.openstack.org/meetings/keystone/2018/keystone.2018-07-03-16.00.log.html#l-346 > Kubernetes encountered the same problem, with work organized around > owners and permanent SIGs. They created the concept of a "working > group"[1] with a clear limited objective, and a clear disband criteria. > I feel like adopting something like it in OpenStack could help with work > that affects multiple projects. We would not name it "working group" > since that's already overloaded in OpenStack, but maybe "pop-up team" to > stress the temporary nature of it. We've been sort-of informally using > those in the past, but maybe formalizing and listing them could help > getting extra visibility and prioritization. > > Thoughts? Alternate solutions? > > [1] > > https://github.com/kubernetes/community/blob/master/committee-steering/governance/wg-governance.md > > -- > Thierry Carrez (ttx) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jillr at redhat.com Thu Jan 31 18:00:13 2019 From: jillr at redhat.com (Jill Rouleau) Date: Thu, 31 Jan 2019 11:00:13 -0700 Subject: [tc] The future of the "Help most needed" list In-Reply-To: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> References: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> Message-ID: <1548957613.6476.16.camel@redhat.com> + openstack-mentoringOn Thu, 2019-01-31 at 11:45 +0100, Thierry Carrez wrote: > Hi everyone, > > The "Help most needed" list[1] was created by the Technical Committee > to  > clearly describe areas of the OpenStack open source project which > were  > in the most need of urgent help. This was done partly to facilitate  > communications with corporate sponsors and engineering managers, and > be  > able to point them to an official statement of need from "the > project". > > [1] https://governance.openstack.org/tc/reference/help-most- > needed.html TIL - will start sharing this with our mentees when they sign up. > This list encounters two issues. First it's hard to limit entries: a > lot  > of projects teams, SIGs and other forms of working groups could use  > extra help. But more importantly, this list has had a very limited  > impact -- new contributors did not exactly magically show up in the  > areas we designated as in most need of help. > > When we raised that topic (again) at a Board+TC meeting, a suggestion  > was made that we should turn the list more into a "job description"  > style that would make it more palatable to the corporate world. I > fear  > that would not really solve the underlying issue (which is that at > our  > stage of the hype curve, no organization really has spare > contributors  > to throw at random hard problems). > > So I wonder if we should not reframe the list and make it less "this  > team needs help" and more "I offer peer-mentoring in this team". A > list  > of contributor internships offers, rather than a call for corporate > help  > in the dark. I feel like that would be more of a win-win offer, and > more  > likely to appeal to students, or OpenStack users trying to contribute > back. We've got a list of folks now who have volunteered to be mentors for various topics but we've struggled to get mentees and mentors engaged with each other and the program.  There seems to be a hurdle between "I need/want to help" and active participation.  A list of "this team needs this specific help" might actually be beneficial to getting people active, whereas I don't know that we'd gain much from another list of people who are generally open to helping (as much as that willingness to help is appreciated). > > Proper 1:1 mentoring takes a lot of time, and I'm not underestimating  > that. Only people that are ready to dedicate mentoring time should > show  > up on this new "list"... which is why it should really list > identified  > individuals rather than anonymous teams. It should also probably be  > one-off offers -- once taken, the offer should probably go off the > list. > Thoughts on that? Do you think reframing help-needed as  > mentoring-offered could help? Do you have alternate suggestions? Hopefully some of the folks who have signed up for the cohort mentoring program can share their thoughts here. -jill -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From ashlee at openstack.org Thu Jan 31 18:29:14 2019 From: ashlee at openstack.org (Ashlee Ferguson) Date: Thu, 31 Jan 2019 12:29:14 -0600 Subject: Open Infrastructure Summit Denver - Community Voting Open Message-ID: <6B02F9A1-28A7-4F43-85E1-66AD570ED37B@openstack.org> Hi everyone, Community voting for the Open Infrastructure Summit Denver sessions is open! You can VOTE HERE , but what does that mean? Now that the Call for Presentations has closed, all submissions are available for community vote and input. After community voting closes, the volunteer Programming Committee members will receive the presentations to review and determine the final selections for Summit schedule. While community votes are meant to help inform the decision, Programming Committee members are expected to exercise judgment in their area of expertise and help ensure diversity of sessions and speakers. View full details of the session selection process here . In order to vote, you need an OSF community membership. If you do not have an account, please create one by going to openstack.org/join . If you need to reset your password, you can do that here . Hurry, voting closes Monday, February 4 at 11:59pm Pacific Time (Tuesday, February 5 at 7:59 UTC). Continue to visit https://www.openstack.org/summit/denver-2019 for all Summit-related information. REGISTER Register for the Summit before prices increase in late February! VISA APPLICATION PROCESS Make sure to secure your Visa soon. More information about the Visa application process. TRAVEL SUPPORT PROGRAM February 27 is the last day to submit applications. Please submit your applications by 11:59pm Pacific Time (February 28 at 7:59am UTC). If you have any questions, please email summit at openstack.org. Cheers, Ashlee Ashlee Ferguson OpenStack Foundation ashlee at openstack.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Thu Jan 31 19:35:34 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 31 Jan 2019 14:35:34 -0500 Subject: [openstack-ansible] bug squash day! In-Reply-To: <717c065910a2365e8d9674f987227771@arcor.de> References: <717c065910a2365e8d9674f987227771@arcor.de> Message-ID: On Tue, Jan 29, 2019 at 2:26 PM Frank Kloeker wrote: > > Am 2019-01-29 17:09, schrieb Mohammed Naser: > > Hi team, > > > > As you may have noticed, bug triage during our meetings has been > > something that has kinda killed attendance (really, no one seems to > > enjoy it, believe it or not!) > > > > I wanted to propose for us to take a day to go through as much bugs as > > possible, triaging and fixing as much as we can. It'd be a fun day > > and we can also hop on a more higher bandwidth way to talk about this > > stuff while we grind through it all. > > > > Is this something that people are interested in, if so, is there any > > times/days that work better in the week to organize? > > Interesting. Something in EU timezone would be nice. Or what about: Bug > around the clock? > So 24 hours of bug triage :) I'd be up for that too, we have a pretty distributed team so that would be awesome, I'm still wondering if there are enough resources or folks available to be doing this, as we haven't had a response yet on a timeline that might work or availabilities yet. > kind regards > > Frank -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From gouthampravi at gmail.com Thu Jan 31 19:55:55 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Thu, 31 Jan 2019 11:55:55 -0800 Subject: [manila][glusterfs] on queens error In-Reply-To: References: Message-ID: Hi Ignazio, On Thu, Jan 31, 2019 at 7:31 AM Ignazio Cassano wrote: > > Hello All, > I installed manila on my queens openstack based on centos 7. > I configured two servers with glusterfs replocation and ganesha nfs. > I configured my controllers octavia,conf but when I try to create a share > the manila scheduler logs reports: > > Failed to schedule create_share: No valid host was found. Failed to find a weighted host, the last executed filter was CapabilitiesFilter.: NoValidHost: No valid host was found. Failed to find a weighted host, the last executed filter was CapabilitiesFilter. > 2019-01-31 16:07:32.614 159380 INFO manila.message.api [req-241d66b3-8004-410b-b000-c6d2d3536e4a 89f76bc5de5545f381da2c10c7df7f15 59f1f232ce28409593d66d8f6495e434 - - -] Creating message record for request_id = req-241d66b3-8004-410b-b000-c6d2d3536e4a The scheduler failure points out that you have a mismatch in expectations (backend capabilities vs share type extra-specs) and there was no host to schedule your share to. So a few things to check here: - What is the share type you're using? Can you list the share type extra-specs and confirm that the backend (your GlusterFS storage) capabilities are appropriate with whatever you've set up as extra-specs ($ manila pool-list --detail)? - Is your backend operating correctly? You can list the manila services ($ manila service-list) and see if the backend is both 'enabled' and 'up'. If it isn't, there's a good chance there was a problem with the driver initialization, please enable debug logging, and look at the log file for the manila-share service, you might see why and be able to fix it. Please be aware that we're on a look out for a maintainer for the GlusterFS driver for the past few releases. We're open to bug fixes and maintenance patches, but there is currently no active maintainer for this driver. > I did not understand if controllers node must be connected to the network where shares must be exported for virtual machines, so my glusterfs are connected on the management network where openstack controllers are conencted and to the network where virtual machine are connected. > > My manila.conf section for glusterfs section is the following > > [gluster-manila565] > driver_handles_share_servers = False > share_driver = manila.share.drivers.glusterfs.GlusterfsShareDriver > glusterfs_target = root at 10.102.184.229:/manila565 > glusterfs_path_to_private_key = /etc/manila/id_rsa > glusterfs_ganesha_server_username = root > glusterfs_nfs_server_type = Ganesha > glusterfs_ganesha_server_ip = 10.102.184.229 > #glusterfs_servers = root at 10.102.185.19 > ganesha_config_dir = /etc/ganesha > > > PS > 10.102.184.0/24 is the network where controlelrs expose endpoint > > 10.102.189.0/24 is the shared network inside openstack where virtual machines are connected. > > The gluster servers are connected on both. > > > Any help, please ? > > Ignazio From colleen at gazlene.net Thu Jan 31 20:31:55 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Thu, 31 Jan 2019 21:31:55 +0100 Subject: [tc] The future of the "Help most needed" list In-Reply-To: <1548957613.6476.16.camel@redhat.com> References: <713ef94c-27d9-ed66-cf44-f9aa98e49a4c@openstack.org> <1548957613.6476.16.camel@redhat.com> Message-ID: <1548966715.921386.1648087056.15F1B280@webmail.messagingengine.com> On Thu, Jan 31, 2019, at 7:00 PM, Jill Rouleau wrote: > + openstack-mentoringOn Thu, 2019-01-31 at 11:45 +0100, Thierry Carrez wrote: > > Hi everyone, > > > > The "Help most needed" list[1] was created by the Technical Committee > > to  > > clearly describe areas of the OpenStack open source project which > > were  > > in the most need of urgent help. This was done partly to facilitate  > > communications with corporate sponsors and engineering managers, and > > be  > > able to point them to an official statement of need from "the > > project". > > > > [1] https://governance.openstack.org/tc/reference/help-most- > > needed.html > > TIL - will start sharing this with our mentees when they sign up. > > > This list encounters two issues. First it's hard to limit entries: a > > lot  > > of projects teams, SIGs and other forms of working groups could use  > > extra help. But more importantly, this list has had a very limited  > > impact -- new contributors did not exactly magically show up in the  > > areas we designated as in most need of help. > > > > When we raised that topic (again) at a Board+TC meeting, a suggestion  > > was made that we should turn the list more into a "job description"  > > style that would make it more palatable to the corporate world. I > > fear  > > that would not really solve the underlying issue (which is that at > > our  > > stage of the hype curve, no organization really has spare > > contributors  > > to throw at random hard problems). > > > > So I wonder if we should not reframe the list and make it less "this  > > team needs help" and more "I offer peer-mentoring in this team". A > > list  > > of contributor internships offers, rather than a call for corporate > > help  > > in the dark. I feel like that would be more of a win-win offer, and > > more  > > likely to appeal to students, or OpenStack users trying to contribute > > back. > > We've got a list of folks now who have volunteered to be mentors for > various topics but we've struggled to get mentees and mentors engaged > with each other and the program.  There seems to be a hurdle between "I > need/want to help" and active participation.  A list of "this team needs > this specific help" might actually be beneficial to getting people > active, whereas I don't know that we'd gain much from another list of > people who are generally open to helping (as much as that willingness to > help is appreciated). This is a good point, I think for this reason it would be nice of the list was sort of a combination of the "job description" and peer-mentoring offer. It should be specific enough that people know beforehand whether it's something they would be interested in, and it should include specific objectives so people have something concrete to work towards and to measure their success against as they are working with their mentor. > > > > > Proper 1:1 mentoring takes a lot of time, and I'm not underestimating  > > that. Only people that are ready to dedicate mentoring time should > > show  > > up on this new "list"... which is why it should really list > > identified  > > individuals rather than anonymous teams. It should also probably be  > > one-off offers -- once taken, the offer should probably go off the > > list. > > Thoughts on that? Do you think reframing help-needed as  > > mentoring-offered could help? Do you have alternate suggestions? > > Hopefully some of the folks who have signed up for the cohort mentoring > program can share their thoughts here. As an Outreachy mentor I agree that 1:1 mentoring is a lot of work, and coming up with small-scope tasks for new people is really challenging. The problem with Outreachy is that often the interns go away when their internship is over, so something like this that encourages long-term growing of new contributors will ultimately be more rewarding for the community. Colleen > > -jill > Email had 1 attachment: > + signature.asc > 1k (application/pgp-signature) From openstack at fried.cc Thu Jan 31 22:32:13 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 31 Jan 2019 16:32:13 -0600 Subject: [dev] How to develop changes in a series In-Reply-To: References: <1CC272501B5BC543A05DB90AA509DED527475067@ORSMSX162.amr.corp.intel.com> <20181205195227.4j3rkpinlgts3ujv@yuggoth.org> Message-ID: <9e9174cc-8c77-6623-34c5-6ca0c0056e7c@fried.cc> /me dusts off thread I have proposed this addition to the contributor guide: https://review.openstack.org/634333 I will now go and blithely add everyone who participated in this thread as reviewers :P -efried On 12/7/18 3:55 PM, Kendall Nelson wrote: > Thanks for mentioning the contributor guide! > > I'll happily review any patches you have for that section. I'm sure > Ildiko would be happy to as well.  > > -Kendall (diablo_rojo) > > On Wed, Dec 5, 2018 at 12:41 PM William M Edmonds > wrote: > > Jeremy Stanley > wrote > on 12/05/2018 02:52:28 PM: > > On 2018-12-05 14:48:37 -0500 (-0500), William M Edmonds wrote: > > > Eric Fried wrote on 12/05/2018 12:18:37 PM: > > > > > > > > > > > > > But I want to edit 1b2c453, while leaving ebb3505 properly > stacked on > > > > top of it. Here I use a tool called `git restack` (run `pip > install > > > > git-restack` to install it). > > > > > > It's worth noting that you can just use `git rebase` [1], you > don't have to > > > use git-restack. This is why later you're using `git rebase > --continue`, > > > because git-restack is actually using rebase under the covers. > > > > > > [1] https://stackoverflow.com/questions/1186535/how-to-modify-a- > > specified-commit > > > > You can, however what git-restack does for you is figure out which > > commit to rebase on top of so that you don't inadvertently rebase > > your stack of changes onto a newer branch state and then make things > > harder on reviewers. > > -- > > Jeremy Stanley > > Ah, that's good to know. > > Also, found this existing documentation [2] if someone wants to > propose an update or link from another location. Note that it > doesn't currently mention git-restack, just rebase. > > [2] > https://docs.openstack.org/contributors/code-and-documentation/patch-best-practices.html#how-to-handle-chains > From d.lake at surrey.ac.uk Wed Jan 30 19:02:50 2019 From: d.lake at surrey.ac.uk (David Lake) Date: Wed, 30 Jan 2019 19:02:50 +0000 Subject: Issue with launching instance with OVS-DPDK In-Reply-To: References: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> Message-ID: Hi Sean I've set OVS_NUM_HUGEPAGES=14336 but now Devstack is failing to install... David full create: /opt/stack/tempest/.tox/tempest ERROR: invocation failed (exit code 1), logfile: /opt/stack/tempest/.tox/tempest/log/full-0.log ERROR: actionid: full msg: getenv cmdargs: '/usr/bin/python -m virtualenv --python /usr/bin/python tempest' Already using interpreter /usr/bin/python New python executable in /opt/stack/tempest/.tox/tempest/bin/python Complete output from command /opt/stack/tempest/.tox/tempest/bin/python -m pip config list: ERROR: unknown command "config" ---------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/usr/lib/python2.7/site-packages/virtualenv.py", line 2502, in main() File "/usr/lib/python2.7/site-packages/virtualenv.py", line 793, in main symlink=options.symlink, File "/usr/lib/python2.7/site-packages/virtualenv.py", line 1087, in create_environment install_wheel(to_install, py_executable, search_dirs, download=download) File "/usr/lib/python2.7/site-packages/virtualenv.py", line 935, in install_wheel _install_wheel_with_search_dir(download, project_names, py_executable, search_dirs) File "/usr/lib/python2.7/site-packages/virtualenv.py", line 964, in _install_wheel_with_search_dir config = _pip_config(py_executable, python_path) File "/usr/lib/python2.7/site-packages/virtualenv.py", line 1038, in _pip_config remove_from_env=["PIP_VERBOSE", "PIP_QUIET"], File "/usr/lib/python2.7/site-packages/virtualenv.py", line 886, in call_subprocess raise OSError("Command {} failed with error code {}".format(cmd_desc, proc.returncode)) OSError: Command /opt/stack/tempest/.tox/tempest/bin/python -m pip config list failed with error code 1 ERROR: Error creating virtualenv. Note that some special characters (e.g. ':' and unicode symbols) in paths are not supported by virtualenv. Error details: InvocationError('/usr/bin/python -m virtualenv --python /usr/bin/python tempest (see /opt/stack/tempest/.tox/tempest/log/full-0.log)', 1) ___________________________________ summary ____________________________________ ERROR: full: Error creating virtualenv. Note that some special characters (e.g. ':' and unicode symbols) in paths are not supported by virtualenv. Error details: InvocationError('/usr/bin/python -m virtualenv --python /usr/bin/python tempest (see /opt/stack/tempest/.tox/tempest/log/full-0.log)', 1) -----Original Message----- From: Sean Mooney Sent: 29 January 2019 21:46 To: Lake, David (PG/R - Elec Electronic Eng) ; openstack-dev at lists.openstack.org Cc: Ge, Chang Dr (Elec Electronic Eng) Subject: Re: Issue with launching instance with OVS-DPDK On Tue, 2019-01-29 at 18:05 +0000, David Lake wrote: > Answers
in-line
> > Thanks > > David > > -----Original Message----- > From: Sean Mooney > Sent: 29 January 2019 14:55 > To: Lake, David (PG/R - Elec Electronic Eng) ; > openstack-dev at lists.openstack.org > Cc: Ge, Chang Dr (Elec Electronic Eng) > Subject: Re: Issue with launching instance with OVS-DPDK > > On Mon, 2019-01-28 at 13:17 +0000, David Lake wrote: > > Hello > > > > I’ve built an Openstack all-in-one using OVS-DPDK via Devstack. > > > > I can launch instances which use the “m1.small” flavour (which I > > have modified to include the hw:mem_size large as per the DPDK > > instructions) but as soon as I try to launch anything more than m1.small, I get this error: > > > > Jan 28 12:56:52 localhost nova-conductor: #033[01;31mERROR > > nova.scheduler.utils [#033[01;36mNone req-917cd3b9-8ce6- > > 41af-8d44-045002512c91 #033[00;36madmin admin#033[01;31m] > > #033[01;35m[instance: 25cfee28-08e9-419c-afdb-4d0fe515fb2a] > > #033[01;31mError from last host: localhost (node localhost): [u'Traceback (most recent call last):\n', u' File > > "/opt/stack/nova/nova/compute/manager.py", line 1935, in _do_build_and_run_instance\n filter_properties, > > request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2215, in _build_and_run_instance\n > > instance_uuid=instance.uuid, reason=six.text_type(e))\n', > > u'RescheduledException: Build of instance 25cfee28-08e9- > > 419c-afdb-4d0fe515fb2a was re-scheduled: internal error: qemu > > unexpectedly closed the monitor: 2019-01- 28T12:56:48.127594Z > > qemu-kvm: -chardev > > socket,id=charnet0,path=/var/run/openvswitch/vhu46b3c508-f8,server: > > info: QEMU waiting for connection on: > > disconnected:unix:/var/run/openvswitch/vhu46b3c508-f8,server\n2019-0 > > 1- > > 28T12:56:49.251071Z > > qemu-kvm: -object > > memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepage > > s/ > > libvirt/qemu/4-instance- > > 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: > > os_mem_prealloc: Insufficient free host memory pages available to > > allocate guest RAM\n']#033[00m#033[00m > > > > > > My Hypervisor is reporting 510.7GB of RAM and 61 vCPUs. > > how much of that ram did you allocate as hugepages. > >
OVS_NUM_HUGEPAGES=3072
ok so you used networking-ovs-dpdks ablitiy to automatically allocate 2MB hugepages at runtime so this should have allocate 6GB of hugepages per numa node. > > can you provide the output of cat /proc/meminfo > >
> > MemTotal: 526779552 kB > MemFree: 466555316 kB > MemAvailable: 487218548 kB > Buffers: 2308 kB > Cached: 22962972 kB > SwapCached: 0 kB > Active: 29493384 kB > Inactive: 13344640 kB > Active(anon): 20826364 kB > Inactive(anon): 522012 kB > Active(file): 8667020 kB > Inactive(file): 12822628 kB > Unevictable: 43636 kB > Mlocked: 47732 kB > SwapTotal: 4194300 kB > SwapFree: 4194300 kB > Dirty: 20 kB > Writeback: 0 kB > AnonPages: 19933028 kB > Mapped: 171680 kB > Shmem: 1450564 kB > Slab: 1224444 kB > SReclaimable: 827696 kB > SUnreclaim: 396748 kB > KernelStack: 69392 kB > PageTables: 181020 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 261292620 kB > Committed_AS: 84420252 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 1352128 kB > VmallocChunk: 34154915836 kB > HardwareCorrupted: 0 kB > AnonHugePages: 5365760 kB > CmaTotal: 0 kB > CmaFree: 0 kB > HugePages_Total: 6144 since we have 6144 total and OVS_NUM_HUGEPAGES was set to 3072 this indicate the host has 2 numa nodes > HugePages_Free: 2048 and you currently have 4G of 2MB hugepages free. however this will also be split across numa nodes. the qemu commandline you provied which i have coppied below is trying to allocate 4G of hugepage memory from a single host numa node qemu-kvm: -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/ libvirt/qemu/4-instance- 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: os_mem_prealloc: Insufficient free host memory pages available to allocate guest RAM\n']#033[00m#033[00m as a result the vm is failing to boot because nova cannot create the vm with a singel numa node. if you set hw:numa_nodes=2 this vm would likely boot but since you have a 512G hostyou should be able to increase OVS_NUM_HUGEPAGES to something like OVS_NUM_HUGEPAGES=14336. this will allocate 60G of 2MB hugepages total. if you want to allocate more then about 96G of hugepages you should set OVS_ALLOCATE_HUGEPAGES=False and instead allcoate the hugepages on the kernel commandline using 1G hugepages. e.g. default_hugepagesz=1G hugepagesz=1G hugepages=480 This is becase it take a long time for ovs-dpdk to scan all the hugepages on start up. setting default_hugepagesz=1G hugepagesz=1G hugepages=480 will leave 32G of ram for the host. if it a comptue node and not a contorller you can safly reduce the the free host ram to 16G e.g. default_hugepagesz=1G hugepagesz=1G hugepages=496 i would not advice allocating much more above than 496G of hugepages as the qemu emularot over head can eaially get into the 10s of gigs if you have 50+ vms running. > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > DirectMap4k: 746304 kB > DirectMap2M: 34580480 kB > DirectMap1G: 502267904 kB > [stack at localhost devstack]$ > >
> > > > > Build is the latest git clone of Devstack. > > > > Thanks > > > > David > > From feilong at catalyst.net.nz Wed Jan 30 20:13:01 2019 From: feilong at catalyst.net.nz (Feilong Wang) Date: Thu, 31 Jan 2019 09:13:01 +1300 Subject: [openstack-ansible][magnum] In-Reply-To: References: <1F00FD58-4132-4C42-A9C2-41E3FF8A84C4@crandale.de> Message-ID: I'm echoing Von's comments. >From the log of cloud-init-output.log, you should be able to see below error: /Cloud-init v. 0.7.9 running 'modules:final' at Wed, 30 Jan 2019 08:33:41 +0000. Up 76.51 seconds.// //2019-01-30 08:37:49,209 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-011 [1]// //+ _prefix=docker.io/openstackmagnum/// //+ atomic install --storage ostree --system --system-package no --set REQUESTS_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt --name heat-container-agent docker.io/openstackmagnum/heat-container-agent:queens-stable// //The docker daemon does not appear to be running.// //+ systemctl start heat-container-agent// //Failed to start heat-container-agent.service: Unit heat-container-agent.service not found.// //2019-01-30 08:38:10,250 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-013 [5]/ Then please go to |/var/lib/cloud/instances//scripts| to find the script 011 and 013 to run it manually to get the root cause. And welcome to pop up into #openstack-containers irc channel. On 30/01/19 11:43 PM, Clemens Hardewig wrote: > Read the cloud-Init.log! There you can see that your > /var/lib/.../part-011 part of the config script finishes with error. > Check why. > > Von meinem iPhone gesendet > > Am 30.01.2019 um 10:11 schrieb Alfredo De Luca > >: > >> here are also the logs for the cloud init logs from the k8s master.... >> >> >> >> On Wed, Jan 30, 2019 at 9:30 AM Alfredo De Luca >> > wrote: >> >> >> In the meantime this is my cluster >>  template >> >> >> >> On Wed, Jan 30, 2019 at 9:17 AM Alfredo De Luca >> > wrote: >> >> hi Clemens and Ignazio. thanks for your support. >> it must be network related but I don't do something special >> apparently to create a simple k8s cluster.  >> I ll post later on configurations and logs as you Clemens >> suggested.  >> >> >> Cheers >> >> >> >> On Tue, Jan 29, 2019 at 9:16 PM Clemens >> > > wrote: >> >> … an more important: check the other log cloud-init.log >> for error messages (not only cloud-init-output.log) >> >>> Am 29.01.2019 um 16:07 schrieb Alfredo De Luca >>> >> >: >>> >>> Hi Ignazio and Clemens. I haven\t configure the proxy  >>> and all the logs on the kube master keep saying the >>> following >>> >>> + '[' ok = '[-]poststarthook/bootstrap-controller >>> failed: not finished >>> [+]poststarthook/extensions/third-party-resources ok >>> [-]poststarthook/rbac/bootstrap-roles failed: not finished >>> healthz check failed' ']' >>> + sleep 5 >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '' ']' >>> + sleep 5 >>> ++ curl --silent http://127.0.0.1:8080/healthz >>> + '[' ok = '[-]poststarthook/bootstrap-controller >>> failed: not finished >>> [+]poststarthook/extensions/third-party-resources ok >>> [-]poststarthook/rbac/bootstrap-roles failed: not finished >>> healthz check failed' ']' >>> + sleep 5 >>> >>> Not sure what to do.  >>> My configuration is ...  >>> eth0 - 10.1.8.113 >>> >>> But the openstack configration in terms of networkin is >>> the default from  ansible-openstack which >>> is 172.29.236.100/22 >>> >>> Maybe that's the problem? >>> >>> >>> >>> >>> >>> >>> On Tue, Jan 29, 2019 at 2:26 PM Ignazio Cassano >>> >> > wrote: >>> >>> Hello Alfredo, >>> your external network is using proxy ? >>> If you using a proxy, and yuo configured it in >>> cluster template, you must setup no proxy for 127.0.0.1 >>> Ignazio >>> >>> Il giorno mar 29 gen 2019 alle ore 12:26 Clemens >>> Hardewig >> > ha scritto: >>> >>> At least on fedora there is a second cloud Init >>> log as far as I remember-Look into both  >>> >>> Br c >>> >>> Von meinem iPhone gesendet >>> >>> Am 29.01.2019 um 12:08 schrieb Alfredo De Luca >>> >> >: >>> >>>> thanks Clemens. >>>> I looked at the cloud-init-output.log  on the >>>> master... and at the moment is doing the >>>> following.... >>>> >>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>> + '[' ok = '' ']' >>>> + sleep 5 >>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>> + '[' ok = '' ']' >>>> + sleep 5 >>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>> + '[' ok = '' ']' >>>> + sleep 5 >>>> >>>> Network ....could be but not sure where to look at >>>> >>>> >>>> On Tue, Jan 29, 2019 at 11:34 AM Clemens >>>> Hardewig >>> > wrote: >>>> >>>> Yes, you should check the cloud-init logs >>>> of your master. Without having seen them, I >>>> would guess a network issue or you have >>>> selected for your minion nodes a flavor >>>> using swap perhaps ... >>>> So, log files are the first step you could >>>> dig into... >>>> Br c >>>> Von meinem iPhone gesendet >>>> >>>> Am 28.01.2019 um 15:34 schrieb Alfredo De >>>> Luca >>> >: >>>> >>>>> Hi all. >>>>> I finally instaledl successufully >>>>> openstack ansible (queens) but, after >>>>> creating a cluster template I create k8s >>>>> cluster, it stuck on  >>>>> >>>>> >>>>> kube_masters >>>>> >>>>> b7204f0c-b9d8-4ef2-8f0b-afe4c077d039 >>>>> >>>>> OS::Heat::ResourceGroup 16 minutes >>>>> Create In Progress state changed >>>>> >>>>> create in progress....and after around an >>>>> hour it says...time out. k8s master seems >>>>> to be up.....at least as VM.  >>>>> >>>>> any idea?  >>>>> >>>>> >>>>> >>>>>   >>>>> /*Alfredo*/ >>>>> >>>> >>>> >>>> -- >>>> /*Alfredo*/ >>>> >>> >>> >>> -- >>> /*Alfredo*/ >>> >> >> >> >> -- >> /*Alfredo*/ >> >> >> >> -- >> /*Alfredo*/ >> >> >> >> -- >> /*Alfredo*/ >> >> >> -- Cheers & Best regards, Feilong Wang (王飞龙) -------------------------------------------------------------------------- Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.lake at surrey.ac.uk Wed Jan 30 22:24:57 2019 From: d.lake at surrey.ac.uk (David Lake) Date: Wed, 30 Jan 2019 22:24:57 +0000 Subject: Issue with launching instance with OVS-DPDK In-Reply-To: References: <2c0edad2c1e27eca588188967c2ac71a13d9386c.camel@redhat.com> Message-ID: Hi Sean Thanks! Got it working and I can now spin up larger VMs. All I've got to do now is work out how to get SSSE3 support in my VM. I think I need to modify the flavour to "Haswell" for that? David -----Original Message----- From: Sean Mooney Sent: 30 January 2019 19:58 To: Lake, David (PG/R - Elec Electronic Eng) ; openstack-dev at lists.openstack.org Cc: Ge, Chang Dr (Elec Electronic Eng) Subject: Re: Issue with launching instance with OVS-DPDK On Wed, 2019-01-30 at 19:02 +0000, David Lake wrote: > Hi Sean > > I've set OVS_NUM_HUGEPAGES=14336 but now Devstack is failing to install... that appars to be unrelated you could disable the installation of tempest as a workaround but my guess is that it is related to the pip 19.0 or 19.0.1 relsase that was don in the last few days https://pypi.org/project/pip/#history pip config was intoduced in pip 10.0.0b1 https://pip.pypa.io/en/stable/news/#b1-2018-03-31 to disable tempest add "disable_service tempest" to your local.conf then unstack and stack. > > David > > full create: /opt/stack/tempest/.tox/tempest > ERROR: invocation failed (exit code 1), logfile: > /opt/stack/tempest/.tox/tempest/log/full-0.log > ERROR: actionid: full > msg: getenv > cmdargs: '/usr/bin/python -m virtualenv --python /usr/bin/python tempest' > > Already using interpreter /usr/bin/python New python executable in > /opt/stack/tempest/.tox/tempest/bin/python > Complete output from command /opt/stack/tempest/.tox/tempest/bin/python -m pip config list: > ERROR: unknown command "config" > ---------------------------------------- > Traceback (most recent call last): > File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main > "__main__", fname, loader, pkg_name) > File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code > exec code in run_globals > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 2502, in > main() > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 793, in main > symlink=options.symlink, > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 1087, in create_environment > install_wheel(to_install, py_executable, search_dirs, download=download) > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 935, in install_wheel > _install_wheel_with_search_dir(download, project_names, py_executable, search_dirs) > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 964, in _install_wheel_with_search_dir > config = _pip_config(py_executable, python_path) > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 1038, in _pip_config > remove_from_env=["PIP_VERBOSE", "PIP_QUIET"], > File "/usr/lib/python2.7/site-packages/virtualenv.py", line 886, in call_subprocess > raise OSError("Command {} failed with error code > {}".format(cmd_desc, proc.returncode)) > OSError: Command /opt/stack/tempest/.tox/tempest/bin/python -m pip > config list failed with error code 1 > > ERROR: Error creating virtualenv. Note that some special characters > (e.g. ':' and unicode symbols) in paths are not supported by > virtualenv. Error details: InvocationError('/usr/bin/python -m > virtualenv --python /usr/bin/python tempest (see /opt/stack/tempest/.tox/tempest/log/full-0.log)', 1) ___________________________________ summary ____________________________________ > ERROR: full: Error creating virtualenv. Note that some special characters (e.g. ':' and unicode symbols) in paths > are not supported by virtualenv. Error details: > InvocationError('/usr/bin/python -m virtualenv --python > /usr/bin/python tempest (see > /opt/stack/tempest/.tox/tempest/log/full-0.log)', 1) > > > -----Original Message----- > From: Sean Mooney > Sent: 29 January 2019 21:46 > To: Lake, David (PG/R - Elec Electronic Eng) ; > openstack-dev at lists.openstack.org > Cc: Ge, Chang Dr (Elec Electronic Eng) > Subject: Re: Issue with launching instance with OVS-DPDK > > On Tue, 2019-01-29 at 18:05 +0000, David Lake wrote: > > Answers
in-line
> > > > Thanks > > > > David > > > > -----Original Message----- > > From: Sean Mooney > > Sent: 29 January 2019 14:55 > > To: Lake, David (PG/R - Elec Electronic Eng) ; > > openstack-dev at lists.openstack.org > > Cc: Ge, Chang Dr (Elec Electronic Eng) > > Subject: Re: Issue with launching instance with OVS-DPDK > > > > On Mon, 2019-01-28 at 13:17 +0000, David Lake wrote: > > > Hello > > > > > > I’ve built an Openstack all-in-one using OVS-DPDK via Devstack. > > > > > > I can launch instances which use the “m1.small” flavour (which I > > > have modified to include the hw:mem_size large as per the DPDK > > > instructions) but as soon as I try to launch anything more than m1.small, I get this error: > > > > > > Jan 28 12:56:52 localhost nova-conductor: #033[01;31mERROR > > > nova.scheduler.utils [#033[01;36mNone req-917cd3b9-8ce6- > > > 41af-8d44-045002512c91 #033[00;36madmin admin#033[01;31m] > > > #033[01;35m[instance: 25cfee28-08e9-419c-afdb-4d0fe515fb2a] > > > #033[01;31mError from last host: localhost (node localhost): [u'Traceback (most recent call last):\n', u' File > > > "/opt/stack/nova/nova/compute/manager.py", line 1935, in _do_build_and_run_instance\n filter_properties, > > > request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2215, in _build_and_run_instance\n > > > instance_uuid=instance.uuid, reason=six.text_type(e))\n', > > > u'RescheduledException: Build of instance 25cfee28-08e9- > > > 419c-afdb-4d0fe515fb2a was re-scheduled: internal error: qemu > > > unexpectedly closed the monitor: 2019-01- 28T12:56:48.127594Z > > > qemu-kvm: -chardev > > > socket,id=charnet0,path=/var/run/openvswitch/vhu46b3c508-f8,server: > > > info: QEMU waiting for connection on: > > > disconnected:unix:/var/run/openvswitch/vhu46b3c508-f8,server\n2019 > > > -0 > > > 1- > > > 28T12:56:49.251071Z > > > qemu-kvm: -object > > > memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepa > > > ge > > > s/ > > > libvirt/qemu/4-instance- > > > 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: > > > os_mem_prealloc: Insufficient free host memory pages available to > > > allocate guest RAM\n']#033[00m#033[00m > > > > > > > > > My Hypervisor is reporting 510.7GB of RAM and 61 vCPUs. > > > > how much of that ram did you allocate as hugepages. > > > >
OVS_NUM_HUGEPAGES=3072
> > ok so you used networking-ovs-dpdks ablitiy to automatically allocate > 2MB hugepages at runtime so this should have allocate 6GB of hugepages per numa node. > > > > can you provide the output of cat /proc/meminfo > > > >
> > > > MemTotal: 526779552 kB > > MemFree: 466555316 kB > > MemAvailable: 487218548 kB > > Buffers: 2308 kB > > Cached: 22962972 kB > > SwapCached: 0 kB > > Active: 29493384 kB > > Inactive: 13344640 kB > > Active(anon): 20826364 kB > > Inactive(anon): 522012 kB > > Active(file): 8667020 kB > > Inactive(file): 12822628 kB > > Unevictable: 43636 kB > > Mlocked: 47732 kB > > SwapTotal: 4194300 kB > > SwapFree: 4194300 kB > > Dirty: 20 kB > > Writeback: 0 kB > > AnonPages: 19933028 kB > > Mapped: 171680 kB > > Shmem: 1450564 kB > > Slab: 1224444 kB > > SReclaimable: 827696 kB > > SUnreclaim: 396748 kB > > KernelStack: 69392 kB > > PageTables: 181020 kB > > NFS_Unstable: 0 kB > > Bounce: 0 kB > > WritebackTmp: 0 kB > > CommitLimit: 261292620 kB > > Committed_AS: 84420252 kB > > VmallocTotal: 34359738367 kB > > VmallocUsed: 1352128 kB > > VmallocChunk: 34154915836 kB > > HardwareCorrupted: 0 kB > > AnonHugePages: 5365760 kB > > CmaTotal: 0 kB > > CmaFree: 0 kB > > HugePages_Total: 6144 > > since we have 6144 total and OVS_NUM_HUGEPAGES was set to 3072 this > indicate the host has 2 numa nodes > > HugePages_Free: 2048 > > and you currently have 4G of 2MB hugepages free. > however this will also be split across numa nodes. > > the qemu commandline you provied which i have coppied below is trying > to allocate 4G of hugepage memory from a single host numa node > > qemu-kvm: -object > memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/ > libvirt/qemu/4-instance- > 00000005,share=yes,size=4294967296,host-nodes=0,policy=bind: > os_mem_prealloc: Insufficient free host memory pages available to > allocate guest RAM\n']#033[00m#033[00m > > as a result the vm is failing to boot because nova cannot create the vm with a singel numa node. > > if you set hw:numa_nodes=2 this vm would likely boot but since you > have a 512G hostyou should be able to increase OVS_NUM_HUGEPAGES to something like OVS_NUM_HUGEPAGES=14336. > this will allocate 60G of 2MB hugepages total. > > if you want to allocate more then about 96G of hugepages you should > set OVS_ALLOCATE_HUGEPAGES=False and instead allcoate the hugepages on the kernel commandline using 1G hugepages. > e.g. default_hugepagesz=1G hugepagesz=1G hugepages=480 This is becase > it take a long time for ovs-dpdk to scan all the hugepages on start up. > > setting default_hugepagesz=1G hugepagesz=1G hugepages=480 will leave 32G of ram for the host. > if it a comptue node and not a contorller you can safly reduce the the > free host ram to 16G e.g. default_hugepagesz=1G hugepagesz=1G > hugepages=496 i would not advice allocating much more above than 496G of hugepages as the qemu emularot over head can eaially get into the 10s of gigs if you have 50+ vms running. > > > > > HugePages_Rsvd: 0 > > HugePages_Surp: 0 > > Hugepagesize: 2048 kB > > DirectMap4k: 746304 kB > > DirectMap2M: 34580480 kB > > DirectMap1G: 502267904 kB > > [stack at localhost devstack]$ > > > >
> > > > > > > > Build is the latest git clone of Devstack. > > > > > > Thanks > > > > > > David > > > > > > From satish.txt at gmail.com Thu Jan 31 02:10:01 2019 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 30 Jan 2019 21:10:01 -0500 Subject: Horizon extremely slow with 400 instances Message-ID: folks, we have mid size openstack cloud running 400 instances, and day by day its getting slower, i can understand it render every single machine during loading instance page but it seems it's design issue, why not it load page from MySQL instead of running bunch of API calls behind then page? is this just me or someone else also having this issue? i am surprised why there is no good and robust Web GUI for very popular openstack? I am curious how people running openstack in large environment using Horizon. I have tired all kind of setting and tuning like memcache etc.. ~S From mjturek at linux.vnet.ibm.com Thu Jan 31 15:30:25 2019 From: mjturek at linux.vnet.ibm.com (Michael Turek) Date: Thu, 31 Jan 2019 10:30:25 -0500 Subject: [3rd party CI] [ironic] BaremetalBasicOps test Message-ID: <854eea7f-9de5-fb43-686c-c95e3b4d0ed9@linux.vnet.ibm.com> Hello all, Our ironic job has been broken and it seems to be due to a lack of IPs. We allocate two IPs to our job, one for the dhcp server, and one for the target node. This had been working for as long as the job has existed but recently (since about early December 2018), we've been broken. The job is able to clean the node during devstack, successfully deploy to the node during the tempest run, and is successfully validated via ssh. The node then moves to clean failed with a network error [1], and the job subsequently fails. Sometime between the validation and attempting to clean, the neutron port associated with the ironic port is deleted and a new port comes into existence. Where I'm having trouble is finding out what this port is. Based on it's MAC address It's a virtual port, and its MAC is not the same as the ironic port. We could add an IP to the job to fix it, but I'd rather not do that needlessly. Any insight or advice would be appreciated here! Thanks, Mike Turek [1] http://paste.openstack.org/show/743191/