From samuel at cassi.ba Tue Jan 1 02:54:11 2019 From: samuel at cassi.ba (Samuel Cassiba) Date: Mon, 31 Dec 2018 18:54:11 -0800 Subject: [chef] State of the Kitchen: 8th Edition Message-ID: It has been some time since the last State of the Kitchen. Since the release of Chef OpenStack 17 (Queens), we have been operating in a minimal churn mode to give people time to test/upgrade deployments and handle any regressions that emerge. There are three main areas that are still in progress upstream from Chef OpenStack, but can affect its cadence regardless. As a result, this update focuses more on those areas. Consider this more of a year-end review. ### Important Happenings * *fog-openstack*[^1] Beginning in August, we started receiving reports of breakage due to changes in fog-core. As a reactionary measure, we implemented upper-level constraints in the client resource cookbook to maintain a consistent outcome. The fog-openstack library has continued to receive changes to further align with fog-core, and we are following its progress to find a good time to move ChefDK and Chef OpenStack to a post-1.0 release of fog-openstack. We are targeting the 18th release of Chef OpenStack, due to Keystone endpoint changes that need to happen. * *Sous Chefs*[^2] One of the biggest strengths of Chef and OpenStack is the collective outcome of their unique communities. Within the Chef ecosystem, the Sous Chefs group was formed in response to a need for the continued existence of Chef components, libraries, and utilities that need a long-term home. Across the globe, Sous Chefs work to keep some of the most heavily used cookbooks in existence, such as [apache2](https://supermarket.chef.io/cookbooks/apache2), [mysql](https://supermarket.chef.io/cookbooks/mysql) (and [mariadb](https://supermarket.chef.io/cookbooks/mariadb)!), [postgresql](https://supermarket.chef.io/cookbooks/postgresql), as well as [redisio](https://supermarket.chef.io/cookbooks/redisio), and many more. Chef OpenStack depends on MariaDB, Apache, and their related cookbooks, for compatibility without operators needing to plumb those resources internally. * *poise-python*[^3] In early October, pip 18.1 was released, which made some additional waves in the ecosystem. Workarounds were devised and implemented to limit the fallout. Currently, the fix has been merged to poise-python's master, but cannot be released safely due to CI changes in the current workflow. There are limitations on what the Sous Chefs can reasonably maintain. The maintenance of poise is rather beyond that boundary, not to discount or disparage anyone involved. Anyone interested with spare cycles over the holiday season might consider joining the conversation. ### Upcoming Changes * In Chef OpenStack 18... - The MariaDB version will default to 10.3, consistent with the default in the 2.0 version of the cookbook. Please plan accordingly. - Keystone's endpoint will be changing to drop the hardcoded API version - the cloud primitives (client) cookbook is in the process of migrating from cookbook-openstackclient to cookbook-openstack-client (named openstack_client, to conform with current best practices in the Chef community) - Ubuntu will be upgraded from 16.04 to 18.04, and as such we will be gating against Bionic at that time. Plainly put, previous Chef OpenStack releases will not be moving to Bionic jobs, and will continue to work at best effort until they succumb to the detritus of time. ### Meetings Since the Summit, a few people have reached out through various means about Chef OpenStack and how to work together to improve the outcome. As a result, I would like to propose holding regular meetings for Chef OpenStack once more set aside a dedicated period where we can come together and talk about food, or other things. We have the IRC channel, but IRC has proven less effective for a small group to dedicate time consistently, so I would something more high bandwidth for technical conversations, such as video with a publicized method for joining and viewing. I will follow up with a more expanded proposal outside this update. ### On The Menu This would not be a State of the Kitchen without something to eat. My partner and I try to cook with recipes that are not overly complicated, but can be infinitely complex with just the right nudge. Sometimes we incorporate our own opinions into someone else's recipe to make it our own thing, and sometimes they're great just as they come. *Dat Dough, Doe* * 170g / 6oz grated mozzarella or Edam, or another mild cheese with similar melting consistency * 85g / 3oz almond meal/flour * 28g / 2 tbsp cream cheese or Neufchatel * 1 egg * pinch of salt to taste 1. Mix the shredded/grated cheese and the almond meal in a microwaveable bowl, then add the cream cheese. Microwave on high for 1 minute. Stir the mixture, then microwave on high for another 30 seconds. 2. Add egg, salt, additional spices or flavorings, and mix or fold gently. 3. Shape using parchment paper into the desired outcome, be it flat like a disc or rounded, like a boule. 4. Create vents to ensure that the finished product cooks evenly. 5. Fry, bake, broil or grill as desired. Lipids can be friends here. More commonly known as the "Fat Head" dough, out of these few ingredients, one can make food that can taste every bit like pizza, pasta, bread, even pão de queijo. Or, perhaps, cinnamon rolls, or danishes, as one might consider making. With these basic suggestions, one can apply their own opinions and set of requirements to create complex pieces of work, which can taste every bit like an artform and a science. See you in 2019! Your humble pastry chef, -scas [^1]: https://github.com/fog/fog-openstack/issues/434 [^2]: https://sous-chefs.org/ [^3]: https://github.com/poise/poise-python/issues/133 -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Tue Jan 1 04:15:39 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Tue, 1 Jan 2019 15:15:39 +1100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Thanks Ignazio I ll have a look asap. Cheers On Sun., 30 Dec. 2018, 6:43 pm Ignazio Cassano Hi Alfredo, > attached here there is my magnum.conf for queens release > As you can see my heat sections are empty > When you create your cluster, I suggest to check heat logs e magnum logs > for verifyng what is wrong > Ignazio > > > > Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> so. Creating a stack either manually or dashboard works fine. The problem >> seems to be when I create a cluster (kubernetes/swarm) that I got that >> error. >> Maybe the magnum conf it's not properly setup? >> In the heat section of the magnum.conf I have only >> *[heat_client]* >> *region_name = RegionOne* >> *endpoint_type = internalURL* >> >> Cheers >> >> >> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >> alfredo.deluca at gmail.com> wrote: >> >>> Yes. Next step is to check with ansible. >>> I do think it's some rights somewhere... >>> I'll check later. Thanks >>> >>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano >> wrote: >>> >>>> Alfredo, >>>> 1 . how did you run the last heat template? By dashboard ? >>>> 2. Using openstack command you can check if ansible configured heat >>>> user/domain correctly >>>> >>>> >>>> It seems a problem related to >>>> heat user rights? >>>> >>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>> killed by signal 15 >>>>> which is last log from a few days ago. >>>>> >>>>> While the journal of the heat engine says >>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>> heat-engine service. >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>> SAWarning: Unicode type received non-unicode bind param value >>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>> occurrences) >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> (util.ellipses_string(value),)) >>>>> >>>>> >>>>> I also checked the configuration and it seems to be ok. the problem is >>>>> that I installed openstack with ansible-openstack.... so I can't change >>>>> anything unless I re run everything. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Check heat user and domani are c onfigured like at the following: >>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>> >>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>> >>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>> >>>>>>>> I ll try asap. Thanks >>>>>>>> >>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>> >>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>> heat is working fine? >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>> >>>>>>>>>> HI IGNAZIO >>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>> creating the master. >>>>>>>>>> >>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> Anycase during deployment you can connect with ssh to the master >>>>>>>>>>> and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>> >>>>>>>>>>>> Cheers >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my answer >>>>>>>>>>>>> could help you.... >>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>> Ignazio >>>>>>>>>>>>> >>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>> one.... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Alfredo* >>>>>>>>>>>> >>>>>>>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >> >> -- >> *Alfredo* >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From flux.adam at gmail.com Tue Jan 1 12:34:44 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Tue, 1 Jan 2019 04:34:44 -0800 Subject: [heat] Bug : Heat cannot create Octavia Load Balancer In-Reply-To: References: Message-ID: I'm just on my phone over the holidays, but it kinda looks like the code for this was just updated 12 days ago: https://review.openstack.org/#/c/619577/ If you're using that new code, I imagine it's possible there could be a bug that wasn't yet caught... If you're NOT using that code, maybe try it and see if it helps? I'm guessing it's related one way or another. If you come to the #openstack-lbaas channel once more people are around (later this week?), we can probably take a look. --Adam Harwell (rm_work) On Sun, Dec 30, 2018, 03:37 Zufar Dhiyaulhaq wrote: > I have try creating load balancer with Heat. but always get this error : > > Resource CREATE failed: OctaviaClientException: resources.loadbalancer: > Validation failure: Missing project ID in request where one is required. > (HTTP 400) (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) > > OctaviaClientException: resources.loadbalancer: Validation failure: > Missing project ID in request where one is required. (HTTP 400) > (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) > > I create 2 openstack environment : > > - Heat with Octavia (Octavia Heat Template : > http://paste.opensuse.org/view//33592182 ) > - Heat with Neutron Lbaasv2 (Neutron LBaaSv2 Heat Template : > http://paste.opensuse.org/view//71741503) > > But always error when creating with octavia : > > - Octavia Log (https://imgur.com/a/EsuWvla) > - LBaaS v2 (https://imgur.com/a/BqNGRPH) > > Are Heat code is broken to create Octavia Load Balancer? > > Best Regards, > Zufar Dhiyaulhaq > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liliueecg at gmail.com Tue Jan 1 17:28:05 2019 From: liliueecg at gmail.com (Li Liu) Date: Tue, 1 Jan 2019 12:28:05 -0500 Subject: [Cyborg] no irc meeting this week Message-ID: Hi Team, Since it's public holiday for folks in US and Canada, we will not have the irc meeting this week. Enjoy the new year guys :) Thank you Regards Li Liu -------------- next part -------------- An HTML attachment was scrubbed... URL: From jiaopengju at cmss.chinamobile.com Tue Jan 1 07:39:38 2019 From: jiaopengju at cmss.chinamobile.com (=?utf-8?B?54Sm6bmP5Li+?=) Date: Tue, 1 Jan 2019 15:39:38 +0800 (CST) Subject: [dev][karbor]No meeting today Message-ID: <2aff5c2b179265e-00007.Richmail.00004030743850933408@cmss.chinamobile.com> Hi Karbor Team, We will skip karbor weekly meeting today due to holidays. Next meeting will come on 15 January. Thanks. Pengju Jiao -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Wed Jan 2 02:45:13 2019 From: zbitter at redhat.com (Zane Bitter) Date: Wed, 2 Jan 2019 15:45:13 +1300 Subject: [Heat][Octavia] Is autoscaling feature missing? In-Reply-To: References: Message-ID: <26860a5b-38de-cb6d-301e-07fd7b332310@redhat.com> On 21/12/18 1:00 AM, Viktor Shulhin wrote: > Hi all, > > I am trying to create Heat template with autoscaling and loadbalancing. > I didn't find any similar Heat template examples. > Individually, loadbalancing and autoscaling work well, but loadbalancing > OS::Octavia::PoolMember can be added only manually. > Is there any way to use OS::Heat::AutoScalingGroup as server pool for > loadbalancing? Yes. The trick is that the thing you're using as the scaled unit in the Autoscaling group should not be just an OS::Nova::Server, but rather a Heat stack that contains both a server and an OS::Octavia::PoolMember. The example that Rabi linked to shows how to do it. Note that you need both of these files: https://github.com/openstack/heat-templates/blob/master/hot/autoscaling.yaml https://github.com/openstack/heat-templates/blob/master/hot/lb_server.yaml cheers, Zane. From zbitter at redhat.com Wed Jan 2 03:22:31 2019 From: zbitter at redhat.com (Zane Bitter) Date: Wed, 2 Jan 2019 16:22:31 +1300 Subject: [heat] Bug : Heat cannot create Octavia Load Balancer In-Reply-To: References: Message-ID: <11bd665e-ab0d-6ae6-49f9-6b3a7fbc4eea@redhat.com> On 2/01/19 1:34 AM, Adam Harwell wrote: > I'm just on my phone over the holidays, but it kinda looks like the code > for this was just updated 12 days ago: > https://review.openstack.org/#/c/619577/ > > If you're using that new code, I imagine it's possible there could be a > bug that wasn't yet caught... If you're NOT using that code, maybe try > it and see if it helps? I'm guessing it's related one way or another. If > you come to the #openstack-lbaas channel once more people are around > (later this week?), we can probably take a look. That's only the example template (previously it was an example for LBaaSv2; now it's an example for Octavia); there's been no recent change to the code. >      --Adam Harwell (rm_work) > > On Sun, Dec 30, 2018, 03:37 Zufar Dhiyaulhaq > wrote: > > I have try creating load balancer with Heat. but always get this error : > > Resource CREATE failed: OctaviaClientException: > resources.loadbalancer: Validation failure: Missing project ID in > request where one is required. (HTTP 400) (Request-ID: > req-b45208e1-a200-47f9-8aad-b130c4c12272) > > OctaviaClientException: resources.loadbalancer: Validation failure: > Missing project ID in request where one is required. (HTTP 400) > (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) What version of OpenStack are you using? The issue is that Heat is sending a "tenant_id" but Octavia wants a "project_id", which is the new name for the same thing. (I think you likely modified that template after trying it but before uploading it, because there is no "project_id" property in Heat's OS::Octavia::LoadBalancer resource type.) This bug has been reported and there is a patch up for review in Heat: https://storyboard.openstack.org/#!/story/2004650 There was a change to Octavia in Pike (https://review.openstack.org/455442) to add backwards compatibility, but it was either incomplete or the problem reoccurred and was fixed again in Rocky (https://review.openstack.org/569881). My guess is that it's likely broken in Pike and Queens. I'd certainly have expected Heat's gate tests to pick up the problem, and it's a bit of a mystery why they didn't. Perhaps we're not exercising the case where a project_id is required (using it at all is an admin-only feature, so that's not too surprising I guess; it's actually more surprising that there's a case where it's _required_). cheers, Zane. > I create 2 openstack environment : > > * Heat with Octavia (Octavia Heat Template : > http://paste.opensuse.org/view//33592182 ) > * Heat with Neutron Lbaasv2 (Neutron LBaaSv2 Heat Template : > http://paste.opensuse.org/view//71741503) > > But always error when creating with octavia : > > * Octavia Log (https://imgur.com/a/EsuWvla) > * LBaaS v2 (https://imgur.com/a/BqNGRPH) > > Are Heat code is broken to create Octavia Load Balancer? > > Best Regards, > Zufar Dhiyaulhaq > From yjf1970231893 at gmail.com Wed Jan 2 04:30:30 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Wed, 2 Jan 2019 12:30:30 +0800 Subject: [heat] Bug : Heat cannot create Octavia Load Balancer In-Reply-To: References: <11bd665e-ab0d-6ae6-49f9-6b3a7fbc4eea@redhat.com> Message-ID: Please confirm whether "auth_strategy" is set as ''keystone" in configure file. I remember that the value of "auth_strategy" is set as "noauth" in "/usr/share/octavia/octavia-dist.conf" default if you install octavia by rpm. If the value was set as "noauth", you must manually specify "project_id" for octavia. Jeff Yang 于2019年1月2日周三 下午12:26写道: > Please confirm whether "auth_strategy" is set as ''keystone" in > configure file. I remember that the value of "auth_strategy" is set as > "noauth" in "/usr/share/octavia/octavia-dist.conf" default if you install > octavia by rpm. If the value was set as "noauth", you must manually specify > "project_id" for octavia. > > Zane Bitter 于2019年1月2日周三 上午11:23写道: > >> On 2/01/19 1:34 AM, Adam Harwell wrote: >> > I'm just on my phone over the holidays, but it kinda looks like the >> code >> > for this was just updated 12 days ago: >> > https://review.openstack.org/#/c/619577/ >> > >> > If you're using that new code, I imagine it's possible there could be a >> > bug that wasn't yet caught... If you're NOT using that code, maybe try >> > it and see if it helps? I'm guessing it's related one way or another. >> If >> > you come to the #openstack-lbaas channel once more people are around >> > (later this week?), we can probably take a look. >> >> That's only the example template (previously it was an example for >> LBaaSv2; now it's an example for Octavia); there's been no recent change >> to the code. >> >> > --Adam Harwell (rm_work) >> > >> > On Sun, Dec 30, 2018, 03:37 Zufar Dhiyaulhaq > > > wrote: >> > >> > I have try creating load balancer with Heat. but always get this >> error : >> > >> > Resource CREATE failed: OctaviaClientException: >> > resources.loadbalancer: Validation failure: Missing project ID in >> > request where one is required. (HTTP 400) (Request-ID: >> > req-b45208e1-a200-47f9-8aad-b130c4c12272) >> > >> > OctaviaClientException: resources.loadbalancer: Validation failure: >> > Missing project ID in request where one is required. (HTTP 400) >> > (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) >> >> What version of OpenStack are you using? >> >> The issue is that Heat is sending a "tenant_id" but Octavia wants a >> "project_id", which is the new name for the same thing. (I think you >> likely modified that template after trying it but before uploading it, >> because there is no "project_id" property in Heat's >> OS::Octavia::LoadBalancer resource type.) >> >> This bug has been reported and there is a patch up for review in Heat: >> https://storyboard.openstack.org/#!/story/2004650 >> >> There was a change to Octavia in Pike >> (https://review.openstack.org/455442) to add backwards compatibility, >> but it was either incomplete or the problem reoccurred and was fixed >> again in Rocky (https://review.openstack.org/569881). My guess is that >> it's likely broken in Pike and Queens. >> >> I'd certainly have expected Heat's gate tests to pick up the problem, >> and it's a bit of a mystery why they didn't. Perhaps we're not >> exercising the case where a project_id is required (using it at all is >> an admin-only feature, so that's not too surprising I guess; it's >> actually more surprising that there's a case where it's _required_). >> >> cheers, >> Zane. >> >> > I create 2 openstack environment : >> > >> > * Heat with Octavia (Octavia Heat Template : >> > http://paste.opensuse.org/view//33592182 ) >> > * Heat with Neutron Lbaasv2 (Neutron LBaaSv2 Heat Template : >> > http://paste.opensuse.org/view//71741503) >> > >> > But always error when creating with octavia : >> > >> > * Octavia Log (https://imgur.com/a/EsuWvla) >> > * LBaaS v2 (https://imgur.com/a/BqNGRPH) >> > >> > Are Heat code is broken to create Octavia Load Balancer? >> > >> > Best Regards, >> > Zufar Dhiyaulhaq >> > >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramishra at redhat.com Wed Jan 2 05:49:25 2019 From: ramishra at redhat.com (Rabi Mishra) Date: Wed, 2 Jan 2019 11:19:25 +0530 Subject: [heat] Bug : Heat cannot create Octavia Load Balancer In-Reply-To: References: <11bd665e-ab0d-6ae6-49f9-6b3a7fbc4eea@redhat.com> Message-ID: On Wed, Jan 2, 2019 at 10:03 AM Jeff Yang wrote: > Please confirm whether "auth_strategy" is set as ''keystone" in > configure file. I remember that the value of "auth_strategy" is set as > "noauth" in "/usr/share/octavia/octavia-dist.conf" default if you install > octavia by rpm. If the value was set as "noauth", you must manually specify > "project_id" for octavia. > > Yeah, that could be the reason[1] (when deploying with puppet puppet-octavia sets it to keystone[2]), as the error is coming from octavia[3], when you don't specify a project_id in request and the context does not have it either. [1] https://github.com/rdo-packages/octavia-distgit/blob/rpm-master/octavia-dist.conf#L3 [2] https://github.com/openstack/puppet-octavia/blob/master/manifests/api.pp#L65 [3] https://github.com/openstack/octavia/blob/master/octavia/api/v2/controllers/load_balancer.py#L251 Jeff Yang 于2019年1月2日周三 下午12:26写道: > Please confirm whether "auth_strategy" is set as ''keystone" in >> configure file. I remember that the value of "auth_strategy" is set as >> "noauth" in "/usr/share/octavia/octavia-dist.conf" default if you install >> octavia by rpm. If the value was set as "noauth", you must manually specify >> "project_id" for octavia. >> >> Zane Bitter 于2019年1月2日周三 上午11:23写道: >> >>> On 2/01/19 1:34 AM, Adam Harwell wrote: >>> > I'm just on my phone over the holidays, but it kinda looks like the >>> code >>> > for this was just updated 12 days ago: >>> > https://review.openstack.org/#/c/619577/ >>> > >>> > If you're using that new code, I imagine it's possible there could be >>> a >>> > bug that wasn't yet caught... If you're NOT using that code, maybe try >>> > it and see if it helps? I'm guessing it's related one way or another. >>> If >>> > you come to the #openstack-lbaas channel once more people are around >>> > (later this week?), we can probably take a look. >>> >>> That's only the example template (previously it was an example for >>> LBaaSv2; now it's an example for Octavia); there's been no recent change >>> to the code. >>> >>> > --Adam Harwell (rm_work) >>> > >>> > On Sun, Dec 30, 2018, 03:37 Zufar Dhiyaulhaq < >>> zufardhiyaulhaq at gmail.com >>> > > wrote: >>> > >>> > I have try creating load balancer with Heat. but always get this >>> error : >>> > >>> > Resource CREATE failed: OctaviaClientException: >>> > resources.loadbalancer: Validation failure: Missing project ID in >>> > request where one is required. (HTTP 400) (Request-ID: >>> > req-b45208e1-a200-47f9-8aad-b130c4c12272) >>> > >>> > OctaviaClientException: resources.loadbalancer: Validation failure: >>> > Missing project ID in request where one is required. (HTTP 400) >>> > (Request-ID: req-b45208e1-a200-47f9-8aad-b130c4c12272) >>> >>> What version of OpenStack are you using? >>> >>> The issue is that Heat is sending a "tenant_id" but Octavia wants a >>> "project_id", which is the new name for the same thing. (I think you >>> likely modified that template after trying it but before uploading it, >>> because there is no "project_id" property in Heat's >>> OS::Octavia::LoadBalancer resource type.) >>> >>> This bug has been reported and there is a patch up for review in Heat: >>> https://storyboard.openstack.org/#!/story/2004650 >>> >>> There was a change to Octavia in Pike >>> (https://review.openstack.org/455442) to add backwards compatibility, >>> but it was either incomplete or the problem reoccurred and was fixed >>> again in Rocky (https://review.openstack.org/569881). My guess is that >>> it's likely broken in Pike and Queens. >>> >>> I'd certainly have expected Heat's gate tests to pick up the problem, >>> and it's a bit of a mystery why they didn't. Perhaps we're not >>> exercising the case where a project_id is required (using it at all is >>> an admin-only feature, so that's not too surprising I guess; it's >>> actually more surprising that there's a case where it's _required_). >>> >>> cheers, >>> Zane. >>> >>> > I create 2 openstack environment : >>> > >>> > * Heat with Octavia (Octavia Heat Template : >>> > http://paste.opensuse.org/view//33592182 ) >>> > * Heat with Neutron Lbaasv2 (Neutron LBaaSv2 Heat Template : >>> > http://paste.opensuse.org/view//71741503) >>> > >>> > But always error when creating with octavia : >>> > >>> > * Octavia Log (https://imgur.com/a/EsuWvla) >>> > * LBaaS v2 (https://imgur.com/a/BqNGRPH) >>> > >>> > Are Heat code is broken to create Octavia Load Balancer? >>> > >>> > Best Regards, >>> > Zufar Dhiyaulhaq >>> > >>> >>> >>> -- Regards, Rabi Mishra -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Wed Jan 2 06:25:37 2019 From: zbitter at redhat.com (Zane Bitter) Date: Wed, 2 Jan 2019 19:25:37 +1300 Subject: queens heat db deadlock In-Reply-To: <028c4ec2-d6a7-d5a2-190d-91065d7231ee@gmail.com> References: <38bae882-b4ac-4b55-5345-e27edbd582f3@redhat.com> <82a140e4-55b7-453f-593d-d7423ac34e64@gmail.com> <94442ceb-1278-a573-1456-9f44204a8ccd@redhat.com> <028c4ec2-d6a7-d5a2-190d-91065d7231ee@gmail.com> Message-ID: <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> On 21/12/18 2:07 AM, Jay Pipes wrote: > On 12/20/2018 02:01 AM, Zane Bitter wrote: >> On 19/12/18 6:49 AM, Jay Pipes wrote: >>> On 12/18/2018 11:06 AM, Mike Bayer wrote: >>>> On Tue, Dec 18, 2018 at 12:36 AM Ignazio Cassano >>>> wrote: >>>>> >>>>> Yes, I  tried on yesterday and this workaround solved. >>>>> Thanks >>>>> Ignazio >>>> >>>> OK, so that means this "deadlock" is not really a deadlock but it is a >>>> write-conflict between two Galera masters.      I have a long term >>>> goal to being relaxing this common requirement that Openstack apps >>>> only refer to one Galera master at a time.    If this is a particular >>>> hotspot for Heat (no pun intended) can we pursue adding a transaction >>>> retry decorator for this operation?  This is the standard approach for >>>> other applications that are subject to galera multi-master writeset >>>> conflicts such as Neutron. >> >> The weird thing about this issue is that we actually have a retry >> decorator on the operation that I assume is the problem. It was added >> in Queens and largely fixed this issue in the gate: >> >> https://review.openstack.org/#/c/521170/1/heat/db/sqlalchemy/api.py >> >>> Correct. >>> >>> Heat doesn't use SELECT .. FOR UPDATE does it? That's also a big >>> cause of the aforementioned "deadlocks". >> >> AFAIK, no. In fact we were quite careful to design stuff that is >> expected to be subject to write contention to use UPDATE ... WHERE (by >> doing query().filter_by().update() in sqlalchemy), but it turned out >> to be those very statements that were most prone to causing deadlocks >> in the gate (i.e. we added retry decorators in those two places and >> the failures went away), according to me in the commit message for >> that patch: https://review.openstack.org/521170 >> >> Are we Doing It Wrong(TM)? > > No, it looks to me like you're doing things correctly. The OP mentioned > that this only happens when deleting a Magnum cluster -- and that it > doesn't occur in normal Heat template usage. > > I wonder (as I really don't know anything about Magnum, unfortunately), > is there something different about the Magnum cluster resource handling > in Heat that might be causing the wonkiness? There's no special-casing for Magnum within Heat. It's likely to be just that there's a lot of resources in a Magnum cluster - or more specifically, a lot of edges in the resource graph, which leads to more write contention (and, in a multi-master setup, more write conflicts). I'd assume that any similarly-complex template would have the same issues, and that Ignazio just didn't have anything else that complex to hand. That gives me an idea, though. I wonder if this would help: https://review.openstack.org/627914 Ignazio, could you possibly test with that ^ patch in multi-master mode to see if it resolves the issue? cheers, Zane. From ignaziocassano at gmail.com Wed Jan 2 07:55:10 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 2 Jan 2019 08:55:10 +0100 Subject: queens heat db deadlock In-Reply-To: <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> References: <38bae882-b4ac-4b55-5345-e27edbd582f3@redhat.com> <82a140e4-55b7-453f-593d-d7423ac34e64@gmail.com> <94442ceb-1278-a573-1456-9f44204a8ccd@redhat.com> <028c4ec2-d6a7-d5a2-190d-91065d7231ee@gmail.com> <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> Message-ID: Hello, I'll try as soon as possible ans I will send you a response. Ignazio Il giorno mer 2 gen 2019 alle ore 07:28 Zane Bitter ha scritto: > On 21/12/18 2:07 AM, Jay Pipes wrote: > > On 12/20/2018 02:01 AM, Zane Bitter wrote: > >> On 19/12/18 6:49 AM, Jay Pipes wrote: > >>> On 12/18/2018 11:06 AM, Mike Bayer wrote: > >>>> On Tue, Dec 18, 2018 at 12:36 AM Ignazio Cassano > >>>> wrote: > >>>>> > >>>>> Yes, I tried on yesterday and this workaround solved. > >>>>> Thanks > >>>>> Ignazio > >>>> > >>>> OK, so that means this "deadlock" is not really a deadlock but it is a > >>>> write-conflict between two Galera masters. I have a long term > >>>> goal to being relaxing this common requirement that Openstack apps > >>>> only refer to one Galera master at a time. If this is a particular > >>>> hotspot for Heat (no pun intended) can we pursue adding a transaction > >>>> retry decorator for this operation? This is the standard approach for > >>>> other applications that are subject to galera multi-master writeset > >>>> conflicts such as Neutron. > >> > >> The weird thing about this issue is that we actually have a retry > >> decorator on the operation that I assume is the problem. It was added > >> in Queens and largely fixed this issue in the gate: > >> > >> https://review.openstack.org/#/c/521170/1/heat/db/sqlalchemy/api.py > >> > >>> Correct. > >>> > >>> Heat doesn't use SELECT .. FOR UPDATE does it? That's also a big > >>> cause of the aforementioned "deadlocks". > >> > >> AFAIK, no. In fact we were quite careful to design stuff that is > >> expected to be subject to write contention to use UPDATE ... WHERE (by > >> doing query().filter_by().update() in sqlalchemy), but it turned out > >> to be those very statements that were most prone to causing deadlocks > >> in the gate (i.e. we added retry decorators in those two places and > >> the failures went away), according to me in the commit message for > >> that patch: https://review.openstack.org/521170 > >> > >> Are we Doing It Wrong(TM)? > > > > No, it looks to me like you're doing things correctly. The OP mentioned > > that this only happens when deleting a Magnum cluster -- and that it > > doesn't occur in normal Heat template usage. > > > > I wonder (as I really don't know anything about Magnum, unfortunately), > > is there something different about the Magnum cluster resource handling > > in Heat that might be causing the wonkiness? > > There's no special-casing for Magnum within Heat. It's likely to be just > that there's a lot of resources in a Magnum cluster - or more > specifically, a lot of edges in the resource graph, which leads to more > write contention (and, in a multi-master setup, more write conflicts). > I'd assume that any similarly-complex template would have the same > issues, and that Ignazio just didn't have anything else that complex to > hand. > > That gives me an idea, though. I wonder if this would help: > > https://review.openstack.org/627914 > > Ignazio, could you possibly test with that ^ patch in multi-master mode > to see if it resolves the issue? > > cheers, > Zane. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhengzhenyulixi at gmail.com Wed Jan 2 08:57:28 2019 From: zhengzhenyulixi at gmail.com (Zhenyu Zheng) Date: Wed, 2 Jan 2019 16:57:28 +0800 Subject: [Nova] Suggestion needed for detach-boot-volume design Message-ID: Hi Nova, Happy New Year! I've been working on detach-boot-volume[1] in Stein, we got the initial design merged and while implementing we have meet some new problems and now I'm amending the spec to cover these new problems[2]. The thing I want to discuss for wider opinion is that in the initial design, we planned to support detach root volume for only STOPPED and SHELVED/SHELVE_OFFLOADED instances. But then we found out that we allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as well. Should we allow detaching root volume for instances in these status too? Cases like RESIZE could be complicated for the revert resize action, and it also seems unnecesary. Thoughts? BR, [1] https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/volume-backed-server-rebuild.html [2] https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/volume-backed-server-rebuild.html Kevin Zheng -------------- next part -------------- An HTML attachment was scrubbed... URL: From ltoscano at redhat.com Wed Jan 2 10:13:54 2019 From: ltoscano at redhat.com (Luigi Toscano) Date: Wed, 02 Jan 2019 11:13:54 +0100 Subject: [sahara][qa][api-sig]Support for Sahara APIv2 in tempest tests, unversioned endpoints Message-ID: <1818981.9ErCeWV4fL@whitebase.usersys.redhat.com> Hi all, I'm working on adding support for APIv2 to the Sahara tempest plugin. If I get it correctly, there are two main steps 1) Make sure that that tempest client works with APIv2 (and don't regress with APIv1.1). This mainly mean implementing the tempest client for Sahara APIv2, which should not be too complicated. On the other hand, we hit an issue with the v1.1 client in an APIv2 environment. A change associated with API v2 is usage of an unversioned endpoint for the deployment (see https://review.openstack.org/#/c/622330/ , without the /v1,1/$ (tenant_id) suffix) which should magically work with both API variants, but it seems that the current tempest client fails in this case: http://logs.openstack.org/30/622330/1/check/sahara-tests-tempest/7e02114/job-output.txt.gz#_2018-12-05_21_20_23_535544 Does anyone know if this is an issue with the code of the tempest tests (which should maybe have some logic to build the expected endpoint when it's unversioned, like saharaclient does) or somewhere else? 2) fix the tests to support APIv2. Should I duplicate the tests for APIv1.1 and APIv2? Other projects which supports different APIs seems to do this. But can I freely move the existing tests under a subdirectory (sahara_tempest_plugins/tests/api/ -> sahara_tempest_plugins/tests/api/v1/), or are there any compatibility concerns? Are the test ID enough to ensure that everything works as before? And what about CLI tests currently under sahara_tempest_plugin/tests/cli/ ? They supports both API versions through a configuration flag. Should they be duplicated as well? Ciao (and happy new year if you have a new one in your calendar!) -- Luigi From dtantsur at redhat.com Wed Jan 2 11:18:40 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 2 Jan 2019 12:18:40 +0100 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat Message-ID: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> Hi all and happy new year :) As you know, tempest plugins are branchless, so the CI of ironic-tempest-plugin has to run tests on all supported branches. Currently it amounts to 16 (!) voting devstack jobs. With each of them have some small probability of a random failure, it is impossible to land anything without at least one recheck, usually more. The bad news is, we only run master API tests job, and these tests are changed more often that the other. We already had a minor stable branch breakage because of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And I've just spotted a missing master multinode job, which is defined but does not run for some reason :( Here is my proposal to deal with gate bloat on ironic-tempest-plugin: 1. Do not run CI jobs at all for unsupported branches and branches in extended maintenance. For Ocata this has already been done in [2]. 2. Make jobs running with N-3 (currently Pike) and older non-voting (and thus remove them from the gate queue). I have a gut feeling that a change that breaks N-3 is very likely to break N-2 (currently Queens) as well, so it's enough to have N-2 voting. 3. Make the discovery and the multinode jobs from all stable branches non-voting. These jobs cover the tests that get changed very infrequently (if ever). These are also the jobs with the highest random failure rate. 4. Add the API tests, voting for Queens to master, non-voting for Pike (as proposed above). This should leave us with 20 jobs, but with only 11 of them voting. Which is still a lot, but probably manageable. The corresponding change is [3], please comment here or there. Dmitry [1] https://review.openstack.org/622177 [2] https://review.openstack.org/621537 [3] https://review.openstack.org/627955 From ignaziocassano at gmail.com Wed Jan 2 11:27:13 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 2 Jan 2019 12:27:13 +0100 Subject: queens heat db deadlock In-Reply-To: <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> References: <38bae882-b4ac-4b55-5345-e27edbd582f3@redhat.com> <82a140e4-55b7-453f-593d-d7423ac34e64@gmail.com> <94442ceb-1278-a573-1456-9f44204a8ccd@redhat.com> <028c4ec2-d6a7-d5a2-190d-91065d7231ee@gmail.com> <353d8c64-17c8-f10b-7de3-fe5471e46b4f@redhat.com> Message-ID: Hello Zane, we applyed the patch and modified our haproxy : unfortunately it does not solve db deadlock issue. Ignazio & Gianpiero Il giorno mer 2 gen 2019 alle ore 07:28 Zane Bitter ha scritto: > On 21/12/18 2:07 AM, Jay Pipes wrote: > > On 12/20/2018 02:01 AM, Zane Bitter wrote: > >> On 19/12/18 6:49 AM, Jay Pipes wrote: > >>> On 12/18/2018 11:06 AM, Mike Bayer wrote: > >>>> On Tue, Dec 18, 2018 at 12:36 AM Ignazio Cassano > >>>> wrote: > >>>>> > >>>>> Yes, I tried on yesterday and this workaround solved. > >>>>> Thanks > >>>>> Ignazio > >>>> > >>>> OK, so that means this "deadlock" is not really a deadlock but it is a > >>>> write-conflict between two Galera masters. I have a long term > >>>> goal to being relaxing this common requirement that Openstack apps > >>>> only refer to one Galera master at a time. If this is a particular > >>>> hotspot for Heat (no pun intended) can we pursue adding a transaction > >>>> retry decorator for this operation? This is the standard approach for > >>>> other applications that are subject to galera multi-master writeset > >>>> conflicts such as Neutron. > >> > >> The weird thing about this issue is that we actually have a retry > >> decorator on the operation that I assume is the problem. It was added > >> in Queens and largely fixed this issue in the gate: > >> > >> https://review.openstack.org/#/c/521170/1/heat/db/sqlalchemy/api.py > >> > >>> Correct. > >>> > >>> Heat doesn't use SELECT .. FOR UPDATE does it? That's also a big > >>> cause of the aforementioned "deadlocks". > >> > >> AFAIK, no. In fact we were quite careful to design stuff that is > >> expected to be subject to write contention to use UPDATE ... WHERE (by > >> doing query().filter_by().update() in sqlalchemy), but it turned out > >> to be those very statements that were most prone to causing deadlocks > >> in the gate (i.e. we added retry decorators in those two places and > >> the failures went away), according to me in the commit message for > >> that patch: https://review.openstack.org/521170 > >> > >> Are we Doing It Wrong(TM)? > > > > No, it looks to me like you're doing things correctly. The OP mentioned > > that this only happens when deleting a Magnum cluster -- and that it > > doesn't occur in normal Heat template usage. > > > > I wonder (as I really don't know anything about Magnum, unfortunately), > > is there something different about the Magnum cluster resource handling > > in Heat that might be causing the wonkiness? > > There's no special-casing for Magnum within Heat. It's likely to be just > that there's a lot of resources in a Magnum cluster - or more > specifically, a lot of edges in the resource graph, which leads to more > write contention (and, in a multi-master setup, more write conflicts). > I'd assume that any similarly-complex template would have the same > issues, and that Ignazio just didn't have anything else that complex to > hand. > > That gives me an idea, though. I wonder if this would help: > > https://review.openstack.org/627914 > > Ignazio, could you possibly test with that ^ patch in multi-master mode > to see if it resolves the issue? > > cheers, > Zane. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Wed Jan 2 13:08:00 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 2 Jan 2019 14:08:00 +0100 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> Message-ID: <9487fc46-6957-82b2-49a7-3fc8cca53842@redhat.com> On 1/2/19 12:18 PM, Dmitry Tantsur wrote: > Hi all and happy new year :) > > As you know, tempest plugins are branchless, so the CI of ironic-tempest-plugin > has to run tests on all supported branches. Currently it amounts to 16 (!) > voting devstack jobs. With each of them have some small probability of a random > failure, it is impossible to land anything without at least one recheck, usually > more. > > The bad news is, we only run master API tests job, and these tests are changed > more often that the other. We already had a minor stable branch breakage because > of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And I've just > spotted a missing master multinode job, which is defined but does not run for > some reason :( Better news: the API tests did not have a separate job before Rocky, so we only need to add Rocky. However, we'll get to 4 jobs in the future. The multinode job is missing because it was renamed on master, and apparently Zuul does not report it Oo > > Here is my proposal to deal with gate bloat on ironic-tempest-plugin: > > 1. Do not run CI jobs at all for unsupported branches and branches in extended > maintenance. For Ocata this has already been done in [2]. > > 2. Make jobs running with N-3 (currently Pike) and older non-voting (and thus > remove them from the gate queue). I have a gut feeling that a change that breaks > N-3 is very likely to break N-2 (currently Queens) as well, so it's enough to > have N-2 voting. > > 3. Make the discovery and the multinode jobs from all stable branches > non-voting. These jobs cover the tests that get changed very infrequently (if > ever). These are also the jobs with the highest random failure rate. > > 4. Add the API tests, voting for Queens to master, non-voting for Pike (as > proposed above). Only Rocky here for now. > > This should leave us with 20 jobs, but with only 11 of them voting. Which is > still a lot, but probably manageable. > > The corresponding change is [3], please comment here or there. > > Dmitry > > [1] https://review.openstack.org/622177 > [2] https://review.openstack.org/621537 > [3] https://review.openstack.org/627955 From eblock at nde.ag Wed Jan 2 13:39:23 2019 From: eblock at nde.ag (Eugen Block) Date: Wed, 02 Jan 2019 13:39:23 +0000 Subject: [Openstack] [Nova][Glance] Nova imports flat images from base file despite ceph backend In-Reply-To: References: <20180928115051.Horde.ZC_55UzSXeK4hiOjJt6tajA@webmail.nde.ag> <20180928125224.Horde.33aqtdk0B9Ncylg-zxjA5to@webmail.nde.ag> <9F3C86CE-862D-469A-AD79-3F334CD5DB41@enter.eu> <20181004124417.Horde.py2wEG4JmO1oFXbjX5u1uw3@webmail.nde.ag> <20181009080101.Horde.---iO9LIrKkWvTsNJwWk_Mj@webmail.nde.ag> <679352a8-c082-d851-d8a5-ea7b2348b7d3@gmail.com> <20181012215027.Horde.t5xm_KfkoEE4YEnrewHQZPG@webmail.nde.ag> <9df7167b-ea3b-51d6-9fad-7c9298caa7be@gmail.com> <72242CC2-621E-4037-A8F0-8AE56C4A6F36@italy1.com> Message-ID: <20190102133923.Horde.CY4bM26RNgf_UaNTjY-WZYe@webmail.nde.ag> Hello and a happy new year! I need to reopen this thread because there are still things going on that I don't fully understand. I changed the disk_format of all the images that were affected by my mistake a couple of weeks ago. I deleted the base files in /var/lib/nova/instances/_base and launched new instances, leading to expected cow clones: ---cut here--- control:~ # openstack image show 5f486361-5468-42a0-9993-9cdda3450b0e | grep disk_format | disk_format | raw| control:~ # rbd children images/5f486361-5468-42a0-9993-9cdda3450b0e at snap images/029bfb90-cbab-4a7c-a51c-27807ab41ce7_disk images/ccd4498f-c0a0-480a-ab4b-6224d63e78fa_disk images/d5379918-40a0-4119-9e47-773c0ab8c0f3_disk ---cut here--- These are only three clones, but I have 16 instances based on the same image: ---cut here--- +--------------------------------------+ | uuid | +--------------------------------------+ | 87dfbb4f-784d-4390-a0c9-d3162a56ea7e | | bb56995d-d3ea-464d-94fe-382880bf2a92 | | d4fde2fe-140e-4904-90e8-996cc418302d | | 7bd4fdc0-d6b7-47cb-adb3-324abff6a0e5 | | bef8e4fe-b2f4-44f5-a144-36ced37007ac | | bcc005fc-fa12-4dd9-b531-cbccfb7c426a | ###| ccd4498f-c0a0-480a-ab4b-6224d63e78fa | ###| 029bfb90-cbab-4a7c-a51c-27807ab41ce7 | ###| d5379918-40a0-4119-9e47-773c0ab8c0f3 | | abd2c0bc-2e66-4e2a-9f4c-f006937c7b27 | | e07c0f5e-403e-4ad3-b04f-fb306e99869e | | edec4aa0-9459-4444-bbb8-03097bccba4c | | ca4449f5-921d-4ce5-8559-fc719cfbc845 | | 8bff1799-391e-4620-9b11-134962084b84 | | 9d927ac4-d7d3-45cb-8628-b267a6d0e668 | | 54cfeede-44f9-49bd-9f49-ef4dcacff953 | +--------------------------------------+ ---cut here--- Those three cow clones were created after adjusting the disk_format, some of the rest were created before the changes, the others have been created after the changes. Can anyone explain why a new base image is created? I downloaded the glance image and double-checked the file format, I also exported the snapshot which should be used to create clones, there's only one discrepancy (disk size), but I don't think this could be relevant: ---cut here--- control:~ # qemu-img info /var/lib/glance/images/image-snap.img image: /var/lib/glance/images/image-snap.img file format: raw virtual size: 5.0G (5368709120 bytes) disk size: 2.5G control:~ # qemu-img info /var/lib/glance/images/image.img image: /var/lib/glance/images/image.img file format: raw virtual size: 5.0G (5368709120 bytes) disk size: 5.0G control:~ # md5sum /var/lib/glance/images/image.img b9b28dd300a6fbb1de2081f1cb8a07d0 /var/lib/glance/images/image.img control:~ # md5sum /var/lib/glance/images/image-snap.img b9b28dd300a6fbb1de2081f1cb8a07d0 /var/lib/glance/images/image-snap.img ---cut here--- I also tried to find something with the glance-cache-manage cli, but I can't even connect to that service, I guess it has never been used. control:~ # glance-cache-manage list-cached Failed to show cached images. Got error: Connect error/bad request to Auth service at URL http://control1.cloud.hh.nde.ag:5000/v3/tokens. I don't know how to fix that yet, but could this be a way to resolve that issue? Regards, Eugen Zitat von melanie witt : > On Fri, 12 Oct 2018 20:06:04 -0700, Remo Mattei wrote: >> I do not have it handy now but you can verify that the image is >> indeed raw or qcow2 >> >> As soon as I get home I will dig the command and pass it on. I have >> seen where images have extensions thinking it is raw and it is not. > > You could try 'qemu-img info ' and get output like this, > notice "file format": > > $ qemu-img info test.vmdk > (VMDK) image open: flags=0x2 filename=test.vmdk > image: test.vmdk > file format: vmdk > virtual size: 20M (20971520 bytes) > disk size: 17M > > [1] https://en.wikibooks.org/wiki/QEMU/Images#Getting_information > > -melanie > > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From bence.romsics at gmail.com Wed Jan 2 14:20:10 2019 From: bence.romsics at gmail.com (Bence Romsics) Date: Wed, 2 Jan 2019 15:20:10 +0100 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <2a13e4e3-2add-76ef-9fd9-018dfc493cdb@gmail.com> References: <1545231821.28650.2@smtp.office365.com> <2a13e4e3-2add-76ef-9fd9-018dfc493cdb@gmail.com> Message-ID: Hi Matt, Sorry for the slow response over the winter holidays. First I have to correct myself: > On 12/20/2018 8:02 AM, Bence Romsics wrote: > > ... in neutron we > > don't directly control the list of extensions loaded. Instead what we > > control (through configuration) is the list of service plugins loaded. > > The 'resource_request' extension is implemented by the 'qos' service > > plugin. But the 'qos' service plugin implements other API extensions > > and features too. A cloud admin may want to use these other features > > of the 'qos' service plugin, but not the guaranteed minimum bandwidth. This is the default behavior, but it can be overcome by a small patch like this: https://review.openstack.org/627978 With a patch like that we could control loading the port-resource-request extension (and by that the presence of the resource_request attribute) on its own (independently of all other extensions implemented by the qos service plugin). On Thu, Dec 20, 2018 at 6:58 PM Matt Riedemann wrote: > Can't the resource_request part of this be controlled via policy rules > or something similar? Is this question still relevant given the above? Even if we could control the resource-request attribute via policy rules wouldn't that be just as undiscoverable as a config-only feature flag? > Barring that, are policy rules something that > could be used for deployers could decide which users can use this > feature while it's being rolled out? Using a standalone neutron extension (controlled on its own by a neutron config option) as a feature flag (and keeping it not loaded by default until the feature is past experimental) would lessen the number of cloud deployments (probably to zero) where the experimental feature is unintentionally exposed. On the other hand - now that Jay called my attention to the undiscoverability of feature flags - I realize that this approach is not enough to distinguish the complete and experimental versions of the feature, given the experimental version was exposed intentionally. Cheers, Bence From jaypipes at gmail.com Wed Jan 2 14:47:01 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Wed, 2 Jan 2019 09:47:01 -0500 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> Message-ID: <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> On 12/21/2018 03:45 AM, Rui Zang wrote: > It was advised in today's nova team meeting to bring this up by email. > > There has been some discussion on the how to track persistent memory > resource in placement on the spec review [1]. > > Background: persistent memory (PMEM) needs to be partitioned to > namespaces to be consumed by VMs. Due to fragmentation issues, the spec > proposed to use fixed sized PMEM namespaces. The spec proposed to use fixed sized namespaces that is controllable by the deployer, not fixed-size-for-everyone :) Just want to make sure we're being clear here. > The spec proposed way to represent PMEM namespaces is to use one > Resource Provider (RP) for one PMEM namespace. An new standard Resource > Class (RC) -- 'VPMEM_GB` is introduced to classify PMEM namspace RPs. > For each PMEM namespace RP, the values for 'max_unit', 'min_unit', > 'total' and 'step_size` are all set to the size of the PMEM namespace. > In this way, it is guaranteed each RP will be consumed as a whole at one > time. > > An alternative was brought out in the review. Different Custom Resource > Classes ( CUSTOM_PMEM_XXXGB) can be used to designate PMEM namespaces of > different sizes. The size of the PMEM namespace is encoded in the name > of the custom Resource Class. And multiple PMEM namespaces of the same > size  (say 128G) can be represented by one RP of the same Not represented by "one RP of the same CUSTOM_PMEM_128G". There would be only one resource provider: the compute node itself. It would have an inventory of, say, 8 CUSTOM_PMEM_128G resources. > CUSTOM_PMEM_128G. In this way, the RP could have 'max_unit'  and 'total' > as the total number of the PMEM namespaces of the certain size. And the > values of 'min_unit' and 'step_size' could set to 1. No, the max_unit, min_unit, step_size and total would refer to the number of *PMEM namespaces*, not the amount of GB of memory represented by those namespaces. Therefore, min_unit and step_size would be 1, max_unit would be the total number of *namespaces* that could simultaneously be attached to a single consumer (VM), and total would be 8 in our example where the compute node had 8 of these pre-defined 128G PMEM namespaces. > We believe both way could work. We would like to have a community > consensus on which way to use. > Email replies and review comments to the spec [1] are both welcomed. Custom resource classes were invented for precisely this kind of use case. The resource being represented is a namespace. The resource is not "a Gibibyte of persistent memory". Best, -jay > Regards, > Zang, Rui > > > [1] https://review.openstack.org/#/c/601596/ > From openstack at nemebean.com Wed Jan 2 16:23:14 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 2 Jan 2019 10:23:14 -0600 Subject: [oslo] Problem when use library "oslo.messaging" for HA Openstack In-Reply-To: References: Message-ID: On 12/27/18 8:22 PM, Thành Nguyễn Bá wrote: > Dear all, > > I have a problem when use 'notification listener' oslo-message for HA > Openstack. > > It raise 'oslo_messaging.exceptions.MessageDeliveryFailure: Unable to > connect to AMQP server on 172.16.4.125:5672 >  after inf tries: Exchange.declare: (406) > PRECONDITION_FAILED - inequivalent arg 'durable' for exchange 'nova' in > vhost '/': received 'false' but current is 'true''. > > How can i fix this?. I think settings default in my program set > 'durable' is False so it can't listen RabbitMQ Openstack? It probably depends on which rabbit client library you're using to listen for notifications. Presumably there should be some way to configure it to set durable to True. I guess the other option is to disable durable queues in the Nova config, but then you lose the contents of any queues when Rabbit gets restarted. It would be better to figure out how to make the consuming application configure durable queues instead. > > This is my nova.conf > > http://paste.openstack.org/show/738813/ > > > And section [oslo_messaging_rabbit] > > [oslo_messaging_rabbit] > rabbit_ha_queues = true > rabbit_retry_interval = 1 > rabbit_retry_backoff = 2 > amqp_durable_queues= true > > > > *Nguyễn Bá Thành* > > *Mobile*:    0128 748 0391 > > *Email*: bathanhtlu at gmail.com > From hongbin.lu at huawei.com Wed Jan 2 16:51:00 2019 From: hongbin.lu at huawei.com (Hongbin Lu) Date: Wed, 2 Jan 2019 16:51:00 +0000 Subject: [neutron] bug deputy report (Dec 24 - Dec 30) Message-ID: <0957CD8F4B55C0418161614FEC580D6B308D68A4@yyzeml705-chm.china.huawei.com> Hi all, Below is the bug deputy report for last week. Since it is on the holiday, there are not too much reporting bugs. REFs: * https://bugs.launchpad.net/neutron/+bug/1809628 [RFE] Enable driver field for the api of service_providers * https://bugs.launchpad.net/neutron/+bug/1809878 [RFE] Move sanity-checks to neutron-status CLI tool Incomplete: * https://bugs.launchpad.net/neutron/+bug/1809907 Unstable ping during zuul testing Best regards, Hongbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Wed Jan 2 17:05:05 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 02 Jan 2019 12:05:05 -0500 Subject: [tc] agenda for Technical Committee Meeting 3 Jan 2019 @ 1400 UTC Message-ID: TC Members, Our next meeting will be this Thursday, 3 Jan at 1400 UTC in #openstack-tc. This email contains the agenda for the meeting, based on the content of the wiki [0]. If you will not be able to attend, please include your name in the "Apologies for Absence" section of the wiki page [0]. [0] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee We have several items of old business to wrap up: * technical vision for openstack We have approved the first draft of the vision [1], and have one update proposed by gmann [2]. What are the next steps for this change to the vision? Are we ready to remove this topic from the tracker now, and treat further updates as individual work items? [1] https://governance.openstack.org/tc/reference/technical-vision.html [2] https://review.openstack.org/#/c/621516/ * next step in TC vision/defining the role of the TC We also approved a document explaining the role of the TC [3]. Is there more work to do here, or are we ready to remove this from the tracker now? [3] https://governance.openstack.org/tc/reference/role-of-the-tc.html * keeping up with python 3 releases We have approved all of the patches for documenting the policy and for selecting the versions to be covered in Stein. What are the next steps for ensuring that any implementation work is handled? * Reviewing TC Office Hour Times and Locations The most recent mailing list thread [4] was resolved with no changes to the number of office hours. Does anyone want to propose different times, or are we happy with the current schedule? [4] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000542.html and we also have a couple of items of new business to take up: * Train cycle goals selection update Checking with lbragstad and evrardjp for any updates to share with us this month. * health check status for stein How is it going contacting the PTLs for the health check for stein? Does anyone have anything to raise based on what they have learned in their conversations? -- Doug From doug at doughellmann.com Wed Jan 2 17:30:39 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 02 Jan 2019 12:30:39 -0500 Subject: [oslo] Problem when use library "oslo.messaging" for HA Openstack In-Reply-To: References: Message-ID: Ben Nemec writes: > On 12/27/18 8:22 PM, Thành Nguyễn Bá wrote: >> Dear all, >> >> I have a problem when use 'notification listener' oslo-message for HA >> Openstack. >> >> It raise 'oslo_messaging.exceptions.MessageDeliveryFailure: Unable to >> connect to AMQP server on 172.16.4.125:5672 >>  after inf tries: Exchange.declare: (406) >> PRECONDITION_FAILED - inequivalent arg 'durable' for exchange 'nova' in >> vhost '/': received 'false' but current is 'true''. >> >> How can i fix this?. I think settings default in my program set >> 'durable' is False so it can't listen RabbitMQ Openstack? > > It probably depends on which rabbit client library you're using to > listen for notifications. Presumably there should be some way to > configure it to set durable to True. IIRC, the "exchange" needs to be declared consistently among all listeners because the first client to connect causes the exchange to be created. > I guess the other option is to disable durable queues in the Nova > config, but then you lose the contents of any queues when Rabbit gets > restarted. It would be better to figure out how to make the consuming > application configure durable queues instead. > >> >> This is my nova.conf >> >> http://paste.openstack.org/show/738813/ >> >> >> And section [oslo_messaging_rabbit] >> >> [oslo_messaging_rabbit] >> rabbit_ha_queues = true >> rabbit_retry_interval = 1 >> rabbit_retry_backoff = 2 >> amqp_durable_queues= true You say that is your nova.conf. Is that the same configuration file your client is using when it connects? >> >> >> >> *Nguyễn Bá Thành* >> >> *Mobile*:    0128 748 0391 >> >> *Email*: bathanhtlu at gmail.com >> > -- Doug From doug at doughellmann.com Wed Jan 2 18:09:12 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 02 Jan 2019 13:09:12 -0500 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: Message-ID: Doug Hellmann writes: > Today devstack requires each project to explicitly indicate that it can > be installed under python 3, even when devstack itself is running with > python 3 enabled. > > As part of the python3-first goal, I have proposed a change to devstack > to modify that behavior [1]. With the change in place, when devstack > runs with python3 enabled all services are installed under python 3, > unless explicitly listed as not supporting python 3. > > If your project has a devstack plugin or runs integration or functional > test jobs that use devstack, please test your project with the patch > (you can submit a trivial change to your project and use Depends-On to > pull in the devstack change). > > [1] https://review.openstack.org/#/c/622415/ > -- > Doug > We have had a few +1 votes on the patch above with comments that indicate at least a couple of projects have taken the time to test and verify that things won't break for them with the change. Are we ready to proceed with merging the change? -- Doug From openstack at nemebean.com Wed Jan 2 18:17:12 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 2 Jan 2019 12:17:12 -0600 Subject: [oslo] Parallel Privsep is Proposed for Release Message-ID: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Yay alliteration! :-) I wanted to draw attention to this release[1] in particular because it includes the parallel privsep change[2]. While it shouldn't have any effect on the public API of the library, it does significantly affect how privsep will process calls on the back end. Specifically, multiple calls can now be processed at the same time, so if any privileged code is not reentrant it's possible that new race bugs could pop up. While this sounds scary, it's a necessary change to allow use of privsep in situations where a privileged call may take a non-trivial amount of time. Cinder in particular has some privileged calls that are long-running and can't afford to block all other privileged calls on them. So if you're a consumer of oslo.privsep please keep your eyes open for issues related to this new release and contact the Oslo team if you find any. Thanks. -Ben 1: https://review.openstack.org/628019 2: https://review.openstack.org/#/c/593556/ From cboylan at sapwetik.org Wed Jan 2 18:24:09 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 02 Jan 2019 10:24:09 -0800 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> Message-ID: <1546453449.3633235.1623759896.26639384@webmail.messagingengine.com> On Wed, Jan 2, 2019, at 3:18 AM, Dmitry Tantsur wrote: > Hi all and happy new year :) > > As you know, tempest plugins are branchless, so the CI of ironic- > tempest-plugin > has to run tests on all supported branches. Currently it amounts to 16 > (!) > voting devstack jobs. With each of them have some small probability of a > random > failure, it is impossible to land anything without at least one recheck, > usually > more. > > The bad news is, we only run master API tests job, and these tests are > changed > more often that the other. We already had a minor stable branch breakage > because > of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And > I've just > spotted a missing master multinode job, which is defined but does not > run for > some reason :( > > Here is my proposal to deal with gate bloat on ironic-tempest-plugin: > > 1. Do not run CI jobs at all for unsupported branches and branches in extended > maintenance. For Ocata this has already been done in [2]. > > 2. Make jobs running with N-3 (currently Pike) and older non-voting (and > thus > remove them from the gate queue). I have a gut feeling that a change > that breaks > N-3 is very likely to break N-2 (currently Queens) as well, so it's > enough to > have N-2 voting. > > 3. Make the discovery and the multinode jobs from all stable branches > non-voting. These jobs cover the tests that get changed very infrequently (if > ever). These are also the jobs with the highest random failure rate. Has any work been done to investigate why these jobs fail? And if not maybe we should stop running the jobs entirely. Non voting jobs that aren't reliable will just get ignored. > > 4. Add the API tests, voting for Queens to master, non-voting for Pike (as > proposed above). > > This should leave us with 20 jobs, but with only 11 of them voting. Which is > still a lot, but probably manageable. > > The corresponding change is [3], please comment here or there. > > Dmitry > > [1] https://review.openstack.org/622177 > [2] https://review.openstack.org/621537 > [3] https://review.openstack.org/627955 > From dtantsur at redhat.com Wed Jan 2 18:39:00 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 2 Jan 2019 19:39:00 +0100 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: <1546453449.3633235.1623759896.26639384@webmail.messagingengine.com> References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> <1546453449.3633235.1623759896.26639384@webmail.messagingengine.com> Message-ID: On 1/2/19 7:24 PM, Clark Boylan wrote: > On Wed, Jan 2, 2019, at 3:18 AM, Dmitry Tantsur wrote: >> Hi all and happy new year :) >> >> As you know, tempest plugins are branchless, so the CI of ironic- >> tempest-plugin >> has to run tests on all supported branches. Currently it amounts to 16 >> (!) >> voting devstack jobs. With each of them have some small probability of a >> random >> failure, it is impossible to land anything without at least one recheck, >> usually >> more. >> >> The bad news is, we only run master API tests job, and these tests are >> changed >> more often that the other. We already had a minor stable branch breakage >> because >> of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And >> I've just >> spotted a missing master multinode job, which is defined but does not >> run for >> some reason :( >> >> Here is my proposal to deal with gate bloat on ironic-tempest-plugin: >> >> 1. Do not run CI jobs at all for unsupported branches and branches in extended >> maintenance. For Ocata this has already been done in [2]. >> >> 2. Make jobs running with N-3 (currently Pike) and older non-voting (and >> thus >> remove them from the gate queue). I have a gut feeling that a change >> that breaks >> N-3 is very likely to break N-2 (currently Queens) as well, so it's >> enough to >> have N-2 voting. >> >> 3. Make the discovery and the multinode jobs from all stable branches >> non-voting. These jobs cover the tests that get changed very infrequently (if >> ever). These are also the jobs with the highest random failure rate. > > Has any work been done to investigate why these jobs fail? And if not maybe we should stop running the jobs entirely. Non voting jobs that aren't reliable will just get ignored. From my experience it's PXE failing or just generic timeout on slow nodes. Note that they still don't fail too often, it's their total number that makes it problematic. When you have 20 jobs each failing with, say, 5% rate it's just 35% chance of passing (unless I cannot do math). But to answer your question, yes, we do put work in that. We just never got to 0% of random failures. > >> >> 4. Add the API tests, voting for Queens to master, non-voting for Pike (as >> proposed above). >> >> This should leave us with 20 jobs, but with only 11 of them voting. Which is >> still a lot, but probably manageable. >> >> The corresponding change is [3], please comment here or there. >> >> Dmitry >> >> [1] https://review.openstack.org/622177 >> [2] https://review.openstack.org/621537 >> [3] https://review.openstack.org/627955 >> > From openstack at nemebean.com Wed Jan 2 18:49:32 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 2 Jan 2019 12:49:32 -0600 Subject: [nova] Granular locks in the API In-Reply-To: References: <7e570da6-af7a-1439-9172-e454590d52cf@gmail.com> Message-ID: On 12/20/18 4:58 PM, Lance Bragstad wrote: > > > On Thu, Dec 20, 2018 at 3:50 PM Matt Riedemann > wrote: > > On 12/20/2018 1:45 PM, Lance Bragstad wrote: > > > > One way you might be able to do this is by shoveling off the policy > > check using oslo.policy's http_check functionality [0]. But, it > still > > doesn't fix the problem that users have roles on projects, and > that's > > the standard for relaying information from keystone to services > today. > > > > Hypothetically, the external policy system *could* be an API that > allows > > operators to associate users to different policies that are more > > granular than what OpenStack offers today (I could POST to this > policy > > system that a specific user can do everything but resize up this > > *specific* instance). When nova parses a policy check, it hands > control > > to oslo.policy, which shuffles it off to this external system for > > enforcement. This external policy system evaluates the policies > based on > > what information nova passes it, which would require the policy > check > > string, context of the request like the user, and the resource > they are > > trying operate on (the instance in this case). The external policy > > system could query it's own policy database for any policies > matching > > that data, run the decisions, and return the enforcement decision > per > > the oslo.limit API. > > One thing I'm pretty sure of in nova is we do not do a great job of > getting the target of the policy check before actually doing the check. > In other words, our target is almost always the project/user from the > request context, and not the actual resource upon which the action is > being performed (the server in most cases). I know John Garbutt had a > spec for this before. It always confused me. > > > I doubt nova is alone in this position. I would bet there are a lot of > cases across OpenStack where we could be more consistent in how this > information is handed to oslo.policy. We attempted to solve this for the > other half of the equation, which is the `creds` dictionary. Turns out a > lot of what was in this arbitrary `creds` dict, was actually just > information from the request context object. The oslo.policy library now > supports context objects directly [0], as opposed to hoping services > build the dictionary properly. Target information will be a bit harder > to do because it's different across services and even APIs within the > same service. But yeah, I totally sympathize with the complexity it puts > on developers. > > [0] https://review.openstack.org/#/c/578995/ > > > > > > Conversely, you'll have a performance hit since the policy > decision and > > policy enforcement points are no longer oslo.policy *within* > nova, but > > some external system being called by oslo.policy... > > Yeah. The other thing is if I'm just looking at my server, I can see if > it's locked or not since it's an attribute of the server resource. With > policy I would only know if I can perform a certain action if I get a > 403 or not, which is fine in most cases. Being able to see via some > list > of locked actions per server is arguably more user friendly. This also > reminds me of reporting / capabilities APIs we've talked about over the > years, e.g. what I can do on this cloud, on this host, or with this > specific server? > > > Yeah - I wouldn't mind picking that conversation up, maybe in a separate > thread. An idea we had with keystone was to run a user's request through > all registered policies and return a list of the ones they could access > (e.g., take my token and tell me what I can do with it.) There are > probably other issues with this, since policy names are mostly operator > facing and end users don't really care at the moment. > > > > > > Might not be the best idea, but food for thought based on the > > architecture we have today. > > Definitely, thanks for the alternative. This is something one could > implement per-provider based on need if we don't have a standard > solution. > > > Right, I always thought it would be a good fit for people providing > super-specific policy checks or have a custom syntax they want to > implement. It keeps most of that separate from the services and > oslo.policy. So long as we pass target and context information > consistently, they essentially have an API they can write policies against. I know we fixed a number of bugs in services around the time of the first Denver PTG because a user wanted to offload policy checks to an external system and used HTTPCheck for it. They ran across a number of places where the data passed to oslo.policy was either missing or incorrect, which meant their policy system didn't have enough to make a decision. I haven't heard anything new about this in a while, so it's either still working for them or they gave up on the idea. There's also a spec proposing that we add more formal support for external policy engines to oslo.policy: https://review.openstack.org/#/c/578719/ It probably doesn't solve this problem any more than the HTTPCheck option does, but if one were to go down that path it would make external policy engines easier to use (no need to write a custom policy file to replace every rule with HTTPCheck, for example). > > > -- > > Thanks, > > Matt > From chris.friesen at windriver.com Wed Jan 2 19:31:03 2019 From: chris.friesen at windriver.com (Chris Friesen) Date: Wed, 2 Jan 2019 13:31:03 -0600 Subject: [nova] Granular locks in the API In-Reply-To: References: Message-ID: On 12/20/2018 1:07 PM, Matt Riedemann wrote: > I wanted to float something that we talked about in the public cloud SIG > meeting today [1] which is the concept of making the lock API more > granular to lock on a list of actions rather than globally locking all > actions that can be performed on a server. > > The primary use case we discussed was around a pre-paid pricing model > for servers. A user can pre-pay resources at a discount if let's say > they are going to use them for a month at a fixed rate. However, once > they do, they can't resize those servers without going through some kind > of approval (billing) process to resize up. With this, the provider > could lock the user from performing the resize action on the server but > the user could do other things like stop/start/reboot/snapshot/etc. On the operator side, it seems like you could just auto-switch the user from fixed-rate to variable-rate for that instance (assuming you have their billing info). It almost sounds like this is just a convenience thing for the user, so they don't accidentally resize the instance. Looking at it more generally, are there any other user-callable Compute API calls that would make sense to selectively disable for a specific resource? Chris From juliaashleykreger at gmail.com Wed Jan 2 21:44:42 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 2 Jan 2019 13:44:42 -0800 Subject: [ironic] Mid-cycle call times Message-ID: Greetings everyone, During our ironic team meeting in December, we discussed if we should go ahead and have a "mid-cycle" call in order to try sync up on where we are at during this cycle, and the next steps for us to take as a team. With that said, I have created a doodle poll[1] in an attempt to identify some days that might work. Largely the days available on the poll are geared around my availability this month. Ideally, I would like to find three days where we can schedule some 2-4 hour blocks of time. I've gone ahead and started an etherpad[2] to get us started on brainstorming. Once we have some ideas, we will be able to form a schedule and attempt to identify the amount of time required. -Julia [1]: https://doodle.com/poll/uqwywaxuxsiu7zde [2]: https://etherpad.openstack.org/p/ironic-stein-midcycle -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbitter at redhat.com Wed Jan 2 21:49:54 2019 From: zbitter at redhat.com (Zane Bitter) Date: Thu, 3 Jan 2019 10:49:54 +1300 Subject: [all][ptl][heat][senlin][magnum]New SIG for Autoscaling? plus Session Summary: Autoscaling Integration, improvement, and feedback In-Reply-To: References: Message-ID: On 29/11/18 1:00 AM, Rico Lin wrote: > Dear all > Tl;dr; > I gonna use this ML to give a summary of the forum [1] and asking for > feedback for the idea of new SIG. > We have a plan to create a new SIG for autoscaling which to cover the > common library, docs, and tests for cross-project services (Senlin, > Heat, Monasca, etc.) and cross-community (OpenStack, Kubernetes, etc). > And the goal is not just to have a place to keep those resources that > make sure we can guarantee the availability for use cases, also to have > a force to keep push the effort to integrate across services or at least > make sure we don't go to a point where everyone just do their own > service and don't care about any duplication. > So if you have any thoughts for the new SIG (good or bad) please share > it here. So +1, obviously (for the benefit of those who weren't at the Forum, I... may have suggested it?). Even if there were only one way to do autoscaling in OpenStack, it would be an inherently cross-project thing - and in fact there are multiple options for each piece of the puzzle. I really like Rico's idea of building documentation and test suites to make user experience of autoscaling better, and the best place to host those would be a SIG rather than any individual project. We know there are a *lot* of folks out there under the radar using autoscaling in OpenStack, so maybe a SIG will also provide a way to draw some of them out of the woodwork to tell us more about their use cases. > Here I summarize our discussion in the forum session `Autoscaling > Integration, improvement, and feedback`. If you like to learn more or > input your thoughts, feel free to put it in etherpad [1] or simply > reply to this email. > In the forum, we have been discussed the scope and possibility to > integrate effort from Heat, Senlin, and also autoscaling across > OpenStack to K8s. There are some long-term goals that we can start on > like create a document for general Autoscaling on OpenStack, Common > library for cross-project usage, or create real scenario cases to test > on our CI. > And the most important part is how can we help users and satisfied use > cases without confuse them or making too much duplication effort across > communities/projects. > So here's an action we agree on, is to trigger discussion for either we > need to create a new SIG for autoscaling. We need to define the scope > and the goal for this new SIG before we go ahead and create one. > The new SIG will cover the common library, docs, and tests for > cross-project services (Senlin, Heat, Monasca, etc.) and cross-community > (OpenStack, Kubernetes, etc). And the goal is not just to have a place > to keep those resources that make sure we can guarantee the availability > for use cases, also to have a force to keep push the effort to integrate > across services or at least make sure we don't go to a point where > everyone just do their own service and don't care about any duplication. > For example, we can have a document about do autoscaling in OpenStack, > but we need a place to put it and keep maintain it. And we can even have > a place to set up CI to test all scenario for autoscaling. > I think it's possible to extend the definition of this SIG, but we have > to clear our goal and make sure we actually doing a good thing and make > everyone's life easier. On the other hand we also need to make sure we > do not duplicate the effort of other SIGs/WGs. > Also The reason I add `ptl` tag for this ML is that this SIG or the > concept of `autoscaling` might be very deferent to different projects. > So I really wish to hear from anyone and any projects who are > interested in this topic. > > [1] https://etherpad.openstack.org/p/autoscaling-integration-and-feedback > > > > -- > May The Force of OpenStack Be With You, > */Rico Lin > /*irc: ricolin > > From pawel at suder.info Wed Jan 2 22:06:41 2019 From: pawel at suder.info (=?UTF-8?Q?Pawe=C5=82?= Suder) Date: Wed, 02 Jan 2019 23:06:41 +0100 Subject: [neutron] Bug deputy report 17th~23rd Dec 2018 In-Reply-To: <1545647749.6273.1.camel@suder.info> References: <1545647749.6273.1.camel@suder.info> Message-ID: <1546466801.15393.1.camel@suder.info> Hello, I noticed that email sent to openstack-dev was not sent.. Resending it, Paweł Date 24.12.2018, time 11∶35 +0100, Paweł Suder wrote: > Hello Neutrons, > > Following bugs/RFE/issues have been raised during last week. Some of > them were already recognized, triaged, checked: > > From oldest: > > https://bugs.launchpad.net/neutron/+bug/1808731 [RFE] Needs to > restart > metadata proxy with the start/restart of l3/dhcp agent - IMO need to > discus how to do that. > > https://bugs.launchpad.net/neutron/+bug/1808916 Update mailinglist > from > dev to discuss - it is done. > > https://bugs.launchpad.net/neutron/+bug/1808917 RetryRequest > shouldn't > log stack trace by default, or it should be configurable by the > exception - confirmed by Sławek ;) > > https://bugs.launchpad.net/neutron/+bug/1809037 [RFE] Add > anti_affinity_group to binding:profile - RFE connected with another > RFE > from Nova related to NFV and (anti)affinity of resources like PFs. > > https://bugs.launchpad.net/neutron/+bug/1809080 reload_cfg doesn't > work > correctly - change in review, I tried to review, but no comments left > - > new contributor > > https://bugs.launchpad.net/neutron/+bug/1809134 - TypeError in QoS > gateway_ip code in l3-agent logs - review in progress > > https://bugs.launchpad.net/neutron/+bug/1809238 - [l3] > `port_forwarding` cannot be set before l3 `router` in service_plugins > - > review in progress > > https://bugs.launchpad.net/neutron/+bug/1809447 - performance > regression from mitaka to ocata - not triaged, I am not sure how to > handle it - it is a wide thing.. > > https://bugs.launchpad.net/neutron/+bug/1809497 - bug noticed on > gates > related to another bug opened last week: https://bugs.launchpad.net/n > eu > tron/+bug/1809134 > > Wish you a good time this week and for the next year! :) > > Cheers, > Paweł From jeremyfreudberg at gmail.com Wed Jan 2 22:29:27 2019 From: jeremyfreudberg at gmail.com (Jeremy Freudberg) Date: Wed, 2 Jan 2019 17:29:27 -0500 Subject: [sahara][qa][api-sig]Support for Sahara APIv2 in tempest tests, unversioned endpoints In-Reply-To: <1818981.9ErCeWV4fL@whitebase.usersys.redhat.com> References: <1818981.9ErCeWV4fL@whitebase.usersys.redhat.com> Message-ID: Hey Luigi. I poked around in Tempest and saw these code bits: https://github.com/openstack/tempest/blob/master/tempest/lib/common/rest_client.py#L210 https://github.com/openstack/tempest/blob/f9650269a32800fdcb873ff63f366b7bc914b3d7/tempest/lib/auth.py#L53 Here's a patch which takes advantage of those bits to append the version to the unversioned base URL: https://review.openstack.org/#/c/628056/ Hope it works without regression (I'm a bit worried since Tempest does its own URL mangling rather than nicely use keystoneauth...) On Wed, Jan 2, 2019 at 5:19 AM Luigi Toscano wrote: > > Hi all, > > I'm working on adding support for APIv2 to the Sahara tempest plugin. > > If I get it correctly, there are two main steps > > 1) Make sure that that tempest client works with APIv2 (and don't regress with > APIv1.1). > > This mainly mean implementing the tempest client for Sahara APIv2, which > should not be too complicated. > > On the other hand, we hit an issue with the v1.1 client in an APIv2 > environment. > A change associated with API v2 is usage of an unversioned endpoint for the > deployment (see https://review.openstack.org/#/c/622330/ , without the /v1,1/$ > (tenant_id) suffix) which should magically work with both API variants, but it > seems that the current tempest client fails in this case: > > http://logs.openstack.org/30/622330/1/check/sahara-tests-tempest/7e02114/job-output.txt.gz#_2018-12-05_21_20_23_535544 > > Does anyone know if this is an issue with the code of the tempest tests (which > should maybe have some logic to build the expected endpoint when it's > unversioned, like saharaclient does) or somewhere else? > > > 2) fix the tests to support APIv2. > > Should I duplicate the tests for APIv1.1 and APIv2? Other projects which > supports different APIs seems to do this. > But can I freely move the existing tests under a subdirectory > (sahara_tempest_plugins/tests/api/ -> sahara_tempest_plugins/tests/api/v1/), > or are there any compatibility concerns? Are the test ID enough to ensure that > everything works as before? > > And what about CLI tests currently under sahara_tempest_plugin/tests/cli/ ? > They supports both API versions through a configuration flag. Should they be > duplicated as well? > > > Ciao > (and happy new year if you have a new one in your calendar!) > -- > Luigi > > > From yongli.he at intel.com Thu Jan 3 03:15:26 2019 From: yongli.he at intel.com (yonglihe) Date: Thu, 3 Jan 2019 11:15:26 +0800 Subject: [nova] implementation options for nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: On 2018/12/18 下午4:20, yonglihe wrote: > Hi, guys > > This spec needs input and discuss for move on. Jay suggest we might be good to use a new sub node to hold topology stuff,  it's option 2, here. And split the PCI stuff out of this NUMA thing spec, use a /devices node to hold all 'devices' stuff instead, then this node is generic and not only for PCI itself. I'm OK for Jay's suggestion,  it contains more key words and seems crystal clear and straight forward. The problem is we need aligned about this. This spec need gain more input thanks, Jay, Matt. Regards Yongli He > > Currently the spec is under reviewing: > https://review.openstack.org/#/c/612256/8 > > Plus with POC code: > https://review.openstack.org/#/c/621476/3 > > and related stein PTG discuss: > https://etherpad.openstack.org/p/nova-ptg-stein > start from line 897 > > NUMA topology had lots of information to expose, for saving you time > to jumping into to the spec, the information need to > return include NUMA related like: > numa_node,cpu_pinning,cpu_thread_policy,cpuset,siblings, > mem,pagesize,sockets,cores, > threads, and PCI device's information. > > Base on IRC's discuss, we may have 3 options about how to deal with > those blobs: > > 1) include those directly in the server response details, like the > released POC does: > https://review.openstack.org/#/c/621476/3 > > 2) add a new sub-resource endpoint to servers, most likely use key > word 'topology' then: > "GET /servers/{server_id}/topology" returns the NUMA information for > one server. > > 3) put the NUMA info under existing 'diagnostics' API. > "GET /servers/{server_id}/diagnostics" > this is admin only API, normal user loss the possible to check their > topology. > > when the information put into diagnostics, they will be look like: > { >    .... >    "numa_topology": { >       cells  [ >                { >                     "numa_node" : 3 >                     "cpu_pinning": {0:5, 1:6}, >                     "cpu_thread_policy": "prefer", >                     "cpuset": [0,1,2,3], >                     "siblings": [[0,1],[2,3]], >                     "mem": 1024, >                     "pagesize": 4096, >                     "sockets": 0, >                     "cores": 2, >                      "threads": 2, >                  }, >              ... >            ] # cells >     } >     "emulator_threads_policy": "share" > >     "pci_devices": [ >         { >                 "address":"00:1a.0", >                 "type": "VF", >                 "vendor": "8086", >                 "product": "1526" >         }, >     ] >  } > > > Regards > Yongli He > > > From soulxu at gmail.com Thu Jan 3 04:08:26 2019 From: soulxu at gmail.com (Alex Xu) Date: Thu, 3 Jan 2019 12:08:26 +0800 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> Message-ID: Jay Pipes 于2019年1月2日周三 下午10:48写道: > On 12/21/2018 03:45 AM, Rui Zang wrote: > > It was advised in today's nova team meeting to bring this up by email. > > > > There has been some discussion on the how to track persistent memory > > resource in placement on the spec review [1]. > > > > Background: persistent memory (PMEM) needs to be partitioned to > > namespaces to be consumed by VMs. Due to fragmentation issues, the spec > > proposed to use fixed sized PMEM namespaces. > > The spec proposed to use fixed sized namespaces that is controllable by > the deployer, not fixed-size-for-everyone :) Just want to make sure > we're being clear here. > > > The spec proposed way to represent PMEM namespaces is to use one > > Resource Provider (RP) for one PMEM namespace. An new standard Resource > > Class (RC) -- 'VPMEM_GB` is introduced to classify PMEM namspace RPs. > > For each PMEM namespace RP, the values for 'max_unit', 'min_unit', > > 'total' and 'step_size` are all set to the size of the PMEM namespace. > > In this way, it is guaranteed each RP will be consumed as a whole at one > > time. > > > > An alternative was brought out in the review. Different Custom Resource > > Classes ( CUSTOM_PMEM_XXXGB) can be used to designate PMEM namespaces of > > different sizes. The size of the PMEM namespace is encoded in the name > > of the custom Resource Class. And multiple PMEM namespaces of the same > > size (say 128G) can be represented by one RP of the same > > Not represented by "one RP of the same CUSTOM_PMEM_128G". There would be > only one resource provider: the compute node itself. It would have an > inventory of, say, 8 CUSTOM_PMEM_128G resources. > > > CUSTOM_PMEM_128G. In this way, the RP could have 'max_unit' and 'total' > > as the total number of the PMEM namespaces of the certain size. And the > > values of 'min_unit' and 'step_size' could set to 1. > > No, the max_unit, min_unit, step_size and total would refer to the > number of *PMEM namespaces*, not the amount of GB of memory represented > by those namespaces. > > Therefore, min_unit and step_size would be 1, max_unit would be the > total number of *namespaces* that could simultaneously be attached to a > single consumer (VM), and total would be 8 in our example where the > compute node had 8 of these pre-defined 128G PMEM namespaces. > > > We believe both way could work. We would like to have a community > > consensus on which way to use. > > Email replies and review comments to the spec [1] are both welcomed. > > Custom resource classes were invented for precisely this kind of use > case. The resource being represented is a namespace. The resource is not > "a Gibibyte of persistent memory". > The point of the initial design is avoid to encode the `size` in the resource class name. If that is ok for you(I remember people hate to encode size and number into the trait name), then we will update the design. Probably based on the namespace configuration, nova will be responsible for create those custom RC first. Sounds works. > > Best, > -jay > > > Regards, > > Zang, Rui > > > > > > [1] https://review.openstack.org/#/c/601596/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andre at florath.net Thu Jan 3 08:07:45 2019 From: andre at florath.net (Andre Florath) Date: Thu, 3 Jan 2019 09:07:45 +0100 Subject: [glance] Question about container_format and disk_format In-Reply-To: References: Message-ID: <5ba6f641-c298-0d1a-274d-dfae909a3b8b@florath.net> Hello! After digging through the source code, I'd answer my own question: image disk_format and container_format are (A) the formats of the image file that is passed in. Reasoning: Glance's so called flows use those parameters as input like ovf_process.py [1]: if image.container_format == 'ova': When there is a conversion done in the flow, the target format is the one from the configuration (like [2]): target_format = CONF.image_conversion.output_format After a possible conversion, the new disk and container formats are set (e.g. [3]): image.disk_format = target_format image.container_format = 'bare' (At some points instead of using the disk and container format parameters, a call to 'qemu-img info' is done to extract those information from the image - like in [4]: stdout, stderr = putils.trycmd("qemu-img", "info", "--output=json", ... ... metadata = json.loads(stdout) source_format = metadata.get('format') ) So it looks that the idea is, that the disk_format and container_format should always reflect the current format of the image. Can anybody please confirm / comment? Kind regards Andre [1] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/ovf_process.py#n87 [2] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n78 [3] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n129 [4] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n87 On 12/18/18 11:07 AM, Andre Florath wrote: > Hello! > > I do not completely understand the parameters 'container_format' > and 'disk_format' as described in [1]. The documentation always > uses 'the format' but IMHO there might be two formats involved. > > Are those formats either > > (A) the formats of the image file that is passed in. > > Like (from the official documentation [2]) > > $ openstack image create --disk-format qcow2 --container-format bare \ > --public --file ./centos63.qcow2 centos63-image > > qcow2 / bare are the formats of the passed in image. > > or > > (B) the formats that are used internally to store the image > > Like > > $ openstack image create --disk-format vmdk --container-format ova \ > --public --file ./centos63.qcow2 centos63-image > > vmdk / ova are formats that are used internally in OpenStack glance > to store the image. > In this case there must be an auto-detection of the image file format > that is passed in and an automatic conversion into the new format. > > Kind regards > > Andre > > > [1] https://developer.openstack.org/api-ref/image/v2/index.html?expanded=create-image-detail#create-image > [2] https://docs.openstack.org/glance/pike/admin/manage-images.html > From jaypipes at gmail.com Thu Jan 3 12:39:40 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Thu, 3 Jan 2019 07:39:40 -0500 Subject: [nova] implementation options for nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: On 01/02/2019 10:15 PM, yonglihe wrote: > On 2018/12/18 下午4:20, yonglihe wrote: >> Hi, guys >> >> This spec needs input and discuss for move on. > > Jay suggest we might be good to use a new sub node to hold topology > stuff,  it's option 2, here. And split > > the PCI stuff out of this NUMA thing spec, use a /devices node to hold > all 'devices' stuff instead, then this node > > is generic and not only for PCI itself. > > I'm OK for Jay's suggestion,  it contains more key words and seems > crystal clear and straight forward. > > The problem is we need aligned about this. This spec need gain more > input thanks, Jay, Matt. Also, I mentioned that you need not (IMHO) combine both PCI/devices and NUMA topology in a single spec. We could proceed with the /topology API endpoint and work out the more generic /devices API endpoint in a separate spec. Best, -jay From jaypipes at gmail.com Thu Jan 3 13:31:51 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Thu, 3 Jan 2019 08:31:51 -0500 Subject: [nova] Persistent memory resource tracking model In-Reply-To: References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> Message-ID: <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> On 01/02/2019 11:08 PM, Alex Xu wrote: > Jay Pipes > 于2019年1月2 > 日周三 下午10:48写道: > > On 12/21/2018 03:45 AM, Rui Zang wrote: > > It was advised in today's nova team meeting to bring this up by > email. > > > > There has been some discussion on the how to track persistent memory > > resource in placement on the spec review [1]. > > > > Background: persistent memory (PMEM) needs to be partitioned to > > namespaces to be consumed by VMs. Due to fragmentation issues, > the spec > > proposed to use fixed sized PMEM namespaces. > > The spec proposed to use fixed sized namespaces that is controllable by > the deployer, not fixed-size-for-everyone :) Just want to make sure > we're being clear here. > > > The spec proposed way to represent PMEM namespaces is to use one > > Resource Provider (RP) for one PMEM namespace. An new standard > Resource > > Class (RC) -- 'VPMEM_GB` is introduced to classify PMEM namspace > RPs. > > For each PMEM namespace RP, the values for 'max_unit', 'min_unit', > > 'total' and 'step_size` are all set to the size of the PMEM > namespace. > > In this way, it is guaranteed each RP will be consumed as a whole > at one > > time. >  > > > An alternative was brought out in the review. Different Custom > Resource > > Classes ( CUSTOM_PMEM_XXXGB) can be used to designate PMEM > namespaces of > > different sizes. The size of the PMEM namespace is encoded in the > name > > of the custom Resource Class. And multiple PMEM namespaces of the > same > > size  (say 128G) can be represented by one RP of the same > > Not represented by "one RP of the same CUSTOM_PMEM_128G". There > would be > only one resource provider: the compute node itself. It would have an > inventory of, say, 8 CUSTOM_PMEM_128G resources. > > > CUSTOM_PMEM_128G. In this way, the RP could have 'max_unit'  and > 'total' > > as the total number of the PMEM namespaces of the certain size. > And the > > values of 'min_unit' and 'step_size' could set to 1. > > No, the max_unit, min_unit, step_size and total would refer to the > number of *PMEM namespaces*, not the amount of GB of memory represented > by those namespaces. > > Therefore, min_unit and step_size would be 1, max_unit would be the > total number of *namespaces* that could simultaneously be attached to a > single consumer (VM), and total would be 8 in our example where the > compute node had 8 of these pre-defined 128G PMEM namespaces. > > > We believe both way could work. We would like to have a community > > consensus on which way to use. > > Email replies and review comments to the spec [1] are both welcomed. > > Custom resource classes were invented for precisely this kind of use > case. The resource being represented is a namespace. The resource is > not > "a Gibibyte of persistent memory". > > > The point of the initial design is avoid to encode the `size` in the > resource class name. If that is ok for you(I remember people hate to > encode size and number into the trait name), then we will update the > design. Probably based on the namespace configuration, nova will be > responsible for create those custom RC first. Sounds works. A couple points... 1) I was/am opposed to putting the least-fine-grained size in a resource class name. For example, I would have preferred DISK_BYTE instead of DISK_GB. And MEMORY_BYTE instead of MEMORY_MB. 2) After reading the original Intel PMEM specification (http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf), it seems to me that what you are describing with a generic PMEM_GB (or PMEM_BYTE) resource class is more appropriate for the block mode translation system described in the PDF versus the PMEM namespace system described therein. From a lay person's perspective, I see the difference between the two as similar to the difference between describing the bytes that are in block storage versus a filesystem that has been formatted, wiped, cleaned, etc on that block storage. In Nova, the DISK_GB resource class describes the former: it's a bunch of blocks that are reserved in the underlying block storage for use by the virtual machine. The virtual machine manager then formats that bunch of blocks as needed and lays down a formatted image. We don't have a resource class that represents "a filesystem" or "a partition" (yet). But the proposed PMEM namespaces in your spec definitely seem to be more like a "filesystem resource" than a "GB of block storage" resource. Best, -jay From sean.mcginnis at gmx.com Thu Jan 3 13:34:26 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 07:34:26 -0600 Subject: Fwd: [cinder] Is the =?utf-8?Q?cinder_?= =?utf-8?Q?Active-Active_feature_OK=EF=BC=9F?= In-Reply-To: References: Message-ID: <20190103133426.GA27473@sm-workstation> On Tue, Dec 25, 2018 at 05:26:54PM +0800, Jaze Lee wrote: > Hello, > In my opinion, all rest api will get to manager of cinder volume. > If the manager is using DLM, we can say cinder volume can support > active-active. So, can we rewrite the comments and option's help in > page https://github.com/openstack/cinder/blob/master/cinder/cmd/volume.py#L78 > ? > > Thanks a lot. The work isn't entirely complete for active-active. The one last step we've had to complete is for backend storage vendors to validate that their storage and Cinder drivers work well when running with the higher concurrency of operations that active-active HA would allow to happen. As far as I'm aware, none of the vendors have enabled active-active with their drivers as described in [1]. The other services (not cinder-volume) should be fine, but the to complete the work we need vendors on board to support it with the drivers. Sean [1] https://docs.openstack.org/cinder/latest/contributor/high_availability.html#enabling-active-active-on-drivers From sean.mcginnis at gmx.com Thu Jan 3 13:35:33 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 07:35:33 -0600 Subject: [Infra][Cinder] Request for voting permission from LINBIT LINSTOR CI (linbit_ci@linbit.com) for Cinder projects In-Reply-To: References: Message-ID: <20190103133532.GB27473@sm-workstation> On Fri, Dec 28, 2018 at 01:55:30PM -0800, Woojay Poynter wrote: > Hello, > > I would like to request a voting permission on LINBIT LINSTOR CI for Cinder > project. We are looking to satisfy third-party testing requirement for > Cinder volume driver for LINSTOR (https://review.openstack.org/#/c/624233/). > The CI's test result of the lastest patch is at > http://us.linbit.com:8080/CI-LINSTOR/33/456/ > Thanks Woojay, but in Cinder we do not allow any third party CI to be voting. Only Zuul gets to vote. All third party CI's should comment on the pass/fail results, but it does not need to vote. Sean From rosmaita.fossdev at gmail.com Thu Jan 3 13:46:29 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Thu, 3 Jan 2019 08:46:29 -0500 Subject: [glance] Question about container_format and disk_format In-Reply-To: <5ba6f641-c298-0d1a-274d-dfae909a3b8b@florath.net> References: <5ba6f641-c298-0d1a-274d-dfae909a3b8b@florath.net> Message-ID: On 1/3/19 3:07 AM, Andre Florath wrote: > Hello! > > After digging through the source code, I'd answer my own question: Sorry you had to answer your own question, but glad you were willing to dig into the source code! > image disk_format and container_format are > > (A) the formats of the image file that is passed in. This is "sort of" correct. In general, Glance does not verify either the disk_format or container_format of the image data, so these values are whatever the image owner has specified. Glance doesn't verify these because disk/container formats are developed independently of OpenStack, and in the heady days of 2010, it seemed like a good idea that new disk/container formats be usable without having to wait for a new Glance release. (There isn't much incentive for an image owner to lie about the disk/container format, because specifying the wrong one could make the image unusable by any consuming service that relies on these image properties.) > > Reasoning: > > Glance's so called flows use those parameters as input > like ovf_process.py [1]: > > if image.container_format == 'ova': > > When there is a conversion done in the flow, the target > format is the one from the configuration (like [2]): > > target_format = CONF.image_conversion.output_format > > After a possible conversion, the new disk and container formats are > set (e.g. [3]): > > image.disk_format = target_format > image.container_format = 'bare' > > (At some points instead of using the disk and container format > parameters, a call to 'qemu-img info' is done to extract those > information from the image - like in [4]: > > > stdout, stderr = putils.trycmd("qemu-img", "info", > "--output=json", ... Note to fans of CVE 2015-5162: the above call to qemu-img is time restricted. > ... > metadata = json.loads(stdout) > source_format = metadata.get('format') > ) Remember that the "flows" are optional, so in general you cannot rely upon Glance setting these values correctly for you. > > So it looks that the idea is, that the disk_format and > container_format should always reflect the current format of the > image. > > Can anybody please confirm / comment? Yes, the image properties associated with an image are meant to describe the image data associated with that image record. > Kind regards > > Andre Happy new year! brian > > > [1] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/ovf_process.py#n87 > [2] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n78 > [3] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n129 > [4] https://git.openstack.org/cgit/openstack/glance/tree/glance/async_/flows/plugins/image_conversion.py#n87 > > > > On 12/18/18 11:07 AM, Andre Florath wrote: >> Hello! >> >> I do not completely understand the parameters 'container_format' >> and 'disk_format' as described in [1]. The documentation always >> uses 'the format' but IMHO there might be two formats involved. >> >> Are those formats either >> >> (A) the formats of the image file that is passed in. >> >> Like (from the official documentation [2]) >> >> $ openstack image create --disk-format qcow2 --container-format bare \ >> --public --file ./centos63.qcow2 centos63-image >> >> qcow2 / bare are the formats of the passed in image. >> >> or >> >> (B) the formats that are used internally to store the image >> >> Like >> >> $ openstack image create --disk-format vmdk --container-format ova \ >> --public --file ./centos63.qcow2 centos63-image >> >> vmdk / ova are formats that are used internally in OpenStack glance >> to store the image. >> In this case there must be an auto-detection of the image file format >> that is passed in and an automatic conversion into the new format. >> >> Kind regards >> >> Andre >> >> >> [1] https://developer.openstack.org/api-ref/image/v2/index.html?expanded=create-image-detail#create-image >> [2] https://docs.openstack.org/glance/pike/admin/manage-images.html >> > > From sean.mcginnis at gmx.com Thu Jan 3 13:51:55 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 07:51:55 -0600 Subject: Review-Priority for Project Repos In-Reply-To: References: Message-ID: <20190103135155.GC27473@sm-workstation> On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: > Dear All, > > There are many occasion when we want to priorities some of the patches > whether it is related to unblock the gates or blocking the non freeze > patches during RC. > > So adding the Review-Priority will allow more precise dashboard. As > Designate and Cinder projects already experiencing this[1][2] and after > discussion with Jeremy brought this to ML to interact with these team > before landing [3], as there is possibility that reapply the priority vote > following any substantive updates to change could make it more cumbersome > than it is worth. With Cinder this is fairly new, but I think it is working well so far. The oddity we've run into, that I think you're referring to here, is how those votes carry forward with updates. I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when a patchset is updates, the -1 and +2 carry forward. But for some reason we can't get the +1 to be sticky. So far, that's just a slight inconvenience. It would be great if we can figure out a way to have them all be sticky, but if we need to live with reapplying +1 votes, that's manageable to me. The one thing I have been slightly concerned with is the process around using these priority votes. It hasn't been an issue, but I could see a scenario where one core (in Cinder we have it set up so all cores can use the priority voting) has set something like a procedural -1, then been pulled away or is absent for an extended period. Like a Workflow -2, another core cannot override that vote. So until that person is back to remove the -1, that patch would not be able to be merged. Granted, we've lived with this with Workflow -2's for years and it's never been a major issue, but I think as far as centralizing control, it may make sense to have a separate smaller group (just the PTL, or PTL and a few "deputies") that are able to set priorities on patches just to make sure the folks setting it are the ones that are actively tracking what the priorities are for the project. Anyway, my 2 cents. I can imagine this would work really well for some teams, less well for others. So if you think it can help you manage your project priorities, I would recommend giving it a shot and seeing how it goes. You can always drop it if it ends up not being effective or causing issues. Sean From fungi at yuggoth.org Thu Jan 3 14:22:29 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 3 Jan 2019 14:22:29 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <20190103135155.GC27473@sm-workstation> References: <20190103135155.GC27473@sm-workstation> Message-ID: <20190103142229.y4o2syjwrq5jqfsp@yuggoth.org> On 2019-01-03 07:51:55 -0600 (-0600), Sean McGinnis wrote: [...] > The one thing I have been slightly concerned with is the process > around using these priority votes. It hasn't been an issue, but I > could see a scenario where one core (in Cinder we have it set up > so all cores can use the priority voting) has set something like a > procedural -1, then been pulled away or is absent for an extended > period. Like a Workflow -2, another core cannot override that > vote. So until that person is back to remove the -1, that patch > would not be able to be merged. [...] Please treat it as only a last resort, but the solution to this is that a Gerrit admin (find us in #openstack-infra on Freenode or the openstack-infra ML on lists.openstack.org or here on openstack-discuss with an [infra] subject tag) can selectively delete votes on a change at the request of a project leader (PTL, infra liaison, TC member...) to unblock your work. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From doug at doughellmann.com Thu Jan 3 14:52:23 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 03 Jan 2019 09:52:23 -0500 Subject: [tc] agenda for Technical Committee Meeting 3 Jan 2019 @ 1400 UTC In-Reply-To: References: Message-ID: Doug Hellmann writes: > TC Members, > > Our next meeting will be this Thursday, 3 Jan at 1400 UTC in > #openstack-tc. This email contains the agenda for the meeting, based on > the content of the wiki [0]. > The logs from the meeting can be found at: Minutes: http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-01-03-14.01.html Log: http://eavesdrop.openstack.org/meetings/tc/2019/tc.2019-01-03-14.01.log.html -- Doug From marios at redhat.com Thu Jan 3 15:19:26 2019 From: marios at redhat.com (Marios Andreou) Date: Thu, 3 Jan 2019 17:19:26 +0200 Subject: [tripleo] Scenario Standalone ci jobs update - voting and promotion pipeline Message-ID: o/ TripleO's & Happy New Year \o/ if you are tracking the ci squad you may know that one area of focus recently is moving the scenario-multinode jobs to harder/better/faster/stronger (most importantly *smaller* but not as cool) scenario-standalone equivalents. Scenarios 1-4 are now merged [1]. For the current sprint [2] ci squad is doing cleanup on those. This includes making sure the new jobs are used in all the places the multinode jobs were e.g.[3][4] (& scens 2/3 will follow) and fixing any missing services or any other nits we find. Once done we can move on to the rest - scenarios 5/6 etc. We are looking for any feedback about the jobs in general or any one in particular if you have some special interest in a particular service (see [5] for reminder about services and scenarios). Most importantly those jobs are now being set as voting (e.g. already done for 1/4 at [1]) and the next natural step once voting is to add them into the master promotion pipeline. Please let us know if you think this is a bad idea or with any other feedback or suggestion. regards &thanks for reading! marios [1] https://github.com/openstack-infra/tripleo-ci/blob/3d634dc2874f95a9d4fd97a1ac87e0b07f20bd80/zuul.d/standalone-jobs.yaml#L85-L181 [2] https://tree.taiga.io/project/tripleo-ci-board/taskboard/unified-sprint-3 [3] https://review.openstack.org/#/q/topic:replace-scen1 [4] https://review.openstack.org/#/q/topic:replace-scen4 [5] https://github.com/openstack/tripleo-heat-templates#service-testing-matrix -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfidente at redhat.com Thu Jan 3 15:26:23 2019 From: gfidente at redhat.com (Giulio Fidente) Date: Thu, 3 Jan 2019 16:26:23 +0100 Subject: [tripleo] Scenario Standalone ci jobs update - voting and promotion pipeline In-Reply-To: References: Message-ID: <719707f3-9f50-cf03-214a-ef9df206cfde@redhat.com> On 1/3/19 4:19 PM, Marios Andreou wrote: > o/ TripleO's & Happy New Year \o/  > > if you are tracking the ci squad you may know that one area of focus > recently is moving the scenario-multinode jobs to > harder/better/faster/stronger (most importantly *smaller* but not as > cool) scenario-standalone equivalents. > > Scenarios 1-4 are now merged [1]. For the current sprint [2] ci squad is > doing cleanup on those. This includes making sure the new jobs are used > in all the places the multinode jobs were e.g.[3][4] (& scens 2/3 will > follow) and fixing any missing services or any other nits we find. Once > done we can move on to the rest - scenarios 5/6 etc. > > We are looking for any feedback about the jobs in general or any one in > particular if you have some special interest in a particular service > (see [5] for reminder about services and scenarios). > > Most importantly those jobs are now being set as voting (e.g. already > done for 1/4 at [1]) and the next natural step once voting is to add > them into the master promotion pipeline. > Please let us know if you think this is a bad idea or with any other > feedback or suggestion. thanks a lot! note that these are the two scenarios testing ceph: I guess what I'm trying to say, is that if I can change, and you can change, everybody can change! (rocky balboa TM) -- Giulio Fidente GPG KEY: 08D733BA From cdent+os at anticdent.org Thu Jan 3 16:54:00 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 3 Jan 2019 16:54:00 +0000 (GMT) Subject: [placement] [packaging] [docs] WIP placement install docs Message-ID: I've started a very wippy work in progress [1] for placement installation docs within the extracted placement repository, using the existing docs from nova as a base. I'd like to draw the attention of rdo, ubuntu, and obs packagers as the docs follow the existing pattern of documenting install on those three distros. However, since placement is not fully packaged yet, I'm making some guesses in the docs about how things will be set up and how packages will be named. So if people who are interested in such things could have a look that would be helpful. I've stubbed out, but not yet completed, a from-pypi page as well, as with placement, installation can become as simple as pip install + run a single command line with the right args, if that's what you want. I think people should be able to see that too. Thanks. [1] https://review.openstack.org/628220 -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From sean.mcginnis at gmx.com Thu Jan 3 16:59:45 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 10:59:45 -0600 Subject: [release] Release countdown for week R-13, Jan 7-11 Message-ID: <20190103165944.GA5101@sm-workstation> Welcome back from the holiday lull! Development Focus ----------------- Teams should be focused on completing milestone 2 activities by January 10 and checking overall progress for the development cycle. General Information ------------------- The Stein membership freeze coincides with the Stein-2 milestone on the 10th. While doing an audit of all officially governed team repos, there are quite a few that have not had an official release done yet. If your team has added any repos for deliverables you would like to have included in the Stein coordinated release, please add at least an empty template deliverable file for now so we can help track those. We understand some may not be quite ready for a full release yet, but if you have something minimally viable to get released it would be good to do a 0.x release to exercise the release tooling for your deliverables. Another reminder about the changes this cycle with library deliverables that follow the cycle-with-milestones release model. As announced, we will be automatically proposing releases for these libraries at milestone 2 if there have been any functional changes since the last release to help ensure those changes are picked up by consumers with plenty of time to identify and correct issues. More detail can be found in the original mailing list post describing the changes: http://lists.openstack.org/pipermail/openstack-dev/2018-October/135689.html Any other cycle-with-intermediary deliverables that have not requested a release by January 10 will be switched to the cycle-with-rc model as discussed previously: http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000465.html Upcoming Deadlines & Dates -------------------------- Stein-2 Milestone: January 10 -- Sean McGinnis (smcginnis) From mriedemos at gmail.com Thu Jan 3 17:40:22 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 11:40:22 -0600 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1545992000.14055.0@smtp.office365.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> Message-ID: On 12/28/2018 4:13 AM, Balázs Gibizer wrote: > I'm wondering that introducing an API microversion could act like a > feature flag I need and at the same time still make the feautre > discoverable as you would like to see it. Something like: Create a > feature flag in the code but do not put it in the config as a settable > flag. Instead add an API microversion patch to the top of the series > and when the new version is requested it enables the feature via the > feature flag. This API patch can be small and simple enough to > cherry-pick to earlier into the series for local end-to-end testing if > needed. Also in functional test I can set the flag via a mock so I can > add and run functional tests patch by patch. That may work. It's not how I would have done this, I would have started from the bottom and worked my way up with the end to end functional testing at the end, as already noted, but I realize you've been pushing this boulder for a couple of releases now so that's not really something you want to change at this point. I guess the question is should this change have a microversion at all? That's been wrestled in the spec review and called out in this thread. I don't think a microversion would be *wrong* in any sense and could only help with discoverability on the nova side, but am open to other opinions. -- Thanks, Matt From me at not.mn Thu Jan 3 18:41:26 2019 From: me at not.mn (John Dickinson) Date: Thu, 03 Jan 2019 10:41:26 -0800 Subject: Review-Priority for Project Repos In-Reply-To: <20190103135155.GC27473@sm-workstation> References: <20190103135155.GC27473@sm-workstation> Message-ID: On 3 Jan 2019, at 5:51, Sean McGinnis wrote: > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: >> Dear All, >> >> There are many occasion when we want to priorities some of the patches >> whether it is related to unblock the gates or blocking the non freeze >> patches during RC. >> >> So adding the Review-Priority will allow more precise dashboard. As >> Designate and Cinder projects already experiencing this[1][2] and after >> discussion with Jeremy brought this to ML to interact with these team >> before landing [3], as there is possibility that reapply the priority vote >> following any substantive updates to change could make it more cumbersome >> than it is worth. > > With Cinder this is fairly new, but I think it is working well so far. The > oddity we've run into, that I think you're referring to here, is how those > votes carry forward with updates. > > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when > a patchset is updates, the -1 and +2 carry forward. But for some reason we > can't get the +1 to be sticky. > > So far, that's just a slight inconvenience. It would be great if we can figure > out a way to have them all be sticky, but if we need to live with reapplying +1 > votes, that's manageable to me. > > The one thing I have been slightly concerned with is the process around using > these priority votes. It hasn't been an issue, but I could see a scenario where > one core (in Cinder we have it set up so all cores can use the priority voting) > has set something like a procedural -1, then been pulled away or is absent for > an extended period. Like a Workflow -2, another core cannot override that vote. > So until that person is back to remove the -1, that patch would not be able to > be merged. > > Granted, we've lived with this with Workflow -2's for years and it's never been > a major issue, but I think as far as centralizing control, it may make sense to > have a separate smaller group (just the PTL, or PTL and a few "deputies") that > are able to set priorities on patches just to make sure the folks setting it > are the ones that are actively tracking what the priorities are for the > project. > > Anyway, my 2 cents. I can imagine this would work really well for some teams, > less well for others. So if you think it can help you manage your project > priorities, I would recommend giving it a shot and seeing how it goes. You can > always drop it if it ends up not being effective or causing issues. > > Sean This looks pretty interesting. I have a few question about how it's practically working out. I get the impression that the values of the votes are configurable? So you've chosen -1, +1, and +2, but you could have chosen 1, 2, 3, 4, 5 (for example)? Do you have an example of a dashboard that's using these values? IMO, gerrit's display of votes is rather bad. I'd prefer that votes like this could be aggregated. How do you manage discovering what patches are priority or not? I guess that's where the dashboards come in? I get the impression that particular priority votes have the ability to block a merge. How does that work? half-plug/half-context, I've attempted to solve priority discovery in the Swift community with some customer dashboards. We've got a review dashboard in gerrit [1] that shows "starred by the ptl". I've also created a tool that finds all the starred patches by all contributors, weights each contributor by how much they've contributed recently, and then sorts the resulting totals as a list of "stuff the community thinks is important"[2]. As a community, we also manage our own wiki page for prioritization[3]. I'd love to see if some functionality in gerrit, eg these priority review votes, could supplant some of our other tools. [1] http://not.mn/reviews.html [2] http://d.not.mn/swift_community_dashboard.html [3] https://wiki.openstack.org/wiki/Swift/PriorityReviews --John -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 850 bytes Desc: OpenPGP digital signature URL: From bcafarel at redhat.com Thu Jan 3 19:06:53 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Thu, 3 Jan 2019 20:06:53 +0100 Subject: [neutron] Switching Ocata branch to Extended maintenance Message-ID: Hello and happy new year, as discussed in our final 2018 meeting [0], we plan to switch the Ocata branch of Neutron to Extended maintenance [1]. There will be a final Ocata release for Neutron itself. Stadium projects backports only included test fixes so do not really need a final release. We will then mark the branch as ocata-em. Any comments or objections? Especially from driver projects maintainers, as these have independent releases. [0] http://eavesdrop.openstack.org/meetings/networking/2018/networking.2018-12-18-14.00.log.html [1] https://docs.openstack.org/project-team-guide/stable-branches.html#extended-maintenance -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu Jan 3 19:12:11 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 13:12:11 -0600 Subject: [nova] implementation options for nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: On 1/3/2019 6:39 AM, Jay Pipes wrote: > On 01/02/2019 10:15 PM, yonglihe wrote: >> On 2018/12/18 下午4:20, yonglihe wrote: >>> Hi, guys >>> >>> This spec needs input and discuss for move on. >> >> Jay suggest we might be good to use a new sub node to hold topology >> stuff,  it's option 2, here. And split >> >> the PCI stuff out of this NUMA thing spec, use a /devices node to hold >> all 'devices' stuff instead, then this node >> >> is generic and not only for PCI itself. >> >> I'm OK for Jay's suggestion,  it contains more key words and seems >> crystal clear and straight forward. >> >> The problem is we need aligned about this. This spec need gain more >> input thanks, Jay, Matt. > > Also, I mentioned that you need not (IMHO) combine both PCI/devices and > NUMA topology in a single spec. We could proceed with the /topology API > endpoint and work out the more generic /devices API endpoint in a > separate spec. > > Best, > -jay I said earlier in the email thread that I was OK with option 2 (sub-resource) or the diagnostics API, and leaned toward the diagnostics API since it was already admin-only. As long as this information is admin-only by default, not part of the main server response body and therefore not parting of listing servers with details (GET /servers/detail) then I'm OK either way and GET /servers/{server_id}/topology is OK with me also. -- Thanks, Matt From whayutin at redhat.com Thu Jan 3 19:12:59 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Thu, 3 Jan 2019 12:12:59 -0700 Subject: [tripleo] Scenario Standalone ci jobs update - voting and promotion pipeline In-Reply-To: <719707f3-9f50-cf03-214a-ef9df206cfde@redhat.com> References: <719707f3-9f50-cf03-214a-ef9df206cfde@redhat.com> Message-ID: On Thu, Jan 3, 2019 at 8:33 AM Giulio Fidente wrote: > On 1/3/19 4:19 PM, Marios Andreou wrote: > > o/ TripleO's & Happy New Year \o/ > > > > if you are tracking the ci squad you may know that one area of focus > > recently is moving the scenario-multinode jobs to > > harder/better/faster/stronger (most importantly *smaller* but not as > > cool) scenario-standalone equivalents. > > > > Scenarios 1-4 are now merged [1]. For the current sprint [2] ci squad is > > doing cleanup on those. This includes making sure the new jobs are used > > in all the places the multinode jobs were e.g.[3][4] (& scens 2/3 will > > follow) and fixing any missing services or any other nits we find. Once > > done we can move on to the rest - scenarios 5/6 etc. > > > > We are looking for any feedback about the jobs in general or any one in > > particular if you have some special interest in a particular service > > (see [5] for reminder about services and scenarios). > > > > Most importantly those jobs are now being set as voting (e.g. already > > done for 1/4 at [1]) and the next natural step once voting is to add > > them into the master promotion pipeline. > > Please let us know if you think this is a bad idea or with any other > > feedback or suggestion. > thanks a lot! > > note that these are the two scenarios testing ceph: I guess what I'm > trying to say, is that if I can change, and you can change, everybody > can change! > > (rocky balboa TM) > -- > Giulio Fidente > GPG KEY: 08D733BA > Well said Giulio, lolz. Folks, please do take time to help review and merge these patches. Noting this is a big part in reducing TripleO's upstream resource footprint. You may recall upstream infra detailing our very large consumption of upstream resources [1][2]. 99% of the multinode scenario 1-4 jobs should be able to moved to the standalone deployment. If you think you have an exception please reach out to the team in #tripleo. Forward looking, once scenario 1-4 are updated across all the projects we'll start to tackle the other scenario jobs one at a time [3] Thanks [1] https://gist.github.com/notmyname/8bf3dbcb7195250eb76f2a1a8996fb00 [2] http://lists.openstack.org/pipermail/openstack-dev/2018-September/134867.html [3] https://github.com/openstack/tripleo-heat-templates/blob/master/README.rst#service-testing-matrix -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu Jan 3 19:41:52 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 3 Jan 2019 19:41:52 +0000 Subject: [all] One month with openstack-discuss (a progress report) Message-ID: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> First, I want to thank everyone here for the remarkably smooth transition to openstack-discuss at the end of November. It's been exactly one month today since we shuttered the old openstack, openstack-dev, openstack-operators and openstack-sigs mailing lists and forwarded all subsequent posts for them to the new list address instead. The number of posts from non-subscribers has dwindled to the point where it's now only a few each day (many of whom also subscribe immediately after receiving the moderation autoresponse). As of this moment, we're up to 708 subscribers. Unfortunately it's hard to compare raw subscriber counts because the longer a list is in existence the more dead addresses it accumulates. Mailman does its best to unsubscribe addresses which explicitly reject/bounce multiple messages in a row, but these days many E-mail addresses grow defunct without triggering any NDRs (perhaps because they've simply been abandoned, or because their MTAs just blackhole new messages for deleted accounts). Instead, it's a lot more concrete to analyze active participants on mailing lists, especially since ours are consistently configured to require a subscription if you want to avoid your messages getting stuck in the moderation queue. Over the course of 2018 (at least until the lists were closed on December 3) there were 1075 unique E-mail addresses posting to one of more of the openstack, openstack-dev, openstack-operators and openstack-sigs mailing lists. Now, a lot of those people sent one or maybe a handful of messages to ask some question they had, and then disappeared again... they didn't really follow ongoing discussions, so probably won't subscribe to openstack-discuss until they have something new to bring up. On the other hand, if we look at addresses which sent 10 or more messages in 2018 (an arbitrary threshold admittedly), there were 245. Comparing those to the list of addresses subscribed to openstack-discuss today, there are 173 matches. That means we now have *at least* 70% of the people who sent 10 or more messages to the old lists subscribed to the new one. I say "at least" because we don't have an easy way to track address changes, and even if we did that's never going to get us to 100% because there are always going to be people who leave the lists abruptly for various reasons (perhaps even disappearing from our community entirely). Seems like a good place to be after only one month, especially considering the number of folks who may not have even been paying attention at all during end-of-year holidays. As for message volume, we had a total of 912 posts to openstack-discuss in the month of December; comparing to the 1033 posts in total we saw to the four old lists in December of 2017, that's a 12% drop. Consider, though, that right at 10% of the messages on the old lists were duplicates from cross-posting, so that's really more like a 2% drop in actual (deduplicated) posting volume. It's far less of a reduction than I would have anticipated based on year-over-year comparisons (for example, December of 2016 had 1564 posts across those four lists). I think based on this, it's safe to say the transition to openstack-discuss hasn't hampered discussion, at least for its first full month in use. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jimmy at openstack.org Thu Jan 3 19:44:15 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 03 Jan 2019 13:44:15 -0600 Subject: [all] One month with openstack-discuss (a progress report) In-Reply-To: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> References: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> Message-ID: <5C2E660F.4010609@openstack.org> Thanks for the update on this, Jeremy! I was curious about the details behind those numbers :) > Jeremy Stanley > January 3, 2019 at 1:41 PM > First, I want to thank everyone here for the remarkably smooth > transition to openstack-discuss at the end of November. It's been > exactly one month today since we shuttered the old openstack, > openstack-dev, openstack-operators and openstack-sigs mailing lists > and forwarded all subsequent posts for them to the new list address > instead. The number of posts from non-subscribers has dwindled to > the point where it's now only a few each day (many of whom also > subscribe immediately after receiving the moderation autoresponse). > > As of this moment, we're up to 708 subscribers. Unfortunately it's > hard to compare raw subscriber counts because the longer a list is > in existence the more dead addresses it accumulates. Mailman does > its best to unsubscribe addresses which explicitly reject/bounce > multiple messages in a row, but these days many E-mail addresses > grow defunct without triggering any NDRs (perhaps because they've > simply been abandoned, or because their MTAs just blackhole new > messages for deleted accounts). Instead, it's a lot more concrete to > analyze active participants on mailing lists, especially since ours > are consistently configured to require a subscription if you want to > avoid your messages getting stuck in the moderation queue. > > Over the course of 2018 (at least until the lists were closed on > December 3) there were 1075 unique E-mail addresses posting to one > of more of the openstack, openstack-dev, openstack-operators and > openstack-sigs mailing lists. Now, a lot of those people sent one or > maybe a handful of messages to ask some question they had, and then > disappeared again... they didn't really follow ongoing discussions, > so probably won't subscribe to openstack-discuss until they have > something new to bring up. > > On the other hand, if we look at addresses which sent 10 or more > messages in 2018 (an arbitrary threshold admittedly), there were > 245. Comparing those to the list of addresses subscribed to > openstack-discuss today, there are 173 matches. That means we now > have *at least* 70% of the people who sent 10 or more messages to > the old lists subscribed to the new one. I say "at least" because we > don't have an easy way to track address changes, and even if we did > that's never going to get us to 100% because there are always going > to be people who leave the lists abruptly for various reasons > (perhaps even disappearing from our community entirely). Seems like > a good place to be after only one month, especially considering the > number of folks who may not have even been paying attention at all > during end-of-year holidays. > > As for message volume, we had a total of 912 posts to > openstack-discuss in the month of December; comparing to the 1033 > posts in total we saw to the four old lists in December of 2017, > that's a 12% drop. Consider, though, that right at 10% of the > messages on the old lists were duplicates from cross-posting, so > that's really more like a 2% drop in actual (deduplicated) posting > volume. It's far less of a reduction than I would have anticipated > based on year-over-year comparisons (for example, December of 2016 > had 1564 posts across those four lists). I think based on this, it's > safe to say the transition to openstack-discuss hasn't hampered > discussion, at least for its first full month in use. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Thu Jan 3 19:50:48 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 3 Jan 2019 13:50:48 -0600 Subject: Review-Priority for Project Repos In-Reply-To: References: <20190103135155.GC27473@sm-workstation> Message-ID: <20190103195048.GB10975@sm-workstation> > > This looks pretty interesting. I have a few question about how it's practically working out. > > > I get the impression that the values of the votes are configurable? So you've chosen -1, +1, and +2, but you could have chosen 1, 2, 3, 4, 5 (for example)? > Yes, I believe so. Maybe someone more familiar with how this works in Gerrit can correct me if I'm misstating that. > > Do you have an example of a dashboard that's using these values? > Being able to easily query and create a dashboard for this is the big benefit. Here's what I came up with for Cinder: https://tiny.cc/CinderPriorities > > IMO, gerrit's display of votes is rather bad. I'd prefer that votes like this could be aggregated. How do you manage discovering what patches are priority or not? I guess that's where the dashboards come in? > We haven't been aggregating votes, rather just a core can decide to flag things are priority or not. > > I get the impression that particular priority votes have the ability to block a merge. How does that work? > I believe this is what controls that: http://git.openstack.org/cgit/openstack-infra/project-config/tree/gerrit/acls/openstack/cinder.config#n29 So I believe that could be NoBlock or NoOp to allow just ranking without enforcing any kind of blocking. From fungi at yuggoth.org Thu Jan 3 20:04:29 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 3 Jan 2019 20:04:29 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <20190103195048.GB10975@sm-workstation> References: <20190103135155.GC27473@sm-workstation> <20190103195048.GB10975@sm-workstation> Message-ID: <20190103200429.pwafqqohb7lxccw7@yuggoth.org> On 2019-01-03 13:50:48 -0600 (-0600), Sean McGinnis wrote: [...] > > I get the impression that the values of the votes are > > configurable? So you've chosen -1, +1, and +2, but you could > > have chosen 1, 2, 3, 4, 5 (for example)? > > Yes, I believe so. Maybe someone more familiar with how this works > in Gerrit can correct me if I'm misstating that. [...] The value ranges are entirely arbitrary as far as I know. Keep in mind though that the Gerrit configuration to carry over votes to new patch sets under specific conditions can apparently only be set to carry over the highest and lowest possible values, but none in between. I really don't understand that design choice on their part, but that's what it does. > So I believe that could be NoBlock or NoOp to allow just ranking > without enforcing any kind of blocking. Yes, that should work if it's the behavior you're looking for. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at nemebean.com Thu Jan 3 20:38:24 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 3 Jan 2019 14:38:24 -0600 Subject: [tripleo] OVB stable/1.0 branch created Message-ID: <27d1a32e-2cce-4892-fb99-7da8c3522df5@nemebean.com> Just a quick update on the status of OVB. There's some discussion on [1] about how to handle importing it to Gerrit, and it looks like we first want to wrap up the 2.0-dev feature branch so we don't have to mess with that after import. As a result, I've cut the stable/1.0 branch from current OVB master. Anyone who's not ready to try out 2.0 should start pointing at stable/1.0 instead of master as the 2.0-dev branch will soon be merged back to master. This will leave us with a branch structure that better matches most other OpenStack projects. We should probably give consumers some time to get switched over, but it sounds like TripleO CI already pins to a specific commit in OVB so it may not affect that. Once a little time has passed we can do the 2.0-dev merge and proceed with the import. I also made a wordier blog post about this[2]. Let me know if you have any feedback. Thanks. -Ben 1: https://review.openstack.org/#/c/620613/ 2: http://blog.nemebean.com/content/openstack-virtual-baremetal-import-plans From msm at redhat.com Thu Jan 3 20:59:47 2019 From: msm at redhat.com (Michael McCune) Date: Thu, 3 Jan 2019 15:59:47 -0500 Subject: [all] One month with openstack-discuss (a progress report) In-Reply-To: <5C2E660F.4010609@openstack.org> References: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> <5C2E660F.4010609@openstack.org> Message-ID: On Thu, Jan 3, 2019 at 2:47 PM Jimmy McArthur wrote: > > Thanks for the update on this, Jeremy! I was curious about the details behind those numbers :) seconded, i really appreciate the update and all the work that went into the transition. it's been completely smooth and painless on my end. kudos peace o/ From whayutin at redhat.com Thu Jan 3 21:04:16 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Thu, 3 Jan 2019 14:04:16 -0700 Subject: [tripleo] OVB stable/1.0 branch created In-Reply-To: <27d1a32e-2cce-4892-fb99-7da8c3522df5@nemebean.com> References: <27d1a32e-2cce-4892-fb99-7da8c3522df5@nemebean.com> Message-ID: Thanks for the update Ben! On Thu, Jan 3, 2019 at 1:43 PM Ben Nemec wrote: > Just a quick update on the status of OVB. There's some discussion on [1] > about how to handle importing it to Gerrit, and it looks like we first > want to wrap up the 2.0-dev feature branch so we don't have to mess with > that after import. As a result, I've cut the stable/1.0 branch from > current OVB master. Anyone who's not ready to try out 2.0 should start > pointing at stable/1.0 instead of master as the 2.0-dev branch will soon > be merged back to master. This will leave us with a branch structure > that better matches most other OpenStack projects. > > We should probably give consumers some time to get switched over, but it > sounds like TripleO CI already pins to a specific commit in OVB so it > may not affect that. Once a little time has passed we can do the 2.0-dev > merge and proceed with the import. > > I also made a wordier blog post about this[2]. > > Let me know if you have any feedback. Thanks. > > -Ben > > 1: https://review.openstack.org/#/c/620613/ > 2: > http://blog.nemebean.com/content/openstack-virtual-baremetal-import-plans > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Thu Jan 3 21:57:39 2019 From: dms at danplanet.com (Dan Smith) Date: Thu, 03 Jan 2019 13:57:39 -0800 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> (Matt Riedemann's message of "Tue, 18 Dec 2018 20:04:00 -0600") References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> Message-ID: > 1. This can already be done using existing APIs (as noted) client-side > if monitoring the live migration and it times out for whatever you > consider a reasonable timeout at the time. There's another thing to point out here, which is that this is also already doable by adjusting (rightly libvirt-specific) config tunables on a compute node that is being evacuated. Those could be hot-reloadable, meaning they could be changed without restarting the compute service when the evac process begins. It doesn't let you control it per-instance, granted, but there *is* a server-side solution to this based on existing stuff. > 2. The libvirt driver is the only one that currently supports abort > and force-complete. > > For #1, while valid as a workaround, is less than ideal since it would > mean having to orchestrate that into any tooling that needs that kind > of workaround, be that OSC, openstacksdk, python-novaclient, > gophercloud, etc. I think it would be relatively simple to pass those > parameters through with the live migration request down to > nova-compute and have the parameters override the config options and > then it's natively supported in the API. > > For #2, while also true, I think is not a great reason *not* to > support per-instance timeouts/actions in the API when we already have > existing APIs that do the same thing and have the same backend compute > driver limitations. To ease this, I think we can sort out two things: > > a) Can other virt drivers that support live migration (xenapi, hyperv, > vmware in tree, and powervm out of tree) also support abort and > force-complete actions? John Garbutt at least thought it should be > possible for xenapi at the Stein PTG. I don't know about the others - > driver maintainers please speak up here. The next challenge would be > getting driver maintainers to actually add that feature parity, but > that need not be a priority for Stein as long as we know it's possible > to add the support eventually. I think that we asked Eric and he said that powervm would/could not support such a thing because they hand the process off to the hypevisor and don't pay attention to what happens after that (and/or can't cancel it). I know John said he thought it would be doable for xenapi, but even if it is, I'm not expecting it will happen. I'd definitely like to hear from the others. > b) There are pre-live migration checks that happen on the source > compute before we initiate the actual guest transfer. If a user > (admin) specified these new parameters and the driver does not support > them, we could fail the live migration early. This wouldn't change the > instance status but the migration would fail and an instance action > event would be recorded to explain why it didn't work, and then the > admin can retry without those parameters. This would shield us from > exposing something in the API that could give a false sense of > functionality when the backend doesn't support it. This is better than nothing, granted. What I'm concerned about is not that $driver never supports these, but rather that $driver shows up later and wants *different* parameters. Or even that libvirt/kvm migration changes in such a way that these no longer make sense even for it. We already have an example this in-tree today, where the recently-added libvirt post-copy mode makes the 'abort' option invalid. > Given all of this, are these reasonable compromises to continue trying > to drive this feature forward, and more importantly, are other > operators looking to see this functionality added to nova? Huawei > public cloud operators want it because they routinely are doing live > migrations as part of maintenance activities and want to be able to > control these values per-instance. I assume there are other > deployments that would like the same. I don't need to hold this up if everyone else is on board, but I don't really want to +2 it. I'll commit to not -1ing it if it specifically confirms support before starting a migration that won't honor the requested limits. --Dan From mriedemos at gmail.com Thu Jan 3 22:17:33 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 16:17:33 -0600 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> Message-ID: <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> On 1/3/2019 3:57 PM, Dan Smith wrote: > Or even that libvirt/kvm > migration changes in such a way that these no longer make sense even for > it. We already have an example this in-tree today, where the > recently-added libvirt post-copy mode makes the 'abort' option invalid. I'm not following you here. As far as I understand, post-copy in the libvirt driver is triggered on the force complete action and only if (1) it's available and (2) nova is configured to allow it, otherwise the force complete action for the libvirt driver pauses the VM. The abort operation aborts the job in libvirt [1] which I believe triggers a rollback [2]. [1] https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7388 [2] https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7454 -- Thanks, Matt From dms at danplanet.com Thu Jan 3 22:37:03 2019 From: dms at danplanet.com (Dan Smith) Date: Thu, 03 Jan 2019 14:37:03 -0800 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> (Matt Riedemann's message of "Thu, 3 Jan 2019 16:17:33 -0600") References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> Message-ID: >> Or even that libvirt/kvm >> migration changes in such a way that these no longer make sense even for >> it. We already have an example this in-tree today, where the >> recently-added libvirt post-copy mode makes the 'abort' option invalid. > > I'm not following you here. As far as I understand, post-copy in the > libvirt driver is triggered on the force complete action and only if > (1) it's available and (2) nova is configured to allow it, otherwise > the force complete action for the libvirt driver pauses the VM. The > abort operation aborts the job in libvirt [1] which I believe triggers > a rollback [2]. > > [1] > https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7388 > [2] > https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7454 Because in nova[0] we currently only switch to post-copy after we decide we're not making progress right? If we later allow a configuration where post-copy is the default from the start (as I believe is the actual current recommendation from the virt people[1]), and someone triggers a migration with a short timeout and abort action, we'll not be able to actually do the abort. I'm guessing we'd just need to refuse a request where abort is specified with any timeout if post-copy will be used from the beginning. Since the API user can't know how the virt driver is configured, we just have to refuse to do the migration and hope they'll understand :) 0: Sorry, I shouldn't have said "in tree" because I meant "in the libvirt world" 1: look for "in summary" here: https://www.berrange.com/posts/2016/05/12/analysis-of-techniques-for-ensuring-migration-completion-with-kvm/ --Dan From mriedemos at gmail.com Thu Jan 3 23:23:21 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 17:23:21 -0600 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> Message-ID: <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> On 1/3/2019 4:37 PM, Dan Smith wrote: > Because in nova[0] we currently only switch to post-copy after we decide > we're not making progress right? If you're referring to the "live_migration_progress_timeout" option that has been deprecated and was replaced in Stein with the live_migration_timeout_action option, which was a pre-requisite for the per-instance timeout + action spec. In Stein, we only switch to post-copy if we hit live_migration_completion_timeout and live_migration_timeout_action=force_complete and live_migration_permit_post_copy=True (and libvirt/qemu are new enough for post-copy), otherwise we pause the guest. So I don't think the stalled progress stuff has applied for awhile (OSIC found problems with it in Ocata and disabled/deprecated it). > If we later allow a configuration where > post-copy is the default from the start (as I believe is the actual > current recommendation from the virt people[1]), and someone triggers a > migration with a short timeout and abort action, we'll not be able to > actually do the abort. Sorry but I don't understand this, how does "post-copy from the start" apply? If I specify a short timeout and abort action in the API, and the timeout is reached before the migration is complete, it should abort, just like if I abort it via the API. As noted above, post-copy should only be triggered once we reach the timeout, and if you overwrite that action to abort (per instance, in the API), it should abort rather than switch to post-copy. -- Thanks, Matt From dms at danplanet.com Thu Jan 3 23:45:25 2019 From: dms at danplanet.com (Dan Smith) Date: Thu, 03 Jan 2019 15:45:25 -0800 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> (Matt Riedemann's message of "Thu, 3 Jan 2019 17:23:21 -0600") References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> Message-ID: Matt Riedemann writes: > On 1/3/2019 4:37 PM, Dan Smith wrote: >> Because in nova[0] we currently only switch to post-copy after we decide >> we're not making progress right? > > If you're referring to the "live_migration_progress_timeout" option > that has been deprecated and was replaced in Stein with the > live_migration_timeout_action option, which was a pre-requisite for > the per-instance timeout + action spec. > > In Stein, we only switch to post-copy if we hit > live_migration_completion_timeout and > live_migration_timeout_action=force_complete and > live_migration_permit_post_copy=True (and libvirt/qemu are new enough > for post-copy), otherwise we pause the guest. > > So I don't think the stalled progress stuff has applied for awhile > (OSIC found problems with it in Ocata and disabled/deprecated it). Yeah, I'm trying to point out something _other_ than what is currently nova behavior. >> If we later allow a configuration where >> post-copy is the default from the start (as I believe is the actual >> current recommendation from the virt people[1]), and someone triggers a >> migration with a short timeout and abort action, we'll not be able to >> actually do the abort. > > Sorry but I don't understand this, how does "post-copy from the start" > apply? If I specify a short timeout and abort action in the API, and > the timeout is reached before the migration is complete, it should > abort, just like if I abort it via the API. As noted above, post-copy > should only be triggered once we reach the timeout, and if you > overwrite that action to abort (per instance, in the API), it should > abort rather than switch to post-copy. You can't abort a post-copy migration once it has started. If we were to add an "always do post-copy" mode to Nova, per the recommendation from the post I linked, then we would start a migration in post-copy mode, which would make it un-cancel-able. That means not only could you not cancel it, but we would have to refuse to start the migration if the user requested an abort action via this new proposed API with any timeout value. Anyway, my point here is just that libvirt already (but not nova/libvirt yet) has a live migration mode where we would not be able to honor a request of "abort after N seconds". If config specified that, we could warn or fail on startup, but via the API all we'd be able to do is refuse to start the migration. I'm just trying to highlight that baking "force/abort after N seconds" into our API is not only just libvirt-specific at the moment, but even libvirt-pre-copy specific. --Dan From mriedemos at gmail.com Fri Jan 4 00:02:16 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 3 Jan 2019 18:02:16 -0600 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> Message-ID: <65c0c5c3-51eb-1b25-2818-0f149a1125fe@gmail.com> On 1/3/2019 5:45 PM, Dan Smith wrote: > You can't abort a post-copy migration once it has started. If we were to > add an "always do post-copy" mode to Nova, per the recommendation from > the post I linked, then we would start a migration in post-copy mode, > which would make it un-cancel-able. That means not only could you not > cancel it, but we would have to refuse to start the migration if the > user requested an abort action via this new proposed API with any > timeout value. > > Anyway, my point here is just that libvirt already (but not nova/libvirt > yet) has a live migration mode where we would not be able to honor a > request of "abort after N seconds". If config specified that, we could > warn or fail on startup, but via the API all we'd be able to do is > refuse to start the migration. I'm just trying to highlight that > baking "force/abort after N seconds" into our API is not only just > libvirt-specific at the moment, but even libvirt-pre-copy specific. OK, sorry, I'm following you now. I didn't make the connection that you were talking about something we could do in the future (in nova) to initiate the live migration in post-copy mode. Yeah I agree in that case if the user said abort we'd just have to reject it and say you can't do that based on how the source host is configured. -- Thanks, Matt From jungleboyj at gmail.com Fri Jan 4 00:26:04 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Thu, 3 Jan 2019 18:26:04 -0600 Subject: Review-Priority for Project Repos In-Reply-To: <20190103135155.GC27473@sm-workstation> References: <20190103135155.GC27473@sm-workstation> Message-ID: > With Cinder this is fairly new, but I think it is working well so far. The > oddity we've run into, that I think you're referring to here, is how those > votes carry forward with updates. It is unfortunate that we can't get +1's to carry forward but I don't think this negates the value of having the priorities.  I have been using our review dashboard quite a bit lately and plan to set up processes that involve it as we move forward to using/documenting Storyboard for Cinder. > So far, that's just a slight inconvenience. It would be great if we can figure > out a way to have them all be sticky, but if we need to live with reapplying +1 > votes, that's manageable to me.  Is there someway that we could allow the owner to reset this priority after pushing up a new patch.  That would lower the dependence on the cores in that case. > Anyway, my 2 cents. I can imagine this would work really well for some teams, > less well for others. So if you think it can help you manage your project > priorities, I would recommend giving it a shot and seeing how it goes. You can > always drop it if it ends up not being effective or causing issues. > > Sean > As usual, the biggest problem I am seeing is getting enough people to do the reviews and really set up all the priorities appropriately.  There are just a couple of us doing it right now. I am hoping to see more participation in the coming months to make the output more beneficial for all. From chris at openstack.org Fri Jan 4 00:27:25 2019 From: chris at openstack.org (Chris Hoge) Date: Thu, 3 Jan 2019 16:27:25 -0800 Subject: [loci] Loci Meeting for January 3, 2019 Message-ID: <1BE8273A-D24B-4F22-A225-9451117FD0F6@openstack.org> On Friday, January 3 we will resume our Loci team meetings at 7 AM PT/ 15 UTC after an extended end-of-year break. We will revisit our plan for building stable branches (in light of several failures to build stable branches on a few distributions because of requirements failures), coming up with a more robust testing strategy, and planning new development priorities for the remainder of the cycle. Thanks, Chris https://etherpad.openstack.org/p/loci-meeting From chris at openstack.org Fri Jan 4 00:29:20 2019 From: chris at openstack.org (Chris Hoge) Date: Thu, 3 Jan 2019 16:29:20 -0800 Subject: [loci] Loci Meeting for January 3, 2019 In-Reply-To: <1BE8273A-D24B-4F22-A225-9451117FD0F6@openstack.org> References: <1BE8273A-D24B-4F22-A225-9451117FD0F6@openstack.org> Message-ID: <096E5BE5-2F28-4BBE-A6DA-69E95B0D7FB1@openstack.org> Correction, the date of the meeting is January 4. > On Jan 3, 2019, at 4:27 PM, Chris Hoge wrote: > > On Friday, January 3 we will resume our Loci team meetings at 7 AM PT/ 15 > UTC after an extended end-of-year break. We will revisit our plan for > building stable branches (in light of several failures to build stable > branches on a few distributions because of requirements failures), coming > up with a more robust testing strategy, and planning new development > priorities for the remainder of the cycle. > > Thanks, > Chris > > https://etherpad.openstack.org/p/loci-meeting > > From cboylan at sapwetik.org Fri Jan 4 01:22:26 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 03 Jan 2019 17:22:26 -0800 Subject: Review-Priority for Project Repos In-Reply-To: References: <20190103135155.GC27473@sm-workstation> Message-ID: <1546564946.3332290.1625035808.384EA5BF@webmail.messagingengine.com> On Thu, Jan 3, 2019, at 4:26 PM, Jay Bryant wrote: > > snip > > So far, that's just a slight inconvenience. It would be great if we can figure > > out a way to have them all be sticky, but if we need to live with reapplying +1 > > votes, that's manageable to me. > > > >  Is there someway that we could allow the owner to reset this priority > after pushing up a new patch.  That would lower the dependence on the > cores in that case. If you use a three value label: [-1: +1] then you could set copy min and max scores so all values are carried forward on new patchsets. This would allow you to have -1 "Don't review", 0 "default no special priority", and +1 "this is a priority please review now". This may have to take advantage of the fact that if you don't set a value its roughly the same as 0 (I don't know if this is explicitly true in Gerrit but we can approximate it since -1 and +1 would be explicitly set and query on those values). If you need an explicit copy all values function in Gerrit you'll want to get that merged upstream first then we could potentially backport it to our Gerrit. This will likely require writing Java. For some reason I thought that Prolog predicates could be written for these value copying functions, but docs seem to say otherwise. Prolog is only for determining if a label's value allows a change to be submitted (merged). Clark From bathanhtlu at gmail.com Fri Jan 4 02:09:47 2019 From: bathanhtlu at gmail.com (=?UTF-8?B?VGjDoG5oIE5ndXnhu4VuIELDoQ==?=) Date: Fri, 4 Jan 2019 09:09:47 +0700 Subject: [oslo] Problem when use library "oslo.messaging" for HA Openstack In-Reply-To: References: Message-ID: No, it isn't. It raised when i use default settings on my client base on "olso_messaging". And when i create the config file and use "oslo_config" passed to tranport (get_notification_transport), it work :D Thank for your help. *Nguyễn Bá Thành* *Mobile*: 0128 748 0391 *Email*: bathanhtlu at gmail.com Vào Th 5, 3 thg 1, 2019 vào lúc 00:30 Doug Hellmann đã viết: > Ben Nemec writes: > > > On 12/27/18 8:22 PM, Thành Nguyễn Bá wrote: > >> Dear all, > >> > >> I have a problem when use 'notification listener' oslo-message for HA > >> Openstack. > >> > >> It raise 'oslo_messaging.exceptions.MessageDeliveryFailure: Unable to > >> connect to AMQP server on 172.16.4.125:5672 > >> after inf tries: Exchange.declare: (406) > >> PRECONDITION_FAILED - inequivalent arg 'durable' for exchange 'nova' in > >> vhost '/': received 'false' but current is 'true''. > >> > >> How can i fix this?. I think settings default in my program set > >> 'durable' is False so it can't listen RabbitMQ Openstack? > > > > It probably depends on which rabbit client library you're using to > > listen for notifications. Presumably there should be some way to > > configure it to set durable to True. > > IIRC, the "exchange" needs to be declared consistently among all > listeners because the first client to connect causes the exchange to be > created. > > > I guess the other option is to disable durable queues in the Nova > > config, but then you lose the contents of any queues when Rabbit gets > > restarted. It would be better to figure out how to make the consuming > > application configure durable queues instead. > > > >> > >> This is my nova.conf > >> > >> http://paste.openstack.org/show/738813/ > >> > >> > >> And section [oslo_messaging_rabbit] > >> > >> [oslo_messaging_rabbit] > >> rabbit_ha_queues = true > >> rabbit_retry_interval = 1 > >> rabbit_retry_backoff = 2 > >> amqp_durable_queues= true > > You say that is your nova.conf. Is that the same configuration file > your client is using when it connects? > > >> > >> > >> > >> *Nguyễn Bá Thành* > >> > >> *Mobile*: 0128 748 0391 > >> > >> *Email*: bathanhtlu at gmail.com > >> > > > > -- > Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Fri Jan 4 02:14:01 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Thu, 3 Jan 2019 20:14:01 -0600 Subject: [openstack-dev] [neutron] Cancelling Neutron drivers meeting on January 4th Message-ID: Hi Neutrinos, Today I spent time triaging RFEs. I posted comments and questions in several of them in order to get them in good shape to be discussed by the Drivers team. None are ready to be discussed, though. As a consequence. I am cancelling the meeting on January 4th. Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjf1970231893 at gmail.com Fri Jan 4 07:19:04 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Fri, 4 Jan 2019 15:19:04 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? Message-ID: Dear Octavia team: The email aims to ask the development progress about l3-active-active blueprint. I noticed that the work in this area has been stagnant for eight months. https://review.openstack.org/#/q/l3-active-active I want to know the community's next work plan in this regard. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Fri Jan 4 08:07:33 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 00:07:33 -0800 Subject: [nova][ops] Trying to get per-instance live migration timeout action spec unstuck In-Reply-To: <65c0c5c3-51eb-1b25-2818-0f149a1125fe@gmail.com> References: <4f696e01-9c04-2da9-34b6-86bdce0c6cdd@gmail.com> <167395f9-2178-c18a-223c-c75592dc6d68@gmail.com> <04ebc84f-ed61-2c34-a4cf-6c046b4ce3ff@gmail.com> <65c0c5c3-51eb-1b25-2818-0f149a1125fe@gmail.com> Message-ID: On Thu, 3 Jan 2019 18:02:16 -0600, Matt Riedemann wrote: > On 1/3/2019 5:45 PM, Dan Smith wrote: >> You can't abort a post-copy migration once it has started. If we were to >> add an "always do post-copy" mode to Nova, per the recommendation from >> the post I linked, then we would start a migration in post-copy mode, >> which would make it un-cancel-able. That means not only could you not >> cancel it, but we would have to refuse to start the migration if the >> user requested an abort action via this new proposed API with any >> timeout value. >> >> Anyway, my point here is just that libvirt already (but not nova/libvirt >> yet) has a live migration mode where we would not be able to honor a >> request of "abort after N seconds". If config specified that, we could >> warn or fail on startup, but via the API all we'd be able to do is >> refuse to start the migration. I'm just trying to highlight that >> baking "force/abort after N seconds" into our API is not only just >> libvirt-specific at the moment, but even libvirt-pre-copy specific. > > OK, sorry, I'm following you now. I didn't make the connection that you > were talking about something we could do in the future (in nova) to > initiate the live migration in post-copy mode. Yeah I agree in that case > if the user said abort we'd just have to reject it and say you can't do > that based on how the source host is configured. This seems like a reasonable way to handle the future case of a live migration initiated in post-copy mode. Overall, I'm in support of the idea of adding finer-grained control over live migrations, being that we have multiple operators who've expressed the usefulness they'd get from it and it seems like a relatively simple change. It also sounds like we have answers for the concerns about bad UX by checking pre-live-migration whether the driver supports the new parameters and fail fast in that case. And in the future if we have live migrations able to be initiated in post-copy mode, fail fast with instance action info similarly. -melanie From melwittt at gmail.com Fri Jan 4 08:48:45 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 00:48:45 -0800 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> Message-ID: On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann wrote: > On 12/28/2018 4:13 AM, Balázs Gibizer wrote: >> I'm wondering that introducing an API microversion could act like a >> feature flag I need and at the same time still make the feautre >> discoverable as you would like to see it. Something like: Create a >> feature flag in the code but do not put it in the config as a settable >> flag. Instead add an API microversion patch to the top of the series >> and when the new version is requested it enables the feature via the >> feature flag. This API patch can be small and simple enough to >> cherry-pick to earlier into the series for local end-to-end testing if >> needed. Also in functional test I can set the flag via a mock so I can >> add and run functional tests patch by patch. > > That may work. It's not how I would have done this, I would have started > from the bottom and worked my way up with the end to end functional > testing at the end, as already noted, but I realize you've been pushing > this boulder for a couple of releases now so that's not really something > you want to change at this point. > > I guess the question is should this change have a microversion at all? > That's been wrestled in the spec review and called out in this thread. I > don't think a microversion would be *wrong* in any sense and could only > help with discoverability on the nova side, but am open to other opinions. Sorry to be late to this discussion, but this brought up in the nova meeting today to get more thoughts. I'm going to briefly summarize my thoughts here. IMHO, I think this change should have a microversion, to help with discoverability. I'm thinking, how will users be able to detect they're able to leverage the new functionality otherwise? A microversion would signal the availability. As for dealing with the situation where a user specifies an older microversion combined with resource requests, I think it should behave similarly to how multiattach works, where the request will be rejected straight away if microversion too low + resource requests are passed. Current behavior today would be, the resource requests are ignored. If we only ignored the resource requests when they're passed with an older microversion, it seems like it would be an unnecessarily poor UX to have their parameters ignored and likely lead them on a debugging journey if and when they realize things aren't working the way they expect given the resource requests they specified. -melanie From melwittt at gmail.com Fri Jan 4 08:53:47 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 00:53:47 -0800 Subject: [nova][dev] tracking spec reviews ahead of spec freeze In-Reply-To: <67c6b11f-a6a5-67cf-1b31-8ae8b5e580ae@gmail.com> References: <67c6b11f-a6a5-67cf-1b31-8ae8b5e580ae@gmail.com> Message-ID: On Wed, 19 Dec 2018 08:49:10 -0800, Melanie Witt wrote: > Hey all, > > Spec freeze is coming up shortly after the holidays on January 10, 2019. > Since we don't have much time after returning to work next year before > spec freeze, let's keep track of specs that are close to approval or > otherwise candidates to focus on in the last stretch before freeze. > > Here's an etherpad where I've collected a list of possible candidates > for focus the first week of January. Feel free to add notes and specs I > might have missed: > > https://etherpad.openstack.org/p/nova-stein-blueprint-spec-freeze Just wanted to bump up this message now that we're back from the holiday break. Milestone s-2 and our spec/blueprint freeze is next Thursday January 10. Please use this etherpad to help with spec reviews and blueprint approvals ahead of the freeze. Best, -melanie From rafaelweingartner at gmail.com Fri Jan 4 12:13:24 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Fri, 4 Jan 2019 10:13:24 -0200 Subject: Openstack CLI using a non-zero return code for successful command. Message-ID: Hello OpenStackers, I have been using "openstack" CLI for a while now, and for most of the commands (the ones I used so far), when it is successful, I receive a return code (RC) 0 (ZERO). However, when using the command "openstack federation protocol set --identity-provider --mapping ", I am getting an RC 1 (ONE) as the exit code for successful executions as well. Is this a known bug? -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Fri Jan 4 13:20:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 04 Jan 2019 13:20:54 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> Message-ID: On Fri, 2019-01-04 at 00:48 -0800, melanie witt wrote: > On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann > wrote: > > On 12/28/2018 4:13 AM, Balázs Gibizer wrote: > > > I'm wondering that introducing an API microversion could act like a > > > feature flag I need and at the same time still make the feautre > > > discoverable as you would like to see it. Something like: Create a > > > feature flag in the code but do not put it in the config as a settable > > > flag. Instead add an API microversion patch to the top of the series > > > and when the new version is requested it enables the feature via the > > > feature flag. This API patch can be small and simple enough to > > > cherry-pick to earlier into the series for local end-to-end testing if > > > needed. Also in functional test I can set the flag via a mock so I can > > > add and run functional tests patch by patch. > > > > That may work. It's not how I would have done this, I would have started > > from the bottom and worked my way up with the end to end functional > > testing at the end, as already noted, but I realize you've been pushing > > this boulder for a couple of releases now so that's not really something > > you want to change at this point. > > > > I guess the question is should this change have a microversion at all? > > That's been wrestled in the spec review and called out in this thread. I > > don't think a microversion would be *wrong* in any sense and could only > > help with discoverability on the nova side, but am open to other opinions. > > Sorry to be late to this discussion, but this brought up in the nova > meeting today to get more thoughts. I'm going to briefly summarize my > thoughts here. > > IMHO, I think this change should have a microversion, to help with > discoverability. I'm thinking, how will users be able to detect they're > able to leverage the new functionality otherwise? A microversion would > signal the availability. As for dealing with the situation where a user > specifies an older microversion combined with resource requests, I think > it should behave similarly to how multiattach works, where the request > will be rejected straight away if microversion too low + resource > requests are passed. this has implcations for upgrades and virsion compatiablity. if a newver version of neutron is used with older nova then behavior will change when nova is upgraded to a version of nova the has the new micoversion. my concern is as follows. a given deployment has rocky nova and rocky neutron. a teant define a minium bandwidth policy and applise it to a network. they create a port on that network. neutorn will automatically apply the minium bandwith policy to the port when it is created on the network. but we could also assuume the tenatn applied the policy to the port if we liked. the tanant then boots a vm with that port. when the vm is schduled to a node neutron will ask the network backend via the ml2 driver to configure the minium bandwith policy if the network backend supports it as part of the bind port call. the ml2 driver can refuse to bind the port at this point if it cannot fulfile the request to prevent the vm from spwaning. assuming the binding succeeds the backend will configure the minium andwith policy on the interface. nova in rocky will not schdule based on the qos policy as there is no resouce request in the port and placement will not model bandwith availablity. note: that this is how minium bandwith was orignially planned to be implmented with ml2/odl and other sdn controler backend several years ago but odl did not implement the required features so this mechanium was never used. i am not aware of any ml2 dirver that actully impmented bandwith check but before placement was created this the mechinium that at least my team at intel and some others had been planning to use. so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in neutron be egress qos minium bandwith support was added in newton with https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf24270d57417#diff-4bbb0b6d12a0d060196c0e3f10e57cec so there are will be a lot of existing cases where ports will have minium bandwith policies before stein. if we repeat the same exercise with rocky nova and stein neutron this changes slightly in that neutron will look at the qos policy associates with the port and add a resouce request. as rocky nova will not have code to parse the resource requests form the neutron port they will be ignored and the vm will boot, the neutron bandwith will configure minium bandwith enforcement on the port, placement will model the bandwith as a inventory but no allocation will be created for the vm. note: i have not checked the neutron node to confirm the qos plugin will still work without the placement allocation but if it dose not its a bug as stien neutron would nolnger work with pre stien nova. as such we would have broken the ablity to upgrade nova and neutron seperatly. if you use stein nova and stein neutron and the new micro version then the vm boots, we allocate the bandiwth in placement and configure the enforment in the networking backend if it supports it which is our end goal. the last configuration is stein nova and stien neutron with old microviron. this will happen in two cases. first the no micorverion is specified explcitly and openstack client is used since it will not negocitate the latest micro version or an explict microversion is passed. if the last rocky micro version was passed for example and we chose to ignore the presence of the resouce request then it would work the way it did with nova rocky and neutron stien above. if we choose to reject the request instead anyone who tries to preform instance actions on an existing instance will break after nova is upgraded to stien. while the fact over subsription is may happend could be problematic to debug for some i think the ux cost is less then the cost of updating all software that used egress qos since it was intoduced in newton to explcitly pass the latest microversion. i am in favor of adding a microversion by the way, i just think we should ignore the resouce request if an old microversion is used. > Current behavior today would be, the resource > requests are ignored. If we only ignored the resource requests when > they're passed with an older microversion, it seems like it would be an > unnecessarily poor UX to have their parameters ignored and likely lead > them on a debugging journey if and when they realize things aren't > working the way they expect given the resource requests they specified. > > -melanie > > > > From skaplons at redhat.com Fri Jan 4 13:28:04 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Fri, 4 Jan 2019 14:28:04 +0100 Subject: [oslo] Parallel Privsep is Proposed for Release In-Reply-To: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Message-ID: Hi, I just found that functional tests in Neutron are failing since today or maybe yesterday. See [1] I was able to reproduce it locally and it looks that it happens with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. [1] https://bugs.launchpad.net/neutron/+bug/1810518 — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Ben Nemec w dniu 02.01.2019, o godz. 19:17: > > Yay alliteration! :-) > > I wanted to draw attention to this release[1] in particular because it includes the parallel privsep change[2]. While it shouldn't have any effect on the public API of the library, it does significantly affect how privsep will process calls on the back end. Specifically, multiple calls can now be processed at the same time, so if any privileged code is not reentrant it's possible that new race bugs could pop up. > > While this sounds scary, it's a necessary change to allow use of privsep in situations where a privileged call may take a non-trivial amount of time. Cinder in particular has some privileged calls that are long-running and can't afford to block all other privileged calls on them. > > So if you're a consumer of oslo.privsep please keep your eyes open for issues related to this new release and contact the Oslo team if you find any. Thanks. > > -Ben > > 1: https://review.openstack.org/628019 > 2: https://review.openstack.org/#/c/593556/ > From doug at doughellmann.com Fri Jan 4 14:26:22 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 04 Jan 2019 09:26:22 -0500 Subject: [oslo] Problem when use library "oslo.messaging" for HA Openstack In-Reply-To: References: Message-ID: Thành Nguyễn Bá writes: > No, it isn't. It raised when i use default settings on my client base on > "olso_messaging". And when i create the config file and use "oslo_config" > passed to tranport (get_notification_transport), it work :D > > Thank for your help. I'm glad to hear that it is working! -- Doug From doug at doughellmann.com Fri Jan 4 14:41:28 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 04 Jan 2019 09:41:28 -0500 Subject: Review-Priority for Project Repos In-Reply-To: <1546564946.3332290.1625035808.384EA5BF@webmail.messagingengine.com> References: <20190103135155.GC27473@sm-workstation> <1546564946.3332290.1625035808.384EA5BF@webmail.messagingengine.com> Message-ID: Clark Boylan writes: > On Thu, Jan 3, 2019, at 4:26 PM, Jay Bryant wrote: >> >> > > snip > >> > So far, that's just a slight inconvenience. It would be great if we can figure >> > out a way to have them all be sticky, but if we need to live with reapplying +1 >> > votes, that's manageable to me. >> >> >> >>  Is there someway that we could allow the owner to reset this priority >> after pushing up a new patch.  That would lower the dependence on the >> cores in that case. > > If you use a three value label: [-1: +1] then you could set copy min > and max scores so all values are carried forward on new > patchsets. This would allow you to have -1 "Don't review", 0 "default > no special priority", and +1 "this is a priority please review > now". This may have to take advantage of the fact that if you don't > set a value its roughly the same as 0 (I don't know if this is > explicitly true in Gerrit but we can approximate it since -1 and +1 > would be explicitly set and query on those values). It is possible to tell the difference between not having a value set and having the default set, but as you point out if the dashboards are simply configured to look for +1 and -1 then the other distinction isn't important. > > If you need an explicit copy all values function in Gerrit you'll want > to get that merged upstream first then we could potentially backport > it to our Gerrit. This will likely require writing Java. We could also have a bot do it. The history of each patch is available, so it's possible to determine that a priority was set but lost when a new patch is submitted. The first step to having a bot would be to write the logic to fix the lost priorities, and if someone does that as a CLI then teams could use that by hand until someone configures the bot. > > For some reason I thought that Prolog predicates could be written for these value copying functions, but docs seem to say otherwise. Prolog is only for determining if a label's value allows a change to be submitted (merged). > > Clark > -- Doug From doug at doughellmann.com Fri Jan 4 14:46:55 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 04 Jan 2019 09:46:55 -0500 Subject: [all] One month with openstack-discuss (a progress report) In-Reply-To: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> References: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> Message-ID: Jeremy Stanley writes: > First, I want to thank everyone here for the remarkably smooth > transition to openstack-discuss at the end of November. It's been > exactly one month today since we shuttered the old openstack, > openstack-dev, openstack-operators and openstack-sigs mailing lists > and forwarded all subsequent posts for them to the new list address > instead. The number of posts from non-subscribers has dwindled to > the point where it's now only a few each day (many of whom also > subscribe immediately after receiving the moderation autoresponse). > > As of this moment, we're up to 708 subscribers. Unfortunately it's > hard to compare raw subscriber counts because the longer a list is > in existence the more dead addresses it accumulates. Mailman does > its best to unsubscribe addresses which explicitly reject/bounce > multiple messages in a row, but these days many E-mail addresses > grow defunct without triggering any NDRs (perhaps because they've > simply been abandoned, or because their MTAs just blackhole new > messages for deleted accounts). Instead, it's a lot more concrete to > analyze active participants on mailing lists, especially since ours > are consistently configured to require a subscription if you want to > avoid your messages getting stuck in the moderation queue. > > Over the course of 2018 (at least until the lists were closed on > December 3) there were 1075 unique E-mail addresses posting to one > of more of the openstack, openstack-dev, openstack-operators and > openstack-sigs mailing lists. Now, a lot of those people sent one or > maybe a handful of messages to ask some question they had, and then > disappeared again... they didn't really follow ongoing discussions, > so probably won't subscribe to openstack-discuss until they have > something new to bring up. > > On the other hand, if we look at addresses which sent 10 or more > messages in 2018 (an arbitrary threshold admittedly), there were > 245. Comparing those to the list of addresses subscribed to > openstack-discuss today, there are 173 matches. That means we now > have *at least* 70% of the people who sent 10 or more messages to > the old lists subscribed to the new one. I say "at least" because we > don't have an easy way to track address changes, and even if we did > that's never going to get us to 100% because there are always going > to be people who leave the lists abruptly for various reasons > (perhaps even disappearing from our community entirely). Seems like > a good place to be after only one month, especially considering the > number of folks who may not have even been paying attention at all > during end-of-year holidays. > > As for message volume, we had a total of 912 posts to > openstack-discuss in the month of December; comparing to the 1033 > posts in total we saw to the four old lists in December of 2017, > that's a 12% drop. Consider, though, that right at 10% of the > messages on the old lists were duplicates from cross-posting, so > that's really more like a 2% drop in actual (deduplicated) posting > volume. It's far less of a reduction than I would have anticipated > based on year-over-year comparisons (for example, December of 2016 > had 1564 posts across those four lists). I think based on this, it's > safe to say the transition to openstack-discuss hasn't hampered > discussion, at least for its first full month in use. > -- > Jeremy Stanley Thank you, Jeremy, both for producing those reassuring stats and for managing the transition. The change has been much less disruptive than I was worried it would be (even though I considered it necessary from the start) and much of the credit for that goes to you for the careful way you have planned and implemented the merge. Nice job! -- Doug From cdent+os at anticdent.org Fri Jan 4 15:29:19 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 4 Jan 2019 15:29:19 +0000 (GMT) Subject: [placement] update 19-00 Message-ID: HTML: https://anticdent.org/placement-update-19-00.html Welcome to the first placement update of 2019. May all your placements have sufficient resources this year. # Most Important A few different people have mentioned that we're approaching crunch time on pulling the trigger on deleting the placement code from nova. The week of the 14th there will be a meeting to iron out the details of what needs to be done prior to that. If this is important to you, watch out for an announcement of when it will be. This is a separate issue from putting placement under its own governance, but some of the requirements [declared](http://lists.openstack.org/pipermail/openstack-dev/2018-September/134541.html) for that, notably a deployment tool demonstrating an upgrade from placement-in-nova to placement-alone, are relevant. Therefore, reviewing and tracking the deployment tool related work remains critical. Those are listed below. Also, it is spec freeze next week. There are quite a lot of specs that are relevant to placement and scheduling that are close, but not quite. Mel has sent out [an email](http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001408.html) about which specs most need attention. # What's Changed * There's an [os-resource-classes](https://pypi.org/p/os-resource-classes) now, already merged in placement, with a change posted [for nova](https://review.openstack.org/#/c/628278/). It's effectively the same as os-traits, but for resource classes. * There's a release of a 0.1.0 of placement [pending](https://review.openstack.org/628400). This won't have complete documentation, but will mean that there's an actually usable openstack-placement on PyPI, with what we expect to be the final python module requirements. * This has been true for a while, but it seems worth mentioning, via coreycb: "you can install placement-api on bionic with the stein cloud archive enabled". * A `db stamp` command has been added to `placement-manage` tool which makes it possible for someone who has migrated their data from nova to say "I'm at version X". * placement functional tests have been removed from nova. * Matt did a mess of work to make initializing the scheduler report client in nova less expensive and redundant. * Improving the handling of [allocation ratios](https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#allocation-ratios) has merged, allowing for "initial allocation ratios". # Bugs * Placement related [bugs not yet in progress](https://goo.gl/TgiPXb): 15. -2. * [In progress placement bugs](https://goo.gl/vzGGDQ) 15. +2 # Specs Spec freeze next week! Only one of the previously listed specs has merged since early December and a new one has been added (at the end). * Account for host agg allocation ratio in placement (Still in rocky/) * Add subtree filter for GET /resource_providers * Resource provider - request group mapping in allocation candidate * VMware: place instances on resource pool (still in rocky/) * Standardize CPU resource tracking * Allow overcommit of dedicated CPU (Has an alternative which changes allocations to a float) * Modelling passthrough devices for report to placement * Nova Cyborg interaction specification. * supporting virtual NVDIMM devices * Proposes NUMA topology with RPs * Count quota based on resource class * Adds spec for instance live resize * Provider config YAML file * Propose counting quota usage from placement and API database * Resource modeling in cyborg. * Support filtering of allocation_candidates by forbidden aggregates * support virtual persistent memory # Main Themes ## Making Nested Useful Progress continues on gpu-reshaping for libvirt and xen: * Also making use of nested is bandwidth-resource-provider: * There's a [review guide](http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001129.html) for those patches. Eric's in the process of doing lots of cleanups to how often the ProviderTree in the resource tracker is checked against placement, and a variety of other "let's make this more right" changes in the same neighborhood: * Stack at: ## Extraction The [extraction etherpad](https://etherpad.openstack.org/p/placement-extract-stein-4) is starting to contain more strikethrough text than not. Progress is being made. I'll refactor that soon so it is more readable, before the week of the 14th meeting. The main tasks are the reshaper work mentioned above and the work to get deployment tools operating with an extracted placement: * [TripleO](https://review.openstack.org/#/q/topic:tripleo-placement-extraction) * [OpenStack Ansible](https://review.openstack.org/#/q/project:openstack/openstack-ansible-os_placement) * [Kolla and Kolla Ansible](https://review.openstack.org/#/q/topic:split-placement) Loci's change to have an extracted placement has merged. Kolla has a patch to [include the upgrade script](https://review.openstack.org/#/q/topic:upgrade-placement). It raises the question of how or if the `mysql-migrate-db.sh` should be distributed. Should it maybe end up in the pypi distribution? Documentation tuneups: * Release-notes: This is blocked until we refactor the release notes to reflect _now_ better. * The main remaining task here is participating in [openstack-manuals](https://docs.openstack.org/doc-contrib-guide/doc-index.html), to that end: * A stack of changes to nova to remove placement from the install docs. * Install docs in placement. I wrote to the [mailing list](http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001379.html) asking for input on making sure these things are close to correct, especially with regard to distro-specific things like package names. * Change to openstack-manuals to assert that placement is publishing install docs. Depends on the above. * There is a patch to [delete placement](https://review.openstack.org/#/c/618215/) from nova that we've put an administrative -2 on while we determine where things are (see about the meeting above). * There's a pending patch to support [online data migrations](https://review.openstack.org/#/c/624942/). This is important to make sure that fixup commands like `create_incomplete_consumers` can be safely removed from nova and implemented in placement. # Other There are currently 13 [open changes](https://review.openstack.org/#/q/project:openstack/placement+status:open) in placement itself. Most of the time critical work is happening elsewhere (notably the deployment tool changes listed above). Of those placement changes, the [database-related](https://review.openstack.org/#/q/owner:nakamura.tetsuro%2540lab.ntt.co.jp+status:open+project:openstack/placement) ones from Tetsuro are the most important. Outside of placement: * Neutron minimum bandwidth implementation * Add OWNERSHIP $SERVICE traits * zun: Use placement for unified resource management * WIP: add Placement aggregates tests (in tempest) * blazar: Consider the number of reservation inventory * Add placement client for basic GET operations (to tempest) # End Lot's of good work in progress. Our main task is making sure it all gets review and merged. The sooner we do, the sooner people get to use it and find all the bugs we're sure to have left lying around. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From mriedemos at gmail.com Fri Jan 4 15:50:46 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Jan 2019 09:50:46 -0600 Subject: [Nova] Suggestion needed for detach-boot-volume design In-Reply-To: References: Message-ID: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> On 1/2/2019 2:57 AM, Zhenyu Zheng wrote: > I've been working on detach-boot-volume[1] in Stein, we got the initial > design merged and while implementing we have meet some new problems and > now I'm amending the spec to cover these new problems[2]. [2] is https://review.openstack.org/#/c/619161/ > > The thing I want to discuss for wider opinion is that in the initial > design, we planned to support detach root volume for only STOPPED and > SHELVED/SHELVE_OFFLOADED instances. But then we found out that we > allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as > well. Should we allow detaching root volume for instances in these > status too? Cases like RESIZE could be complicated for the revert resize > action, and it also seems unnecesary. The full set of allowed states for attaching and detaching are here: https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4187 https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4297 Concerning those other states: RESIZED: There might be a case for attaching/detaching volumes based on flavor during a resize, but I'm not sure about the root volume in that case (that really sounds more like rebuild with a new image to me, which is a different blueprint). I'm also not sure how much people know about the ability to do this or what the behavior is on revert if you have changed the volumes while the server is resized. If we consider that when a user reverts a resize, they want to go back to the way things were for the root disk image, then I would think we should not allow changing out the root volume while resized. PAUSED: First, I'm not sure how much anyone uses the pause API (or suspend for that matter) although most of the virt drivers implement it. At one point you could attach volumes to suspended servers as well, but because libvirt didn't support it that was removed from the API (yay for non-discoverable backend-specific API behavior changes): https://review.openstack.org/#/c/83505/ Anyway, swapping the root volume on a paused instance seems dangerous to me, so until someone really has a good use case for it, then I think we should avoid that one as well. SOFT_DELETED: I really don't understand the use case for attaching/detaching volumes to/from a (soft) deleted server. If the server is deleted and only hanging around because it hasn't been reclaimed yet, there are really no guarantees that this would work, so again, I would just skip this one for the root volume changes. If the user really wants to play with the volumes attached to a soft deleted server, they should restore it first. So in summary, I think we should just not support any of those other states for attach/detach root volumes and only focus on stopped or shelved instances. -- Thanks, Matt From juliaashleykreger at gmail.com Fri Jan 4 15:53:54 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 4 Jan 2019 07:53:54 -0800 Subject: [dev][tc][ptl] Evaluating projects in relation to OpenStack cloud vision Message-ID: As some of you may or may not have heard, recently the Technical Committee approved a technical vision document [1]. The goal of the technical vision document is to try to provide a reference point for cloud infrastructure software in an ideal universe. It is naturally recognized that not all items will apply to all projects. With that in mind, we want to encourage projects to leverage the vision by performing a realistic self-evaluation to determine how their individual project compares to the technical vision: What gaps exist in the project that could be closed to be more in alignment with the vision? Are there aspects of the vision which are inappropriate for the project to such a degree that the vision itself should change, not the project? We envision the results of the evaluation to be added to each project's primary contributor documentation tree (/doc/source/contributor/vision-reflection.rst) as a list of bullet points detailing areas where a project feels they need adjustment to better align with the technical vision, and if the project already has visibility into a path forward, that as well. As with all things of this nature, we anticipate projects to treat the document as a living document and update it as each project's contributors feel necessary. If an individual project community feels something in the overall OpenStack community technical vision does not apply, that is okay. If the project community feels that something in the vision is wrong for the whole of OpenStack, please feel free to submit a revision to gerrit in order to start that discussion. Once projects have performed a realistic self-evaluation, we ask each project to then consider those items they identified in their future planning as areas that could use the attention of contributors. To be very explicit about this, the intent is to help enable projects to identify areas for improved alignment with the rest of OpenStack using a short, concise, easily consumable list that can be referenced in planning, or even by drive-by contributors if they are intrigued by a specific problem. Thanks, Julia Kreger & Chris Dent [1] https://governance.openstack.org/tc/reference/technical-vision.html From mriedemos at gmail.com Fri Jan 4 17:11:03 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Jan 2019 11:11:03 -0600 Subject: [goals][upgrade-checkers] Week R-14 Update Message-ID: <3bbb7683-1581-5414-1698-a08a0abed10b@gmail.com> There has not been much progress since the R-16 update [1] let's assume because of the holiday break. There is a new trove patch which replaces the placeholder check with a real upgrade check [2]. I have left comments on that review. I am not sure if that is due to some new changes in trove which require the upgrade check, or if this is just something that has always been needed when upgrading trove. [1] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001266.html [2] https://review.openstack.org/#/c/627555/ -- Thanks, Matt From miguel at mlavalle.com Fri Jan 4 17:56:49 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Fri, 4 Jan 2019 11:56:49 -0600 Subject: [openstack-dev] [neutron] Changing the meeting channel for Neutron upgrades weekly meeting Message-ID: Lujin and Nate and Neutrinos, Please be aware that the meeting room for the Neutron upgrades channel is being changed: https://review.openstack.org/#/c/626182. due to infra optimization. Cheers Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From cboylan at sapwetik.org Fri Jan 4 18:12:25 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Fri, 04 Jan 2019 10:12:25 -0800 Subject: Update on gate status for the new year Message-ID: <1546625545.3567722.1625642368.48EE2F8A@webmail.messagingengine.com> I'm still not entirely caught up on everything after the holidays, but thought I would attempt to do another update on gate reliability issues since those were well received last month. Overall things look pretty good based on elastic-recheck data. That said I think this is mostly due to low test volume over the holidays and our 10 day index window. We should revisit this next week or the week after to get a more accurate view of things. On the infra team side of things we've got quota issues in a cloud region that has decreased our test node capacity. Waiting on people to return from holidays to take a look at that. We also started tracking hypervisor IDs for our test instances (thank you pabelanger) to try and help identify when specific hypervisors might be the cause of some of our issues. https://review.openstack.org/628642 is a followup to index that data with our job log data in Elasticsearch. We've seen some ssh failures in tripleo jobs on limestone [0] and neutron and zuul report constrained IOPS there resulting in failed database migrations. I think the idea with 628642 is to see if we can narrow that down to specific hypervisors. On the project side of things our categorization rates are quite low [1][2]. If your changes are evicted from the gate due to failures it would be helpful if you could spend a few minutes to try and identify and fingerprint those failures. We'll check back in a week or two when we should have a much better data set to look at. [0] http://status.openstack.org/elastic-recheck/index.html#18100542 [1] http://status.openstack.org/elastic-recheck/data/integrated_gate.html [2] http://status.openstack.org/elastic-recheck/data/others.html Clark From mihalis68 at gmail.com Fri Jan 4 18:44:32 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Fri, 4 Jan 2019 13:44:32 -0500 Subject: [all] One month with openstack-discuss (a progress report) In-Reply-To: References: <20190103194151.zhnqx5esj76xhkxa@yuggoth.org> Message-ID: Yes Jeremy actually did an amazing job, agree with all the positive comments above. Chris On Fri, Jan 4, 2019 at 9:50 AM Doug Hellmann wrote: > Jeremy Stanley writes: > > > First, I want to thank everyone here for the remarkably smooth > > transition to openstack-discuss at the end of November. It's been > > exactly one month today since we shuttered the old openstack, > > openstack-dev, openstack-operators and openstack-sigs mailing lists > > and forwarded all subsequent posts for them to the new list address > > instead. The number of posts from non-subscribers has dwindled to > > the point where it's now only a few each day (many of whom also > > subscribe immediately after receiving the moderation autoresponse). > > > > As of this moment, we're up to 708 subscribers. Unfortunately it's > > hard to compare raw subscriber counts because the longer a list is > > in existence the more dead addresses it accumulates. Mailman does > > its best to unsubscribe addresses which explicitly reject/bounce > > multiple messages in a row, but these days many E-mail addresses > > grow defunct without triggering any NDRs (perhaps because they've > > simply been abandoned, or because their MTAs just blackhole new > > messages for deleted accounts). Instead, it's a lot more concrete to > > analyze active participants on mailing lists, especially since ours > > are consistently configured to require a subscription if you want to > > avoid your messages getting stuck in the moderation queue. > > > > Over the course of 2018 (at least until the lists were closed on > > December 3) there were 1075 unique E-mail addresses posting to one > > of more of the openstack, openstack-dev, openstack-operators and > > openstack-sigs mailing lists. Now, a lot of those people sent one or > > maybe a handful of messages to ask some question they had, and then > > disappeared again... they didn't really follow ongoing discussions, > > so probably won't subscribe to openstack-discuss until they have > > something new to bring up. > > > > On the other hand, if we look at addresses which sent 10 or more > > messages in 2018 (an arbitrary threshold admittedly), there were > > 245. Comparing those to the list of addresses subscribed to > > openstack-discuss today, there are 173 matches. That means we now > > have *at least* 70% of the people who sent 10 or more messages to > > the old lists subscribed to the new one. I say "at least" because we > > don't have an easy way to track address changes, and even if we did > > that's never going to get us to 100% because there are always going > > to be people who leave the lists abruptly for various reasons > > (perhaps even disappearing from our community entirely). Seems like > > a good place to be after only one month, especially considering the > > number of folks who may not have even been paying attention at all > > during end-of-year holidays. > > > > As for message volume, we had a total of 912 posts to > > openstack-discuss in the month of December; comparing to the 1033 > > posts in total we saw to the four old lists in December of 2017, > > that's a 12% drop. Consider, though, that right at 10% of the > > messages on the old lists were duplicates from cross-posting, so > > that's really more like a 2% drop in actual (deduplicated) posting > > volume. It's far less of a reduction than I would have anticipated > > based on year-over-year comparisons (for example, December of 2016 > > had 1564 posts across those four lists). I think based on this, it's > > safe to say the transition to openstack-discuss hasn't hampered > > discussion, at least for its first full month in use. > > -- > > Jeremy Stanley > > Thank you, Jeremy, both for producing those reassuring stats and for > managing the transition. The change has been much less disruptive than I > was worried it would be (even though I considered it necessary from the > start) and much of the credit for that goes to you for the careful way > you have planned and implemented the merge. Nice job! > > -- > Doug > > -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ken at jots.org Fri Jan 4 19:45:43 2019 From: ken at jots.org (Ken D'Ambrosio) Date: Fri, 04 Jan 2019 14:45:43 -0500 Subject: Per-VM CPU & RAM allocation? Message-ID: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> Hi! If I go into the UI, I can easily see how much each VM is allocated for RAM and CPU. However, I've googled until I'm blue in the face, and can't seem to see a way -- either through CLI or API -- to get that info. "nova limits --tenant " SEEMS like it should... except (at least on my Juno cloud), the "used" column is either full of zeros or dashes. Clearly, if it's in the UI, it's possible... somehow. But it seemed like it might be easier to ask the list than go down the rabbit hole of tcp captures. Any ideas? Thanks! -Ken P.S. Not interested in CPU-hours/usage -- I'm just looking for how much is actually allocated. From jungleboyj at gmail.com Fri Jan 4 19:53:39 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Fri, 4 Jan 2019 13:53:39 -0600 Subject: [cinder] Addition mid-cycle details ... Message-ID: Team, We are at about a month away from the Cinder Mid-Cycle in Raleigh.  I have started requesting drinks/snacks and have rooms reserved.  I soon will need a firm number of people attending so that I can finalize the various requests. If you are planning to physically attend and have not yet added your name to our planning etherpad [1] please do so ASAP. I have reserved a room at the Hyatt House Raleigh Durham Airport (10962 Chapel Hill Road, Morrisville, NC, 27560, USA ) if people want to stay at the same hotel as myself.  It is close to the Lenovo site which will make it easier to travel if we have unexpected snowy weather there.  We can also carpool to reduce the problem of finding parking. I am arriving the afternoon of 2/4/19 and leaving the morning of 2/9/19. Also a reminder to please add topics to the mid-cycle planning etherpad regardless of whether you are able to attend or not. Look forward to seeing you in Raleigh next month! Jay (jungleboyj) [1] https://etherpad.openstack.org/p/cinder-stein-mid-cycle-planning -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Fri Jan 4 19:59:11 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Jan 2019 13:59:11 -0600 Subject: Update on gate status for the new year In-Reply-To: <1546625545.3567722.1625642368.48EE2F8A@webmail.messagingengine.com> References: <1546625545.3567722.1625642368.48EE2F8A@webmail.messagingengine.com> Message-ID: On 1/4/2019 12:12 PM, Clark Boylan wrote: > Overall things look pretty good based on elastic-recheck data. That said I think this is mostly due to low test volume over the holidays and our 10 day index window. We should revisit this next week or the week after to get a more accurate view of things. > > On the infra team side of things we've got quota issues in a cloud region that has decreased our test node capacity. Waiting on people to return from holidays to take a look at that. We also started tracking hypervisor IDs for our test instances (thank you pabelanger) to try and help identify when specific hypervisors might be the cause of some of our issues.https://review.openstack.org/628642 is a followup to index that data with our job log data in Elasticsearch. > > We've seen some ssh failures in tripleo jobs on limestone [0] and neutron and zuul report constrained IOPS there resulting in failed database migrations. I think the idea with 628642 is to see if we can narrow that down to specific hypervisors. > > On the project side of things our categorization rates are quite low [1][2]. If your changes are evicted from the gate due to failures it would be helpful if you could spend a few minutes to try and identify and fingerprint those failures. On a side note, I've noticed tempest jobs failing and elastic-recheck wasn't commenting on the changes. Turns out that's because we're using a really limited regex for the jobs that e-r will process in order to comment on a change in gerrit. The following patch should help with that: https://review.openstack.org/#/c/628669/ But since "dsvm" isn't standard in job names anymore it's clear that e-r is going to be skipping a lot of project-specific jobs which otherwise have categorized failures. -- Thanks, Matt From sean.mcginnis at gmx.com Fri Jan 4 20:03:00 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Fri, 4 Jan 2019 14:03:00 -0600 Subject: Per-VM CPU & RAM allocation? In-Reply-To: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> References: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> Message-ID: <20190104200300.GB22595@sm-workstation> On Fri, Jan 04, 2019 at 02:45:43PM -0500, Ken D'Ambrosio wrote: > Hi! If I go into the UI, I can easily see how much each VM is allocated for > RAM and CPU. However, I've googled until I'm blue in the face, and can't > seem to see a way -- either through CLI or API -- to get that info. "nova > limits --tenant " SEEMS like it should... except (at least on my Juno > cloud), the "used" column is either full of zeros or dashes. Clearly, if > it's in the UI, it's possible... somehow. But it seemed like it might be > easier to ask the list than go down the rabbit hole of tcp captures. > > Any ideas? > > Thanks! > > -Ken > > P.S. Not interested in CPU-hours/usage -- I'm just looking for how much is > actually allocated. > Hey Ken, Those values are set based on the flavor that is chosen when creating the instance. You can get that information of a running instance by: openstack server show And looking at the flavor of the instance. I believe it's in the format of "flavor.name (id)". You can then do: openstack flavor show or just: openstack flavor list to get the RAM and VCPUs values defined for that flavor. There are corresponding API calls you can make. Add "--debug" to those CLI calls to get the debug output that shows curl examples of the REST APIs being called. Hope that helps. Sean From mriedemos at gmail.com Fri Jan 4 20:05:08 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 4 Jan 2019 14:05:08 -0600 Subject: Per-VM CPU & RAM allocation? In-Reply-To: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> References: <5b7ab82d45e3ecdfa11a7768647b1040@jots.org> Message-ID: <8bd9b015-48d5-62a0-dafc-035e9407ff76@gmail.com> On 1/4/2019 1:45 PM, Ken D'Ambrosio wrote: > Hi!  If I go into the UI, I can easily see how much each VM is allocated > for RAM and CPU.  However, I've googled until I'm blue in the face, and > can't seem to see a way -- either through CLI or API -- to get that > info.  "nova limits --tenant " SEEMS like it should... except (at > least on my Juno cloud), the "used" column is either full of zeros or > dashes.  Clearly, if it's in the UI, it's possible... somehow.  But it > seemed like it might be easier to ask the list than go down the rabbit > hole of tcp captures. You might be looking for the os-simple-tenant-usages API [1]. That is per-tenant but the response has per-server results in it. If you were on something newer (Ocata+) you could use [2] to get per-instance resource allocations for VCPU and MEMORY_MB. I'm not sure what the UI is doing, but it might simply be getting the flavor used for each VM and showing the details about that flavor, which you could also get (more reliably) with the server details themselves starting in microversion 2.47 (added in Pike). [1] https://developer.openstack.org/api-ref/compute/?expanded=show-usage-statistics-for-tenant-detail#show-usage-statistics-for-tenant [2] https://developer.openstack.org/api-ref/placement/?expanded=list-allocations-detail#list-allocations -- Thanks, Matt From melwittt at gmail.com Fri Jan 4 23:33:00 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 15:33:00 -0800 Subject: [Nova] Suggestion needed for detach-boot-volume design In-Reply-To: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> References: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> Message-ID: On Fri, 4 Jan 2019 09:50:46 -0600, Matt Riedemann wrote: > On 1/2/2019 2:57 AM, Zhenyu Zheng wrote: >> I've been working on detach-boot-volume[1] in Stein, we got the initial >> design merged and while implementing we have meet some new problems and >> now I'm amending the spec to cover these new problems[2]. > > [2] is https://review.openstack.org/#/c/619161/ > >> >> The thing I want to discuss for wider opinion is that in the initial >> design, we planned to support detach root volume for only STOPPED and >> SHELVED/SHELVE_OFFLOADED instances. But then we found out that we >> allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as >> well. Should we allow detaching root volume for instances in these >> status too? Cases like RESIZE could be complicated for the revert resize >> action, and it also seems unnecesary. > > The full set of allowed states for attaching and detaching are here: > > https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4187 > > https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4297 > > Concerning those other states: > > RESIZED: There might be a case for attaching/detaching volumes based on > flavor during a resize, but I'm not sure about the root volume in that > case (that really sounds more like rebuild with a new image to me, which > is a different blueprint). I'm also not sure how much people know about > the ability to do this or what the behavior is on revert if you have > changed the volumes while the server is resized. If we consider that > when a user reverts a resize, they want to go back to the way things > were for the root disk image, then I would think we should not allow > changing out the root volume while resized. Yeah, if someone attaches/detaches a regular volume while the instance is in VERIFY_RESIZE state and then reverts the resize, I assume we probably don't attempt to change or restore anything with the volume attachments to put them back to how they were attached before the resize. But as you point out, the situation does seem different regarding a root volume. If a user changes that while in VERIFY_RESIZE and reverts the resize, and we leave the root volume alone, then they end up with a different root disk image than they had before the resize. Which seems weird. I agree it seems better not to allow this for now and come back to it later if people start asking for it. > PAUSED: First, I'm not sure how much anyone uses the pause API (or > suspend for that matter) although most of the virt drivers implement it. > At one point you could attach volumes to suspended servers as well, but > because libvirt didn't support it that was removed from the API (yay for > non-discoverable backend-specific API behavior changes): > > https://review.openstack.org/#/c/83505/ > > Anyway, swapping the root volume on a paused instance seems dangerous to > me, so until someone really has a good use case for it, then I think we > should avoid that one as well. > > SOFT_DELETED: I really don't understand the use case for > attaching/detaching volumes to/from a (soft) deleted server. If the > server is deleted and only hanging around because it hasn't been > reclaimed yet, there are really no guarantees that this would work, so > again, I would just skip this one for the root volume changes. If the > user really wants to play with the volumes attached to a soft deleted > server, they should restore it first. > > So in summary, I think we should just not support any of those other > states for attach/detach root volumes and only focus on stopped or > shelved instances. Again, agree, I think we should just not allow the other states for the initial implementation and revisit later if it turns out people need these. -melanie From aspiers at suse.com Fri Jan 4 23:45:30 2019 From: aspiers at suse.com (Adam Spiers) Date: Fri, 4 Jan 2019 23:45:30 +0000 Subject: [docs] question about deprecation badges Message-ID: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> Hi all, I'm currently hacking on the deprecation badges in openstack-manuals, and there's a couple of things I don't understand. Any chance someone could explain why www/latest/badge.html doesn't just do: {% include 'templates/deprecated_badge.tmpl' %} like all the others? https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-61d0adc734c25e15fa375c6acd344703 I'm also what exactly would be wrong with the included CSS path if CSSDIR was used in www/templates/deprecated_badge.tmpl instead of heeding this caveat: https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-67d1669c09d2cddc437c6d803a5d6c02R4 It would be good to fix it to use CSSDIR because currently it's awkward to test CSS changes. Thanks! Adam From johnsomor at gmail.com Sat Jan 5 00:02:43 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Fri, 4 Jan 2019 16:02:43 -0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Hi Jeff, Unfortunately the team that was working on that code had stopped due to internal reasons. I hope to make the reference active/active blueprint a priority again during the Train cycle. Following that I may be able to look at the L3 distributor option, but I cannot commit to that at this time. If you are interesting in picking up that work, please let me know and we can sync up on that status of the WIP patches, etc. Michael On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: > > Dear Octavia team: > The email aims to ask the development progress about l3-active-active blueprint. I > noticed that the work in this area has been stagnant for eight months. > https://review.openstack.org/#/q/l3-active-active > I want to know the community's next work plan in this regard. > Thanks. From melwittt at gmail.com Sat Jan 5 00:35:21 2019 From: melwittt at gmail.com (melanie witt) Date: Fri, 4 Jan 2019 16:35:21 -0800 Subject: [nova] review guide for the bandwidth patches In-Reply-To: References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> Message-ID: <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> On Fri, 04 Jan 2019 13:20:54 +0000, Sean Mooney wrote: > On Fri, 2019-01-04 at 00:48 -0800, melanie witt wrote: >> On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann >> wrote: >>> On 12/28/2018 4:13 AM, Balázs Gibizer wrote: >>>> I'm wondering that introducing an API microversion could act like a >>>> feature flag I need and at the same time still make the feautre >>>> discoverable as you would like to see it. Something like: Create a >>>> feature flag in the code but do not put it in the config as a settable >>>> flag. Instead add an API microversion patch to the top of the series >>>> and when the new version is requested it enables the feature via the >>>> feature flag. This API patch can be small and simple enough to >>>> cherry-pick to earlier into the series for local end-to-end testing if >>>> needed. Also in functional test I can set the flag via a mock so I can >>>> add and run functional tests patch by patch. >>> >>> That may work. It's not how I would have done this, I would have started >>> from the bottom and worked my way up with the end to end functional >>> testing at the end, as already noted, but I realize you've been pushing >>> this boulder for a couple of releases now so that's not really something >>> you want to change at this point. >>> >>> I guess the question is should this change have a microversion at all? >>> That's been wrestled in the spec review and called out in this thread. I >>> don't think a microversion would be *wrong* in any sense and could only >>> help with discoverability on the nova side, but am open to other opinions. >> >> Sorry to be late to this discussion, but this brought up in the nova >> meeting today to get more thoughts. I'm going to briefly summarize my >> thoughts here. >> >> IMHO, I think this change should have a microversion, to help with >> discoverability. I'm thinking, how will users be able to detect they're >> able to leverage the new functionality otherwise? A microversion would >> signal the availability. As for dealing with the situation where a user >> specifies an older microversion combined with resource requests, I think >> it should behave similarly to how multiattach works, where the request >> will be rejected straight away if microversion too low + resource >> requests are passed. > > this has implcations for upgrades and virsion compatiablity. > if a newver version of neutron is used with older nova then > behavior will change when nova is upgraded to a version of > nova the has the new micoversion. > > my concern is as follows. > a given deployment has rocky nova and rocky neutron. > a teant define a minium bandwidth policy and applise it to a network. > they create a port on that network. > neutorn will automatically apply the minium bandwith policy to the port when it is created on the network. > but we could also assuume the tenatn applied the policy to the port if we liked. > the tanant then boots a vm with that port. > > when the vm is schduled to a node neutron will ask the network backend via the ml2 driver to configure the minium > bandwith policy if the network backend supports it as part of the bind port call. the ml2 driver can refuse to bind the > port at this point if it cannot fulfile the request to prevent the vm from spwaning. assuming the binding succeeds the > backend will configure the minium andwith policy on the interface. nova in rocky will not schdule based on the qos > policy as there is no resouce request in the port and placement will not model bandwith availablity. > > note: that this is how minium bandwith was orignially planned to be implmented with ml2/odl and other sdn controler > backend several years ago but odl did not implement the required features so this mechanium was never used. > i am not aware of any ml2 dirver that actully impmented bandwith check but before placement was created this > the mechinium that at least my team at intel and some others had been planning to use. > > > so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure > the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in > neutron be egress qos minium bandwith support was added in newton with > https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf24270d57417#diff-4bbb0b6d12a0d060196c0e3f10e57cec > so there are will be a lot of existing cases where ports will have minium bandwith policies before stein. > > if we repeat the same exercise with rocky nova and stein neutron this changes slightly in that > neutron will look at the qos policy associates with the port and add a resouce request. as rocky nova > will not have code to parse the resource requests form the neutron port they will be ignored and > the vm will boot, the neutron bandwith will configure minium bandwith enforcement on the port, placement will > model the bandwith as a inventory but no allocation will be created for the vm. > > note: i have not checked the neutron node to confirm the qos plugin will still work without the placement allocation > but if it dose not its a bug as stien neutron would nolnger work with pre stien nova. as such we would have > broken the ablity to upgrade nova and neutron seperatly. > > if you use stein nova and stein neutron and the new micro version then the vm boots, we allocate the bandiwth in > placement and configure the enforment in the networking backend if it supports it which is our end goal. > > the last configuration is stein nova and stien neutron with old microviron. > this will happen in two cases. > first the no micorverion is specified explcitly and openstack client is used since it will not negocitate the latest > micro version or an explict microversion is passed. > > if the last rocky micro version was passed for example and we chose to ignore the presence of the resouce request then > it would work the way it did with nova rocky and neutron stien above. if we choose to reject the request instead > anyone who tries to preform instance actions on an existing instance will break after nova is upgraded to stien. > > while the fact over subsription is may happend could be problematic to debug for some i think the ux cost is less then > the cost of updating all software that used egress qos since it was intoduced in newton to explcitly pass the latest > microversion. > > i am in favor of adding a microversion by the way, i just think we should ignore the resouce request if an old > microversion is used. Thanks for describing this detailed scenario -- I wasn't realizing that today, you can get _some_ QoS support by pre-creating ports in neutron with resource requests attached and specifying those ports when creating a server. I understand now the concern with the idea of rejecting requests < new microversion + port.resource_request existing on pre-created ports. And there's no notion of being able to request QoS support via ports created by Nova (no change in Nova API or flavor extra-specs in the design). So, I could see this situation being reason enough not to reject requests when an old microversion is specified. But, let's chat more about it via a hangout the week after next (week of January 14 when Matt is back), as suggested in #openstack-nova today. We'll be able to have a high-bandwidth discussion then and agree on a decision on how to move forward with this. >> Current behavior today would be, the resource >> requests are ignored. If we only ignored the resource requests when >> they're passed with an older microversion, it seems like it would be an >> unnecessarily poor UX to have their parameters ignored and likely lead >> them on a debugging journey if and when they realize things aren't >> working the way they expect given the resource requests they specified. >> >> -melanie >> >> >> >> > From yjf1970231893 at gmail.com Sat Jan 5 03:29:11 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Sat, 5 Jan 2019 11:29:11 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Yes, I want to reboot this work recently. And, I want to replace exabgp with os-ken, So I may need to rewrite some patches. Michael Johnson 于2019年1月5日周六 上午8:02写道: > Hi Jeff, > > Unfortunately the team that was working on that code had stopped due > to internal reasons. > > I hope to make the reference active/active blueprint a priority again > during the Train cycle. Following that I may be able to look at the L3 > distributor option, but I cannot commit to that at this time. > > If you are interesting in picking up that work, please let me know and > we can sync up on that status of the WIP patches, etc. > > Michael > > On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: > > > > Dear Octavia team: > > The email aims to ask the development progress about > l3-active-active blueprint. I > > noticed that the work in this area has been stagnant for eight months. > > https://review.openstack.org/#/q/l3-active-active > > I want to know the community's next work plan in this regard. > > Thanks. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From saphi070 at gmail.com Sat Jan 5 03:58:35 2019 From: saphi070 at gmail.com (Sa Pham) Date: Sat, 5 Jan 2019 10:58:35 +0700 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: <3F4AE7E0-CE23-4098-8C79-225781E0BBBF@gmail.com> Hi Jeff, Do you have design and specs for that. Best, Sa Pham Dang Cloud RnD Team - VCCLOUD Phone: 0986849582 Skype: great_bn > On Jan 5, 2019, at 10:29 AM, Jeff Yang wrote: > > Yes, I want to reboot this work recently. And, I want to replace exabgp with > os-ken, So I may need to rewrite some patches. > > Michael Johnson 于2019年1月5日周六 上午8:02写道: >> Hi Jeff, >> >> Unfortunately the team that was working on that code had stopped due >> to internal reasons. >> >> I hope to make the reference active/active blueprint a priority again >> during the Train cycle. Following that I may be able to look at the L3 >> distributor option, but I cannot commit to that at this time. >> >> If you are interesting in picking up that work, please let me know and >> we can sync up on that status of the WIP patches, etc. >> >> Michael >> >> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: >> > >> > Dear Octavia team: >> > The email aims to ask the development progress about l3-active-active blueprint. I >> > noticed that the work in this area has been stagnant for eight months. >> > https://review.openstack.org/#/q/l3-active-active >> > I want to know the community's next work plan in this regard. >> > Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From smarcet at gmail.com Sat Jan 5 04:19:21 2019 From: smarcet at gmail.com (Sebastian Marcet) Date: Sat, 5 Jan 2019 01:19:21 -0300 Subject: [docs] question about deprecation badges In-Reply-To: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> References: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> Message-ID: Hi Adam, that approach is followed in order to be consumed from other projects like theme docs, using an ajax call doing a get and including its content dynamically on the page, also the css dir its not used bc otherwise it will not point to the rite css path on the fore-mentioned approach, instead its using the absolute path in order to load the css correctly through the ajax call ( check https://review.openstack.org/#/c/585516/) on docs theme the deprecation badge is loaded using this snippet on file openstackdocstheme/theme/openstackdocs/layout.html hope that shed some lite , have in mind that its a first iteration and any better approach its welcome regards El vie., 4 ene. 2019 a las 20:45, Adam Spiers () escribió: > Hi all, > > I'm currently hacking on the deprecation badges in openstack-manuals, > and there's a couple of things I don't understand. Any chance someone > could explain why www/latest/badge.html doesn't just do: > > {% include 'templates/deprecated_badge.tmpl' %} > > like all the others? > > > https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-61d0adc734c25e15fa375c6acd344703 > > I'm also what exactly would be wrong with the included CSS path if > CSSDIR was used in www/templates/deprecated_badge.tmpl instead of > heeding this caveat: > > > https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-67d1669c09d2cddc437c6d803a5d6c02R4 > > It would be good to fix it to use CSSDIR because currently it's > awkward to test CSS changes. > > Thanks! > Adam > -- Sebastian Marcet https://ar.linkedin.com/in/smarcet SKYPE: sebastian.marcet -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjf1970231893 at gmail.com Sat Jan 5 04:31:28 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Sat, 5 Jan 2019 12:31:28 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: <3F4AE7E0-CE23-4098-8C79-225781E0BBBF@gmail.com> Message-ID: I have no new specs, I plan to follow the original blueprint. https://docs.openstack.org/octavia/latest/contributor/specs/version1.1/active-active-l3-distributor.html Jeff Yang 于2019年1月5日周六 下午12:16写道: > I have no new specs, I plan to follow the original blueprint. > > https://docs.openstack.org/octavia/latest/contributor/specs/version1.1/active-active-l3-distributor.html > > Sa Pham 于2019年1月5日周六 上午11:58写道: > >> Hi Jeff, >> >> Do you have design and specs for that. >> >> >> Best, >> >> Sa Pham Dang >> Cloud RnD Team - VCCLOUD >> Phone: 0986849582 >> Skype: great_bn >> >> On Jan 5, 2019, at 10:29 AM, Jeff Yang wrote: >> >> Yes, I want to reboot this work recently. And, I want to replace >> exabgp with >> os-ken, So I may need to rewrite some patches. >> >> Michael Johnson 于2019年1月5日周六 上午8:02写道: >> >>> Hi Jeff, >>> >>> Unfortunately the team that was working on that code had stopped due >>> to internal reasons. >>> >>> I hope to make the reference active/active blueprint a priority again >>> during the Train cycle. Following that I may be able to look at the L3 >>> distributor option, but I cannot commit to that at this time. >>> >>> If you are interesting in picking up that work, please let me know and >>> we can sync up on that status of the WIP patches, etc. >>> >>> Michael >>> >>> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang >>> wrote: >>> > >>> > Dear Octavia team: >>> > The email aims to ask the development progress about >>> l3-active-active blueprint. I >>> > noticed that the work in this area has been stagnant for eight months. >>> > https://review.openstack.org/#/q/l3-active-active >>> > I want to know the community's next work plan in this regard. >>> > Thanks. >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From smarcet at gmail.com Sat Jan 5 04:34:11 2019 From: smarcet at gmail.com (Sebastian Marcet) Date: Sat, 5 Jan 2019 01:34:11 -0300 Subject: [docs] question about deprecation badges In-Reply-To: References: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> Message-ID: and the latest release its different bc its a corner case, there is not way to figure it out from the template logic if current navigated release its the latest so that is why its has its own template and logic regards El sáb., 5 ene. 2019 a las 1:19, Sebastian Marcet () escribió: > Hi Adam, that approach is followed in order to be consumed from other > projects like theme docs, using an ajax call doing a get and including its > content dynamically on the page, also the css dir its not used bc otherwise > it will not point to the rite css path on the fore-mentioned approach, > instead its using the absolute path in order to load the css correctly > through the ajax call ( check https://review.openstack.org/#/c/585516/) > > on docs theme the deprecation badge is loaded using this snippet > > > > on file openstackdocstheme/theme/openstackdocs/layout.html > > hope that shed some lite , have in mind that its a first iteration and any > better approach its welcome > > regards > > > El vie., 4 ene. 2019 a las 20:45, Adam Spiers () > escribió: > >> Hi all, >> >> I'm currently hacking on the deprecation badges in openstack-manuals, >> and there's a couple of things I don't understand. Any chance someone >> could explain why www/latest/badge.html doesn't just do: >> >> {% include 'templates/deprecated_badge.tmpl' %} >> >> like all the others? >> >> >> https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-61d0adc734c25e15fa375c6acd344703 >> >> I'm also what exactly would be wrong with the included CSS path if >> CSSDIR was used in www/templates/deprecated_badge.tmpl instead of >> heeding this caveat: >> >> >> https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-67d1669c09d2cddc437c6d803a5d6c02R4 >> >> It would be good to fix it to use CSSDIR because currently it's >> awkward to test CSS changes. >> >> Thanks! >> Adam >> > > > -- > Sebastian Marcet > https://ar.linkedin.com/in/smarcet > SKYPE: sebastian.marcet > -- Sebastian Marcet https://ar.linkedin.com/in/smarcet SKYPE: sebastian.marcet -------------- next part -------------- An HTML attachment was scrubbed... URL: From smarcet at gmail.com Sat Jan 5 04:40:21 2019 From: smarcet at gmail.com (Sebastian Marcet) Date: Sat, 5 Jan 2019 01:40:21 -0300 Subject: [docs] question about deprecation badges In-Reply-To: References: <20190104234530.nn3f7ay4izzfgy5b@pacific.linksys.moosehall> Message-ID: you could check the reason here https://review.openstack.org/#/c/585517/ https://review.openstack.org/#/c/585517/1/openstackdocstheme/theme/openstackdocs/layout.html regards El sáb., 5 ene. 2019 a las 1:34, Sebastian Marcet () escribió: > and the latest release its different bc its a corner case, there is not > way to figure it out from the template logic if current navigated release > its the latest so that is why its has its own template and logic regards > > El sáb., 5 ene. 2019 a las 1:19, Sebastian Marcet () > escribió: > >> Hi Adam, that approach is followed in order to be consumed from other >> projects like theme docs, using an ajax call doing a get and including its >> content dynamically on the page, also the css dir its not used bc otherwise >> it will not point to the rite css path on the fore-mentioned approach, >> instead its using the absolute path in order to load the css correctly >> through the ajax call ( check https://review.openstack.org/#/c/585516/) >> >> on docs theme the deprecation badge is loaded using this snippet >> >> >> >> on file openstackdocstheme/theme/openstackdocs/layout.html >> >> hope that shed some lite , have in mind that its a first iteration and >> any better approach its welcome >> >> regards >> >> >> El vie., 4 ene. 2019 a las 20:45, Adam Spiers () >> escribió: >> >>> Hi all, >>> >>> I'm currently hacking on the deprecation badges in openstack-manuals, >>> and there's a couple of things I don't understand. Any chance someone >>> could explain why www/latest/badge.html doesn't just do: >>> >>> {% include 'templates/deprecated_badge.tmpl' %} >>> >>> like all the others? >>> >>> >>> https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-61d0adc734c25e15fa375c6acd344703 >>> >>> I'm also what exactly would be wrong with the included CSS path if >>> CSSDIR was used in www/templates/deprecated_badge.tmpl instead of >>> heeding this caveat: >>> >>> >>> https://github.com/openstack/openstack-manuals/commit/961f544a4ec383d8b500afd82dda5dc333f689d1#diff-67d1669c09d2cddc437c6d803a5d6c02R4 >>> >>> It would be good to fix it to use CSSDIR because currently it's >>> awkward to test CSS changes. >>> >>> Thanks! >>> Adam >>> >> >> >> -- >> Sebastian Marcet >> https://ar.linkedin.com/in/smarcet >> SKYPE: sebastian.marcet >> > > > -- > Sebastian Marcet > https://ar.linkedin.com/in/smarcet > SKYPE: sebastian.marcet > -- Sebastian Marcet https://ar.linkedin.com/in/smarcet SKYPE: sebastian.marcet -------------- next part -------------- An HTML attachment was scrubbed... URL: From qianbiao.ng at turnbig.net Sat Jan 5 09:11:26 2019 From: qianbiao.ng at turnbig.net (qianbiao.ng at turnbig.net) Date: Sat, 5 Jan 2019 17:11:26 +0800 Subject: Ironic ibmc driver for Huawei server Message-ID: <201901051711257416397@turnbig.net>+5FE654154F3343E2 Hi julia, According to the comment of story, 1. The spec for huawei ibmc drvier has been post here: https://storyboard.openstack.org/#!/story/2004635 , waiting for review. 2. About the third-party CI part, we provide mocked unittests for our driver's code. Not sure what third-party CI works for in this case. What else we should do? Thanks Qianbiao.NG -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaronzhu1121 at gmail.com Mon Jan 7 03:22:55 2019 From: aaronzhu1121 at gmail.com (Rong Zhu) Date: Mon, 7 Jan 2019 11:22:55 +0800 Subject: [murano]Retire of murano-deployment project Message-ID: Hi, teams, Murano-deployment project is the third-party CI scripts and automatic tools. Since the third-party CI have already switched to openstack CI, this project doesn't need anymore. So we decided to retire murano-deployment project. Thanks, Rong Zhu -- Thanks, Rong Zhu -------------- next part -------------- An HTML attachment was scrubbed... URL: From yongli.he at intel.com Mon Jan 7 05:06:24 2019 From: yongli.he at intel.com (yonglihe) Date: Mon, 7 Jan 2019 13:06:24 +0800 Subject: [nova] implementation options for nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: On 2019/1/4 上午3:12, Matt Riedemann wrote: > On 1/3/2019 6:39 AM, Jay Pipes wrote: >> On 01/02/2019 10:15 PM, yonglihe wrote: >>> On 2018/12/18 下午4:20, yonglihe wrote: >>>> Hi, guys >>>> >>>> This spec needs input and discuss for move on. >>> >>> Jay suggest we might be good to use a new sub node to hold topology >>> stuff,  it's option 2, here. And split >>> >>> the PCI stuff out of this NUMA thing spec, use a /devices node to >>> hold all 'devices' stuff instead, then this node >>> >>> is generic and not only for PCI itself. >>> >>> I'm OK for Jay's suggestion,  it contains more key words and seems >>> crystal clear and straight forward. >>> >>> The problem is we need aligned about this. This spec need gain more >>> input thanks, Jay, Matt. >> >> Also, I mentioned that you need not (IMHO) combine both PCI/devices >> and NUMA topology in a single spec. We could proceed with the >> /topology API endpoint and work out the more generic /devices API >> endpoint in a separate spec. >> >> Best, >> -jay > > I said earlier in the email thread that I was OK with option 2 > (sub-resource) or the diagnostics API, and leaned toward the > diagnostics API since it was already admin-only. > > As long as this information is admin-only by default, not part of the > main server response body and therefore not parting of listing servers > with details (GET /servers/detail) then I'm OK either way and GET > /servers/{server_id}/topology is OK with me also. Thanks. the spec updated to use topology. Regards Yongli he From rui.zang at yandex.com Mon Jan 7 08:23:51 2019 From: rui.zang at yandex.com (rui zang) Date: Mon, 07 Jan 2019 16:23:51 +0800 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> Message-ID: <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> An HTML attachment was scrubbed... URL: From zhengzhenyulixi at gmail.com Mon Jan 7 08:37:50 2019 From: zhengzhenyulixi at gmail.com (Zhenyu Zheng) Date: Mon, 7 Jan 2019 16:37:50 +0800 Subject: [Nova] Suggestion needed for detach-boot-volume design In-Reply-To: References: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> Message-ID: Thanks alot for the replies, lets wait for some more comments, and I will update the follow-up spec about this within two days. On Sat, Jan 5, 2019 at 7:37 AM melanie witt wrote: > On Fri, 4 Jan 2019 09:50:46 -0600, Matt Riedemann > wrote: > > On 1/2/2019 2:57 AM, Zhenyu Zheng wrote: > >> I've been working on detach-boot-volume[1] in Stein, we got the initial > >> design merged and while implementing we have meet some new problems and > >> now I'm amending the spec to cover these new problems[2]. > > > > [2] is https://review.openstack.org/#/c/619161/ > > > >> > >> The thing I want to discuss for wider opinion is that in the initial > >> design, we planned to support detach root volume for only STOPPED and > >> SHELVED/SHELVE_OFFLOADED instances. But then we found out that we > >> allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as > >> well. Should we allow detaching root volume for instances in these > >> status too? Cases like RESIZE could be complicated for the revert resize > >> action, and it also seems unnecesary. > > > > The full set of allowed states for attaching and detaching are here: > > > > > https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4187 > > > > > https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4297 > > > > Concerning those other states: > > > > RESIZED: There might be a case for attaching/detaching volumes based on > > flavor during a resize, but I'm not sure about the root volume in that > > case (that really sounds more like rebuild with a new image to me, which > > is a different blueprint). I'm also not sure how much people know about > > the ability to do this or what the behavior is on revert if you have > > changed the volumes while the server is resized. If we consider that > > when a user reverts a resize, they want to go back to the way things > > were for the root disk image, then I would think we should not allow > > changing out the root volume while resized. > > Yeah, if someone attaches/detaches a regular volume while the instance > is in VERIFY_RESIZE state and then reverts the resize, I assume we > probably don't attempt to change or restore anything with the volume > attachments to put them back to how they were attached before the > resize. But as you point out, the situation does seem different > regarding a root volume. If a user changes that while in VERIFY_RESIZE > and reverts the resize, and we leave the root volume alone, then they > end up with a different root disk image than they had before the resize. > Which seems weird. > > I agree it seems better not to allow this for now and come back to it > later if people start asking for it. > > > PAUSED: First, I'm not sure how much anyone uses the pause API (or > > suspend for that matter) although most of the virt drivers implement it. > > At one point you could attach volumes to suspended servers as well, but > > because libvirt didn't support it that was removed from the API (yay for > > non-discoverable backend-specific API behavior changes): > > > > https://review.openstack.org/#/c/83505/ > > > > Anyway, swapping the root volume on a paused instance seems dangerous to > > me, so until someone really has a good use case for it, then I think we > > should avoid that one as well. > > > > SOFT_DELETED: I really don't understand the use case for > > attaching/detaching volumes to/from a (soft) deleted server. If the > > server is deleted and only hanging around because it hasn't been > > reclaimed yet, there are really no guarantees that this would work, so > > again, I would just skip this one for the root volume changes. If the > > user really wants to play with the volumes attached to a soft deleted > > server, they should restore it first. > > > > So in summary, I think we should just not support any of those other > > states for attach/detach root volumes and only focus on stopped or > > shelved instances. > > Again, agree, I think we should just not allow the other states for the > initial implementation and revisit later if it turns out people need these. > > -melanie > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjf1970231893 at gmail.com Mon Jan 7 09:17:53 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Mon, 7 Jan 2019 17:17:53 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Hi Michael, I found that you forbid import eventlet in octavia.[1] I guess the eventlet has a conflict with gunicorn, is that? But, I need to import eventlet for os-ken that used to implement bgp speaker.[2] I am studying eventlet and gunicorn deeply. Have you some suggestions to resolve this conflict? [1] https://review.openstack.org/#/c/462334/ [2] https://review.openstack.org/#/c/628915/ Michael Johnson 于2019年1月5日周六 上午8:02写道: > Hi Jeff, > > Unfortunately the team that was working on that code had stopped due > to internal reasons. > > I hope to make the reference active/active blueprint a priority again > during the Train cycle. Following that I may be able to look at the L3 > distributor option, but I cannot commit to that at this time. > > If you are interesting in picking up that work, please let me know and > we can sync up on that status of the WIP patches, etc. > > Michael > > On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: > > > > Dear Octavia team: > > The email aims to ask the development progress about > l3-active-active blueprint. I > > noticed that the work in this area has been stagnant for eight months. > > https://review.openstack.org/#/q/l3-active-active > > I want to know the community's next work plan in this regard. > > Thanks. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Jan 7 09:32:50 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Mon, 7 Jan 2019 01:32:50 -0800 Subject: [neutron] Functional tests job broken Message-ID: Hi Neutrinos, Since few days we have an issue with neutron-functional job [1]. Please don’t recheck Your patches now. It will not help until this bug will be fixed/workarouded. [1] https://bugs.launchpad.net/neutron/+bug/1810518 — Slawek Kaplonski Senior software engineer Red Hat From smooney at redhat.com Mon Jan 7 10:05:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Jan 2019 10:05:54 +0000 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> Message-ID: <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> On Mon, 2019-01-07 at 16:23 +0800, rui zang wrote: > Hey Jay, > > I replied to your comments to the spec however missed this email. > Please see my replies in line. > > Thanks, > Zang, Rui > > > > 03.01.2019, 21:31, "Jay Pipes" : > > On 01/02/2019 11:08 PM, Alex Xu wrote: > > > Jay Pipes > 于2019年1月2 > > > 日周三 下午10:48写道: > > > > > > On 12/21/2018 03:45 AM, Rui Zang wrote: > > > > It was advised in today's nova team meeting to bring this up by > > > email. > > > > > > > > There has been some discussion on the how to track persistent memory > > > > resource in placement on the spec review [1]. > > > > > > > > Background: persistent memory (PMEM) needs to be partitioned to > > > > namespaces to be consumed by VMs. Due to fragmentation issues, > > > the spec > > > > proposed to use fixed sized PMEM namespaces. > > > > > > The spec proposed to use fixed sized namespaces that is controllable by > > > the deployer, not fixed-size-for-everyone :) Just want to make sure > > > we're being clear here. > > > > > > > The spec proposed way to represent PMEM namespaces is to use one > > > > Resource Provider (RP) for one PMEM namespace. An new standard > > > Resource > > > > Class (RC) -- 'VPMEM_GB` is introduced to classify PMEM namspace > > > RPs. > > > > For each PMEM namespace RP, the values for 'max_unit', 'min_unit', > > > > 'total' and 'step_size` are all set to the size of the PMEM > > > namespace. > > > > In this way, it is guaranteed each RP will be consumed as a whole > > > at one > > > > time. > > > > > > > > An alternative was brought out in the review. Different Custom > > > Resource > > > > Classes ( CUSTOM_PMEM_XXXGB) can be used to designate PMEM > > > namespaces of > > > > different sizes. The size of the PMEM namespace is encoded in the > > > name > > > > of the custom Resource Class. And multiple PMEM namespaces of the > > > same > > > > size (say 128G) can be represented by one RP of the same > > > > > > Not represented by "one RP of the same CUSTOM_PMEM_128G". There > > > would be > > > only one resource provider: the compute node itself. It would have an > > > inventory of, say, 8 CUSTOM_PMEM_128G resources. > > > > > > > CUSTOM_PMEM_128G. In this way, the RP could have 'max_unit' and > > > 'total' > > > > as the total number of the PMEM namespaces of the certain size. > > > And the > > > > values of 'min_unit' and 'step_size' could set to 1. > > > > > > No, the max_unit, min_unit, step_size and total would refer to the > > > number of *PMEM namespaces*, not the amount of GB of memory represented > > > by those namespaces. > > > > > > Therefore, min_unit and step_size would be 1, max_unit would be the > > > total number of *namespaces* that could simultaneously be attached to a > > > single consumer (VM), and total would be 8 in our example where the > > > compute node had 8 of these pre-defined 128G PMEM namespaces. > > > > > > > We believe both way could work. We would like to have a community > > > > consensus on which way to use. > > > > Email replies and review comments to the spec [1] are both welcomed. > > > > > > Custom resource classes were invented for precisely this kind of use > > > case. The resource being represented is a namespace. The resource is > > > not > > > "a Gibibyte of persistent memory". > > > > > > > > > The point of the initial design is avoid to encode the `size` in the > > > resource class name. If that is ok for you(I remember people hate to > > > encode size and number into the trait name), then we will update the > > > design. Probably based on the namespace configuration, nova will be > > > responsible for create those custom RC first. Sounds works. > > > > A couple points... > > > > 1) I was/am opposed to putting the least-fine-grained size in a resource > > class name. For example, I would have preferred DISK_BYTE instead of > > DISK_GB. And MEMORY_BYTE instead of MEMORY_MB. > > I agree the more precise the better as far as resource tracking is concerned. > However, as for persistent memory, it usually comes out in large capacity -- > terabytes are normal. And the targeting applications are also expected to use > persistent memory in that quantity. GB is a reasonable unit not to make > the number too nasty. so im honestly not that concernetd with large numbers. if we want to imporve the user experience we can do what we do with hugepage memory. we suppport passing a sufix. so we can say 2M or 1G. if you are concerned with capasity its a relitivly simple exerises to show that if we use a 64 int or even 48bit we have plenty of headroom over where teh technology is. NVDIMs are speced for a max capasity of 512GB per module. if i recall correctly you can also only have 12 nvdim with 4 ram dimms per socket acting as a cache so that effectivly limits you to 6TB per socket or 12 TB per 1/2U with standard density servers. moderen x86 processors i belive still use a 48 bit phyical adress spaces with the last 16 bits reserved for future use meaning a host can adress a maxium of 2^48 or 256 TiB of memory such a system. note persistent memory is stream memory so it base 2 not base 10 so when we state it 1GB we technically mean 1 GiB or 2^10 bytes not 10^9 bytes whiile it unlikely we will ever need byte level granularity in allocations to guest im not sure i buy the argument that this will only be used by applications in large allocations in the 100GB or TBs range. i think i share jays preference here in increasing the granularity and eiter tracking the allocation in MiBs or Bytes. i do somewhat agree that bytes is likely to fine grained hence my perference for mebibytes. > > > 2) After reading the original Intel PMEM specification > > (http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf), it seems to me > > that what you are describing with a generic PMEM_GB (or PMEM_BYTE) > > resource class is more appropriate for the block mode translation system > > described in the PDF versus the PMEM namespace system described therein. > > > > From a lay person's perspective, I see the difference between the two > > as similar to the difference between describing the bytes that are in > > block storage versus a filesystem that has been formatted, wiped, > > cleaned, etc on that block storage. > > First let's talk about "block mode" v.s. "persistent memory mode". > They are not tiered up, they are counterparts. Each of them describes an access > method to the unlerlying hardware. Quote some sectors from > https://www.kernel.org/doc/Documentation/nvdimm/nvdimm.txt > inside the dash line block. > > ------------------------------8<------------------------------------------------------------------- > Why BLK? > -------- > > While PMEM provides direct byte-addressable CPU-load/store access to > NVDIMM storage, it does not provide the best system RAS (recovery, > availability, and serviceability) model. An access to a corrupted > system-physical-address address causes a CPU exception while an access > to a corrupted address through an BLK-aperture causes that block window > to raise an error status in a register. The latter is more aligned with > the standard error model that host-bus-adapter attached disks present. > Also, if an administrator ever wants to replace a memory it is easier to > service a system at DIMM module boundaries. Compare this to PMEM where > data could be interleaved in an opaque hardware specific manner across > several DIMMs. > > PMEM vs BLK > BLK-apertures solve these RAS problems, but their presence is also the > major contributing factor to the complexity of the ND subsystem. They > complicate the implementation because PMEM and BLK alias in DPA space. > Any given DIMM's DPA-range may contribute to one or more > system-physical-address sets of interleaved DIMMs, *and* may also be > accessed in its entirety through its BLK-aperture. Accessing a DPA > through a system-physical-address while simultaneously accessing the > same DPA through a BLK-aperture has undefined results. For this reason, > DIMMs with this dual interface configuration include a DSM function to > store/retrieve a LABEL. The LABEL effectively partitions the DPA-space > into exclusive system-physical-address and BLK-aperture accessible > regions. For simplicity a DIMM is allowed a PMEM "region" per each > interleave set in which it is a member. The remaining DPA space can be > carved into an arbitrary number of BLK devices with discontiguous > extents. > ------------------------------8<------------------------------------------------------------------- > > You can see that "block mode" does not provide "direct access", thus not the best > performance. That is the reason "persistent memory mode" is proposed in the spec. the block mode will allow any exsing applciation that is coded to work with a block device to just use the NVDIM storage as a faster from of solid state storage. direct mode reqiures applications to be specifcialy coded to support it. form an openstack perspective we will eventually want to support exposing the deivce both as a block deivce (e.g. via virtio-blk or virtio-scsi devices if/when qemu supports that) and direct mode pmem device to the guest. i understand why persistent memory mode is more appealing from a vendor perspecitve to lead with but pratically speaking there are very few application that actully supprot pmem to date and supporting app direct mode only seams like it would hurt adoption of this feautre more generally then encourage it. > > However, people can still create a block device out of a "persistent memory mode" > namespace. And further more, create a file system on top of that block device. > Applications can map files from that file system into their memory namespaces, > and if the file system is DAX (direct-access) capable. The application's access to > the hardware is still direct-access which means direct byte-addressable > CPU-load/store access to NVDIMM storage. > This is perfect so far, as one can think of why not just track the DAX file system > and let the VM instances map the files of the file system? > However, this usage model is reported to have severe issues with hardware > pass-ed through. So the recommended model is still mapping namespaces > of "persistent memory mode" into applications' address space. > intels nvdimm technology works in 3 modes, app direct, block and system memory. the direct and block modes were discussed at some lenght in the spec and this thread. does libvirt support using a nvdims pmem namespaces in devdax mode to back a guest memory instead of system ram. based on https://docs.pmem.io/getting-started-guide/creating-development-environments/virtualization/qemu qemu does support such a configuration and honestly haveing the capablity to alter the guest meory backing to run my vms with 100s or GB of ram would as compeeling as app direct mode as it would allow all my legacy application to work without modification and would deliver effectivly the same perfromance. perhaps we should also consider a hw:mem_page_backing extra spec to complement the hw:mem_page_size we have already hugepages today. this would proably be a seperate spec but i would hope we dont make desisions today that would block other useage models in the future. > > > > > In Nova, the DISK_GB resource class describes the former: it's a bunch > > of blocks that are reserved in the underlying block storage for use by > > the virtual machine. The virtual machine manager then formats that bunch > > of blocks as needed and lays down a formatted image. > > > > We don't have a resource class that represents "a filesystem" or "a > > partition" (yet). But the proposed PMEM namespaces in your spec > > definitely seem to be more like a "filesystem resource" than a "GB of > > block storage" resource. > > > > Best, > > -jay From balazs.gibizer at ericsson.com Mon Jan 7 12:52:35 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Mon, 7 Jan 2019 12:52:35 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> Message-ID: <1546865551.29530.0@smtp.office365.com> > But, let's chat more about it via a hangout the week after next (week > of January 14 when Matt is back), as suggested in #openstack-nova > today. We'll be able to have a high-bandwidth discussion then and > agree on a decision on how to move forward with this. Thank you all for the discussion. I agree to have a real-time discussion about the way forward. Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a hangouts[2]? I see the following topics we need to discuss: * backward compatibility with already existing SRIOV ports having min bandwidth * introducing microversion(s) for this feature in Nova * allowing partial support for this feature in Nova in Stein (E.g.: only server create/delete but no migrate support). * step-by-step verification of the really long commit chain in Nova I will post a summar of each issue to the ML during this week. Cheers, gibi [1] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190114T170000 [2] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI From fungi at yuggoth.org Mon Jan 7 13:11:37 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 7 Jan 2019 13:11:37 +0000 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> Message-ID: <20190107131137.4xc4lue7t333iosu@yuggoth.org> On 2019-01-07 10:05:54 +0000 (+0000), Sean Mooney wrote: [...] > note persistent memory is stream memory so it base 2 not base 10 > so when we state it 1GB we technically mean 1 GiB or 2^10 bytes > not 10^9 bytes [...] Not to get pedantic, but a gibibyte is 2^30 bytes (2^10 is a kibibyte). I'm quite sure you (and most of the rest of us) know this, just pointing it out for the sake of clarity. -- Jeremy Stanley From smooney at redhat.com Mon Jan 7 13:47:28 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Jan 2019 13:47:28 +0000 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <20190107131137.4xc4lue7t333iosu@yuggoth.org> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> <20190107131137.4xc4lue7t333iosu@yuggoth.org> Message-ID: <37db2f34f4f39b302df1a30662013c75f4f61853.camel@redhat.com> On Mon, 2019-01-07 at 13:11 +0000, Jeremy Stanley wrote: > On 2019-01-07 10:05:54 +0000 (+0000), Sean Mooney wrote: > [...] > > note persistent memory is stream memory so it base 2 not base 10 > > so when we state it 1GB we technically mean 1 GiB or 2^10 bytes > > not 10^9 bytes > > [...] > > Not to get pedantic, but a gibibyte is 2^30 bytes (2^10 is a > kibibyte). I'm quite sure you (and most of the rest of us) know > this, just pointing it out for the sake of clarity. yep i spotted that when i was reading the mail back after i send it :) i kind of wanted to fix it but i assumed most would see its a typo and didnt wnat to spam. the main point i wanted to convay was that nvdimm-p is being standarised by JEDEC and will be using there unit definitions rather then the IEC definitions typically used by block storage. thanks for giving me the opertunity to calrify :) From jaypipes at gmail.com Mon Jan 7 14:02:40 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Mon, 7 Jan 2019 09:02:40 -0500 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> Message-ID: <5b70f20b-fbae-97f9-0253-1d54d84057e3@gmail.com> On 01/07/2019 05:05 AM, Sean Mooney wrote: > i think i share jays preference here in increasing the granularity and eiter tracking > the allocation in MiBs or Bytes. i do somewhat agree that bytes is likely to fine grained > hence my perference for mebibytes. Actually, that's not at all my preference for PMEM :) My preference is to use custom resource classes like "CUSTOM_PMEM_NAMESPACE_1TB" because the resource is the namespace, not the bunch of blocks/bytes of storage. With regards to the whole "finest-grained unit" thing, I was just responding to Alex Xu's comment: "The point of the initial design is avoid to encode the `size` in the resource class name. If that is ok for you(I remember people hate to encode size and number into the trait name), then we will update the design. Probably based on the namespace configuration, nova will be responsible for create those custom RC first. Sounds works." Best, -jay From ignaziocassano at gmail.com Mon Jan 7 15:22:03 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 7 Jan 2019 16:22:03 +0100 Subject: Queens octavia error Message-ID: Hello All, I installed octavia on queens with centos 7, but when I create a load balance with the command openstack loadbalancer create --name lb1 --vip-subnet-id admin-subnet I got some errors in octavia worker.log: 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server failures[0].reraise() 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/taskflow/types/failure.py", line 343, in reraise 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server six.reraise(*self._exc_info) 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server result = task.execute(**arguments) 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 192, in execute 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server raise exceptions.ComputeBuildException(fault=fault) 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server ComputeBuildException: Failed to build compute instance due to: {u'message': u'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 581, in build_instances\n raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': u'2019-01-07T15:15:59Z'} Anyone could help me, please ? Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Mon Jan 7 15:25:43 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 7 Jan 2019 09:25:43 -0600 Subject: [neutron][oslo] Functional tests job broken (oslo.privsep) In-Reply-To: References: Message-ID: <20190107152543.kugiskrwk4kuawtf@mthode.org> On 19-01-07 01:32:50, Slawomir Kaplonski wrote: > Hi Neutrinos, > > Since few days we have an issue with neutron-functional job [1]. > Please don’t recheck Your patches now. It will not help until this bug > will be fixed/workarouded. > > [1] https://bugs.launchpad.net/neutron/+bug/1810518 > Adding an oslo tag. As far as can be determined the new oslo.privsep code impacts neutron. There is a requirements review out to restict the version of oslo.privsep but I'd like an ack from oslo people before we take a step back. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From derekh at redhat.com Mon Jan 7 16:24:13 2019 From: derekh at redhat.com (Derek Higgins) Date: Mon, 7 Jan 2019 16:24:13 +0000 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default Message-ID: Hi All, Shortly before the holidays CI jobs moved from xenial to bionic, for Ironic this meant a bunch failures[1], all have now been dealt with, with the exception of the UEFI job. It turns out that during this job our (virtual) baremetal nodes use tftp to download a ipxe image. In order to track these tftp connections we have been making use of the fact that nf_conntrack_helper has been enabled by default. In newer kernel versions[2] this is no longer the case and I'm now trying to figure out the best way to deal with the new behaviour. I've put together some possible solutions along with some details on why they are not ideal and would appreciate some opinions 1. Why not enable the conntrack helper with echo 1 > /proc/sys/net/netfilter/nf_conntrack_helper The router namespace is still created with nf_conntrack_helper==0 as it follows the default the nf_conntrack module was loaded with 2. Enable it in modprobe.d # cat /etc/modprobe.d/conntrack.conf options nf_conntrack nf_conntrack_helper=1 This works but requires the nf_conntrack module to be unloaded if it has already been loaded, for devstack and I guess in the majority of cases (including CI nodes) this means a reboot stage or a potentially error prone sequence of stopping the firewall and unloading nf_conntrack modules. This also globally turns on the helper on the host reintroducing the security concerns it comes with 3. Enable the contrack helper in the router network namespace when it is created[3] This works for ironic CI, but there may be better solutions that can be worked within neutron that I'm not aware of. Of the 3 options above this would be most transparent to other operators as the original behaviour would be maintained. thoughts on any of the above? or better solutions? 1 - https://storyboard.openstack.org/#!/story/2004604 2 - https://kernel.googlesource.com/pub/scm/linux/kernel/git/horms/ipvs-next/+/3bb398d925ec73e42b778cf823c8f4aecae359ea 3 - https://review.openstack.org/#/c/628493/1 From mihalis68 at gmail.com Mon Jan 7 16:29:32 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Mon, 7 Jan 2019 11:29:32 -0500 Subject: [Ops] ops meetups team meeting 2018-12-18 In-Reply-To: References: Message-ID: Hello Everyone, The next OpenStack ops meetup team meeting will be tomorrow (2019-1-8) at 10am EST on #openstack-operators on freenode. It is important that we get back to work, as there is an ops meetup to organise! The only offer received in written form was the Deutsche Telekom offer to host in Berlin (see https://etherpad.openstack.org/p/ops-meetup-venue-discuss-1st-2019-berlin) and those present in previous meetings favored opting for Thursday March 7th and Friday March 8th so as to adjoin the weekend, making a personal weekend in berlin immediately after the meetup more doable. The OpenStack "ops meetups team" is charged with making these events happen, but we could definitely do with some help. If you'd like to be involved, see our charter here https://wiki.openstack.org/wiki/Ops_Meetups_Team and/or attend the meeting on IRC tomorrow. A draft agenda is now posted at the top of our agenda etherpad, see https://etherpad.openstack.org/p/ops-meetups-team All being well, I hope we can formally agree on the Berlin proposal and get moving with all the usual prep work. We have two months to get it done. See you tomorrow! Chris On Tue, Dec 18, 2018 at 10:54 AM Chris Morgan wrote: > Meeting minutes from our meeting today on IRC are linked below. > > Key points: > - next meeting Jan 8th 2019 > - ops meetup #1 looks likely to be at Deutsche Telekom, Berlin, March 7,8 > 2019 > - the team hopes to confirm this soon and commence technical agenda > planning early January > - team meetings will continue to be 10AM EST Tuesdays on > #openstack-operators > > Meeting ended Tue Dec 18 15:44:05 2018 UTC. > 10:44 AM Minutes: > http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-12-18-15.03.html > 10:44 AM Minutes (text): > http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-12-18-15.03.txt > 10:44 AM Log: > http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-12-18-15.03.log.html > > About the OpenStack Operators Meetups team: > https://wiki.openstack.org/wiki/Ops_Meetups_Team > > Chris > > -- > Chris Morgan > -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Jan 7 16:44:05 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Mon, 7 Jan 2019 08:44:05 -0800 Subject: [neutron] Bug deputy report - week 1 2019 Message-ID: Hi Neutrios, I was bug deputy last week. Below is summary of bugs reported this week: Critical bugs: * https://bugs.launchpad.net/neutron/+bug/1810314  - neutron objects base get_values() fails with KeyError - I marked it as Critical because it cause gate failures for Tricircle project, there is patch proposed for that https://review.openstack.org/#/c/628857/ but some OVO expert should take a look at it, * https://bugs.launchpad.net/neutron/+bug/1810518 - neutron-functional tests failing with oslo.privsep 1.31 - set to Critical as it cause gate failures, Ben Nemec is looking at it from oslo.privsep side. I found that all problems are caused by tests from neutron.tests.functional.agent.linux.test_netlink_lib.NetlinkLibTestCase so maybe if someone familiar with this code in Neutron can take a look at it too. High bugs: * https://bugs.launchpad.net/neutron/+bug/1809238 - [l3] `port_forwarding` cannot be set before l3 `router` in service_plugins - I set it to High - it looks that we now require proper order of service plugins in config file, it is in progress, LIU Yulong is working on it. We also discussed that on last L3 sub team meeting. * https://bugs.launchpad.net/neutron/+bug/1810504 - neutron-tempest-iptables_hybrid job failing with internal server error while listining ports - set to High as it cause gate failures from time to time, patch proposed already: https://review.openstack.org/#/c/628492/ * https://bugs.launchpad.net/neutron/+bug/1810764 - XenServer cannot enable tunneling - this is fresh bug from today, I set it to High and it needs attention from someone familiar with Xenserver, Other bugs: * https://bugs.launchpad.net/neutron/+bug/1810025 - ovs-agent do not clear QoS rules after restart - in progress, I set it to Medium, patch proposed https://review.openstack.org/#/c/627779/ * https://bugs.launchpad.net/neutron/+bug/1810349 - agent gw ports created on non dvr destination hosts - DVR related issue, I set it to Medium - patch proposed: https://review.openstack.org/628071 * https://bugs.launchpad.net/neutron/+bug/1810563 - adding rules to security groups is slow - set to Medium, patch proposed https://review.openstack.org/628691 — Slawek Kaplonski Senior software engineer Red Hat From juliaashleykreger at gmail.com Mon Jan 7 16:48:48 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 7 Jan 2019 08:48:48 -0800 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: References: Message-ID: Thanks for bringing this up Derek! Comments below. On Mon, Jan 7, 2019 at 8:30 AM Derek Higgins wrote: > > Hi All, > > Shortly before the holidays CI jobs moved from xenial to bionic, for > Ironic this meant a bunch failures[1], all have now been dealt with, > with the exception of the UEFI job. It turns out that during this job > our (virtual) baremetal nodes use tftp to download a ipxe image. In > order to track these tftp connections we have been making use of the > fact that nf_conntrack_helper has been enabled by default. In newer > kernel versions[2] this is no longer the case and I'm now trying to > figure out the best way to deal with the new behaviour. I've put > together some possible solutions along with some details on why they > are not ideal and would appreciate some opinions The git commit message suggests that users should explicitly put in rules such that the traffic is matched. I feel like the kernel change ends up being a behavior change in this case. I think the reasonable path forward is to have a configuration parameter that the l3 agent can use to determine to set the netfilter connection tracker helper. Doing so, allows us to raise this behavior change to operators minimizing the need of them having to troubleshoot it in production, and gives them a choice in the direction that they wish to take. [trim] > 3. Enable the contrack helper in the router network namespace when it > is created[3] > This works for ironic CI, but there may be better solutions that can > be worked within neutron that I'm not aware of. Of the 3 options above > this would be most transparent to other operators as the original > behaviour would be maintained. > My thoughts exactly. > thoughts on any of the above? or better solutions? I think we should just raise it as a configuration option. Coupled with a release note, provides operators visibility to the kernel change. > > 1 - https://storyboard.openstack.org/#!/story/2004604 > 2 - https://kernel.googlesource.com/pub/scm/linux/kernel/git/horms/ipvs-next/+/3bb398d925ec73e42b778cf823c8f4aecae359ea > 3 - https://review.openstack.org/#/c/628493/1 > From cboylan at sapwetik.org Mon Jan 7 17:05:38 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 07 Jan 2019 09:05:38 -0800 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: References: Message-ID: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: > Thanks for bringing this up Derek! > Comments below. > > On Mon, Jan 7, 2019 at 8:30 AM Derek Higgins wrote: > > > > Hi All, > > > > Shortly before the holidays CI jobs moved from xenial to bionic, for > > Ironic this meant a bunch failures[1], all have now been dealt with, > > with the exception of the UEFI job. It turns out that during this job > > our (virtual) baremetal nodes use tftp to download a ipxe image. In > > order to track these tftp connections we have been making use of the > > fact that nf_conntrack_helper has been enabled by default. In newer > > kernel versions[2] this is no longer the case and I'm now trying to > > figure out the best way to deal with the new behaviour. I've put > > together some possible solutions along with some details on why they > > are not ideal and would appreciate some opinions > > The git commit message suggests that users should explicitly put in rules such > that the traffic is matched. I feel like the kernel change ends up > being a behavior > change in this case. > > I think the reasonable path forward is to have a configuration > parameter that the > l3 agent can use to determine to set the netfilter connection tracker helper. > > Doing so, allows us to raise this behavior change to operators minimizing the > need of them having to troubleshoot it in production, and gives them a choice > in the direction that they wish to take. https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to cover this. Basically you should explicitly enable specific helpers when you need them rather than relying on the auto helper rules. Maybe even avoid the configuration option entirely if ironic and neutron can set the required helper for tftp when tftp is used? > > [trim] > [more trimming] From rui.zang at yandex.com Mon Jan 7 17:17:59 2019 From: rui.zang at yandex.com (rui zang) Date: Tue, 08 Jan 2019 01:17:59 +0800 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> Message-ID: <15042191546881479@iva4-031ea4da33a1.qloud-c.yandex.net> An HTML attachment was scrubbed... URL: From rui.zang at yandex.com Mon Jan 7 17:20:17 2019 From: rui.zang at yandex.com (rui zang) Date: Tue, 08 Jan 2019 01:20:17 +0800 Subject: [nova] Persistent memory resource tracking model In-Reply-To: <5b70f20b-fbae-97f9-0253-1d54d84057e3@gmail.com> References: <16197751545381914@sas2-857317bd6599.qloud-c.yandex.net> <57136e18-31e5-27e6-c6e2-bf442af8a779@gmail.com> <8d3c2249-583f-0286-c708-596e712406c7@gmail.com> <43764701546849431@myt6-add70abb4f02.qloud-c.yandex.net> <943b469faa92ecbe5bd413afc62469abff302926.camel@redhat.com> <5b70f20b-fbae-97f9-0253-1d54d84057e3@gmail.com> Message-ID: <48252031546881617@sas1-d856b3d759c7.qloud-c.yandex.net> An HTML attachment was scrubbed... URL: From mrhillsman at gmail.com Mon Jan 7 17:28:39 2019 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Mon, 7 Jan 2019 11:28:39 -0600 Subject: [all] [uc] OpenStack UC Meeting @ 1900 UTC Message-ID: Hi everyone, Just a reminder that the UC meeting will be in #openstack-uc in a little more than an hour and a half from now. Please feel empowered to add to the agenda here - https://etherpad.openstack.org/p/uc - and we hope to see you there! -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Mon Jan 7 17:32:32 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 07 Jan 2019 17:32:32 +0000 Subject: [nova] Mempage fun Message-ID: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> We've been looking at a patch that landed some months ago and have spotted some issues: https://review.openstack.org/#/c/532168 In summary, that patch is intended to make the memory check for instances memory pagesize aware. The logic it introduces looks something like this: If the instance requests a specific pagesize (#1) Check if each host cell can provide enough memory of the pagesize requested for each instance cell Otherwise If the host has hugepages (#2) Check if each host cell can provide enough memory of the smallest pagesize available on the host for each instance cell Otherwise (#3) Check if each host cell can provide enough memory for each instance cell, ignoring pagesizes This also has the side-effect of allowing instances with hugepages and instances with a NUMA topology but no hugepages to co-exist on the same host, because the latter will now be aware of hugepages and won't consume them. However, there are a couple of issues with this: 1. It breaks overcommit for instances without pagesize request running on hosts with different pagesizes. This is because we don't allow overcommit for hugepages, but case (#2) above means we are now reusing the same functions previously used for actual hugepage checks to check for regular 4k pages 2. It doesn't fix the issue when non-NUMA instances exist on the same host as NUMA instances with hugepages. The non-NUMA instances don't run through any of the code above, meaning they're still not pagesize aware We could probably fix issue (1) by modifying those hugepage functions we're using to allow overcommit via a flag that we pass for case (#2). We can mitigate issue (2) by advising operators to split hosts into aggregates for 'hw:mem_page_size' set or unset (in addition to 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but I think this may be the case in some docs (sean-k-mooney said Intel used to do this. I don't know about Red Hat's docs or upstream). In addition, we did actually called that out in the original spec: https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact However, if we're doing that for non-NUMA instances, one would have to question why the patch is necessary/acceptable for NUMA instances. For what it's worth, a longer fix would be to start tracking hugepages in a non-NUMA aware way too but that's a lot more work and doesn't fix the issue now. As such, my question is this: should be look at fixing issue (1) and documenting issue (2), or should we revert the thing wholesale until we work on a solution that could e.g. let us track hugepages via placement and resolve issue (2) too. Thoughts? Stephen From juliaashleykreger at gmail.com Mon Jan 7 17:42:23 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 7 Jan 2019 09:42:23 -0800 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> References: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> Message-ID: On Mon, Jan 7, 2019 at 9:11 AM Clark Boylan wrote: > > On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: [trim] > > > > Doing so, allows us to raise this behavior change to operators minimizing the > > need of them having to troubleshoot it in production, and gives them a choice > > in the direction that they wish to take. > > https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to cover this. Basically you should explicitly enable specific helpers when you need them rather than relying on the auto helper rules. > > Maybe even avoid the configuration option entirely if ironic and neutron can set the required helper for tftp when tftp is used? > Great link Clark, thanks! It could be viable to ask operators to explicitly set their security groups for tftp to be passed. I guess we actually have multiple cases where there are issues and the only non-impacted case is when the ironic conductor host is directly attached to the flat network the machine is booting from. In the case of a flat network, it doesn't seem viable for us to change rules ad-hoc since we would need to be able to signal that the helper is needed, but it does seem viable to say "make sure connectivity works x way". Where as with multitenant networking, we use dedicated networks, so conceivably it is just a static security group setting that an operator can keep in place. Explicit static rules like that seem less secure to me without conntrack helpers. :( Does anyone in Neutron land have any thoughts? > > > > [trim] > > > > [more trimming] > From derekh at redhat.com Mon Jan 7 17:53:40 2019 From: derekh at redhat.com (Derek Higgins) Date: Mon, 7 Jan 2019 17:53:40 +0000 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> References: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> Message-ID: On Mon, 7 Jan 2019 at 17:08, Clark Boylan wrote: > > On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: > > Thanks for bringing this up Derek! > > Comments below. > > > > On Mon, Jan 7, 2019 at 8:30 AM Derek Higgins wrote: > > > > > > Hi All, > > > > > > Shortly before the holidays CI jobs moved from xenial to bionic, for > > > Ironic this meant a bunch failures[1], all have now been dealt with, > > > with the exception of the UEFI job. It turns out that during this job > > > our (virtual) baremetal nodes use tftp to download a ipxe image. In > > > order to track these tftp connections we have been making use of the > > > fact that nf_conntrack_helper has been enabled by default. In newer > > > kernel versions[2] this is no longer the case and I'm now trying to > > > figure out the best way to deal with the new behaviour. I've put > > > together some possible solutions along with some details on why they > > > are not ideal and would appreciate some opinions > > > > The git commit message suggests that users should explicitly put in rules such > > that the traffic is matched. I feel like the kernel change ends up > > being a behavior > > change in this case. > > > > I think the reasonable path forward is to have a configuration > > parameter that the > > l3 agent can use to determine to set the netfilter connection tracker helper. > > > > Doing so, allows us to raise this behavior change to operators minimizing the > > need of them having to troubleshoot it in production, and gives them a choice > > in the direction that they wish to take. > > https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to cover this. Basically you should explicitly enable specific helpers when you need them rather than relying on the auto helper rules. Thanks, I forgot to point out the option of adding these rules, If I understand it correctly they would need to be added inside the router namespace when neutron creates it, somebody from neutron might be able to indicate if this is a workable solution. > > Maybe even avoid the configuration option entirely if ironic and neutron can set the required helper for tftp when tftp is used? > > > > > [trim] > > > > [more trimming] > From openstack at nemebean.com Mon Jan 7 18:11:21 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 7 Jan 2019 12:11:21 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Message-ID: Renamed the thread to be more descriptive. Just to update the list on this, it looks like the problem is a segfault when the netlink_lib module makes a C call. Digging into that code a bit, it appears there is a callback being used[1]. I've seen some comments that when you use a callback with a Python thread, the thread needs to be registered somehow, but this is all uncharted territory for me. Suggestions gratefully accepted. :-) 1: https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: > Hi, > > I just found that functional tests in Neutron are failing since today or maybe yesterday. See [1] > I was able to reproduce it locally and it looks that it happens with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. > > [1] https://bugs.launchpad.net/neutron/+bug/1810518 > > — > Slawek Kaplonski > Senior software engineer > Red Hat > >> Wiadomość napisana przez Ben Nemec w dniu 02.01.2019, o godz. 19:17: >> >> Yay alliteration! :-) >> >> I wanted to draw attention to this release[1] in particular because it includes the parallel privsep change[2]. While it shouldn't have any effect on the public API of the library, it does significantly affect how privsep will process calls on the back end. Specifically, multiple calls can now be processed at the same time, so if any privileged code is not reentrant it's possible that new race bugs could pop up. >> >> While this sounds scary, it's a necessary change to allow use of privsep in situations where a privileged call may take a non-trivial amount of time. Cinder in particular has some privileged calls that are long-running and can't afford to block all other privileged calls on them. >> >> So if you're a consumer of oslo.privsep please keep your eyes open for issues related to this new release and contact the Oslo team if you find any. Thanks. >> >> -Ben >> >> 1: https://review.openstack.org/628019 >> 2: https://review.openstack.org/#/c/593556/ >> > From smooney at redhat.com Mon Jan 7 19:19:46 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 07 Jan 2019 19:19:46 +0000 Subject: [nova] [placement] Mempage fun In-Reply-To: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> Message-ID: On Mon, 2019-01-07 at 17:32 +0000, Stephen Finucane wrote: > We've been looking at a patch that landed some months ago and have > spotted some issues: > > https://review.openstack.org/#/c/532168 > > In summary, that patch is intended to make the memory check for > instances memory pagesize aware. The logic it introduces looks > something like this: > > If the instance requests a specific pagesize > (#1) Check if each host cell can provide enough memory of the > pagesize requested for each instance cell > Otherwise > If the host has hugepages > (#2) Check if each host cell can provide enough memory of the > smallest pagesize available on the host for each instance cell > Otherwise > (#3) Check if each host cell can provide enough memory for > each instance cell, ignoring pagesizes > > This also has the side-effect of allowing instances with hugepages and > instances with a NUMA topology but no hugepages to co-exist on the same > host, because the latter will now be aware of hugepages and won't > consume them. However, there are a couple of issues with this: > > 1. It breaks overcommit for instances without pagesize request > running on hosts with different pagesizes. This is because we don't > allow overcommit for hugepages, but case (#2) above means we are now > reusing the same functions previously used for actual hugepage > checks to check for regular 4k pages > 2. It doesn't fix the issue when non-NUMA instances exist on the same > host as NUMA instances with hugepages. The non-NUMA instances don't > run through any of the code above, meaning they're still not > pagesize aware > > We could probably fix issue (1) by modifying those hugepage functions > we're using to allow overcommit via a flag that we pass for case (#2). > We can mitigate issue (2) by advising operators to split hosts into > aggregates for 'hw:mem_page_size' set or unset (in addition to > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > I think this may be the case in some docs (sean-k-mooney said Intel > used to do this. I don't know about Red Hat's docs or upstream). In > addition, we did actually called that out in the original spec: > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > However, if we're doing that for non-NUMA instances, one would have to > question why the patch is necessary/acceptable for NUMA instances. For > what it's worth, a longer fix would be to start tracking hugepages in a > non-NUMA aware way too but that's a lot more work and doesn't fix the > issue now. > > As such, my question is this: should be look at fixing issue (1) and > documenting issue (2), or should we revert the thing wholesale until we > work on a solution that could e.g. let us track hugepages via placement > and resolve issue (2) too. for what its worth the review in question https://review.openstack.org/#/c/532168 actully attempts to implement option 1/ form https://bugs.launchpad.net/nova/+bug/1439247 the frist time i tried to fix issue 2 was with my proposal for the AggregateTypeExtraSpecsAffinityFilter https://review.openstack.org/#/c/183876/4/specs/liberty/approved/aggregate-flavor-extra-spec-affinity-filter.rst which became the out tree AggregateInstanceTypeFilter after 3 cycles of trying to get it upstream. https://github.com/openstack/nfv-filters/blob/master/nfv_filters/nova/scheduler/filters/aggregate_instance_type_filter.py the AggregateTypeExtraSpecsAffinityFilter or AggregateInstanceTypeFilter was a filter we deveopled spcifically to enforce seperation of instnace that uses explict memoery pages form those that did not and also to cater for dpdk hugepage requirement and enforce seperation of pinnind an unpinned guests. we finally got approval to publish a blog on the topin in january of 2017 https://software.intel.com/en-us/blogs/2017/01/04/filter-by-host-aggregate-metadata-or-by-image-extra-specs based on the the content in the second version of the spec https://review.openstack.org/#/c/314097/12/specs/newton/approved/aggregate-instance-type-filter.rst this filter was used in semi production 4g trial deployment in addtion to lab use with some parthers i was working with at the time but we decided to stop supporting it with the assumtion placemnet would solve it :) alot of the capablities of out of tree filter could likely be acived with some extentions to placemnt but are not support by placement today. i have raised the topic in the past of required tratis on a resouce provider that need to be present to in the request for an alloction to be made against the resouce provide. similar i have raised the idea of forbinon traits on a resocue provide that eliminates the resouce provide as a candiatie if present in the requerst. this is an inverse relation ship of the required and forbidin traits we have to day but is what the filter we implmented in 2015 did before placment using aggreate metatdata. i think there is a generalised problem statement here that would be a ligiame usecase for placement out side of simply tracking hugepages (or preferable memory of all page sizes) in placement. i would be infavor of fixing oversubsiption which is issue 1 this cycle as that is clearly a bug as a short term solution which we could backport and exploring adressing both issue 1 and 2 with placement or by repoposing the out of tree filter if placement deamed it out of scope. That said i too am interest to hear what other think especially the placement folks. you can jsut use host aggrates and existing filter to address issue 2 but its really easy to get wrong and its not very well documented that it is required. > > Thoughts? > Stephen > > From haleyb.dev at gmail.com Mon Jan 7 20:05:06 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Mon, 7 Jan 2019 15:05:06 -0500 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Message-ID: Hi Ben, On 1/7/19 1:11 PM, Ben Nemec wrote: > Renamed the thread to be more descriptive. > > Just to update the list on this, it looks like the problem is a segfault > when the netlink_lib module makes a C call. Digging into that code a > bit, it appears there is a callback being used[1]. I've seen some > comments that when you use a callback with a Python thread, the thread > needs to be registered somehow, but this is all uncharted territory for > me. Suggestions gratefully accepted. :-) > > 1: > https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 Maybe it's something as mentioned in the end of this section? https://docs.python.org/2/library/ctypes.html#callback-functions "Note Make sure you keep references to CFUNCTYPE() objects as long as they are used from C code. ctypes doesn’t, and if you don’t, they may be garbage collected, crashing your program when a callback is made. Also, note that if the callback function is called in a thread created outside of Python’s control (e.g. by the foreign code that calls the callback), ctypes creates a new dummy Python thread on every invocation. This behavior is correct for most purposes, but it means that values stored with threading.local will not survive across different callbacks, even when those calls are made from the same C thread." I can try keeping a reference to the callback function and see if it makes any difference, but I'm assuming it's not that easy. -Brian > On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >> Hi, >> >> I just found that functional tests in Neutron are failing since today >> or maybe yesterday. See [1] >> I was able to reproduce it locally and it looks that it happens with >> oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >> >> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >> >> — >> Slawek Kaplonski >> Senior software engineer >> Red Hat >> >>> Wiadomość napisana przez Ben Nemec w dniu >>> 02.01.2019, o godz. 19:17: >>> >>> Yay alliteration! :-) >>> >>> I wanted to draw attention to this release[1] in particular because >>> it includes the parallel privsep change[2]. While it shouldn't have >>> any effect on the public API of the library, it does significantly >>> affect how privsep will process calls on the back end. Specifically, >>> multiple calls can now be processed at the same time, so if any >>> privileged code is not reentrant it's possible that new race bugs >>> could pop up. >>> >>> While this sounds scary, it's a necessary change to allow use of >>> privsep in situations where a privileged call may take a non-trivial >>> amount of time.  Cinder in particular has some privileged calls that >>> are long-running and can't afford to block all other privileged calls >>> on them. >>> >>> So if you're a consumer of oslo.privsep please keep your eyes open >>> for issues related to this new release and contact the Oslo team if >>> you find any. Thanks. >>> >>> -Ben >>> >>> 1: https://review.openstack.org/628019 >>> 2: https://review.openstack.org/#/c/593556/ >>> >> > From doug at doughellmann.com Mon Jan 7 20:12:21 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Mon, 07 Jan 2019 15:12:21 -0500 Subject: [goal][python3] week R-13 update Message-ID: This is the weekly update for the "Run under Python 3 by default" goal (https://governance.openstack.org/tc/goals/stein/python3-first.html). == Ongoing and Completed Work == This week is the second milestone for the Stein cycle. By this point I hoped to have python 3 functional test jobs in place for all projects, but we still have quite a ways to go to achieve that. I have added a few missing projects to the wiki page [1] and there is a *lot* of red on that page. I count 21 projects without functional test jobs running under python 3. We also have a few projects who don't seem to have voting python 3 unit test jobs, still. [1] https://wiki.openstack.org/wiki/Python3#Other_OpenStack_Applications_and_Projects Now that folks are mostly back from the holidays, my patch to change the default version of python in devstack [2] is ready for approval. See the other thread on this list [3] for details. [2] https://review.openstack.org/#/c/622415/ [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001356.html == Next Steps == PTLs, please review the information for your projects on that page. If you have a functional test job, please update that part of the table with the name of the job. If you do not have a functional test job, please add any information you have about plans to implement one (blue prints, bugs, etc.) to the comments column in the table. == How can you help? == 1. Choose a patch that has failing tests and help fix it. https://review.openstack.org/#/q/topic:python3-first+status:open+(+label:Verified-1+OR+label:Verified-2+) 2. Review the patches for the zuul changes. Keep in mind that some of those patches will be on the stable branches for projects. 3. Work on adding functional test jobs that run under Python 3. == How can you ask for help? == If you have any questions, please post them here to the openstack-dev list with the topic tag [python3] in the subject line. Posting questions to the mailing list will give the widest audience the chance to see the answers. We are using the #openstack-dev IRC channel for discussion as well, but I'm not sure how good our timezone coverage is so it's probably better to use the mailing list. == Reference Material == Goal description: https://governance.openstack.org/tc/goals/stein/python3-first.html Open patches needing reviews: https://review.openstack.org/#/q/topic:python3-first+is:open Storyboard: https://storyboard.openstack.org/#!/board/104 Zuul migration notes: https://etherpad.openstack.org/p/python3-first Zuul migration tracking: https://storyboard.openstack.org/#!/story/2002586 Python 3 Wiki page: https://wiki.openstack.org/wiki/Python3 -- Doug From jungleboyj at gmail.com Mon Jan 7 21:43:12 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Mon, 7 Jan 2019 15:43:12 -0600 Subject: [dev][tc][ptl] Evaluating projects in relation to OpenStack cloud vision In-Reply-To: References: Message-ID: <8da07091-1fec-174b-af81-6ccc008bab2f@gmail.com> Julia and Chris, Thanks for putting this together.  Wanted to share some thoughts in-line below: On 1/4/2019 9:53 AM, Julia Kreger wrote: > As some of you may or may not have heard, recently the Technical > Committee approved a technical vision document [1]. > > The goal of the technical vision document is to try to provide a > reference point for cloud infrastructure software in an ideal > universe. It is naturally recognized that not all items will apply to > all projects. The document is a really good high level view of what each OpenStack project should hopefully conform to.  I think it would be good to get this into the Upstream Institute education in some way as I think it is something that new contributors should understand and keep in mind.  It certainly would have helped me as a newbie to think about this. > We envision the results of the evaluation to be added to each > project's primary contributor documentation tree > (/doc/source/contributor/vision-reflection.rst) as a list of bullet > points detailing areas where a project feels they need adjustment to > better align with the technical vision, and if the project already has > visibility into a path forward, that as well. Good idea to have teams go through this.  I will work on doing the above for Cinder. Jay From juliaashleykreger at gmail.com Mon Jan 7 22:38:14 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 7 Jan 2019 14:38:14 -0800 Subject: [dev][tc][ptl] Evaluating projects in relation to OpenStack cloud vision In-Reply-To: <8da07091-1fec-174b-af81-6ccc008bab2f@gmail.com> References: <8da07091-1fec-174b-af81-6ccc008bab2f@gmail.com> Message-ID: On Mon, Jan 7, 2019 at 1:48 PM Jay Bryant wrote: [trim] > > > > We envision the results of the evaluation to be added to each > > project's primary contributor documentation tree > > (/doc/source/contributor/vision-reflection.rst) as a list of bullet > > points detailing areas where a project feels they need adjustment to > > better align with the technical vision, and if the project already has > > visibility into a path forward, that as well. > > > > Good idea to have teams go through this. I will work on doing the above > for Cinder. > > Jay > > Thanks Jay! Putting on my Ironic TL hat for a while, I ended up with a fairly short list [1]. Maybe some naming/words should change, but overall I hope that it kind of gets the level ideas across to a reader. [1]: https://review.openstack.org/#/c/629060/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From codeology.lab at gmail.com Mon Jan 7 23:35:29 2019 From: codeology.lab at gmail.com (Cody) Date: Mon, 7 Jan 2019 18:35:29 -0500 Subject: [openstack-ansible]Enable DVR support with routed network (L3 spine) Message-ID: Hi OSA, Greetings! I wish to enable DVR in a routed network environment (i.e. a Leaf/Spine topology) using OSA. Has this feature been production ready under the OSA project? An example for the user_variables.yml would be great. Thank you very much. Regards, Cody From codeology.lab at gmail.com Mon Jan 7 23:45:19 2019 From: codeology.lab at gmail.com (Cody) Date: Mon, 7 Jan 2019 18:45:19 -0500 Subject: [openstack-ansible]Enable DVR with routed network (Spine/Leaf) Message-ID: Hi OSA, Greetings! I wish to enable DVR in a routed network environment (i.e. a Leaf/Spine topology) using OSA. Has this feature been production ready under the OSA project? An example for the user_variables.yml and openstack_user_config.yml would be much appreciated. Thank you very much. Regards, Cody From liliueecg at gmail.com Tue Jan 8 06:31:36 2019 From: liliueecg at gmail.com (Li Liu) Date: Tue, 8 Jan 2019 01:31:36 -0500 Subject: [Cyborg] IRC meeting Message-ID: The IRC meeting will be held Tuesday at 0300 UTC, which is 10:00 pm est(Tuesday) / 7:00 pm pst(Tuesday) /11 am Beijing time (Wednesday) This week's agenda: 1. Review Sundar's feature branch 2. Review and try to merge CoCo's patches https://review.openstack.org/#/c/625630/ https://review.openstack.org/#/c/624138/ 3. Track status updates -- Thank you Regards Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From chkumar246 at gmail.com Tue Jan 8 07:11:07 2019 From: chkumar246 at gmail.com (Chandan kumar) Date: Tue, 8 Jan 2019 12:41:07 +0530 Subject: [tripleo][openstack-ansible] collaboration on os_tempest role update V - Jan 08, 2019 Message-ID: Hello, Happy New Year all! Here is the first update (Dec, 19th, 18 to Jan 08, 19) of 2019 on collaboration on os_tempest[1] role between TripleO and OpenStack-Ansible projects. Things got merged: os_tempest: * Update all plugin urls to use https rather than git - https://review.openstack.org/625670 * Remove octavia in favor of octavia-tempest-plugin - https://review.openstack.org/625828 * Add the manila-tempest-plugin - https://review.openstack.org/626181 * Added support for installing python-tempestconf from git - https://review.openstack.org/625904 * Use tempest_tempestconf_profile for handling named args - https://review.openstack.org/623187 * Use tempest_cloud_name for setting cloudname - https://review.openstack.org/628610 python-tempestconf * Add profile argument - https://review.openstack.org/621567 * Add unit test for profile feature - https://review.openstack.org/626889 * Fixed SafeConfigParser deprecation warning for py3 - https://review.openstack.org/628130 * Fix diff in gates - https://review.openstack.org/628180 * Added python-tempestconf-tempest-devstack-admin/demo-py3 - https://review.openstack.org/622865 Summary: * On os_tempest side, we have finished the python-tempestconf support and introduced tempest_cloud_name var in order to set cloud name from clouds.yaml for tempest tests. * python-tempestconf got --profile feature and added py3 based devstack jobs. Note: when we use tempest run --subunit It always return exit status 0, It is the desired behaviour of stestr [https://github.com/mtreinish/stestr/issues/210]. The docs are now getting updated. We are working on implementing tempest last subcommand [https://review.openstack.org/#/c/511172/]related to the same. Things In-progress: os_tempest * Better tempest black and whitelist management - https://review.openstack.org/621605 * Add support for aarch64 images - https://review.openstack.org/620032 * Fix tempest workspace path - https://review.openstack.org/628182 * Configuration drives don't appear to work on aarch64+kvm - https://review.openstack.org/626592 * Use the inventory to enable/disable services by default - https://review.openstack.org/628979 * Synced tempest plugins names and services - https://review.openstack.org/628926 python-tempestconf * Create functional-tests role - https://review.openstack.org/626539 * Enable manila plugin in devstack - https://review.openstack.org/625191 Apart from this we have started working on integrating os_tempest with devstack and Tripleo standalone job. * Devstack - https://review.openstack.org/627482 * Tripleo CI - https://review.openstack.org/627500 We will try to finish os_tempest docs cleanup, whitelist/Blacklist tests management and os_tempest integration specs. I would like to thanks mkopec, arxcruz, cloudnull (reviewing patches in holidays), mnaser, jrosser, odyssey4me, marios & quiquell (from tripleo CI team on helping on os_tempest integration with Tripleo CI). Here is the 4th update [2] Have queries, Feel free to ping us on #tripleo or #openstack-ansible channel. Links: [1.] http://git.openstack.org/cgit/openstack/openstack-ansible-os_tempest [2.] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001116.html Thanks, Chandan Kumar From skaplons at redhat.com Tue Jan 8 08:06:11 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Tue, 8 Jan 2019 09:06:11 +0100 Subject: [neutron][oslo] Functional tests job broken (oslo.privsep) In-Reply-To: <20190107152543.kugiskrwk4kuawtf@mthode.org> References: <20190107152543.kugiskrwk4kuawtf@mthode.org> Message-ID: <97336A61-5F37-4A0D-A08F-6D2BE5B1F131@redhat.com> Hi, So requirements patch is now merged and oslo.privsep version is now lowered to 1.30.1. Neutron functional job should be good for now, You can recheck Your patches. — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Matthew Thode w dniu 07.01.2019, o godz. 16:25: > > On 19-01-07 01:32:50, Slawomir Kaplonski wrote: >> Hi Neutrinos, >> >> Since few days we have an issue with neutron-functional job [1]. >> Please don’t recheck Your patches now. It will not help until this bug >> will be fixed/workarouded. >> >> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >> > > Adding an oslo tag. As far as can be determined the new oslo.privsep > code impacts neutron. There is a requirements review out to restict the > version of oslo.privsep but I'd like an ack from oslo people before we > take a step back. > > -- > Matthew Thode From ignaziocassano at gmail.com Tue Jan 8 08:11:15 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 09:11:15 +0100 Subject: openstack queens octavia security group not found Message-ID: Hello everyone, I installed octavia with centos 7 queens. When I crreate a load balancer the amphora instance is not created because nova conductor cannot find the security group specified in octavia.conf. I am sure the security group id is correct but the nova condictor reports: 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils [req-75df2561-4bc3-4bde-86d0-40469058250c 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd - default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] Error from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1828, in _do_build_and_run_instance\n filter_properties, request_spec)\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2108, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security group fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] Please, what is wrong ? Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at ericsson.com Tue Jan 8 08:54:39 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 8 Jan 2019 08:54:39 +0000 Subject: [nova] Mempage fun In-Reply-To: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> Message-ID: <1546937673.17763.2@smtp.office365.com> On Mon, Jan 7, 2019 at 6:32 PM, Stephen Finucane wrote: > We've been looking at a patch that landed some months ago and have > spotted some issues: > > https://review.openstack.org/#/c/532168 > > In summary, that patch is intended to make the memory check for > instances memory pagesize aware. The logic it introduces looks > something like this: > > If the instance requests a specific pagesize > (#1) Check if each host cell can provide enough memory of the > pagesize requested for each instance cell > Otherwise > If the host has hugepages > (#2) Check if each host cell can provide enough memory of the > smallest pagesize available on the host for each instance > cell > Otherwise > (#3) Check if each host cell can provide enough memory for > each instance cell, ignoring pagesizes > > This also has the side-effect of allowing instances with hugepages and > instances with a NUMA topology but no hugepages to co-exist on the > same > host, because the latter will now be aware of hugepages and won't > consume them. However, there are a couple of issues with this: > > 1. It breaks overcommit for instances without pagesize request > running on hosts with different pagesizes. This is because we > don't > allow overcommit for hugepages, but case (#2) above means we > are now > reusing the same functions previously used for actual hugepage > checks to check for regular 4k pages > 2. It doesn't fix the issue when non-NUMA instances exist on the > same > host as NUMA instances with hugepages. The non-NUMA instances > don't > run through any of the code above, meaning they're still not > pagesize aware > > We could probably fix issue (1) by modifying those hugepage functions > we're using to allow overcommit via a flag that we pass for case (#2). > We can mitigate issue (2) by advising operators to split hosts into > aggregates for 'hw:mem_page_size' set or unset (in addition to > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > I think this may be the case in some docs (sean-k-mooney said Intel > used to do this. I don't know about Red Hat's docs or upstream). In > addition, we did actually called that out in the original spec: > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > However, if we're doing that for non-NUMA instances, one would have to > question why the patch is necessary/acceptable for NUMA instances. For > what it's worth, a longer fix would be to start tracking hugepages in > a > non-NUMA aware way too but that's a lot more work and doesn't fix the > issue now. > > As such, my question is this: should be look at fixing issue (1) and > documenting issue (2), or should we revert the thing wholesale until > we > work on a solution that could e.g. let us track hugepages via > placement > and resolve issue (2) too. If you feel that fixing (1) is pretty simple then I suggest to do that and document the limitation of (2) while we think about a proper solution. gibi > > Thoughts? > Stephen > > From jean-philippe at evrard.me Tue Jan 8 09:27:09 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 08 Jan 2019 10:27:09 +0100 Subject: [openstack-ansible]Enable DVR with routed network (Spine/Leaf) In-Reply-To: References: Message-ID: On Mon, 2019-01-07 at 18:45 -0500, Cody wrote: > Hi OSA, > > Greetings! > > I wish to enable DVR in a routed network environment (i.e. a > Leaf/Spine topology) using OSA. Has this feature been production > ready > under the OSA project? An example for the user_variables.yml and > openstack_user_config.yml would be much appreciated. > > Thank you very much. > > Regards, > Cody > That might have changed, but there are not many people that are using OVS + DVR in OSA. We don't have a full scenario testing of this (only testing in neutron), AFAIK. I would say if you are looking for example files, you might want to discuss with people on our channel (#openstack-ansible), maybe you'll find more people that might help you there. Also it might be worth checking in neutron role tests. Patches are always welcome to test full end-to-end coverage of this feature :) NB: Is there a particular reason you want DVR? Would calico fit the bill? Wouldn't other SDN solutions fit the bill better than DVR? Before going the DVR route, I am generally asking why :) Regards, Jean-Philippe Evrard (evrardjp) From ignaziocassano at gmail.com Tue Jan 8 09:39:16 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 10:39:16 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: Hello, I do not have an octavia project but only a service project. Octavia user belongs to admin and service project :-( Documentation does not seem clear about it Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann ha scritto: > Hi, > > did you create the security group in the octavia project? > > Can you see the sg if you login with the octavia credentials? > > > Fabian > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > Hello everyone, > > I installed octavia with centos 7 queens. > > When I crreate a load balancer the amphora instance is not created > > because nova conductor cannot find the security group specified in > > octavia.conf. > > I am sure the security group id is correct but the nova condictor > reports: > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd - > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] Error > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most recent > > call last):\n', u' File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1828, > > in _do_build_and_run_instance\n filter_properties, request_spec)\n', > > u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > line 2108, in _build_and_run_instance\n instance_uuid=instance.uuid, > > reason=six.text_type(e))\n', u'RescheduledException: Build of instance > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security group > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > Please, what is wrong ? > > > > Regards > > Ignazio > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dev.faz at gmail.com Tue Jan 8 09:44:47 2019 From: dev.faz at gmail.com (Fabian Zimmermann) Date: Tue, 8 Jan 2019 10:44:47 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: Hi, in which project should octavia start its amphora instances? In this project you should create a suitable sg. Fabian Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > Hello, I do not have an octavia project but only a service project. > Octavia user belongs to admin and service project :-( > Documentation  does not seem clear about it > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > ha scritto: > > Hi, > > did you create the security group in the octavia project? > > Can you see the sg if you login with the octavia credentials? > > >   Fabian > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > Hello everyone, > > I installed octavia with centos 7 queens. > > When I crreate a load balancer the amphora instance is not created > > because nova conductor cannot find the security group specified in > > octavia.conf. > > I am sure the security group id is correct but the nova condictor > reports: > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd - > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] > Error > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most > recent > > call last):\n', u'  File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > 1828, > > in _do_build_and_run_instance\n    filter_properties, > request_spec)\n', > > u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > line 2108, in _build_and_run_instance\n > instance_uuid=instance.uuid, > > reason=six.text_type(e))\n', u'RescheduledException: Build of > instance > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security > group > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > Please, what is wrong ? > > > > Regards > > Ignazio > From jean-philippe at evrard.me Tue Jan 8 09:57:03 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 08 Jan 2019 10:57:03 +0100 Subject: [loci] Stable Branches in Loci In-Reply-To: References: <3855B170-6E38-4DB9-A91C-9389D16D387F@openstack.org> <64a34fd9-d31b-5d7d-ae94-053d9bdebbad@openstack.org> <20181218054413.GG6373@thor.bakeyournoodle.com> Message-ID: <6572b304b7a340a798bad518061dd95d71efc04a.camel@evrard.me> On Thu, 2018-12-20 at 09:58 -0800, Chris Hoge wrote: > > On Dec 17, 2018, at 9:44 PM, Tony Breeds > > wrote: > > > > On Thu, Dec 13, 2018 at 09:43:37AM -0800, Chris Hoge wrote: > > > There is a need for us to work out whether Loci is even > > > appropriate for > > > stable branch development. Over the last week or so the CentOS > > > libvirt > > > update has broken all stable branch builds as it introduced an > > > incompatibility between the stable upper contraints of python- > > > libvirt and > > > libvirt. > > > > Yup, as we've seen on https://review.openstack.org/#/c/622262 this > > is a > > common thing and happens with every CentOS minor release. We're > > working > > the update to make sure we don't cause more breakage as we try to > > fix > > this thing. > > > > > libvirt. If we start running stable builds, it might provide a > > > useful > > > gate signal for when stable source builds break against upstream > > > distributions. It's something for the Loci team to think about as > > > we > > > work through refactoring our gate jobs. > > > > That's interesting idea. Happy to discuss how we can do that in a > > way > > that makes sense for each project. How long does LOCI build take? > > Loci makes one build for each OpenStack project you want to deploy. > The > requirements container takes the most time, as it does a pip wheel of > every requirement listed in the openstack/requirements repository, > then > bind-mounts the complete set of wheels into the service containers > during > those builds to ensure a complete and consistent set of dependencies. > Requirements must be done serially, but the rest of the builds can be > done in parallel. > > What I'm thinking is if we do stable builds of Loci that stand up a > simplified all-in-one environment we can run Tempest against, we > would > both get a signal for the Loci stable build (as well as master) and a > signal for requirements. Co-gating means we can check that an update > to > requirements to fix one distrubution does not negatively impact the > stability of other distributions. > > I have some very initial work on this in a personal project (this is > how > I like to spend some of my holiday down time), and we can bring it up > as > an agenda item for the Loci meeting tomorrow morning. > > -Chris > > I like the idea of having REAL testing of the loci images. Currently we just install software, and it's up to deployment tools to configure the images to match their needs. Doing a real test for all distros would be very nice, and a positive addition. I am curious about how we'd do this though. I suppose though it might require a new job, which will take far more time: After doing a building of the necessary images (more than one project!), we need to deploy them together and run tempest on them (therefore ensuring proper image building and co-installability). Or did you mean that you wanted to test each image building separately by running the minimum smoke tests for each image? What about reusing a deployment project job that's using loci in an experimental pipeline? Not sure to understand what you have written :) Regards, JP From ignaziocassano at gmail.com Tue Jan 8 09:57:40 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 10:57:40 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: It started in service project but service project not read securty groups that I created in admin project. So I modified /etc/octavia.conf secitons service_auth and keystone_authoken and I put project_name = admin instead of project_name = service With the above modifications the amphora instance starts in admin projects abd can read from it the security group id. But the load balancer remains in pending and then the ambora instance is automatically deleted. Another problem is that in both cases it does not start to create the amphra instance when I specify amp_ssh_key_name in octavia.conf In admin project case it shoud read it, because this key is in the admin project :-( So I started without ssh_key. Could you help me with my wrong configuration,please ? Regards Ignazio Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann ha scritto: > Hi, > > in which project should octavia start its amphora instances? > > In this project you should create a suitable sg. > > Fabian > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > Hello, I do not have an octavia project but only a service project. > > Octavia user belongs to admin and service project :-( > > Documentation does not seem clear about it > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > did you create the security group in the octavia project? > > > > Can you see the sg if you login with the octavia credentials? > > > > > > Fabian > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > Hello everyone, > > > I installed octavia with centos 7 queens. > > > When I crreate a load balancer the amphora instance is not created > > > because nova conductor cannot find the security group specified in > > > octavia.conf. > > > I am sure the security group id is correct but the nova condictor > > reports: > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd > - > > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > Error > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most > > recent > > > call last):\n', u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > 1828, > > > in _do_build_and_run_instance\n filter_properties, > > request_spec)\n', > > > u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > line 2108, in _build_and_run_instance\n > > instance_uuid=instance.uuid, > > > reason=six.text_type(e))\n', u'RescheduledException: Build of > > instance > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security > > group > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > Please, what is wrong ? > > > > > > Regards > > > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sahid.ferdjaoui at canonical.com Tue Jan 8 10:06:31 2019 From: sahid.ferdjaoui at canonical.com (Sahid Orentino Ferdjaoui) Date: Tue, 8 Jan 2019 11:06:31 +0100 Subject: [nova] Mempage fun In-Reply-To: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> Message-ID: <20190108100631.GA4852@canonical> On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > We've been looking at a patch that landed some months ago and have > spotted some issues: > > https://review.openstack.org/#/c/532168 > > In summary, that patch is intended to make the memory check for > instances memory pagesize aware. The logic it introduces looks > something like this: > > If the instance requests a specific pagesize > (#1) Check if each host cell can provide enough memory of the > pagesize requested for each instance cell > Otherwise > If the host has hugepages > (#2) Check if each host cell can provide enough memory of the > smallest pagesize available on the host for each instance cell > Otherwise > (#3) Check if each host cell can provide enough memory for > each instance cell, ignoring pagesizes > > This also has the side-effect of allowing instances with hugepages and > instances with a NUMA topology but no hugepages to co-exist on the same > host, because the latter will now be aware of hugepages and won't > consume them. However, there are a couple of issues with this: > > 1. It breaks overcommit for instances without pagesize request > running on hosts with different pagesizes. This is because we don't > allow overcommit for hugepages, but case (#2) above means we are now > reusing the same functions previously used for actual hugepage > checks to check for regular 4k pages I think that we should not accept any overcommit. Only instances with an InstanceNUMATopology associated pass to this part of check. Such instances want to use features like guest NUMA topology so their memory mapped on host NUMA nodes or CPU pinning. Both cases are used for performance reason and to avoid any cross memory latency. > 2. It doesn't fix the issue when non-NUMA instances exist on the same > host as NUMA instances with hugepages. The non-NUMA instances don't > run through any of the code above, meaning they're still not > pagesize aware That is an other issue. We report to the resource tracker all the physical memory (small pages + hugepages allocated). The difficulty is that we can't just change the virt driver to report only small pages. Some instances wont be able to get scheduled. We should basically change the resource tracker so it can take into account the different kind of page memory. But it's not really an issue since instances that use "NUMA features" (in Nova world) should be isolated to an aggregate and not be mixed with no-NUMA instances. The reason is simple no-NUMA instances do not have boundaries and break rules of NUMA instances. > We could probably fix issue (1) by modifying those hugepage functions > we're using to allow overcommit via a flag that we pass for case (#2). > We can mitigate issue (2) by advising operators to split hosts into > aggregates for 'hw:mem_page_size' set or unset (in addition to > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > I think this may be the case in some docs (sean-k-mooney said Intel > used to do this. I don't know about Red Hat's docs or upstream). In > addition, we did actually called that out in the original spec: > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > However, if we're doing that for non-NUMA instances, one would have to > question why the patch is necessary/acceptable for NUMA instances. For > what it's worth, a longer fix would be to start tracking hugepages in a > non-NUMA aware way too but that's a lot more work and doesn't fix the > issue now. > > As such, my question is this: should be look at fixing issue (1) and > documenting issue (2), or should we revert the thing wholesale until we > work on a solution that could e.g. let us track hugepages via placement > and resolve issue (2) too. > > Thoughts? > Stephen > From jan.vondra at ultimum.io Tue Jan 8 10:08:43 2019 From: jan.vondra at ultimum.io (Jan Vondra) Date: Tue, 8 Jan 2019 11:08:43 +0100 Subject: [Kolla] Queens for debian images Message-ID: Dear Kolla team, during project for one of our customers we have upgraded debian part of kolla project using a queens debian repositories (http://stretch-queens.debian.net/debian stretch-queens-backports) and we would like to share this work with community. I would like to ask what's the proper process of contributing since the patches affects both kolla and kolla-ansible repositories. Also any other comments regarding debian in kolla would be appriciated. Thanks, Jan Vondra Ultimum Technologies s.r.o. Na Poříčí 1047/26, 11000 Praha 1 Czech Republic http://ultimum.io From jean-philippe at evrard.me Tue Jan 8 10:09:53 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 08 Jan 2019 11:09:53 +0100 Subject: [tc][all] Train Community Goals In-Reply-To: <66d73db6-9f84-1290-1ab8-cf901a7fb355@catalyst.net.nz> References: <66d73db6-9f84-1290-1ab8-cf901a7fb355@catalyst.net.nz> Message-ID: <6b498008e71b7dae651e54e29717f3ccedea50d1.camel@evrard.me> On Wed, 2018-12-19 at 06:58 +1300, Adrian Turjak wrote: > I put my hand up during the summit for being at least one of the > champions for the deletion of project resources effort. > > I have been meaning to do a follow up email and options as well as > steps > for how the goal might go, but my working holiday in Europe after the > summit turned into more of a holiday than originally planned. > > I'll get a thread going around what I (and the public cloud working > group) think project resource deletion should look like, and what the > options are, and where we should aim to be with it. We can then turn > that discussion into a final 'spec' of sorts. > > Great news! Do you need any help to get started there? Regards, JP From ignaziocassano at gmail.com Tue Jan 8 10:14:16 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 11:14:16 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: PS Now I added the ssh key to octavia user and it assignes it to amphora instance. Still load balancer remains in pending create and after 3 minutes the amphora instance is automatically deleted. Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann ha scritto: > Hi, > > in which project should octavia start its amphora instances? > > In this project you should create a suitable sg. > > Fabian > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > Hello, I do not have an octavia project but only a service project. > > Octavia user belongs to admin and service project :-( > > Documentation does not seem clear about it > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > did you create the security group in the octavia project? > > > > Can you see the sg if you login with the octavia credentials? > > > > > > Fabian > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > Hello everyone, > > > I installed octavia with centos 7 queens. > > > When I crreate a load balancer the amphora instance is not created > > > because nova conductor cannot find the security group specified in > > > octavia.conf. > > > I am sure the security group id is correct but the nova condictor > > reports: > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd > - > > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > Error > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most > > recent > > > call last):\n', u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > 1828, > > > in _do_build_and_run_instance\n filter_properties, > > request_spec)\n', > > > u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > line 2108, in _build_and_run_instance\n > > instance_uuid=instance.uuid, > > > reason=six.text_type(e))\n', u'RescheduledException: Build of > > instance > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security > > group > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > Please, what is wrong ? > > > > > > Regards > > > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 10:24:16 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 11:24:16 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: PS on the amphore instance there is nothng on port 9443 Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann ha scritto: > Hi, > > in which project should octavia start its amphora instances? > > In this project you should create a suitable sg. > > Fabian > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > Hello, I do not have an octavia project but only a service project. > > Octavia user belongs to admin and service project :-( > > Documentation does not seem clear about it > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > did you create the security group in the octavia project? > > > > Can you see the sg if you login with the octavia credentials? > > > > > > Fabian > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > Hello everyone, > > > I installed octavia with centos 7 queens. > > > When I crreate a load balancer the amphora instance is not created > > > because nova conductor cannot find the security group specified in > > > octavia.conf. > > > I am sure the security group id is correct but the nova condictor > > reports: > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd > - > > > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > Error > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most > > recent > > > call last):\n', u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > 1828, > > > in _do_build_and_run_instance\n filter_properties, > > request_spec)\n', > > > u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > line 2108, in _build_and_run_instance\n > > instance_uuid=instance.uuid, > > > reason=six.text_type(e))\n', u'RescheduledException: Build of > > instance > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security > > group > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > Please, what is wrong ? > > > > > > Regards > > > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Tue Jan 8 10:47:47 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 08 Jan 2019 10:47:47 +0000 Subject: [nova] Mempage fun In-Reply-To: <20190108100631.GA4852@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> Message-ID: <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > We've been looking at a patch that landed some months ago and have > > spotted some issues: > > > > https://review.openstack.org/#/c/532168 > > > > In summary, that patch is intended to make the memory check for > > instances memory pagesize aware. The logic it introduces looks > > something like this: > > > > If the instance requests a specific pagesize > > (#1) Check if each host cell can provide enough memory of the > > pagesize requested for each instance cell > > Otherwise > > If the host has hugepages > > (#2) Check if each host cell can provide enough memory of the > > smallest pagesize available on the host for each instance cell > > Otherwise > > (#3) Check if each host cell can provide enough memory for > > each instance cell, ignoring pagesizes > > > > This also has the side-effect of allowing instances with hugepages and > > instances with a NUMA topology but no hugepages to co-exist on the same > > host, because the latter will now be aware of hugepages and won't > > consume them. However, there are a couple of issues with this: > > > > 1. It breaks overcommit for instances without pagesize request > > running on hosts with different pagesizes. This is because we don't > > allow overcommit for hugepages, but case (#2) above means we are now > > reusing the same functions previously used for actual hugepage > > checks to check for regular 4k pages > > I think that we should not accept any overcommit. Only instances with > an InstanceNUMATopology associated pass to this part of check. Such > instances want to use features like guest NUMA topology so their > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > for performance reason and to avoid any cross memory latency. This issue with this is that we had previously designed everything *to* allow overcommit: https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 The only time this doesn't apply is if CPU pinning is also in action (remembering that CPU pinning and NUMA topologies are tightly bound and CPU pinning implies a NUMA topology, much to Jay's consternation). As noted below, our previous advice was not to mix hugepage instances and non-hugepage instances, meaning hosts handling non-hugepage instances should not have hugepages (or should mark the memory consumed by them as reserved for host). We have in effect broken previous behaviour in the name of solving a bug that didn't necessarily have to be fixed yet. > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > host as NUMA instances with hugepages. The non-NUMA instances don't > > run through any of the code above, meaning they're still not > > pagesize aware > > That is an other issue. We report to the resource tracker all the > physical memory (small pages + hugepages allocated). The difficulty is > that we can't just change the virt driver to report only small > pages. Some instances wont be able to get scheduled. We should > basically change the resource tracker so it can take into account the > different kind of page memory. Agreed (likely via move tracking of this resource to placement, I assume). It's a longer term fix though. > But it's not really an issue since instances that use "NUMA features" > (in Nova world) should be isolated to an aggregate and not be mixed > with no-NUMA instances. The reason is simple no-NUMA instances do not > have boundaries and break rules of NUMA instances. Again, we have to be careful not to mix up NUMA and CPU pinning. It's perfectly fine to have NUMA without CPU pinning, though not the other way around. For example: $ openstack flavor set --property hw:numa_nodes=2 FLAVOR >From what I can tell, there are three reasons that an instance will have a NUMA topology: the user explicitly requested one, the user requested CPU pinning and got one implicitly, or the user requested a specific pagesize and, again, got one implicitly. We handle the latter two with the advice given below, but I don't think anyone has ever said we must separate instances that had a user-specified NUMA topology from those that had no NUMA topology. If we're going down this path, we need clear docs. Stephen > > We could probably fix issue (1) by modifying those hugepage functions > > we're using to allow overcommit via a flag that we pass for case (#2). > > We can mitigate issue (2) by advising operators to split hosts into > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > I think this may be the case in some docs (sean-k-mooney said Intel > > used to do this. I don't know about Red Hat's docs or upstream). In > > addition, we did actually called that out in the original spec: > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > However, if we're doing that for non-NUMA instances, one would have to > > question why the patch is necessary/acceptable for NUMA instances. For > > what it's worth, a longer fix would be to start tracking hugepages in a > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > issue now. > > > > As such, my question is this: should be look at fixing issue (1) and > > documenting issue (2), or should we revert the thing wholesale until we > > work on a solution that could e.g. let us track hugepages via placement > > and resolve issue (2) too. > > > > Thoughts? > > Stephen > > From marcin.juszkiewicz at linaro.org Tue Jan 8 11:00:48 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Tue, 8 Jan 2019 12:00:48 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: W dniu 08.01.2019 o 11:08, Jan Vondra pisze: > Dear Kolla team, > > during project for one of our customers we have upgraded debian part > of kolla project using a queens debian repositories > (http://stretch-queens.debian.net/debian stretch-queens-backports) and > we would like to share this work with community. Thanks for doing that. Is there an option to provide arm64 packages next time? > I would like to ask what's the proper process of contributing since > the patches affects both kolla and kolla-ansible repositories. Send patches for review [1] and then we can discuss about changing them. Remember that we target Stein now. 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > Also any other comments regarding debian in kolla would be appriciated. Love to see someone else caring about Debian in Kolla. I took it over two years ago, revived and moved to 'stretch'. But skipped support for binary packages as there were no up-to-date packages available. In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' as it will enter final freeze. Had some discussion with Debian OpenStack team about providing preliminary Stein packages so support for 'binary' type of images could be possible. From dev.faz at gmail.com Tue Jan 8 11:32:33 2019 From: dev.faz at gmail.com (Fabian Zimmermann) Date: Tue, 8 Jan 2019 12:32:33 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Message-ID: <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> Hi, are you able to connect to the amphora via ssh? Could you paste your octavia.log and the log of the amphora somewhere? Fabian Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > PS > on the amphore instance there is nothng on port 9443 > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > ha scritto: > > Hi, > > in which project should octavia start its amphora instances? > > In this project you should create a suitable sg. > >   Fabian > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > Hello, I do not have an octavia project but only a service project. > > Octavia user belongs to admin and service project :-( > > Documentation  does not seem clear about it > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > >> ha scritto: > > > >     Hi, > > > >     did you create the security group in the octavia project? > > > >     Can you see the sg if you login with the octavia credentials? > > > > > >        Fabian > > > >     Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > >      > Hello everyone, > >      > I installed octavia with centos 7 queens. > >      > When I crreate a load balancer the amphora instance is not > created > >      > because nova conductor cannot find the security group > specified in > >      > octavia.conf. > >      > I am sure the security group id is correct but the nova > condictor > >     reports: > >      > > >      > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > >      > [req-75df2561-4bc3-4bde-86d0-40469058250c > >      > 62ed0b7f336b479ebda6f8587c4dd608 > 2a33760772ab4b0381a27735443ec4bd - > >      > default default] [instance: > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > >     Error > >      > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback > (most > >     recent > >      > call last):\n', u'  File > >      > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > >     1828, > >      > in _do_build_and_run_instance\n    filter_properties, > >     request_spec)\n', > >      > u'  File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > >      > line 2108, in _build_and_run_instance\n > >     instance_uuid=instance.uuid, > >      > reason=six.text_type(e))\n', u'RescheduledException: Build of > >     instance > >      > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: > Security > >     group > >      > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > >      > > >      > Please, what is wrong ? > >      > > >      > Regards > >      > Ignazio > > > From ellorent at redhat.com Tue Jan 8 11:48:56 2019 From: ellorent at redhat.com (Felix Enrique Llorente Pastora) Date: Tue, 8 Jan 2019 12:48:56 +0100 Subject: Make tripleo-ci-fedora-28-standalone voting Message-ID: Hi All, The hibrid job to test fedora28 host and centos7 containers is working now and running tempest correctly (well it miss junit xml generation, but it's a matter of updating one RPM), so maybe is time for the job to be voting. What do you think? BR -- Quique Llorente Openstack TripleO CI -------------- next part -------------- An HTML attachment was scrubbed... URL: From sahid.ferdjaoui at canonical.com Tue Jan 8 11:50:27 2019 From: sahid.ferdjaoui at canonical.com (Sahid Orentino Ferdjaoui) Date: Tue, 8 Jan 2019 12:50:27 +0100 Subject: [nova] Mempage fun In-Reply-To: <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> Message-ID: <20190108115027.GA7825@canonical> On Tue, Jan 08, 2019 at 10:47:47AM +0000, Stephen Finucane wrote: > On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > > We've been looking at a patch that landed some months ago and have > > > spotted some issues: > > > > > > https://review.openstack.org/#/c/532168 > > > > > > In summary, that patch is intended to make the memory check for > > > instances memory pagesize aware. The logic it introduces looks > > > something like this: > > > > > > If the instance requests a specific pagesize > > > (#1) Check if each host cell can provide enough memory of the > > > pagesize requested for each instance cell > > > Otherwise > > > If the host has hugepages > > > (#2) Check if each host cell can provide enough memory of the > > > smallest pagesize available on the host for each instance cell > > > Otherwise > > > (#3) Check if each host cell can provide enough memory for > > > each instance cell, ignoring pagesizes > > > > > > This also has the side-effect of allowing instances with hugepages and > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > host, because the latter will now be aware of hugepages and won't > > > consume them. However, there are a couple of issues with this: > > > > > > 1. It breaks overcommit for instances without pagesize request > > > running on hosts with different pagesizes. This is because we don't > > > allow overcommit for hugepages, but case (#2) above means we are now > > > reusing the same functions previously used for actual hugepage > > > checks to check for regular 4k pages > > > > I think that we should not accept any overcommit. Only instances with > > an InstanceNUMATopology associated pass to this part of check. Such > > instances want to use features like guest NUMA topology so their > > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > > for performance reason and to avoid any cross memory latency. > > This issue with this is that we had previously designed everything *to* > allow overcommit: > > https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 This code never worked Stephen, that instead of to please unit tests related. I would not recommend to use it as a reference. > The only time this doesn't apply is if CPU pinning is also in action > (remembering that CPU pinning and NUMA topologies are tightly bound and > CPU pinning implies a NUMA topology, much to Jay's consternation). As > noted below, our previous advice was not to mix hugepage instances and > non-hugepage instances, meaning hosts handling non-hugepage instances > should not have hugepages (or should mark the memory consumed by them > as reserved for host). We have in effect broken previous behaviour in > the name of solving a bug that didn't necessarily have to be fixed yet. > > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > run through any of the code above, meaning they're still not > > > pagesize aware > > > > That is an other issue. We report to the resource tracker all the > > physical memory (small pages + hugepages allocated). The difficulty is > > that we can't just change the virt driver to report only small > > pages. Some instances wont be able to get scheduled. We should > > basically change the resource tracker so it can take into account the > > different kind of page memory. > > Agreed (likely via move tracking of this resource to placement, I > assume). It's a longer term fix though. > > > But it's not really an issue since instances that use "NUMA features" > > (in Nova world) should be isolated to an aggregate and not be mixed > > with no-NUMA instances. The reason is simple no-NUMA instances do not > > have boundaries and break rules of NUMA instances. > > Again, we have to be careful not to mix up NUMA and CPU pinning. It's > perfectly fine to have NUMA without CPU pinning, though not the other > way around. For example: > > $ openstack flavor set --property hw:numa_nodes=2 FLAVOR > > >From what I can tell, there are three reasons that an instance will > have a NUMA topology: the user explicitly requested one, the user > requested CPU pinning and got one implicitly, or the user requested a > specific pagesize and, again, got one implicitly. We handle the latter > two with the advice given below, but I don't think anyone has ever said > we must separate instances that had a user-specified NUMA topology from > those that had no NUMA topology. If we're going down this path, we need > clear docs. The implementation is pretty old and it was a first design from scratch, all the situations have not been take into account or been documented. If we want create specific behaviors we are going to add more complexity on something which is already, and which is not completely stable, as an example the patch you have mentioned which has been merged last release. I agree documenting is probably where we should go; don't try to mix instances with InstanceNUMATopology and without, Nova uses a different way to compute their resources, like don't try to overcommit such instances. We basically recommend to use aggregate for pinning, realtime, hugepages, so it looks reasonable to add guest NUMA topology to that list. > Stephen > > > > We could probably fix issue (1) by modifying those hugepage functions > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > We can mitigate issue (2) by advising operators to split hosts into > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > addition, we did actually called that out in the original spec: > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > question why the patch is necessary/acceptable for NUMA instances. For > > > what it's worth, a longer fix would be to start tracking hugepages in a > > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > > issue now. > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > documenting issue (2), or should we revert the thing wholesale until we > > > work on a solution that could e.g. let us track hugepages via placement > > > and resolve issue (2) too. > > > > > > Thoughts? > > > Stephen > > > > From ignaziocassano at gmail.com Tue Jan 8 12:05:55 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 13:05:55 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> Message-ID: Yes, I can connect to amphora instance for a short time because it is removed automatically. For the amphora instance which log do you need? For octavia worker log is enough? Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann ha scritto: > Hi, > > are you able to connect to the amphora via ssh? > > Could you paste your octavia.log and the log of the amphora somewhere? > > Fabian > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > PS > > on the amphore instance there is nothng on port 9443 > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > in which project should octavia start its amphora instances? > > > > In this project you should create a suitable sg. > > > > Fabian > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > Hello, I do not have an octavia project but only a service > project. > > > Octavia user belongs to admin and service project :-( > > > Documentation does not seem clear about it > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > did you create the security group in the octavia project? > > > > > > Can you see the sg if you login with the octavia credentials? > > > > > > > > > Fabian > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > Hello everyone, > > > > I installed octavia with centos 7 queens. > > > > When I crreate a load balancer the amphora instance is not > > created > > > > because nova conductor cannot find the security group > > specified in > > > > octavia.conf. > > > > I am sure the security group id is correct but the nova > > condictor > > > reports: > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > 2a33760772ab4b0381a27735443ec4bd - > > > > default default] [instance: > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > Error > > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback > > (most > > > recent > > > > call last):\n', u' File > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > > 1828, > > > > in _do_build_and_run_instance\n filter_properties, > > > request_spec)\n', > > > > u' File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > line 2108, in _build_and_run_instance\n > > > instance_uuid=instance.uuid, > > > > reason=six.text_type(e))\n', u'RescheduledException: Build > of > > > instance > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: > > Security > > > group > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > Please, what is wrong ? > > > > > > > > Regards > > > > Ignazio > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dev.faz at gmail.com Tue Jan 8 12:06:54 2019 From: dev.faz at gmail.com (Fabian Zimmermann) Date: Tue, 8 Jan 2019 13:06:54 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> Message-ID: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Well, more logs are always better ;) Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > Yes, I can connect to amphora instance for a short time because it is > removed automatically. > For the amphora instance which log do you need? > For octavia worker log is enough? > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > ha scritto: > > Hi, > > are you able to connect to the amphora via ssh? > > Could you paste your octavia.log and the log of the amphora somewhere? > >   Fabian > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > PS > > on the amphore instance there is nothng on port 9443 > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > >> ha scritto: > > > >     Hi, > > > >     in which project should octavia start its amphora instances? > > > >     In this project you should create a suitable sg. > > > >        Fabian > > > >     Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > >      > Hello, I do not have an octavia project but only a service > project. > >      > Octavia user belongs to admin and service project :-( > >      > Documentation  does not seem clear about it > >      > > >      > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > >      > > > > >      > >>> ha scritto: > >      > > >      >     Hi, > >      > > >      >     did you create the security group in the octavia project? > >      > > >      >     Can you see the sg if you login with the octavia > credentials? > >      > > >      > > >      >        Fabian > >      > > >      >     Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > >      >      > Hello everyone, > >      >      > I installed octavia with centos 7 queens. > >      >      > When I crreate a load balancer the amphora instance > is not > >     created > >      >      > because nova conductor cannot find the security group > >     specified in > >      >      > octavia.conf. > >      >      > I am sure the security group id is correct but the nova > >     condictor > >      >     reports: > >      >      > > >      >      > 2019-01-08 09:06:06.803 11872 ERROR > nova.scheduler.utils > >      >      > [req-75df2561-4bc3-4bde-86d0-40469058250c > >      >      > 62ed0b7f336b479ebda6f8587c4dd608 > >     2a33760772ab4b0381a27735443ec4bd - > >      >      > default default] [instance: > >     83f2fd75-8069-47a5-9572-8949ec9b5cee] > >      >     Error > >      >      > from last host: tst2-kvm02 (node tst2-kvm02): > [u'Traceback > >     (most > >      >     recent > >      >      > call last):\n', u'  File > >      >      > > >     "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > >      >     1828, > >      >      > in _do_build_and_run_instance\n    filter_properties, > >      >     request_spec)\n', > >      >      > u'  File > >     "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > >      >      > line 2108, in _build_and_run_instance\n > >      >     instance_uuid=instance.uuid, > >      >      > reason=six.text_type(e))\n', > u'RescheduledException: Build of > >      >     instance > >      >      > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: > >     Security > >      >     group > >      >      > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > >      >      > > >      >      > Please, what is wrong ? > >      >      > > >      >      > Regards > >      >      > Ignazio > >      > > > > From ignaziocassano at gmail.com Tue Jan 8 12:34:59 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 13:34:59 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Hi, attached here there are logs. As you can see the amphora messages reports that amphora-agent service fails. Thanks a lot for your help Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: worker.log Type: text/x-log Size: 2718 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: messages-amphora-instace.log Type: text/x-log Size: 1077 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: housekeeping.log Type: text/x-log Size: 1146 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: api.log Type: text/x-log Size: 3921 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: healt-manager.log Type: text/x-log Size: 1205 bytes Desc: not available URL: From ignaziocassano at gmail.com Tue Jan 8 12:42:59 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 13:42:59 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Doing a journactl -u on amphora instance it gives: gen 08 12:40:12 amphora-1e35a2d5-c3ab-4016-baf6-e6bf1dd061ac.novalocal amphora-agent[2884]: 2019-01-08 12:40:12.013 2884 ERROR octavia raise ValueError('certfile "%s" does not exist' % conf.certfile) gen 08 12:40:12 amphora-1e35a2d5-c3ab-4016-baf6-e6bf1dd061ac.novalocal amphora-agent[2884]: 2019-01-08 12:40:12.013 2884 ERROR octavia ValueError: certfile "/etc/octavia/certs/client.pem" does not exist Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 12:50:48 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 13:50:48 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Probably I must set client_ca.pem in octavia.conf end not client.pem in section haproxy_amphora Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 13:06:36 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 14:06:36 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: I am becoming crazy. Amphora agent in amphora instance search /etc/octavia/certs/client.pem but in /etc/octavia/certs there is client_ca.pem :-( Probably must I modify the amphora_agent section ? Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 13:14:47 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 14:14:47 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> Message-ID: OK I solved the certificate issue modifying the right section . Now amphora agent starts in the instance but controller canno reach it on 9443 port. I think this is a firewall problem on our network. I am going to check Il giorno mar 8 gen 2019 alle ore 12:32 Fabian Zimmermann ha scritto: > Hi, > > are you able to connect to the amphora via ssh? > > Could you paste your octavia.log and the log of the amphora somewhere? > > Fabian > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > PS > > on the amphore instance there is nothng on port 9443 > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > ha scritto: > > > > Hi, > > > > in which project should octavia start its amphora instances? > > > > In this project you should create a suitable sg. > > > > Fabian > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > Hello, I do not have an octavia project but only a service > project. > > > Octavia user belongs to admin and service project :-( > > > Documentation does not seem clear about it > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > did you create the security group in the octavia project? > > > > > > Can you see the sg if you login with the octavia credentials? > > > > > > > > > Fabian > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > Hello everyone, > > > > I installed octavia with centos 7 queens. > > > > When I crreate a load balancer the amphora instance is not > > created > > > > because nova conductor cannot find the security group > > specified in > > > > octavia.conf. > > > > I am sure the security group id is correct but the nova > > condictor > > > reports: > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > 2a33760772ab4b0381a27735443ec4bd - > > > > default default] [instance: > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > Error > > > > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback > > (most > > > recent > > > > call last):\n', u' File > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line > > > 1828, > > > > in _do_build_and_run_instance\n filter_properties, > > > request_spec)\n', > > > > u' File > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > line 2108, in _build_and_run_instance\n > > > instance_uuid=instance.uuid, > > > > reason=six.text_type(e))\n', u'RescheduledException: Build > of > > > instance > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: > > Security > > > group > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > Please, what is wrong ? > > > > > > > > Regards > > > > Ignazio > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Jan 8 13:39:16 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 08 Jan 2019 13:39:16 +0000 Subject: [nova] Mempage fun In-Reply-To: <20190108100631.GA4852@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> Message-ID: On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > We've been looking at a patch that landed some months ago and have > > spotted some issues: > > > > https://review.openstack.org/#/c/532168 > > > > In summary, that patch is intended to make the memory check for > > instances memory pagesize aware. The logic it introduces looks > > something like this: > > > > If the instance requests a specific pagesize > > (#1) Check if each host cell can provide enough memory of the > > pagesize requested for each instance cell > > Otherwise > > If the host has hugepages > > (#2) Check if each host cell can provide enough memory of the > > smallest pagesize available on the host for each instance cell > > Otherwise > > (#3) Check if each host cell can provide enough memory for > > each instance cell, ignoring pagesizes > > > > This also has the side-effect of allowing instances with hugepages and > > instances with a NUMA topology but no hugepages to co-exist on the same > > host, because the latter will now be aware of hugepages and won't > > consume them. However, there are a couple of issues with this: > > > > 1. It breaks overcommit for instances without pagesize request > > running on hosts with different pagesizes. This is because we don't > > allow overcommit for hugepages, but case (#2) above means we are now > > reusing the same functions previously used for actual hugepage > > checks to check for regular 4k pages > > I think that we should not accept any overcommit. Only instances with > an InstanceNUMATopology associated pass to this part of check. Such > instances want to use features like guest NUMA topology so their > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > for performance reason and to avoid any cross memory latency. that is not nessisarialy correct. if i request cpu pinning that does not imply that i dont want the ablitiy memory over subsricption. that is an artifact of how we chose to implement pinning in the libvirt driver. for the case of cpu pinning specifically i have alway felt it is wrong that we create a numa toplogy for the geust implicitly. in the case of hw:numa_nodes=1 in the absence of any other extra spec or image metadata i also do not think it is correct ot disabel over subsription retroativly after supportin it for several years. requesting a numa toplogy out side of requesting huge pages explcitly shoudl never have disabled over subsription and changing that behavior should have both required a microvirtion and a nova spec. https://review.openstack.org/#/c/532168 was simply a bug fix and therefor shoudl not have changed the meaning of requesting a numa topology. > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > host as NUMA instances with hugepages. The non-NUMA instances don't > > run through any of the code above, meaning they're still not > > pagesize aware > > That is an other issue. We report to the resource tracker all the > physical memory (small pages + hugepages allocated). The difficulty is > that we can't just change the virt driver to report only small > pages. Some instances wont be able to get scheduled. We should > basically change the resource tracker so it can take into account the > different kind of page memory. > > But it's not really an issue since instances that use "NUMA features" > (in Nova world) should be isolated to an aggregate and not be mixed > with no-NUMA instances. The reason is simple no-NUMA instances do not > have boundaries and break rules of NUMA instances. it is true that we should partion deployment useing host aggreates for host with numa instance today and host that have non numa instances. the reason issue 2 was raised is that the commit message implied that the patch addressed mixing numa and non numa guest on the same host. "Also when no pagesize is requested we should consider to compute memory usage based on small pages since the amount of physical memory available may also include some large pages." but the logic in the patch does not actully get triggered when the guest does not have numa topology so it does not actully consider the total number of small page in that case. this was linked to a down bugzilla you filed https://bugzilla.redhat.com/show_bug.cgi?id=1625119 and another for osp 10 https://bugzilla.redhat.com/show_bug.cgi?id=1519540 which has 3 costomer cases associated with it that downstream bugs. on closer inspection of the patch dose not address the downstream bug at all as it is expressly stating that nova does not consider small pages when mem_page_size is not set but since we dont execute this code of non numa guest we dont actully resovle the issue. when we consider the costomer cases associated with this specifcally the 3 the donwstream bug claims to resove are 1.) a sheudler race where two pinned instances get shduled to the same set of resouces (this can only be fixed with placement 2.) mixing hugepage and non huge page guest reulted in OOM events 3.) instnace with a numa toplogy nolonger respect ram allocation ration. the third customer issue was directly caused but backporting this patach. the second issue would be resoved by using host aggreates to segerage hugepage host from non numa hosts and the first cant be addressed without premtivly claiming cpu/hugepages in the schduelr/placement. > > We could probably fix issue (1) by modifying those hugepage functions > > we're using to allow overcommit via a flag that we pass for case (#2). > > We can mitigate issue (2) by advising operators to split hosts into > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > I think this may be the case in some docs (sean-k-mooney said Intel > > used to do this. I don't know about Red Hat's docs or upstream). In > > addition, we did actually called that out in the original spec: > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > However, if we're doing that for non-NUMA instances, one would have to > > question why the patch is necessary/acceptable for NUMA instances. For > > what it's worth, a longer fix would be to start tracking hugepages in a > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > issue now. > > > > As such, my question is this: should be look at fixing issue (1) and > > documenting issue (2), or should we revert the thing wholesale until we > > work on a solution that could e.g. let us track hugepages via placement > > and resolve issue (2) too. > > > > Thoughts? > > Stephen > > > > From fungi at yuggoth.org Tue Jan 8 14:00:26 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 8 Jan 2019 14:00:26 +0000 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: <20190108140026.p4462df5otnyizm2@yuggoth.org> On 2019-01-08 12:00:48 +0100 (+0100), Marcin Juszkiewicz wrote: [...] > Send patches for review [1] and then we can discuss about changing them. > Remember that we target Stein now. > > 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing [...] These days it's probably better to recommend https://docs.openstack.org/contributors/ since I expect we're about ready to retire that old wiki page. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From ignaziocassano at gmail.com Tue Jan 8 14:03:08 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 15:03:08 +0100 Subject: openstack queens octavia security group not found In-Reply-To: <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Hello, I solved firewall issues. Now controllers can access amphora instance on 9443 port but worker.log reports: Could not connect to instance. Retrying.: SSLError: ("bad handshake: SysCallError(-1, 'Unexpected EOF')",) :-( Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann ha scritto: > Well, more logs are always better ;) > > Am 08.01.19 um 13:05 schrieb Ignazio Cassano: > > Yes, I can connect to amphora instance for a short time because it is > > removed automatically. > > For the amphora instance which log do you need? > > For octavia worker log is enough? > > > > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > ha scritto: > > > > Hi, > > > > are you able to connect to the amphora via ssh? > > > > Could you paste your octavia.log and the log of the amphora > somewhere? > > > > Fabian > > > > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: > > > PS > > > on the amphore instance there is nothng on port 9443 > > > > > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann > > > > > >> ha scritto: > > > > > > Hi, > > > > > > in which project should octavia start its amphora instances? > > > > > > In this project you should create a suitable sg. > > > > > > Fabian > > > > > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: > > > > Hello, I do not have an octavia project but only a service > > project. > > > > Octavia user belongs to admin and service project :-( > > > > Documentation does not seem clear about it > > > > > > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann > > > > > > > > > > > > >>> ha scritto: > > > > > > > > Hi, > > > > > > > > did you create the security group in the octavia > project? > > > > > > > > Can you see the sg if you login with the octavia > > credentials? > > > > > > > > > > > > Fabian > > > > > > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > > > > > Hello everyone, > > > > > I installed octavia with centos 7 queens. > > > > > When I crreate a load balancer the amphora instance > > is not > > > created > > > > > because nova conductor cannot find the security > group > > > specified in > > > > > octavia.conf. > > > > > I am sure the security group id is correct but the > nova > > > condictor > > > > reports: > > > > > > > > > > 2019-01-08 09:06:06.803 11872 ERROR > > nova.scheduler.utils > > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c > > > > > 62ed0b7f336b479ebda6f8587c4dd608 > > > 2a33760772ab4b0381a27735443ec4bd - > > > > > default default] [instance: > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] > > > > Error > > > > > from last host: tst2-kvm02 (node tst2-kvm02): > > [u'Traceback > > > (most > > > > recent > > > > > call last):\n', u' File > > > > > > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line > > > > 1828, > > > > > in _do_build_and_run_instance\n > filter_properties, > > > > request_spec)\n', > > > > > u' File > > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > > > > > line 2108, in _build_and_run_instance\n > > > > instance_uuid=instance.uuid, > > > > > reason=six.text_type(e))\n', > > u'RescheduledException: Build of > > > > instance > > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was > re-scheduled: > > > Security > > > > group > > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > > > > > > > > > Please, what is wrong ? > > > > > > > > > > Regards > > > > > Ignazio > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sahid.ferdjaoui at canonical.com Tue Jan 8 14:17:55 2019 From: sahid.ferdjaoui at canonical.com (Sahid Orentino Ferdjaoui) Date: Tue, 8 Jan 2019 15:17:55 +0100 Subject: [nova] Mempage fun In-Reply-To: <20190108115027.GA7825@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> <20190108115027.GA7825@canonical> Message-ID: <20190108141755.GA9289@canonical> On Tue, Jan 08, 2019 at 12:50:27PM +0100, Sahid Orentino Ferdjaoui wrote: > On Tue, Jan 08, 2019 at 10:47:47AM +0000, Stephen Finucane wrote: > > On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > > > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > > > We've been looking at a patch that landed some months ago and have > > > > spotted some issues: > > > > > > > > https://review.openstack.org/#/c/532168 > > > > > > > > In summary, that patch is intended to make the memory check for > > > > instances memory pagesize aware. The logic it introduces looks > > > > something like this: > > > > > > > > If the instance requests a specific pagesize > > > > (#1) Check if each host cell can provide enough memory of the > > > > pagesize requested for each instance cell > > > > Otherwise > > > > If the host has hugepages > > > > (#2) Check if each host cell can provide enough memory of the > > > > smallest pagesize available on the host for each instance cell > > > > Otherwise > > > > (#3) Check if each host cell can provide enough memory for > > > > each instance cell, ignoring pagesizes > > > > > > > > This also has the side-effect of allowing instances with hugepages and > > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > > host, because the latter will now be aware of hugepages and won't > > > > consume them. However, there are a couple of issues with this: > > > > > > > > 1. It breaks overcommit for instances without pagesize request > > > > running on hosts with different pagesizes. This is because we don't > > > > allow overcommit for hugepages, but case (#2) above means we are now > > > > reusing the same functions previously used for actual hugepage > > > > checks to check for regular 4k pages > > > > > > I think that we should not accept any overcommit. Only instances with > > > an InstanceNUMATopology associated pass to this part of check. Such > > > instances want to use features like guest NUMA topology so their > > > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > > > for performance reason and to avoid any cross memory latency. > > > > This issue with this is that we had previously designed everything *to* > > allow overcommit: > > > > https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 > > This code never worked Stephen, that instead of to please unit tests > related. I would not recommend to use it as a reference. > > > The only time this doesn't apply is if CPU pinning is also in action > > (remembering that CPU pinning and NUMA topologies are tightly bound and > > CPU pinning implies a NUMA topology, much to Jay's consternation). As > > noted below, our previous advice was not to mix hugepage instances and > > non-hugepage instances, meaning hosts handling non-hugepage instances > > should not have hugepages (or should mark the memory consumed by them > > as reserved for host). We have in effect broken previous behaviour in > > the name of solving a bug that didn't necessarily have to be fixed yet. > > > > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > > run through any of the code above, meaning they're still not > > > > pagesize aware > > > > > > That is an other issue. We report to the resource tracker all the > > > physical memory (small pages + hugepages allocated). The difficulty is > > > that we can't just change the virt driver to report only small > > > pages. Some instances wont be able to get scheduled. We should > > > basically change the resource tracker so it can take into account the > > > different kind of page memory. > > > > Agreed (likely via move tracking of this resource to placement, I > > assume). It's a longer term fix though. > > > > > But it's not really an issue since instances that use "NUMA features" > > > (in Nova world) should be isolated to an aggregate and not be mixed > > > with no-NUMA instances. The reason is simple no-NUMA instances do not > > > have boundaries and break rules of NUMA instances. > > > > Again, we have to be careful not to mix up NUMA and CPU pinning. It's > > perfectly fine to have NUMA without CPU pinning, though not the other > > way around. For example: > > > > $ openstack flavor set --property hw:numa_nodes=2 FLAVOR > > > > >From what I can tell, there are three reasons that an instance will > > have a NUMA topology: the user explicitly requested one, the user > > requested CPU pinning and got one implicitly, or the user requested a > > specific pagesize and, again, got one implicitly. We handle the latter > > two with the advice given below, but I don't think anyone has ever said > > we must separate instances that had a user-specified NUMA topology from > > those that had no NUMA topology. If we're going down this path, we need > > clear docs. Now I remember why we can't support it. When defining guest NUMA topology (hw:numa_node) the memory is mapped to the assigned host NUMA nodes meaning that the guest memory can't swap out. If a non-NUMA instance starts using memory from host NUMA nodes used by a guest with NUMA it can result that the guest with NUMA run out of memory and be killed. > > The implementation is pretty old and it was a first design from > scratch, all the situations have not been take into account or been > documented. If we want create specific behaviors we are going to add > more complexity on something which is already, and which is not > completely stable, as an example the patch you have mentioned which > has been merged last release. > > I agree documenting is probably where we should go; don't try to mix > instances with InstanceNUMATopology and without, Nova uses a different > way to compute their resources, like don't try to overcommit such > instances. > > We basically recommend to use aggregate for pinning, realtime, > hugepages, so it looks reasonable to add guest NUMA topology to that > list. > > > Stephen > > > > > > We could probably fix issue (1) by modifying those hugepage functions > > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > > We can mitigate issue (2) by advising operators to split hosts into > > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > > addition, we did actually called that out in the original spec: > > > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > > question why the patch is necessary/acceptable for NUMA instances. For > > > > what it's worth, a longer fix would be to start tracking hugepages in a > > > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > > > issue now. > > > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > > documenting issue (2), or should we revert the thing wholesale until we > > > > work on a solution that could e.g. let us track hugepages via placement > > > > and resolve issue (2) too. > > > > > > > > Thoughts? > > > > Stephen > > > > > > From ignaziocassano at gmail.com Tue Jan 8 14:17:57 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 15:17:57 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> <8ff51976-10df-c8fa-01ae-76b18991c1c9@gmail.com> <7ae6dd0d-f0e2-4932-0d11-c00d2f2a872a@gmail.com> Message-ID: Solved hanshake error commenting in amphora_agent section the following lines: #agent_server_ca = /etc/octavia/certs/ca_01.pem #agent_server_cert = /etc/octavia/certs/client_ca.pem Il giorno mar 8 gen 2019 alle ore 15:03 Ignazio Cassano < ignaziocassano at gmail.com> ha scritto: > Hello, > I solved firewall issues. > Now controllers can access amphora instance on 9443 port but worker.log > reports: > Could not connect to instance. Retrying.: SSLError: ("bad handshake: > SysCallError(-1, 'Unexpected EOF')",) > :-( > > > Il giorno mar 8 gen 2019 alle ore 13:06 Fabian Zimmermann < > dev.faz at gmail.com> ha scritto: > >> Well, more logs are always better ;) >> >> Am 08.01.19 um 13:05 schrieb Ignazio Cassano: >> > Yes, I can connect to amphora instance for a short time because it is >> > removed automatically. >> > For the amphora instance which log do you need? >> > For octavia worker log is enough? >> > >> > Il giorno Mar 8 Gen 2019 12:32 Fabian Zimmermann > > > ha scritto: >> > >> > Hi, >> > >> > are you able to connect to the amphora via ssh? >> > >> > Could you paste your octavia.log and the log of the amphora >> somewhere? >> > >> > Fabian >> > >> > Am 08.01.19 um 11:24 schrieb Ignazio Cassano: >> > > PS >> > > on the amphore instance there is nothng on port 9443 >> > > >> > > Il giorno mar 8 gen 2019 alle ore 10:44 Fabian Zimmermann >> > > >> > >> ha scritto: >> > > >> > > Hi, >> > > >> > > in which project should octavia start its amphora instances? >> > > >> > > In this project you should create a suitable sg. >> > > >> > > Fabian >> > > >> > > Am 08.01.19 um 10:39 schrieb Ignazio Cassano: >> > > > Hello, I do not have an octavia project but only a service >> > project. >> > > > Octavia user belongs to admin and service project :-( >> > > > Documentation does not seem clear about it >> > > > >> > > > Il giorno mar 8 gen 2019 alle ore 10:30 Fabian Zimmermann >> > > > >> > > >> > > >> > >>> ha scritto: >> > > > >> > > > Hi, >> > > > >> > > > did you create the security group in the octavia >> project? >> > > > >> > > > Can you see the sg if you login with the octavia >> > credentials? >> > > > >> > > > >> > > > Fabian >> > > > >> > > > Am 08.01.19 um 09:11 schrieb Ignazio Cassano: >> > > > > Hello everyone, >> > > > > I installed octavia with centos 7 queens. >> > > > > When I crreate a load balancer the amphora instance >> > is not >> > > created >> > > > > because nova conductor cannot find the security >> group >> > > specified in >> > > > > octavia.conf. >> > > > > I am sure the security group id is correct but the >> nova >> > > condictor >> > > > reports: >> > > > > >> > > > > 2019-01-08 09:06:06.803 11872 ERROR >> > nova.scheduler.utils >> > > > > [req-75df2561-4bc3-4bde-86d0-40469058250c >> > > > > 62ed0b7f336b479ebda6f8587c4dd608 >> > > 2a33760772ab4b0381a27735443ec4bd - >> > > > > default default] [instance: >> > > 83f2fd75-8069-47a5-9572-8949ec9b5cee] >> > > > Error >> > > > > from last host: tst2-kvm02 (node tst2-kvm02): >> > [u'Traceback >> > > (most >> > > > recent >> > > > > call last):\n', u' File >> > > > > >> > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", >> line >> > > > 1828, >> > > > > in _do_build_and_run_instance\n >> filter_properties, >> > > > request_spec)\n', >> > > > > u' File >> > > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", >> > > > > line 2108, in _build_and_run_instance\n >> > > > instance_uuid=instance.uuid, >> > > > > reason=six.text_type(e))\n', >> > u'RescheduledException: Build of >> > > > instance >> > > > > 83f2fd75-8069-47a5-9572-8949ec9b5cee was >> re-scheduled: >> > > Security >> > > > group >> > > > > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] >> > > > > >> > > > > Please, what is wrong ? >> > > > > >> > > > > Regards >> > > > > Ignazio >> > > > >> > > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Tue Jan 8 15:55:27 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Tue, 8 Jan 2019 16:55:27 +0100 Subject: [ansible-openstack] Magnum/Heat k8s failed Message-ID: Hi all. I have installed ansible-openstal AIO. Now I d like to create a cluster k8s but heat gives me always an error below: 2019-01-08 15:35:03Z [alf-k8s-nerea4mr3b2c.kube_masters]: *CREATE_IN_PROGRESS state changed* 2019-01-08 15:35:24Z [alf-k8s-nerea4mr3b2c.kube_masters]: *CREATE_FAILED AuthorizationFailure: resources.kube_masters.resources[0].resources.master_wait_handle: Authorization failed.* 2019-01-08 15:35:24Z [alf-k8s-nerea4mr3b2c]: CREATE_FAILED Resource CREATE failed: AuthorizationFailure: resources.kube_masters.resources[0].resources.master_wait_handle: Authorization failed. Any idea what to check? Cheers -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Tue Jan 8 16:31:16 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Tue, 8 Jan 2019 08:31:16 -0800 Subject: [Kolla] Queens for debian images In-Reply-To: <20190108140026.p4462df5otnyizm2@yuggoth.org> References: <20190108140026.p4462df5otnyizm2@yuggoth.org> Message-ID: Another useful link that Zane put together a while back now, but is more up to date/complete than the wiki, was this Reviewing the OpenStack Way Guide[1]. -Kendall (diablo_rojo) [1] https://docs.openstack.org/project-team-guide/review-the-openstack-way.html On Tue, Jan 8, 2019 at 6:01 AM Jeremy Stanley wrote: > On 2019-01-08 12:00:48 +0100 (+0100), Marcin Juszkiewicz wrote: > [...] > > Send patches for review [1] and then we can discuss about changing them. > > Remember that we target Stein now. > > > > 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > [...] > > These days it's probably better to recommend > https://docs.openstack.org/contributors/ since I expect we're about > ready to retire that old wiki page. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sokoban at foxmail.com Tue Jan 8 06:47:38 2019 From: sokoban at foxmail.com (=?gb18030?B?eG11Zml2ZUBxcS5jb20=?=) Date: Tue, 8 Jan 2019 14:47:38 +0800 Subject: Ironic ibmc driver for Huawei server Message-ID: Hi julia, According to the comment of story, 1. The spec for huawei ibmc drvier has been post here: https://storyboard.openstack.org/#!/story/2004635 , waiting for review. 2. About the third-party CI part, we provide mocked unittests for our driver's code. Not sure what third-party CI works for in this case. What else we should do? Thanks Qianbiao.NG -------------- next part -------------- An HTML attachment was scrubbed... URL: From dev.faz at gmail.com Tue Jan 8 09:30:44 2019 From: dev.faz at gmail.com (Fabian Zimmermann) Date: Tue, 8 Jan 2019 10:30:44 +0100 Subject: openstack queens octavia security group not found In-Reply-To: References: Message-ID: <2f3881c8-b085-97a4-cc4c-6085203ca53e@gmail.com> Hi, did you create the security group in the octavia project? Can you see the sg if you login with the octavia credentials? Fabian Am 08.01.19 um 09:11 schrieb Ignazio Cassano: > Hello everyone, > I installed octavia with centos 7 queens. > When I crreate a load balancer the amphora instance is not created > because nova conductor cannot find the security group specified in > octavia.conf. > I am sure the security group id is correct but the nova condictor reports: > > 2019-01-08 09:06:06.803 11872 ERROR nova.scheduler.utils > [req-75df2561-4bc3-4bde-86d0-40469058250c > 62ed0b7f336b479ebda6f8587c4dd608 2a33760772ab4b0381a27735443ec4bd - > default default] [instance: 83f2fd75-8069-47a5-9572-8949ec9b5cee] Error > from last host: tst2-kvm02 (node tst2-kvm02): [u'Traceback (most recent > call last):\n', u'  File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1828, > in _do_build_and_run_instance\n    filter_properties, request_spec)\n', > u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", > line 2108, in _build_and_run_instance\n    instance_uuid=instance.uuid, > reason=six.text_type(e))\n', u'RescheduledException: Build of instance > 83f2fd75-8069-47a5-9572-8949ec9b5cee was re-scheduled: Security group > fdd1ab71-bcd2-4b65-b5f2-f4c110b65602 not found.\n'] > > Please, what is wrong ? > > Regards > Ignazio From smooney at redhat.com Tue Jan 8 16:48:36 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 08 Jan 2019 16:48:36 +0000 Subject: [nova] Mempage fun In-Reply-To: <20190108141755.GA9289@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> <20190108115027.GA7825@canonical> <20190108141755.GA9289@canonical> Message-ID: <913e061bf714036ff26bfc268822054ec9878ede.camel@redhat.com> On Tue, 2019-01-08 at 15:17 +0100, Sahid Orentino Ferdjaoui wrote: > On Tue, Jan 08, 2019 at 12:50:27PM +0100, Sahid Orentino Ferdjaoui wrote: > > On Tue, Jan 08, 2019 at 10:47:47AM +0000, Stephen Finucane wrote: > > > On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > > > > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > > > > We've been looking at a patch that landed some months ago and have > > > > > spotted some issues: > > > > > > > > > > https://review.openstack.org/#/c/532168 > > > > > > > > > > In summary, that patch is intended to make the memory check for > > > > > instances memory pagesize aware. The logic it introduces looks > > > > > something like this: > > > > > > > > > > If the instance requests a specific pagesize > > > > > (#1) Check if each host cell can provide enough memory of the > > > > > pagesize requested for each instance cell > > > > > Otherwise > > > > > If the host has hugepages > > > > > (#2) Check if each host cell can provide enough memory of the > > > > > smallest pagesize available on the host for each instance cell > > > > > Otherwise > > > > > (#3) Check if each host cell can provide enough memory for > > > > > each instance cell, ignoring pagesizes > > > > > > > > > > This also has the side-effect of allowing instances with hugepages and > > > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > > > host, because the latter will now be aware of hugepages and won't > > > > > consume them. However, there are a couple of issues with this: > > > > > > > > > > 1. It breaks overcommit for instances without pagesize request > > > > > running on hosts with different pagesizes. This is because we don't > > > > > allow overcommit for hugepages, but case (#2) above means we are now > > > > > reusing the same functions previously used for actual hugepage > > > > > checks to check for regular 4k pages > > > > > > > > I think that we should not accept any overcommit. Only instances with > > > > an InstanceNUMATopology associated pass to this part of check. Such > > > > instances want to use features like guest NUMA topology so their > > > > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > > > > for performance reason and to avoid any cross memory latency. > > > > > > This issue with this is that we had previously designed everything *to* > > > allow overcommit: > > > > > > https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 > > > > This code never worked Stephen, that instead of to please unit tests > > related. I would not recommend to use it as a reference. > > > > > The only time this doesn't apply is if CPU pinning is also in action > > > (remembering that CPU pinning and NUMA topologies are tightly bound and > > > CPU pinning implies a NUMA topology, much to Jay's consternation). As > > > noted below, our previous advice was not to mix hugepage instances and > > > non-hugepage instances, meaning hosts handling non-hugepage instances > > > should not have hugepages (or should mark the memory consumed by them > > > as reserved for host). We have in effect broken previous behaviour in > > > the name of solving a bug that didn't necessarily have to be fixed yet. > > > > > > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > > > run through any of the code above, meaning they're still not > > > > > pagesize aware > > > > > > > > That is an other issue. We report to the resource tracker all the > > > > physical memory (small pages + hugepages allocated). The difficulty is > > > > that we can't just change the virt driver to report only small > > > > pages. Some instances wont be able to get scheduled. We should > > > > basically change the resource tracker so it can take into account the > > > > different kind of page memory. > > > > > > Agreed (likely via move tracking of this resource to placement, I > > > assume). It's a longer term fix though. > > > > > > > But it's not really an issue since instances that use "NUMA features" > > > > (in Nova world) should be isolated to an aggregate and not be mixed > > > > with no-NUMA instances. The reason is simple no-NUMA instances do not > > > > have boundaries and break rules of NUMA instances. > > > > > > Again, we have to be careful not to mix up NUMA and CPU pinning. It's > > > perfectly fine to have NUMA without CPU pinning, though not the other > > > way around. For example: > > > > > > $ openstack flavor set --property hw:numa_nodes=2 FLAVOR > > > > > > > From what I can tell, there are three reasons that an instance will > > > > > > have a NUMA topology: the user explicitly requested one, the user > > > requested CPU pinning and got one implicitly, or the user requested a > > > specific pagesize and, again, got one implicitly. We handle the latter > > > two with the advice given below, but I don't think anyone has ever said > > > we must separate instances that had a user-specified NUMA topology from > > > those that had no NUMA topology. If we're going down this path, we need > > > clear docs. > > Now I remember why we can't support it. When defining guest NUMA > topology (hw:numa_node) the memory is mapped to the assigned host NUMA > nodes meaning that the guest memory can't swap out. the guest memory should still be able to swap out. we do not memlock the pages when we set hw:numa_nodes we only do that for realtime instances. its down implcitly for hugepages but if you taskset/memtune cores/ram to a host numa node it does not prevent the kernel form paging that memory out to swap space. locked defaults to false in the memory backing https://github.com/openstack/nova/blob/88951ca98e1b286b58aa1ad94f9af40b8260c01f/nova/virt/libvirt/config.py#L2053 and we only set it to true here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L4745-L4749 if we take the wantsrealtime branch. > If a non-NUMA > instance starts using memory from host NUMA nodes used by a guest with > NUMA it can result that the guest with NUMA run out of memory and be > killed. that is unrelated to this. that happens because the OOM killer works per numa node and the mempressur value for vms tends to be highre then other processes and the kernel will prefer to kill them over options. a instnace live migration or any other process that add extra memory pressuer can trigger the same effect. part of the issue is the host reserved memory config option is not per numa node but the OOM killer in the kernel is run per numa node. > > > > > The implementation is pretty old and it was a first design from > > scratch, all the situations have not been take into account or been > > documented. If we want create specific behaviors we are going to add > > more complexity on something which is already, and which is not > > completely stable, as an example the patch you have mentioned which > > has been merged last release. > > > > I agree documenting is probably where we should go; don't try to mix > > instances with InstanceNUMATopology and without, Nova uses a different > > way to compute their resources, like don't try to overcommit such > > instances. > > > > We basically recommend to use aggregate for pinning, realtime, > > hugepages, so it looks reasonable to add guest NUMA topology to that > > list. > > > > > Stephen > > > > > > > > We could probably fix issue (1) by modifying those hugepage functions > > > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > > > We can mitigate issue (2) by advising operators to split hosts into > > > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > > > addition, we did actually called that out in the original spec: > > > > > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > > > question why the patch is necessary/acceptable for NUMA instances. For > > > > > what it's worth, a longer fix would be to start tracking hugepages in a > > > > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > > > > issue now. > > > > > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > > > documenting issue (2), or should we revert the thing wholesale until we > > > > > work on a solution that could e.g. let us track hugepages via placement > > > > > and resolve issue (2) too. > > > > > > > > > > Thoughts? > > > > > Stephen > > > > > > > From johnsomor at gmail.com Tue Jan 8 17:00:24 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Tue, 8 Jan 2019 09:00:24 -0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Yes, we do not allow eventlet in Octavia. It leads to a number of conflicts and problems with the overall code base, including the use of taskflow. Is there a reason we need to use the os-ken BGP code as opposed to the exabgp option that was being used before? I remember we looked at those two options back when the other team was developing the l3 option, but I don't remember all of the details of why exabgp was selected. Michael On Mon, Jan 7, 2019 at 1:18 AM Jeff Yang wrote: > > Hi Michael, > I found that you forbid import eventlet in octavia.[1] > I guess the eventlet has a conflict with gunicorn, is that? > But, I need to import eventlet for os-ken that used to implement bgp speaker.[2] > I am studying eventlet and gunicorn deeply. Have you some suggestions to resolve this conflict? > > [1] https://review.openstack.org/#/c/462334/ > [2] https://review.openstack.org/#/c/628915/ > > Michael Johnson 于2019年1月5日周六 上午8:02写道: >> >> Hi Jeff, >> >> Unfortunately the team that was working on that code had stopped due >> to internal reasons. >> >> I hope to make the reference active/active blueprint a priority again >> during the Train cycle. Following that I may be able to look at the L3 >> distributor option, but I cannot commit to that at this time. >> >> If you are interesting in picking up that work, please let me know and >> we can sync up on that status of the WIP patches, etc. >> >> Michael >> >> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang wrote: >> > >> > Dear Octavia team: >> > The email aims to ask the development progress about l3-active-active blueprint. I >> > noticed that the work in this area has been stagnant for eight months. >> > https://review.openstack.org/#/q/l3-active-active >> > I want to know the community's next work plan in this regard. >> > Thanks. From johnsomor at gmail.com Tue Jan 8 17:05:52 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Tue, 8 Jan 2019 09:05:52 -0800 Subject: Queens octavia error In-Reply-To: References: Message-ID: Hi Ignazio, Please use the [octavia] tag in the subject line as this will alert the octavia team to your message. As the message says, this is a nova failure: {u'message': u'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 581, in build_instances\n raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': u'2019-01-07T15:15:59Z'} I recommend you check the nova logs to identify the root issue in nova. If this is related to the security group issue you mentioned on IRC, make sure you create the security group for the Octavia controllers under the account you are running the controllers under. This is the account you specified in your octavia.conf file under the "[service_auth]" section. It is likely you are creating the security group under a different project than your controllers are configured to use. Michael On Mon, Jan 7, 2019 at 7:25 AM Ignazio Cassano wrote: > > Hello All, > I installed octavia on queens with centos 7, but when I create a load balance with the command > openstack loadbalancer create --name lb1 --vip-subnet-id admin-subnet I got some errors in octavia worker.log: > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server failures[0].reraise() > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/taskflow/types/failure.py", line 343, in reraise > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server six.reraise(*self._exc_info) > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server result = task.execute(**arguments) > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 192, in execute > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server raise exceptions.ComputeBuildException(fault=fault) > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server ComputeBuildException: Failed to build compute instance due to: {u'message': u'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 581, in build_instances\n raise exception.MaxRetriesExceeded(reason=msg)\n', u'created': u'2019-01-07T15:15:59Z'} > > Anyone could help me, please ? > > Regards > Ignazio From juliaashleykreger at gmail.com Tue Jan 8 17:10:22 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 8 Jan 2019 09:10:22 -0800 Subject: [ironic] Mid-cycle call times In-Reply-To: References: Message-ID: Greetings everyone! It seems we have coalesced around January 21st and 22nd. I have posted a poll[1] with time windows in two hour blocks so we can reach a consensus on when we should meet. Please vote for your available time windows so we can find the best overlap for everyone. Additionally, if there are any topics or items that you feel would be a good use of the time, please feel free to add them to the planning etherpad[2]. Thanks everyone! -Julia [1]: https://doodle.com/poll/i2awf3zvztncixpg [2]: https://etherpad.openstack.org/p/ironic-stein-midcycle On Wed, Jan 2, 2019 at 1:44 PM Julia Kreger wrote: > > Greetings everyone, > > During our ironic team meeting in December, we discussed if we should go ahead and have a "mid-cycle" call in order to try sync up on where we are at during this cycle, and the next steps for us to take as a team. > > With that said, I have created a doodle poll[1] in an attempt to identify some days that might work. Largely the days available on the poll are geared around my availability this month. > > Ideally, I would like to find three days where we can schedule some 2-4 hour blocks of time. I've gone ahead and started an etherpad[2] to get us started on brainstorming. Once we have some ideas, we will be able to form a schedule and attempt to identify the amount of time required. > > -Julia > > [1]: https://doodle.com/poll/uqwywaxuxsiu7zde > [2]: https://etherpad.openstack.org/p/ironic-stein-midcycle From flux.adam at gmail.com Tue Jan 8 17:13:09 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Tue, 8 Jan 2019 09:13:09 -0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Jeff, Eventlet cannot be included in the main Octavia controller side code or added to our main requirements.txt, but that technically doesn't apply to the pre-built amphora images. They use a different requirements system and since it won't actually be used by any of the real Octavia services (os-ken is just some other software that will run independently inside the amphora, right?) it should actually be fine. In fact, I would assume if it is a requirement of os-ken, you wouldn't have to explicitly list it as a requirement anywhere, as it should be pulled into the system by installing that package. At least, that is my early morning take on this, as I haven't really looked too closely at os-ken yet. --Adam Harwell (rm_work) On Mon, Jan 7, 2019, 01:25 Jeff Yang wrote: > Hi Michael, > I found that you forbid import eventlet in octavia.[1] > I guess the eventlet has a conflict with gunicorn, is that? > But, I need to import eventlet for os-ken that used to implement bgp > speaker.[2] > I am studying eventlet and gunicorn deeply. Have you some suggestions > to resolve this conflict? > > [1] https://review.openstack.org/#/c/462334/ > [2] https://review.openstack.org/#/c/628915/ > > Michael Johnson 于2019年1月5日周六 上午8:02写道: > >> Hi Jeff, > > >> >> Unfortunately the team that was working on that code had stopped due >> to internal reasons. >> >> I hope to make the reference active/active blueprint a priority again >> during the Train cycle. Following that I may be able to look at the L3 >> distributor option, but I cannot commit to that at this time. >> >> If you are interesting in picking up that work, please let me know and >> we can sync up on that status of the WIP patches, etc. >> >> Michael >> >> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang >> wrote: >> > >> > Dear Octavia team: >> > The email aims to ask the development progress about >> l3-active-active blueprint. I >> > noticed that the work in this area has been stagnant for eight months. >> > https://review.openstack.org/#/q/l3-active-active >> > I want to know the community's next work plan in this regard. >> > Thanks. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ed at leafe.com Tue Jan 8 17:26:11 2019 From: ed at leafe.com (Ed Leafe) Date: Tue, 8 Jan 2019 11:26:11 -0600 Subject: [Cyborg] IRC meeting In-Reply-To: References: Message-ID: <7B6CC0C8-82BA-410E-823B-357F08213734@leafe.com> On Jan 8, 2019, at 12:31 AM, Li Liu wrote: > > The IRC meeting will be held Tuesday at 0300 UTC, which is 10:00 pm est(Tuesday) / 7:00 pm pst(Tuesday) /11 am Beijing time (Wednesday) I believe you meant *Wednesday* at 0300 UTC, correct? -- Ed Leafe From brenski at mirantis.com Tue Jan 8 18:21:32 2019 From: brenski at mirantis.com (Boris Renski) Date: Tue, 8 Jan 2019 10:21:32 -0800 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift Message-ID: Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: - We have new look and feel at stackalytics.com - We did away with DriverLog and Member Directory , which were not very actively used or maintained. Those are still available via direct links, but not in the menu on the top - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible via top menu. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback. -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Tue Jan 8 18:22:24 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 08 Jan 2019 18:22:24 +0000 Subject: [dev] 'sqlalchemy.exc.NoSuchTableError: migration_tmp' errors due to SQLite 3.26.0 Message-ID: <088ba53338bf68edbd3742c7e145ccf7605df615.camel@redhat.com> Just to note that I'm currently unable to run nova unit tests locally on Fedora 29 without downgrading my sqlite package. The error I'm seeing is: sqlalchemy.exc.NoSuchTableError: migration_tmp The root cause appears to be a change in 3.26.0 which is breaking sqlalchemy-migrate, as noted here: https://bugs.launchpad.net/ubuntu/+source/nova/+bug/1807262 Corey Bryant has proposed the patch for this patch, linked below, which should resolve this. Any chance an sqlalchemy-migrate core could look at this before I reach the point of not being able to downgrade my sqlite package and run nova unit tests? :) https://review.openstack.org/#/c/623564/5 Stephen From juliaashleykreger at gmail.com Tue Jan 8 18:26:00 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 8 Jan 2019 10:26:00 -0800 Subject: Ironic ibmc driver for Huawei server In-Reply-To: References: Message-ID: Greetings Qianbiao.NG, Welcome to Ironic! The purpose and requirement of Third Party CI is to test drivers are in working order with the current state of the code in Ironic and help prevent the community from accidentally breaking an in-tree vendor driver. Vendors do this by providing one or more physical systems in a pool of hardware that is managed by a Zuul v3 or Jenkins installation which installs ironic (typically in a virtual machine), and configures it to perform a deployment upon the physical bare metal node. Upon failure or successful completion of the test, the results are posted back to OpenStack Gerrit. Ultimately this helps provide the community and the vendor with a level of assurance in what is released by the ironic community. The cinder project has a similar policy and I'll email you directly with the contacts at Huawei that work with the Cinder community, as they would be familiar with many of the aspects of operating third party CI. You can find additional information here on the requirement and the reasoning behind it: https://specs.openstack.org/openstack/ironic-specs/specs/approved/third-party-ci.html We may also be able to put you in touch with some vendors that have recently worked on implementing third-party CI. I'm presently inquiring with others if that will be possible. If you are able to join Internet Relay Chat, our IRC channel (#openstack-ironic) has several individual who have experience setting up and maintaining third-party CI for ironic. Thanks, -Julia On Tue, Jan 8, 2019 at 8:54 AM xmufive at qq.com wrote: > > Hi julia, > > According to the comment of story, > 1. The spec for huawei ibmc drvier has been post here: https://storyboard.openstack.org/#!/story/2004635 , waiting for review. > 2. About the third-party CI part, we provide mocked unittests for our driver's code. Not sure what third-party CI works for in this case. What else we should do? > > Thanks > Qianbiao.NG From sfinucan at redhat.com Tue Jan 8 18:29:03 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 08 Jan 2019 18:29:03 +0000 Subject: [nova] Mempage fun In-Reply-To: <20190108141755.GA9289@canonical> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <20190108100631.GA4852@canonical> <8512066637a045690c037deeecff20845efdadc9.camel@redhat.com> <20190108115027.GA7825@canonical> <20190108141755.GA9289@canonical> Message-ID: On Tue, 2019-01-08 at 15:17 +0100, Sahid Orentino Ferdjaoui wrote: > On Tue, Jan 08, 2019 at 12:50:27PM +0100, Sahid Orentino Ferdjaoui wrote: > > On Tue, Jan 08, 2019 at 10:47:47AM +0000, Stephen Finucane wrote: > > > On Tue, 2019-01-08 at 11:06 +0100, Sahid Orentino Ferdjaoui wrote: > > > > On Mon, Jan 07, 2019 at 05:32:32PM +0000, Stephen Finucane wrote: > > > > > We've been looking at a patch that landed some months ago and have > > > > > spotted some issues: > > > > > > > > > > https://review.openstack.org/#/c/532168 > > > > > > > > > > In summary, that patch is intended to make the memory check for > > > > > instances memory pagesize aware. The logic it introduces looks > > > > > something like this: > > > > > > > > > > If the instance requests a specific pagesize > > > > > (#1) Check if each host cell can provide enough memory of the > > > > > pagesize requested for each instance cell > > > > > Otherwise > > > > > If the host has hugepages > > > > > (#2) Check if each host cell can provide enough memory of the > > > > > smallest pagesize available on the host for each instance cell > > > > > Otherwise > > > > > (#3) Check if each host cell can provide enough memory for > > > > > each instance cell, ignoring pagesizes > > > > > > > > > > This also has the side-effect of allowing instances with hugepages and > > > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > > > host, because the latter will now be aware of hugepages and won't > > > > > consume them. However, there are a couple of issues with this: > > > > > > > > > > 1. It breaks overcommit for instances without pagesize request > > > > > running on hosts with different pagesizes. This is because we don't > > > > > allow overcommit for hugepages, but case (#2) above means we are now > > > > > reusing the same functions previously used for actual hugepage > > > > > checks to check for regular 4k pages > > > > > > > > I think that we should not accept any overcommit. Only instances with > > > > an InstanceNUMATopology associated pass to this part of check. Such > > > > instances want to use features like guest NUMA topology so their > > > > memory mapped on host NUMA nodes or CPU pinning. Both cases are used > > > > for performance reason and to avoid any cross memory latency. > > > > > > This issue with this is that we had previously designed everything *to* > > > allow overcommit: > > > > > > https://github.com/openstack/nova/blob/18.0.0/nova/virt/hardware.py#L1047-L1065 > > > > This code never worked Stephen, that instead of to please unit tests > > related. I would not recommend to use it as a reference. > > > > > The only time this doesn't apply is if CPU pinning is also in action > > > (remembering that CPU pinning and NUMA topologies are tightly bound and > > > CPU pinning implies a NUMA topology, much to Jay's consternation). As > > > noted below, our previous advice was not to mix hugepage instances and > > > non-hugepage instances, meaning hosts handling non-hugepage instances > > > should not have hugepages (or should mark the memory consumed by them > > > as reserved for host). We have in effect broken previous behaviour in > > > the name of solving a bug that didn't necessarily have to be fixed yet. > > > > > > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > > > run through any of the code above, meaning they're still not > > > > > pagesize aware > > > > > > > > That is an other issue. We report to the resource tracker all the > > > > physical memory (small pages + hugepages allocated). The difficulty is > > > > that we can't just change the virt driver to report only small > > > > pages. Some instances wont be able to get scheduled. We should > > > > basically change the resource tracker so it can take into account the > > > > different kind of page memory. > > > > > > Agreed (likely via move tracking of this resource to placement, I > > > assume). It's a longer term fix though. > > > > > > > But it's not really an issue since instances that use "NUMA features" > > > > (in Nova world) should be isolated to an aggregate and not be mixed > > > > with no-NUMA instances. The reason is simple no-NUMA instances do not > > > > have boundaries and break rules of NUMA instances. > > > > > > Again, we have to be careful not to mix up NUMA and CPU pinning. It's > > > perfectly fine to have NUMA without CPU pinning, though not the other > > > way around. For example: > > > > > > $ openstack flavor set --property hw:numa_nodes=2 FLAVOR > > > > > > > From what I can tell, there are three reasons that an instance will > > > have a NUMA topology: the user explicitly requested one, the user > > > requested CPU pinning and got one implicitly, or the user requested a > > > specific pagesize and, again, got one implicitly. We handle the latter > > > two with the advice given below, but I don't think anyone has ever said > > > we must separate instances that had a user-specified NUMA topology from > > > those that had no NUMA topology. If we're going down this path, we need > > > clear docs. > > Now I remember why we can't support it. When defining guest NUMA > topology (hw:numa_node) the memory is mapped to the assigned host NUMA > nodes meaning that the guest memory can't swap out. If a non-NUMA > instance starts using memory from host NUMA nodes used by a guest with > NUMA it can result that the guest with NUMA run out of memory and be > killed. Based on my minimal test, it seems to work just fine? https://bugs.launchpad.net/nova/+bug/1810977 The instances boot with the patch reverted. Is there something I've missed? > > The implementation is pretty old and it was a first design from > > scratch, all the situations have not been take into account or been > > documented. If we want create specific behaviors we are going to add > > more complexity on something which is already, and which is not > > completely stable, as an example the patch you have mentioned which > > has been merged last release. > > > > I agree documenting is probably where we should go; don't try to mix > > instances with InstanceNUMATopology and without, Nova uses a different > > way to compute their resources, like don't try to overcommit such > > instances. > > > > We basically recommend to use aggregate for pinning, realtime, > > hugepages, so it looks reasonable to add guest NUMA topology to that > > list. > > > > > Stephen > > > > > > > > We could probably fix issue (1) by modifying those hugepage functions > > > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > > > We can mitigate issue (2) by advising operators to split hosts into > > > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > > > addition, we did actually called that out in the original spec: > > > > > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > > > question why the patch is necessary/acceptable for NUMA instances. For > > > > > what it's worth, a longer fix would be to start tracking hugepages in a > > > > > non-NUMA aware way too but that's a lot more work and doesn't fix the > > > > > issue now. > > > > > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > > > documenting issue (2), or should we revert the thing wholesale until we > > > > > work on a solution that could e.g. let us track hugepages via placement > > > > > and resolve issue (2) too. > > > > > > > > > > Thoughts? > > > > > Stephen > > > > > From sfinucan at redhat.com Tue Jan 8 18:38:49 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 08 Jan 2019 18:38:49 +0000 Subject: [nova] Mempage fun In-Reply-To: <1546937673.17763.2@smtp.office365.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <1546937673.17763.2@smtp.office365.com> Message-ID: <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> On Tue, 2019-01-08 at 08:54 +0000, Balázs Gibizer wrote: > On Mon, Jan 7, 2019 at 6:32 PM, Stephen Finucane wrote: > > We've been looking at a patch that landed some months ago and have > > spotted some issues: > > > > https://review.openstack.org/#/c/532168 > > > > In summary, that patch is intended to make the memory check for > > instances memory pagesize aware. The logic it introduces looks > > something like this: > > > > If the instance requests a specific pagesize > > (#1) Check if each host cell can provide enough memory of the > > pagesize requested for each instance cell > > Otherwise > > If the host has hugepages > > (#2) Check if each host cell can provide enough memory of the > > smallest pagesize available on the host for each instance cell > > Otherwise > > (#3) Check if each host cell can provide enough memory for > > each instance cell, ignoring pagesizes > > > > This also has the side-effect of allowing instances with hugepages and > > instances with a NUMA topology but no hugepages to co-exist on the same > > host, because the latter will now be aware of hugepages and won't > > consume them. However, there are a couple of issues with this: > > > > 1. It breaks overcommit for instances without pagesize request > > running on hosts with different pagesizes. This is because we don't > > allow overcommit for hugepages, but case (#2) above means we are now > > reusing the same functions previously used for actual hugepage > > checks to check for regular 4k pages > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > host as NUMA instances with hugepages. The non-NUMA instances don't > > run through any of the code above, meaning they're still not > > pagesize aware > > > > We could probably fix issue (1) by modifying those hugepage functions > > we're using to allow overcommit via a flag that we pass for case (#2). > > We can mitigate issue (2) by advising operators to split hosts into > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > I think this may be the case in some docs (sean-k-mooney said Intel > > used to do this. I don't know about Red Hat's docs or upstream). In > > addition, we did actually called that out in the original spec: > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > However, if we're doing that for non-NUMA instances, one would have to > > question why the patch is necessary/acceptable for NUMA instances. For > > what it's worth, a longer fix would be to start tracking hugepages in > > a non-NUMA aware way too but that's a lot more work and doesn't fix the > > issue now. > > > > As such, my question is this: should be look at fixing issue (1) and > > documenting issue (2), or should we revert the thing wholesale until > > we work on a solution that could e.g. let us track hugepages via > > placement and resolve issue (2) too. > > If you feel that fixing (1) is pretty simple then I suggest to do that > and document the limitation of (2) while we think about a proper > solution. > > gibi I have (1) fixed here: https://review.openstack.org/#/c/629281/ That said, I'm not sure if it's the best thing to do. From what I'm hearing, it seems the advice we should be giving is to not mix instances with/without NUMA topologies, with/without hugepages and with/without CPU pinning. We've only documented the latter, as discussed on this related bug by cfriesen: https://bugs.launchpad.net/nova/+bug/1792985 Given that we should be advising folks not to mix these (something I wasn't aware of until now), what does the original patch actually give us? If you're not mixing instances with/without hugepages, then the only use case that would fix is booting an instance with a NUMA topology but no hugepages on a host that had hugepages (because the instance would be limited to CPUs and memory from one NUMA nodes, but it's conceivable all available memory could be on another NUMA node). That seems like a very esoteric use case that might be better solved by perhaps making the reserved memory configuration option optionally NUMA specific. This would allow us to mark this hugepage memory, which is clearly not intended for consumption by nova (remember: this host only handles non-hugepage instances), as reserved on a per-node basis. I'm not sure how we would map this to placement, though I'm sure it could be figured out. jaypipes is going to have so much fun mapping all this in placement :D Stephen From openstack at nemebean.com Tue Jan 8 19:04:58 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 8 Jan 2019 13:04:58 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> Message-ID: <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> Further update: I dusted off my gdb skills and attached it to the privsep process to try to get more details about exactly what is crashing. It looks like the segfault happens on this line: https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 which is h->cb = cb; h being the conntrack handle and cb being the callback function. This makes me think the problem isn't the callback itself (even if we assigned a bogus pointer, which we didn't, it shouldn't cause a segfault unless you try to dereference it) but in the handle we pass in. Trying to look at h->cb results in: (gdb) print h->cb Cannot access memory at address 0x800f228 Interestingly, h itself is fine: (gdb) print h $3 = (struct nfct_handle *) 0x800f1e0 It doesn't _look_ to me like the handle should be crossing any thread boundaries or anything, so I'm not sure why it would be a problem. It gets created in the same privileged function that ultimately registers the callback: https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 So still not sure what's going on, but I thought I'd share what I've found before I stop to eat something. -Ben On 1/7/19 12:11 PM, Ben Nemec wrote: > Renamed the thread to be more descriptive. > > Just to update the list on this, it looks like the problem is a segfault > when the netlink_lib module makes a C call. Digging into that code a > bit, it appears there is a callback being used[1]. I've seen some > comments that when you use a callback with a Python thread, the thread > needs to be registered somehow, but this is all uncharted territory for > me. Suggestions gratefully accepted. :-) > > 1: > https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 > > > On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >> Hi, >> >> I just found that functional tests in Neutron are failing since today >> or maybe yesterday. See [1] >> I was able to reproduce it locally and it looks that it happens with >> oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >> >> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >> >> — >> Slawek Kaplonski >> Senior software engineer >> Red Hat >> From lbragstad at gmail.com Tue Jan 8 19:18:38 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Tue, 8 Jan 2019 13:18:38 -0600 Subject: [dev][keystone][nova][oslo] unified limits + oslo.limit interface questions Message-ID: Hi all, Before the holidays there was a bunch of discussion around unified limits and getting that integrated into nova. One of the last hurdles is smoothing out the interface between nova and the oslo.limit library, which John and Jay were helping out with a bunch. There are a couple of WIP patches proposed that attempt to work through this [0][1][2]. Now that people are starting to recover from the holidays, I wanted to start a thread on what remains for this work. Specifically, what can we do to air out the remaining concerns so that we can release a useable version of oslo.limit for services to consume. Thoughts? [0] https://review.openstack.org/#/c/615180/ John's WIP'd integration patch [1] https://review.openstack.org/#/c/602201/ nova specification [2] https://review.openstack.org/#/c/596520/21 XiYuan's patch to sort out the interface from oslo.limit -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Jan 8 19:23:12 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 8 Jan 2019 20:23:12 +0100 Subject: Queens octavia error In-Reply-To: References: Message-ID: Hello Michael, thanks for your suggestion. I solved my issue. There was some wrong configuration in octavia.conf file. Security group must belong to the octavia project. Regards Ignazio Il giorno Mar 8 Gen 2019 18:06 Michael Johnson ha scritto: > Hi Ignazio, > > Please use the [octavia] tag in the subject line as this will alert > the octavia team to your message. > > As the message says, this is a nova failure: > > {u'message': u'Exceeded maximum number of retries. Exhausted all hosts > available for retrying build failures for instance > 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' > File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", > line 581, in build_instances\n raise > exception.MaxRetriesExceeded(reason=msg)\n', u'created': > u'2019-01-07T15:15:59Z'} > > I recommend you check the nova logs to identify the root issue in nova. > > If this is related to the security group issue you mentioned on IRC, > make sure you create the security group for the Octavia controllers > under the account you are running the controllers under. This is the > account you specified in your octavia.conf file under the > "[service_auth]" section. It is likely you are creating the security > group under a different project than your controllers are configured > to use. > > Michael > > On Mon, Jan 7, 2019 at 7:25 AM Ignazio Cassano > wrote: > > > > Hello All, > > I installed octavia on queens with centos 7, but when I create a load > balance with the command > > openstack loadbalancer create --name lb1 --vip-subnet-id admin-subnet I > got some errors in octavia worker.log: > > > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server > failures[0].reraise() > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/taskflow/types/failure.py", line 343, in > reraise > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server > six.reraise(*self._exc_info) > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/taskflow/engines/action_engine/executor.py", > line 53, in _execute_task > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server result > = task.execute(**arguments) > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/octavia/controller/worker/tasks/compute_tasks.py", > line 192, in execute > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server raise > exceptions.ComputeBuildException(fault=fault) > > 2019-01-07 16:16:05.050 85077 ERROR oslo_messaging.rpc.server > ComputeBuildException: Failed to build compute instance due to: > {u'message': u'Exceeded maximum number of retries. Exhausted all hosts > available for retrying build failures for instance > 5abc100b-5dc8-43f5-9e1c-e6afea0242d9.', u'code': 500, u'details': u' File > "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 581, in > build_instances\n raise exception.MaxRetriesExceeded(reason=msg)\n', > u'created': u'2019-01-07T15:15:59Z'} > > > > Anyone could help me, please ? > > > > Regards > > Ignazio > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Tue Jan 8 19:26:44 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 8 Jan 2019 13:26:44 -0600 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: <20190108192643.GA25045@sm-workstation> On Tue, Jan 08, 2019 at 10:21:32AM -0800, Boris Renski wrote: > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - > > We have new look and feel at stackalytics.com > - > > We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the menu on the top > - > > BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible via top menu. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback. > > -Boris Really looks nice - thanks Boris! Sean From stig.openstack at telfer.org Tue Jan 8 19:44:21 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Tue, 8 Jan 2019 19:44:21 +0000 Subject: [scientific-sig] IRC meeting 2100UTC: Lustre, conferences, CFPs Message-ID: Hi All - We have a Scientific SIG meeting later today at 2100 UTC (about an hour’s time). Everyone is welcome. Today I’d like to restart the efforts for better integration of Lustre with OpenStack, and to canvas for people with use cases for this. Plus start the year with the conference calendar. We meet at 2100 UTC in IRC channel #openstack-meeting. The agenda and full details are here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_8th_2019 Cheers Stig From msm at redhat.com Tue Jan 8 19:51:34 2019 From: msm at redhat.com (Michael McCune) Date: Tue, 8 Jan 2019 14:51:34 -0500 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: <20190108192643.GA25045@sm-workstation> References: <20190108192643.GA25045@sm-workstation> Message-ID: On Tue, Jan 8, 2019 at 2:29 PM Sean McGinnis wrote: > Really looks nice - thanks Boris! ++, really responsive too. thanks for the update =) peace o/ From skaplons at redhat.com Tue Jan 8 20:22:27 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Tue, 8 Jan 2019 21:22:27 +0100 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> Message-ID: <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> Hi Ben, I was also looking at it today. I’m totally not an C and Oslo.privsep expert but I think that there is some new process spawned here. I put pdb before line https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L191 where this issue happen. Then, with "ps aux” I saw: vagrant at fullstack-ubuntu ~ $ ps aux | grep privsep root 18368 0.1 0.5 185752 33544 pts/1 Sl+ 13:24 0:00 /opt/stack/neutron/.tox/dsvm-functional/bin/python /opt/stack/neutron/.tox/dsvm-functional/bin/privsep-helper --config-file neutron/tests/etc/neutron.conf --privsep_context neutron.privileged.default --privsep_sock_path /tmp/tmpG5iqb9/tmp1dMGq0/privsep.sock vagrant 18555 0.0 0.0 14512 1092 pts/2 S+ 13:25 0:00 grep --color=auto privsep But then when I continue run test, and it segfaulted, in journal log I have: Jan 08 13:25:29 fullstack-ubuntu kernel: privsep-helper[18369] segfault at 140043e8 ip 00007f8e1800ef32 sp 00007f8e18a63320 error 4 in libnetfilter_conntrack.so.3.5.0[7f8e18009000+1a000] Please check pics of those processes. First one (when test was „paused” with pdb) has 18368 and later segfault has 18369. I don’t know if You saw my today’s comment in launchpad. I was trying to change method used to start PrivsepDaemon from Method.ROOTWRAP to Method.FORK (in https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/priv_context.py#L218) and run test as root, then tests were passed. — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Ben Nemec w dniu 08.01.2019, o godz. 20:04: > > Further update: I dusted off my gdb skills and attached it to the privsep process to try to get more details about exactly what is crashing. It looks like the segfault happens on this line: > > https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 > > which is > > h->cb = cb; > > h being the conntrack handle and cb being the callback function. > > This makes me think the problem isn't the callback itself (even if we assigned a bogus pointer, which we didn't, it shouldn't cause a segfault unless you try to dereference it) but in the handle we pass in. Trying to look at h->cb results in: > > (gdb) print h->cb > Cannot access memory at address 0x800f228 > > Interestingly, h itself is fine: > > (gdb) print h > $3 = (struct nfct_handle *) 0x800f1e0 > > It doesn't _look_ to me like the handle should be crossing any thread boundaries or anything, so I'm not sure why it would be a problem. It gets created in the same privileged function that ultimately registers the callback: https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 > > So still not sure what's going on, but I thought I'd share what I've found before I stop to eat something. > > -Ben > > On 1/7/19 12:11 PM, Ben Nemec wrote: >> Renamed the thread to be more descriptive. >> Just to update the list on this, it looks like the problem is a segfault when the netlink_lib module makes a C call. Digging into that code a bit, it appears there is a callback being used[1]. I've seen some comments that when you use a callback with a Python thread, the thread needs to be registered somehow, but this is all uncharted territory for me. Suggestions gratefully accepted. :-) >> 1: https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >>> Hi, >>> >>> I just found that functional tests in Neutron are failing since today or maybe yesterday. See [1] >>> I was able to reproduce it locally and it looks that it happens with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >>> >>> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >>> >>> — >>> Slawek Kaplonski >>> Senior software engineer >>> Red Hat >>> From liliueecg at gmail.com Tue Jan 8 21:15:37 2019 From: liliueecg at gmail.com (Li Liu) Date: Tue, 8 Jan 2019 16:15:37 -0500 Subject: [Cyborg] IRC meeting In-Reply-To: <7B6CC0C8-82BA-410E-823B-357F08213734@leafe.com> References: <7B6CC0C8-82BA-410E-823B-357F08213734@leafe.com> Message-ID: Thank you Ed for the correction. It is *Wednesday* at 0300 UTC :P Regards Li Liu On Tue, Jan 8, 2019 at 12:26 PM Ed Leafe wrote: > On Jan 8, 2019, at 12:31 AM, Li Liu wrote: > > > > The IRC meeting will be held Tuesday at 0300 UTC, which is 10:00 pm > est(Tuesday) / 7:00 pm pst(Tuesday) /11 am Beijing time (Wednesday) > > I believe you meant *Wednesday* at 0300 UTC, correct? > > > -- Ed Leafe > > > > > > -- Thank you Regards Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Tue Jan 8 22:00:14 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 8 Jan 2019 16:00:14 -0600 Subject: [cinder] Proposing new Core Members ... Message-ID: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> Team, I would like propose two people who have been taking a more active role in Cinder reviews as Core Team Members: First, Rajat Dhasmana who has been active in doing reviews the last couple of releases (http://www.stackalytics.com/?module=cinder-group&user_id=whoami-rajat). He has also made efforts to join our PTG and Forum sessions remotely, has helped to stay on top of bugs and has submitted a number of fixes recently.  I feel he would be a great addition to our team. Also, I would like to propose Yikun Jiang as a core member.  He had big shoes to fill, back-filling TommyLike Hu and he has stood to the challenge.  Continuing to implement the features that TommyLike had in progress and taking an active role as a reviewer: (http://www.stackalytics.com/?module=cinder-group&user_id=yikunkero) I think that both Rajat and Yikun will be welcome additions to help replace the cores that have recently been removed. If there is no disagreement I plan to add both people to the core reviewer list in a week. Thanks! Jay Bryant (jungleboyj) From fungi at yuggoth.org Tue Jan 8 22:05:23 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 8 Jan 2019 22:05:23 +0000 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: <20190108220522.iczyv2yz5rfg4qci@yuggoth.org> On 2019-01-08 10:21:32 -0800 (-0800), Boris Renski wrote: > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). [...] > Happy to hear comments or feedback. Looks slick! When you say "based on" I guess you mean "forked from?" I don't see those modifications in the repository at https://git.openstack.org/cgit/openstack/stackalytics nor proposed to it through https://review.openstack.org/ so presumably the source code now lives elsewhere. Is Stackalytics still open source, or has it become proprietary? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at nemebean.com Tue Jan 8 22:30:03 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 8 Jan 2019 16:30:03 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> Message-ID: <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> On 1/8/19 2:22 PM, Slawomir Kaplonski wrote: > Hi Ben, > > I was also looking at it today. I’m totally not an C and Oslo.privsep expert but I think that there is some new process spawned here. > I put pdb before line https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L191 where this issue happen. Then, with "ps aux” I saw: > > vagrant at fullstack-ubuntu ~ $ ps aux | grep privsep > root 18368 0.1 0.5 185752 33544 pts/1 Sl+ 13:24 0:00 /opt/stack/neutron/.tox/dsvm-functional/bin/python /opt/stack/neutron/.tox/dsvm-functional/bin/privsep-helper --config-file neutron/tests/etc/neutron.conf --privsep_context neutron.privileged.default --privsep_sock_path /tmp/tmpG5iqb9/tmp1dMGq0/privsep.sock > vagrant 18555 0.0 0.0 14512 1092 pts/2 S+ 13:25 0:00 grep --color=auto privsep > > But then when I continue run test, and it segfaulted, in journal log I have: > > Jan 08 13:25:29 fullstack-ubuntu kernel: privsep-helper[18369] segfault at 140043e8 ip 00007f8e1800ef32 sp 00007f8e18a63320 error 4 in libnetfilter_conntrack.so.3.5.0[7f8e18009000+1a000] > > Please check pics of those processes. First one (when test was „paused” with pdb) has 18368 and later segfault has 18369. privsep-helper does fork, so I _think_ that's normal. https://github.com/openstack/oslo.privsep/blob/ecb1870c29b760f09fb933fc8ebb3eac29ffd03e/oslo_privsep/daemon.py#L539 > > I don’t know if You saw my today’s comment in launchpad. I was trying to change method used to start PrivsepDaemon from Method.ROOTWRAP to Method.FORK (in https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/priv_context.py#L218) and run test as root, then tests were passed. Yeah, I saw that, but I don't understand it. :-/ The daemon should end up running with the same capabilities in either case. By the time it starts making the C calls the environment should be identical, regardless of which method was used to start the process. > > — > Slawek Kaplonski > Senior software engineer > Red Hat > >> Wiadomość napisana przez Ben Nemec w dniu 08.01.2019, o godz. 20:04: >> >> Further update: I dusted off my gdb skills and attached it to the privsep process to try to get more details about exactly what is crashing. It looks like the segfault happens on this line: >> >> https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 >> >> which is >> >> h->cb = cb; >> >> h being the conntrack handle and cb being the callback function. >> >> This makes me think the problem isn't the callback itself (even if we assigned a bogus pointer, which we didn't, it shouldn't cause a segfault unless you try to dereference it) but in the handle we pass in. Trying to look at h->cb results in: >> >> (gdb) print h->cb >> Cannot access memory at address 0x800f228 >> >> Interestingly, h itself is fine: >> >> (gdb) print h >> $3 = (struct nfct_handle *) 0x800f1e0 >> >> It doesn't _look_ to me like the handle should be crossing any thread boundaries or anything, so I'm not sure why it would be a problem. It gets created in the same privileged function that ultimately registers the callback: https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 >> >> So still not sure what's going on, but I thought I'd share what I've found before I stop to eat something. >> >> -Ben >> >> On 1/7/19 12:11 PM, Ben Nemec wrote: >>> Renamed the thread to be more descriptive. >>> Just to update the list on this, it looks like the problem is a segfault when the netlink_lib module makes a C call. Digging into that code a bit, it appears there is a callback being used[1]. I've seen some comments that when you use a callback with a Python thread, the thread needs to be registered somehow, but this is all uncharted territory for me. Suggestions gratefully accepted. :-) >>> 1: https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >>>> Hi, >>>> >>>> I just found that functional tests in Neutron are failing since today or maybe yesterday. See [1] >>>> I was able to reproduce it locally and it looks that it happens with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >>>> >>>> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >>>> >>>> — >>>> Slawek Kaplonski >>>> Senior software engineer >>>> Red Hat >>>> > From sean.mcginnis at gmx.com Tue Jan 8 22:35:36 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 8 Jan 2019 16:35:36 -0600 Subject: [cinder] Proposing new Core Members ... In-Reply-To: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> References: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> Message-ID: <20190108223535.GA29520@sm-workstation> On Tue, Jan 08, 2019 at 04:00:14PM -0600, Jay Bryant wrote: > Team, > > I would like propose two people who have been taking a more active role in > Cinder reviews as Core Team Members: > > > I think that both Rajat and Yikun will be welcome additions to help replace > the cores that have recently been removed. > +1 from me. Both have been doing a good job giving constructive feedback on reviews and have been spending some time reviewing code other than their own direct interests, so I think they would be welcome additions. Sean From openstack at nemebean.com Wed Jan 9 00:30:19 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 8 Jan 2019 18:30:19 -0600 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> Message-ID: <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> I think I've got it. At least in my local tests, the handle pointer being passed from C -> Python -> C was getting truncated at the Python step because we didn't properly define the type. If the address assigned was larger than would fit in a standard int then we passed what amounted to a bogus pointer back to the C code, which caused the segfault. I have no idea why privsep threading would have exposed this, other than maybe running in threads affected the address space somehow? In any case, https://review.openstack.org/629335 has got these functional tests working for me locally in oslo.privsep 1.31.0. It would be great if somebody could try them out and verify that I didn't just find a solution that somehow only works on my system. :-) -Ben On 1/8/19 4:30 PM, Ben Nemec wrote: > > > On 1/8/19 2:22 PM, Slawomir Kaplonski wrote: >> Hi Ben, >> >> I was also looking at it today. I’m totally not an C and Oslo.privsep >> expert but I think that there is some new process spawned here. >> I put pdb before line >> https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L191 >> where this issue happen. Then, with "ps aux” I saw: >> >> vagrant at fullstack-ubuntu ~ $ ps aux | grep privsep >> root     18368  0.1  0.5 185752 33544 pts/1    Sl+  13:24   0:00 >> /opt/stack/neutron/.tox/dsvm-functional/bin/python >> /opt/stack/neutron/.tox/dsvm-functional/bin/privsep-helper >> --config-file neutron/tests/etc/neutron.conf --privsep_context >> neutron.privileged.default --privsep_sock_path >> /tmp/tmpG5iqb9/tmp1dMGq0/privsep.sock >> vagrant  18555  0.0  0.0  14512  1092 pts/2    S+   13:25   0:00 grep >> --color=auto privsep >> >> But then when I continue run test, and it segfaulted, in journal log I >> have: >> >> Jan 08 13:25:29 fullstack-ubuntu kernel: privsep-helper[18369] >> segfault at 140043e8 ip 00007f8e1800ef32 sp 00007f8e18a63320 error 4 >> in libnetfilter_conntrack.so.3.5.0[7f8e18009000+1a000] >> >> Please check pics of those processes. First one (when test was >> „paused” with pdb) has 18368 and later segfault has 18369. > > privsep-helper does fork, so I _think_ that's normal. > > https://github.com/openstack/oslo.privsep/blob/ecb1870c29b760f09fb933fc8ebb3eac29ffd03e/oslo_privsep/daemon.py#L539 > > >> >> I don’t know if You saw my today’s comment in launchpad. I was trying >> to change method used to start PrivsepDaemon from Method.ROOTWRAP to >> Method.FORK (in >> https://github.com/openstack/oslo.privsep/blob/master/oslo_privsep/priv_context.py#L218) >> and run test as root, then tests were passed. > > Yeah, I saw that, but I don't understand it. :-/ > > The daemon should end up running with the same capabilities in either > case. By the time it starts making the C calls the environment should be > identical, regardless of which method was used to start the process. > >> >> — >> Slawek Kaplonski >> Senior software engineer >> Red Hat >> >>> Wiadomość napisana przez Ben Nemec w dniu >>> 08.01.2019, o godz. 20:04: >>> >>> Further update: I dusted off my gdb skills and attached it to the >>> privsep process to try to get more details about exactly what is >>> crashing. It looks like the segfault happens on this line: >>> >>> https://git.netfilter.org/libnetfilter_conntrack/tree/src/conntrack/api.c#n239 >>> >>> >>> which is >>> >>> h->cb = cb; >>> >>> h being the conntrack handle and cb being the callback function. >>> >>> This makes me think the problem isn't the callback itself (even if we >>> assigned a bogus pointer, which we didn't, it shouldn't cause a >>> segfault unless you try to dereference it) but in the handle we pass >>> in. Trying to look at h->cb results in: >>> >>> (gdb) print h->cb >>> Cannot access memory at address 0x800f228 >>> >>> Interestingly, h itself is fine: >>> >>> (gdb) print h >>> $3 = (struct nfct_handle *) 0x800f1e0 >>> >>> It doesn't _look_ to me like the handle should be crossing any thread >>> boundaries or anything, so I'm not sure why it would be a problem. It >>> gets created in the same privileged function that ultimately >>> registers the callback: >>> https://github.com/openstack/neutron/blob/aa8a6ea848aae6882abb631b7089836dee8f4008/neutron/privileged/agent/linux/netlink_lib.py#L246 >>> >>> >>> So still not sure what's going on, but I thought I'd share what I've >>> found before I stop to eat something. >>> >>> -Ben >>> >>> On 1/7/19 12:11 PM, Ben Nemec wrote: >>>> Renamed the thread to be more descriptive. >>>> Just to update the list on this, it looks like the problem is a >>>> segfault when the netlink_lib module makes a C call. Digging into >>>> that code a bit, it appears there is a callback being used[1]. I've >>>> seen some comments that when you use a callback with a Python >>>> thread, the thread needs to be registered somehow, but this is all >>>> uncharted territory for me. Suggestions gratefully accepted. :-) >>>> 1: >>>> https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L136 >>>> On 1/4/19 7:28 AM, Slawomir Kaplonski wrote: >>>>> Hi, >>>>> >>>>> I just found that functional tests in Neutron are failing since >>>>> today or maybe yesterday. See [1] >>>>> I was able to reproduce it locally and it looks that it happens >>>>> with oslo.privsep==1.31. With oslo.privsep==1.30.1 tests are fine. >>>>> >>>>> [1] https://bugs.launchpad.net/neutron/+bug/1810518 >>>>> >>>>> — >>>>> Slawek Kaplonski >>>>> Senior software engineer >>>>> Red Hat >>>>> >> > From iwienand at redhat.com Wed Jan 9 06:11:09 2019 From: iwienand at redhat.com (Ian Wienand) Date: Wed, 9 Jan 2019 17:11:09 +1100 Subject: [infra] NetworkManager on infra Fedora 29 and CentOS nodes Message-ID: <20190109061109.GA24618@fedora19.localdomain> Hello, Just a heads-up; with Fedora 29 the legacy networking setup was moved into a separate, not-installed-by-default network-scripts package. This has prompted us to finally move to managing interfaces on our Fedora and CentOS CI hosts with NetworkManager (see [1]) Support for this is enabled with features added in glean 1.13.0 and diskimage-builder 1.19.0. The newly created Fedora 29 nodes [2] will have it enabled, and [3] will switch CentOS nodes shortly. This is tested by our nodepool jobs which build images, upload them into devstack and boot them, and then check the networking [4]. I don't really expect any problems, but be aware NetworkManager packages will appear on the CentOS 7 and Fedora base images with these changes. Thanks -i [1] https://bugzilla.redhat.com/show_bug.cgi?id=1643763#c2 [2] https://review.openstack.org/618672 [3] https://review.openstack.org/619960 [4] https://review.openstack.org/618671 From smooney at redhat.com Wed Jan 9 06:11:54 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 09 Jan 2019 06:11:54 +0000 Subject: [nova] Mempage fun In-Reply-To: <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <1546937673.17763.2@smtp.office365.com> <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> Message-ID: On Tue, 2019-01-08 at 18:38 +0000, Stephen Finucane wrote: > On Tue, 2019-01-08 at 08:54 +0000, Balázs Gibizer wrote: > > On Mon, Jan 7, 2019 at 6:32 PM, Stephen Finucane wrote: > > > We've been looking at a patch that landed some months ago and have > > > spotted some issues: > > > > > > https://review.openstack.org/#/c/532168 > > > > > > In summary, that patch is intended to make the memory check for > > > instances memory pagesize aware. The logic it introduces looks > > > something like this: > > > > > > If the instance requests a specific pagesize > > > (#1) Check if each host cell can provide enough memory of the > > > pagesize requested for each instance cell > > > Otherwise > > > If the host has hugepages > > > (#2) Check if each host cell can provide enough memory of the > > > smallest pagesize available on the host for each instance cell > > > Otherwise > > > (#3) Check if each host cell can provide enough memory for > > > each instance cell, ignoring pagesizes > > > > > > This also has the side-effect of allowing instances with hugepages and > > > instances with a NUMA topology but no hugepages to co-exist on the same > > > host, because the latter will now be aware of hugepages and won't > > > consume them. However, there are a couple of issues with this: > > > > > > 1. It breaks overcommit for instances without pagesize request > > > running on hosts with different pagesizes. This is because we don't > > > allow overcommit for hugepages, but case (#2) above means we are now > > > reusing the same functions previously used for actual hugepage > > > checks to check for regular 4k pages > > > 2. It doesn't fix the issue when non-NUMA instances exist on the same > > > host as NUMA instances with hugepages. The non-NUMA instances don't > > > run through any of the code above, meaning they're still not > > > pagesize aware > > > > > > We could probably fix issue (1) by modifying those hugepage functions > > > we're using to allow overcommit via a flag that we pass for case (#2). > > > We can mitigate issue (2) by advising operators to split hosts into > > > aggregates for 'hw:mem_page_size' set or unset (in addition to > > > 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but > > > I think this may be the case in some docs (sean-k-mooney said Intel > > > used to do this. I don't know about Red Hat's docs or upstream). In > > > addition, we did actually called that out in the original spec: > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact > > > > > > However, if we're doing that for non-NUMA instances, one would have to > > > question why the patch is necessary/acceptable for NUMA instances. For > > > what it's worth, a longer fix would be to start tracking hugepages in > > > a non-NUMA aware way too but that's a lot more work and doesn't fix the > > > issue now. > > > > > > As such, my question is this: should be look at fixing issue (1) and > > > documenting issue (2), or should we revert the thing wholesale until > > > we work on a solution that could e.g. let us track hugepages via > > > placement and resolve issue (2) too. > > > > If you feel that fixing (1) is pretty simple then I suggest to do that > > and document the limitation of (2) while we think about a proper > > solution. > > > > gibi > > I have (1) fixed here: > > https://review.openstack.org/#/c/629281/ > > That said, I'm not sure if it's the best thing to do. From what I'm > hearing, it seems the advice we should be giving is to not mix > instances with/without NUMA topologies, with/without hugepages and it should be with and without hw:mem_page_size. guest with that set should not be mixed with guests without that set on the same host. and with shiad patch and your patch this now become safe if the guest without hw:mem_page_size has a numa topology. mixing hugepage and non hugepage guests is fine provided the non hugepage guest has an implcit or expcit numa toplogy such as a guest that is useing cpu pinning. > with/without CPU pinning. We've only documented the latter, as > discussed on this related bug by cfriesen: > > https://bugs.launchpad.net/nova/+bug/1792985 > > Given that we should be advising folks not to mix these (something I > wasn't aware of until now), what does the original patch actually give > us? If you're not mixing instances with/without hugepages, then the > only use case that would fix is booting an instance with a NUMA > topology but no hugepages on a host that had hugepages (because the > instance would be limited to CPUs and memory from one NUMA nodes, but > it's conceivable all available memory could be on another NUMA node). > That seems like a very esoteric use case that might be better solved by this is not that esoteric. one simple example is an operator has configred some number of hugepges on the hypervior and want to run pinnined instance some of which have hugepages and somme that dont. this works fine today however oversubsciption of memory in the non hugepage case is broken as per the bug. > perhaps making the reserved memory configuration option optionally NUMA > specific. well i have been asking for that for 2-3 releases. i would like to do that independenly of this issue and i think it will be a requirement if we ever model mempages per numa node in placement. > This would allow us to mark this hugepage memory, which is > clearly not intended for consumption by nova (remember: this host only > handles non-hugepage instances) again it is safe to mix hugepage instance with non hugepages instance if hw:mem_page_size is set in the non hugepage case. but with your senario in mind we can already resrve the hugepage memory for the host use by setting reserved_huge_pages in the default section of the nova.conf https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.reserved_huge_pages > , as reserved on a per-node basis. I'm > not sure how we would map this to placement, though I'm sure it could > be figured out. that is simple. the placement inventory would just have the reserved value set to the value for the reserved_huge_pages config option. > > jaypipes is going to have so much fun mapping all this in placement :D we have disscued this at lenght before so placement can already model this quite well if nova created the RPs and inventories for mempages. the main question is can we stop modeling memory_mb inventories in the root compute node RP entirely. i personcally would like to make all instances numa affined by default. e.g. we woudl start treading all instances as if hw:numa_nodes=1 was set and preferabley hw:mem_page_size=small. this would signifcantly simplfy our lives in placement but it has a down side that if you want to create really large instance they must be multi numa. e.g. if the guest will be larger then will fit in a singel host numa node it must have have hw:numa_nodes>1 to be schduled. the simple fact is that such an instance is already spanning host numa nodes and but we are not tell ing the guest that. by actully telling the geust it has multiple numa nodes it will imporve the guest perfromance but its a behavior change that not everyone will like. Our current practics or tracking memory and cpus both per numa node and per host is tech debt that we need to clean up at some point or live with the fact that numa will never be modeled in placement. we already have numa afinity for vswitch, pci/sriov devices and we will/should have it for vgpus and pmem in the future. long term i think we would only track things per numa node but i know sylvain has a detailed spec on this which has more context the we can resonably discuss here. > > Stephen > > From alfredo.deluca at gmail.com Wed Jan 9 07:26:32 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 08:26:32 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Hi Ignazio. I downloaded your magnum.conf but it\s not that different from mine. Not sure why but the cluster build seems to run....along with heat but I get that error mentioned earlier. Cheers On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano wrote: > Hi Alfredo, > attached here there is my magnum.conf for queens release > As you can see my heat sections are empty > When you create your cluster, I suggest to check heat logs e magnum logs > for verifyng what is wrong > Ignazio > > > > Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> so. Creating a stack either manually or dashboard works fine. The problem >> seems to be when I create a cluster (kubernetes/swarm) that I got that >> error. >> Maybe the magnum conf it's not properly setup? >> In the heat section of the magnum.conf I have only >> *[heat_client]* >> *region_name = RegionOne* >> *endpoint_type = internalURL* >> >> Cheers >> >> >> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >> alfredo.deluca at gmail.com> wrote: >> >>> Yes. Next step is to check with ansible. >>> I do think it's some rights somewhere... >>> I'll check later. Thanks >>> >>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano >> wrote: >>> >>>> Alfredo, >>>> 1 . how did you run the last heat template? By dashboard ? >>>> 2. Using openstack command you can check if ansible configured heat >>>> user/domain correctly >>>> >>>> >>>> It seems a problem related to >>>> heat user rights? >>>> >>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>> killed by signal 15 >>>>> which is last log from a few days ago. >>>>> >>>>> While the journal of the heat engine says >>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>> heat-engine service. >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>> SAWarning: Unicode type received non-unicode bind param value >>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>> occurrences) >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> (util.ellipses_string(value),)) >>>>> >>>>> >>>>> I also checked the configuration and it seems to be ok. the problem is >>>>> that I installed openstack with ansible-openstack.... so I can't change >>>>> anything unless I re run everything. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Check heat user and domani are c onfigured like at the following: >>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>> >>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>> >>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>> >>>>>>>> I ll try asap. Thanks >>>>>>>> >>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>> >>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>> heat is working fine? >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>> >>>>>>>>>> HI IGNAZIO >>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>> creating the master. >>>>>>>>>> >>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> Anycase during deployment you can connect with ssh to the master >>>>>>>>>>> and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>> >>>>>>>>>>>> Cheers >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my answer >>>>>>>>>>>>> could help you.... >>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>> Ignazio >>>>>>>>>>>>> >>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>> one.... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Alfredo* >>>>>>>>>>>> >>>>>>>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhengzhenyulixi at gmail.com Wed Jan 9 07:39:49 2019 From: zhengzhenyulixi at gmail.com (Zhenyu Zheng) Date: Wed, 9 Jan 2019 15:39:49 +0800 Subject: [Nova] Suggestion needed for detach-boot-volume design In-Reply-To: References: <0ef8b4b4-4a02-3f31-efcd-9baa1268822a@gmail.com> Message-ID: Thanks all for the feedback, I have update the spec to be more clear about the scope: https://review.openstack.org/#/c/619161/ On Mon, Jan 7, 2019 at 4:37 PM Zhenyu Zheng wrote: > Thanks alot for the replies, lets wait for some more comments, and I will > update the follow-up spec about this within two days. > > On Sat, Jan 5, 2019 at 7:37 AM melanie witt wrote: > >> On Fri, 4 Jan 2019 09:50:46 -0600, Matt Riedemann >> wrote: >> > On 1/2/2019 2:57 AM, Zhenyu Zheng wrote: >> >> I've been working on detach-boot-volume[1] in Stein, we got the initial >> >> design merged and while implementing we have meet some new problems and >> >> now I'm amending the spec to cover these new problems[2]. >> > >> > [2] is https://review.openstack.org/#/c/619161/ >> > >> >> >> >> The thing I want to discuss for wider opinion is that in the initial >> >> design, we planned to support detach root volume for only STOPPED and >> >> SHELVED/SHELVE_OFFLOADED instances. But then we found out that we >> >> allowed to detach volumes for RESIZED/PAUSED/SOFT_DELETED instances as >> >> well. Should we allow detaching root volume for instances in these >> >> status too? Cases like RESIZE could be complicated for the revert >> resize >> >> action, and it also seems unnecesary. >> > >> > The full set of allowed states for attaching and detaching are here: >> > >> > >> https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4187 >> > >> > >> https://github.com/openstack/nova/blob/8ef3d253a/nova/compute/api.py#L4297 >> > >> > Concerning those other states: >> > >> > RESIZED: There might be a case for attaching/detaching volumes based on >> > flavor during a resize, but I'm not sure about the root volume in that >> > case (that really sounds more like rebuild with a new image to me, which >> > is a different blueprint). I'm also not sure how much people know about >> > the ability to do this or what the behavior is on revert if you have >> > changed the volumes while the server is resized. If we consider that >> > when a user reverts a resize, they want to go back to the way things >> > were for the root disk image, then I would think we should not allow >> > changing out the root volume while resized. >> >> Yeah, if someone attaches/detaches a regular volume while the instance >> is in VERIFY_RESIZE state and then reverts the resize, I assume we >> probably don't attempt to change or restore anything with the volume >> attachments to put them back to how they were attached before the >> resize. But as you point out, the situation does seem different >> regarding a root volume. If a user changes that while in VERIFY_RESIZE >> and reverts the resize, and we leave the root volume alone, then they >> end up with a different root disk image than they had before the resize. >> Which seems weird. >> >> I agree it seems better not to allow this for now and come back to it >> later if people start asking for it. >> >> > PAUSED: First, I'm not sure how much anyone uses the pause API (or >> > suspend for that matter) although most of the virt drivers implement it. >> > At one point you could attach volumes to suspended servers as well, but >> > because libvirt didn't support it that was removed from the API (yay for >> > non-discoverable backend-specific API behavior changes): >> > >> > https://review.openstack.org/#/c/83505/ >> > >> > Anyway, swapping the root volume on a paused instance seems dangerous to >> > me, so until someone really has a good use case for it, then I think we >> > should avoid that one as well. >> > >> > SOFT_DELETED: I really don't understand the use case for >> > attaching/detaching volumes to/from a (soft) deleted server. If the >> > server is deleted and only hanging around because it hasn't been >> > reclaimed yet, there are really no guarantees that this would work, so >> > again, I would just skip this one for the root volume changes. If the >> > user really wants to play with the volumes attached to a soft deleted >> > server, they should restore it first. >> > >> > So in summary, I think we should just not support any of those other >> > states for attach/detach root volumes and only focus on stopped or >> > shelved instances. >> >> Again, agree, I think we should just not allow the other states for the >> initial implementation and revisit later if it turns out people need >> these. >> >> -melanie >> >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 07:47:34 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 08:47:34 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Hello Alfredo, I think I could connect on IRC #openstack-containers where magnum experts could help you. Any case the error you reported in previous emails seems to be related to heat wait conditions. Ignazio Il giorno mer 9 gen 2019 alle ore 08:26 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > Hi Ignazio. I downloaded your magnum.conf but it\s not that different from > mine. Not sure why but the cluster build seems to run....along with heat > but I get that error mentioned earlier. > > Cheers > > > On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano > wrote: > >> Hi Alfredo, >> attached here there is my magnum.conf for queens release >> As you can see my heat sections are empty >> When you create your cluster, I suggest to check heat logs e magnum logs >> for verifyng what is wrong >> Ignazio >> >> >> >> Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> so. Creating a stack either manually or dashboard works fine. The >>> problem seems to be when I create a cluster (kubernetes/swarm) that I got >>> that error. >>> Maybe the magnum conf it's not properly setup? >>> In the heat section of the magnum.conf I have only >>> *[heat_client]* >>> *region_name = RegionOne* >>> *endpoint_type = internalURL* >>> >>> Cheers >>> >>> >>> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> Yes. Next step is to check with ansible. >>>> I do think it's some rights somewhere... >>>> I'll check later. Thanks >>>> >>>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano < >>>> ignaziocassano at gmail.com wrote: >>>> >>>>> Alfredo, >>>>> 1 . how did you run the last heat template? By dashboard ? >>>>> 2. Using openstack command you can check if ansible configured heat >>>>> user/domain correctly >>>>> >>>>> >>>>> It seems a problem related to >>>>> heat user rights? >>>>> >>>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>>> killed by signal 15 >>>>>> which is last log from a few days ago. >>>>>> >>>>>> While the journal of the heat engine says >>>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>>> heat-engine service. >>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>>> SAWarning: Unicode type received non-unicode bind param value >>>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>>> occurrences) >>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>> (util.ellipses_string(value),)) >>>>>> >>>>>> >>>>>> I also checked the configuration and it seems to be ok. the problem >>>>>> is that I installed openstack with ansible-openstack.... so I can't change >>>>>> anything unless I re run everything. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Check heat user and domani are c onfigured like at the following: >>>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>>> >>>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>>> >>>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>>> >>>>>>>>> I ll try asap. Thanks >>>>>>>>> >>>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>> >>>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>>> heat is working fine? >>>>>>>>>> Ignazio >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>> >>>>>>>>>>> HI IGNAZIO >>>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>>> creating the master. >>>>>>>>>>> >>>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>>> >>>>>>>>>>>> Anycase during deployment you can connect with ssh to the >>>>>>>>>>>> master and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>>> Ignazio >>>>>>>>>>>> >>>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>> >>>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my >>>>>>>>>>>>>> answer could help you.... >>>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>>> Ignazio >>>>>>>>>>>>>> >>>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>>> one.... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 08:08:37 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 09:08:37 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Alfredo, you could make another test searching on internet as simple heat stack example with wait conditions inside for checking if heat wait conditions work fine. Cheers -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 08:21:00 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 09:21:00 +0100 Subject: queens [magnum] patches Message-ID: Hello, last week I talked on #openstack-containers IRC about important patches for magnum reported here: https://review.openstack.org/#/c/577477/ I'd like to know when the above will be backported on queens and if centos7 and ubuntu packages will be upgraded with them. Any roadmap ? I would go on with magnum testing on queens because I am going to upgrade from ocata to pike and from pike to queens. At this time I have aproduction environment on ocata and a testing environment on queens. Best Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjf1970231893 at gmail.com Wed Jan 9 08:51:48 2019 From: yjf1970231893 at gmail.com (Jeff Yang) Date: Wed, 9 Jan 2019 16:51:48 +0800 Subject: [octavia] What is our next plan about 'l3-active-active' blueprint? In-Reply-To: References: Message-ID: Hi, Michael & Adam: I only need to confirm the eventlet has no conflict with amphora-agent. Because I just need to use eventlet in amphora-agent. Michael: 1、The os-ken is managed by OpenStack Community now, and neutron-dynamic-routing's default driver also is os-ken. I think we should be consistent with the community for later maintenance. 2、For the current application scenario, exabgp is a bit too heavy. If use exabgp we need to manage an extra service and need to write adaption code for different Linux distributions. 3、We can more accurately get the bgp speaker and bgp peer's status and statistics by use Os-Ken's functions, for example, peer_down_handler, peer_up_handler, neighbor_state_get. I didn't find similar function in Exabgp. 4、Personally, I am more familiar with os-ken. Adam: Os-Ken is a python library, it implemented bgp protocol. Os-ken manages bgp speaker by starting a green thread. So, I need to use eventlet in amphora-agent code. Extra illustration: Last week, I found the monkey_patch of eventlet will result in gunicorn does not work properly. But now, I resolved the problem. We must pass `os=False` to eventlet.monkey_patch when we call eventlet.monkey_patch, if not, the gunicorn master process will not exit never. Michael Johnson 于2019年1月9日周三 上午1:00写道: > Yes, we do not allow eventlet in Octavia. It leads to a number of > conflicts and problems with the overall code base, including the use > of taskflow. > Is there a reason we need to use the os-ken BGP code as opposed to the > exabgp option that was being used before? > I remember we looked at those two options back when the other team was > developing the l3 option, but I don't remember all of the details of > why exabgp was selected. > > Michael > > On Mon, Jan 7, 2019 at 1:18 AM Jeff Yang wrote: > > > > Hi Michael, > > I found that you forbid import eventlet in octavia.[1] > > I guess the eventlet has a conflict with gunicorn, is that? > > But, I need to import eventlet for os-ken that used to implement bgp > speaker.[2] > > I am studying eventlet and gunicorn deeply. Have you some > suggestions to resolve this conflict? > > > > [1] https://review.openstack.org/#/c/462334/ > > [2] https://review.openstack.org/#/c/628915/ > > > > Michael Johnson 于2019年1月5日周六 上午8:02写道: > >> > >> Hi Jeff, > >> > >> Unfortunately the team that was working on that code had stopped due > >> to internal reasons. > >> > >> I hope to make the reference active/active blueprint a priority again > >> during the Train cycle. Following that I may be able to look at the L3 > >> distributor option, but I cannot commit to that at this time. > >> > >> If you are interesting in picking up that work, please let me know and > >> we can sync up on that status of the WIP patches, etc. > >> > >> Michael > >> > >> On Thu, Jan 3, 2019 at 11:19 PM Jeff Yang > wrote: > >> > > >> > Dear Octavia team: > >> > The email aims to ask the development progress about > l3-active-active blueprint. I > >> > noticed that the work in this area has been stagnant for eight months. > >> > https://review.openstack.org/#/q/l3-active-active > >> > I want to know the community's next work plan in this regard. > >> > Thanks. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Jan 9 08:52:29 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 09 Jan 2019 08:52:29 +0000 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: can we tag this conversation with [heat][magnum] in the subject by the way. i keep clicking on it to get the context and realising i can help. On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: > Alfredo, you could make another test searching on internet as simple heat stack example with wait conditions inside > for checking if heat wait conditions work fine. > Cheers > From marcin.juszkiewicz at linaro.org Wed Jan 9 08:57:06 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Wed, 9 Jan 2019 09:57:06 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: <20190108140026.p4462df5otnyizm2@yuggoth.org> References: <20190108140026.p4462df5otnyizm2@yuggoth.org> Message-ID: <6e3c0328-c544-a4dd-32e5-d7e45193a4a7@linaro.org> W dniu 08.01.2019 o 15:00, Jeremy Stanley pisze: >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > [...] > > These days it's probably better to recommend > https://docs.openstack.org/contributors/ since I expect we're about > ready to retire that old wiki page. Then I hope that someone will take care of SEO and redirects. Link I gave was first link from "openstack contributing" google search. From alfredo.deluca at gmail.com Wed Jan 9 09:40:41 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 10:40:41 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Hi Sean. Thanks for that. Do you have any idea about this error? @Ignazio Cassano I will try also on IRC and I am looking on internet a lot. Next step also I will try to create a simple stack to see if it works fine. Cheers On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: > can we tag this conversation with [heat][magnum] in the subject by the way. > i keep clicking on it to get the context and realising i can help. > > On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: > > Alfredo, you could make another test searching on internet as simple > heat stack example with wait conditions inside > > for checking if heat wait conditions work fine. > > Cheers > > > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 10:08:08 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 11:08:08 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Yes, I presume something goes wrong in heat wait condition authorization how reported by Alfredo. So I suggested to try a simple heat stack with a wait condition. Cheers Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > Hi Sean. Thanks for that. > Do you have any idea about this error? @Ignazio Cassano > I will try also on IRC and I am looking on > internet a lot. > Next step also I will try to create a simple stack to see if it works > fine. > Cheers > > On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: > >> can we tag this conversation with [heat][magnum] in the subject by the >> way. >> i keep clicking on it to get the context and realising i can help. >> >> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >> > Alfredo, you could make another test searching on internet as simple >> heat stack example with wait conditions inside >> > for checking if heat wait conditions work fine. >> > Cheers >> > >> >> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Wed Jan 9 10:21:01 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 11:21:01 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: thanks Ignazio. Do you have a quick example for that? pls... Cheers On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano wrote: > Yes, I presume something goes wrong in heat wait condition authorization > how reported by Alfredo. > So I suggested to try a simple heat stack with a wait condition. > Cheers > > Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> Hi Sean. Thanks for that. >> Do you have any idea about this error? @Ignazio Cassano >> I will try also on IRC and I am looking on >> internet a lot. >> Next step also I will try to create a simple stack to see if it works >> fine. >> Cheers >> >> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: >> >>> can we tag this conversation with [heat][magnum] in the subject by the >>> way. >>> i keep clicking on it to get the context and realising i can help. >>> >>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>> > Alfredo, you could make another test searching on internet as simple >>> heat stack example with wait conditions inside >>> > for checking if heat wait conditions work fine. >>> > Cheers >>> > >>> >>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 10:25:45 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 11:25:45 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Alfredo, attached herer there is an example: substitute the image with your image, flavor with your flavor, key_name with your key and network with your network Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > thanks Ignazio. Do you have a quick example for that? pls... > > Cheers > > > On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano > wrote: > >> Yes, I presume something goes wrong in heat wait condition authorization >> how reported by Alfredo. >> So I suggested to try a simple heat stack with a wait condition. >> Cheers >> >> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> Hi Sean. Thanks for that. >>> Do you have any idea about this error? @Ignazio Cassano >>> I will try also on IRC and I am looking on >>> internet a lot. >>> Next step also I will try to create a simple stack to see if it works >>> fine. >>> Cheers >>> >>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: >>> >>>> can we tag this conversation with [heat][magnum] in the subject by the >>>> way. >>>> i keep clicking on it to get the context and realising i can help. >>>> >>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>> > Alfredo, you could make another test searching on internet as simple >>>> heat stack example with wait conditions inside >>>> > for checking if heat wait conditions work fine. >>>> > Cheers >>>> > >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wait.yml Type: application/x-yaml Size: 2529 bytes Desc: not available URL: From balazs.gibizer at ericsson.com Wed Jan 9 10:30:57 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Wed, 9 Jan 2019 10:30:57 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1546865551.29530.0@smtp.office365.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> Message-ID: <1547029853.1128.0@smtp.office365.com> On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer wrote: > > >> But, let's chat more about it via a hangout the week after next >> (week >> of January 14 when Matt is back), as suggested in #openstack-nova >> today. We'll be able to have a high-bandwidth discussion then and >> agree on a decision on how to move forward with this. > > Thank you all for the discussion. I agree to have a real-time > discussion about the way forward. > > Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a > hangouts[2]? > > I see the following topics we need to discuss: > * backward compatibility with already existing SRIOV ports having min > bandwidth > * introducing microversion(s) for this feature in Nova > * allowing partial support for this feature in Nova in Stein (E.g.: > only server create/delete but no migrate support). > * step-by-step verification of the really long commit chain in Nova > > I will post a summar of each issue to the ML during this week. Hi, As I promised here is an etherpad[1] for the hangouts discussion, with a sort summary for the topic I think we need to discuss. Feel free to comment in there or add new topics you feel important. [1] https://etherpad.openstack.org/p/bandwidth-way-forward Cheers, gibi > > From alfredo.deluca at gmail.com Wed Jan 9 10:43:26 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 11:43:26 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: thanks Ignazio. Appreciated On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano wrote: > Alfredo, attached herer there is an example: > > substitute the image with your image, flavor with your flavor, key_name > with your key and network with your network > > Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> thanks Ignazio. Do you have a quick example for that? pls... >> >> Cheers >> >> >> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano >> wrote: >> >>> Yes, I presume something goes wrong in heat wait condition authorization >>> how reported by Alfredo. >>> So I suggested to try a simple heat stack with a wait condition. >>> Cheers >>> >>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>> alfredo.deluca at gmail.com> ha scritto: >>> >>>> Hi Sean. Thanks for that. >>>> Do you have any idea about this error? @Ignazio Cassano >>>> I will try also on IRC and I am looking on >>>> internet a lot. >>>> Next step also I will try to create a simple stack to see if it works >>>> fine. >>>> Cheers >>>> >>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: >>>> >>>>> can we tag this conversation with [heat][magnum] in the subject by the >>>>> way. >>>>> i keep clicking on it to get the context and realising i can help. >>>>> >>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>> > Alfredo, you could make another test searching on internet as simple >>>>> heat stack example with wait conditions inside >>>>> > for checking if heat wait conditions work fine. >>>>> > Cheers >>>>> > >>>>> >>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan.vondra at ultimum.io Wed Jan 9 11:28:55 2019 From: jan.vondra at ultimum.io (Jan Vondra) Date: Wed, 9 Jan 2019 12:28:55 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: út 8. 1. 2019 v 12:00 odesílatel Marcin Juszkiewicz napsal: > > W dniu 08.01.2019 o 11:08, Jan Vondra pisze: > > Dear Kolla team, > > > > during project for one of our customers we have upgraded debian part > > of kolla project using a queens debian repositories > > (http://stretch-queens.debian.net/debian stretch-queens-backports) and > > we would like to share this work with community. > > Thanks for doing that. Is there an option to provide arm64 packages next > time? > It's more of a question for OpenStack Debian Team - namely Thomas Goirand who creates this repo. > > I would like to ask what's the proper process of contributing since > > the patches affects both kolla and kolla-ansible repositories. > > Send patches for review [1] and then we can discuss about changing them. > Remember that we target Stein now. > > 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > Thank you for help (and others with updated links) - I will upload patches today or tomorrow. > > Also any other comments regarding debian in kolla would be appriciated. > > Love to see someone else caring about Debian in Kolla. I took it over > two years ago, revived and moved to 'stretch'. But skipped support for > binary packages as there were no up-to-date packages available. > > In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' > as it will enter final freeze. Had some discussion with Debian OpenStack > team about providing preliminary Stein packages so support for 'binary' > type of images could be possible. I suppose that switch from Queens to Stein would be quite easy since all packages in Queens in Debian are Python 3 only so the most of the work has been already done. From alfredo.deluca at gmail.com Wed Jan 9 11:32:59 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 12:32:59 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: ...more found in logs 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] Domain admin client authentication failed: Unauthorized: The request you have made requires authentication. (HTTP 401) (Request-ID: req-28f9873a-5627-4f2e-9e19-da6c63753383) So what authentication it need? On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca wrote: > hi Ignazio. > the wait condition failed too on your simple stack. > Any other idea where to look at? > > Cheers > > [image: image.png] > > > > On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca > wrote: > >> thanks Ignazio. Appreciated >> >> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano >> wrote: >> >>> Alfredo, attached herer there is an example: >>> >>> substitute the image with your image, flavor with your flavor, key_name >>> with your key and network with your network >>> >>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>> alfredo.deluca at gmail.com> ha scritto: >>> >>>> thanks Ignazio. Do you have a quick example for that? pls... >>>> >>>> Cheers >>>> >>>> >>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Yes, I presume something goes wrong in heat wait condition >>>>> authorization how reported by Alfredo. >>>>> So I suggested to try a simple heat stack with a wait condition. >>>>> Cheers >>>>> >>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> Hi Sean. Thanks for that. >>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>> I will try also on IRC and I am looking >>>>>> on internet a lot. >>>>>> Next step also I will try to create a simple stack to see if it works >>>>>> fine. >>>>>> Cheers >>>>>> >>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>> wrote: >>>>>> >>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>> the way. >>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>> >>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>> simple heat stack example with wait conditions inside >>>>>>> > for checking if heat wait conditions work fine. >>>>>>> > Cheers >>>>>>> > >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >> >> -- >> *Alfredo* >> >> > > -- > *Alfredo* > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 12:18:37 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 13:18:37 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: My suggestion was only to verify that the problem is not magnum but heat/keystone. Heat stacks generated by magnum use wait conditions. I am not so expert of heat, but I am sure it uses keystone. So you have some issues or in keystone or in heat configuration. Are you able to create instances/volumes on your openstack ? If yes keystone can be ok. Probably Sean could help !!! Cheers Ignazio Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > ...more found in logs > > 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient > [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] > Domain admin client authentication failed: Unauthorized: The request you > have made requires authentication. (HTTP 401) (Request-ID: > req-28f9873a-5627-4f2e-9e19-da6c63753383) > > So what authentication it need? > > > On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca > wrote: > >> hi Ignazio. >> the wait condition failed too on your simple stack. >> Any other idea where to look at? >> >> Cheers >> >> [image: image.png] >> >> >> >> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca >> wrote: >> >>> thanks Ignazio. Appreciated >>> >>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Alfredo, attached herer there is an example: >>>> >>>> substitute the image with your image, flavor with your flavor, key_name >>>> with your key and network with your network >>>> >>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>> >>>>> Cheers >>>>> >>>>> >>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>> authorization how reported by Alfredo. >>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>> Cheers >>>>>> >>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Sean. Thanks for that. >>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>> I will try also on IRC and I am looking >>>>>>> on internet a lot. >>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>> works fine. >>>>>>> Cheers >>>>>>> >>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>> wrote: >>>>>>> >>>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>>> the way. >>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>> >>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>> simple heat stack example with wait conditions inside >>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>> > Cheers >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Alfredo* >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >>> >>> -- >>> *Alfredo* >>> >>> >> >> -- >> *Alfredo* >> >> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 12:23:37 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 13:23:37 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Alfredo, try to source you openstack environment variable file: source admin-openrc Then try to execute the following commands heat stack-list openstack stack list Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > ...more found in logs > > 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient > [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] > Domain admin client authentication failed: Unauthorized: The request you > have made requires authentication. (HTTP 401) (Request-ID: > req-28f9873a-5627-4f2e-9e19-da6c63753383) > > So what authentication it need? > > > On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca > wrote: > >> hi Ignazio. >> the wait condition failed too on your simple stack. >> Any other idea where to look at? >> >> Cheers >> >> [image: image.png] >> >> >> >> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca >> wrote: >> >>> thanks Ignazio. Appreciated >>> >>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Alfredo, attached herer there is an example: >>>> >>>> substitute the image with your image, flavor with your flavor, key_name >>>> with your key and network with your network >>>> >>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>> >>>>> Cheers >>>>> >>>>> >>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>> authorization how reported by Alfredo. >>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>> Cheers >>>>>> >>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Sean. Thanks for that. >>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>> I will try also on IRC and I am looking >>>>>>> on internet a lot. >>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>> works fine. >>>>>>> Cheers >>>>>>> >>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>> wrote: >>>>>>> >>>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>>> the way. >>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>> >>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>> simple heat stack example with wait conditions inside >>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>> > Cheers >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Alfredo* >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >>> >>> -- >>> *Alfredo* >>> >>> >> >> -- >> *Alfredo* >> >> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcin.juszkiewicz at linaro.org Wed Jan 9 13:04:56 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Wed, 9 Jan 2019 14:04:56 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: W dniu 09.01.2019 o 12:28, Jan Vondra pisze: > út 8. 1. 2019 v 12:00 odesílatel Marcin Juszkiewicz > napsal: >> >> W dniu 08.01.2019 o 11:08, Jan Vondra pisze: >>> Dear Kolla team, >>> >>> during project for one of our customers we have upgraded debian part >>> of kolla project using a queens debian repositories >>> (http://stretch-queens.debian.net/debian stretch-queens-backports) and >>> we would like to share this work with community. >> >> Thanks for doing that. Is there an option to provide arm64 packages next >> time? > It's more of a question for OpenStack Debian Team - namely Thomas > Goirand who creates this repo. 'Buster' has Rocky now. I was told by Thomas that Stein will follow. >>> I would like to ask what's the proper process of contributing since >>> the patches affects both kolla and kolla-ansible repositories. >> >> Send patches for review [1] and then we can discuss about changing them. >> Remember that we target Stein now. >> >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing >> > > Thank you for help (and others with updated links) - I will upload > patches today or tomorrow. Thanks. Will review. >>> Also any other comments regarding debian in kolla would be appriciated. >> >> Love to see someone else caring about Debian in Kolla. I took it over >> two years ago, revived and moved to 'stretch'. But skipped support for >> binary packages as there were no up-to-date packages available. >> >> In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' >> as it will enter final freeze. Had some discussion with Debian OpenStack >> team about providing preliminary Stein packages so support for 'binary' >> type of images could be possible. > > I suppose that switch from Queens to Stein would be quite easy since > all packages in Queens in Debian are Python 3 only so the most of the > work has been already done. https://review.openstack.org/#/c/625298/ does most of Python 3 packages bring up for Ubuntu with Stein UCA and for Debian 'buster' (which itself is not yet [1] supported in Kolla). 1. https://review.openstack.org/#/c/612681/ From zigo at debian.org Wed Jan 9 13:05:32 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 9 Jan 2019 14:05:32 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: References: Message-ID: <576004de-5ad4-6b51-31da-d7173df41b47@debian.org> Hi, On 1/9/19 12:28 PM, Jan Vondra wrote: > út 8. 1. 2019 v 12:00 odesílatel Marcin Juszkiewicz > napsal: >> >> W dniu 08.01.2019 o 11:08, Jan Vondra pisze: >>> Dear Kolla team, >>> >>> during project for one of our customers we have upgraded debian part >>> of kolla project using a queens debian repositories >>> (http://stretch-queens.debian.net/debian stretch-queens-backports) and >>> we would like to share this work with community. >> >> Thanks for doing that. Is there an option to provide arm64 packages next >> time? >> > > It's more of a question for OpenStack Debian Team - namely Thomas > Goirand who creates this repo. We do not produce arm64 backports for Stretch, because I don't have access to an arm64 instance to build packages. It would also be a lot of manual work, and I'm not sure I would have the time for it. However, I could help you setting that up, it's not very hard. Also, Debian official (ie: Sid, Buster) has arm64 packages, and it will be in Buster. >>> I would like to ask what's the proper process of contributing since >>> the patches affects both kolla and kolla-ansible repositories. >> >> Send patches for review [1] and then we can discuss about changing them. >> Remember that we target Stein now. >> >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing >> > > Thank you for help (and others with updated links) - I will upload > patches today or tomorrow. If you want to contribute to the Debian packaging, it's done in the Gitlab instance of Debian: https://salsa.debian.org/openstack-team Contributors are very much welcome! >>> Also any other comments regarding debian in kolla would be appriciated. >> >> Love to see someone else caring about Debian in Kolla. I took it over >> two years ago, revived and moved to 'stretch'. But skipped support for >> binary packages as there were no up-to-date packages available. >> >> In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' >> as it will enter final freeze. Had some discussion with Debian OpenStack >> team about providing preliminary Stein packages so support for 'binary' >> type of images could be possible. > > I suppose that switch from Queens to Stein would be quite easy since > all packages in Queens in Debian are Python 3 only so the most of the > work has been already done. Why not switching to Rocky right now, which has been the most tested? Cheers, Thomas Goirand (zigo) From brenski at mirantis.com Tue Jan 8 17:10:56 2019 From: brenski at mirantis.com (Boris Renski) Date: Tue, 8 Jan 2019 09:10:56 -0800 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift Message-ID: Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: - We have new look and feel at stackalytics.com - We did away with DriverLog and Member Directory , which were not very actively used or maintained. Those are still available via direct links, but not in the men on the top - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible at the top nav. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback or answer questions. -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Wed Jan 9 11:10:33 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 12:10:33 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: hi Ignazio. the wait condition failed too on your simple stack. Any other idea where to look at? Cheers [image: image.png] On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca wrote: > thanks Ignazio. Appreciated > > On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano > wrote: > >> Alfredo, attached herer there is an example: >> >> substitute the image with your image, flavor with your flavor, key_name >> with your key and network with your network >> >> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> thanks Ignazio. Do you have a quick example for that? pls... >>> >>> Cheers >>> >>> >>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Yes, I presume something goes wrong in heat wait condition >>>> authorization how reported by Alfredo. >>>> So I suggested to try a simple heat stack with a wait condition. >>>> Cheers >>>> >>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> Hi Sean. Thanks for that. >>>>> Do you have any idea about this error? @Ignazio Cassano >>>>> I will try also on IRC and I am looking >>>>> on internet a lot. >>>>> Next step also I will try to create a simple stack to see if it works >>>>> fine. >>>>> Cheers >>>>> >>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney wrote: >>>>> >>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>> the way. >>>>>> i keep clicking on it to get the context and realising i can help. >>>>>> >>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>> > Alfredo, you could make another test searching on internet as >>>>>> simple heat stack example with wait conditions inside >>>>>> > for checking if heat wait conditions work fine. >>>>>> > Cheers >>>>>> > >>>>>> >>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 40109 bytes Desc: not available URL: From jan.vondra at ultimum.io Wed Jan 9 13:28:38 2019 From: jan.vondra at ultimum.io (Jan Vondra) Date: Wed, 9 Jan 2019 14:28:38 +0100 Subject: [Kolla] Queens for debian images In-Reply-To: <576004de-5ad4-6b51-31da-d7173df41b47@debian.org> References: <576004de-5ad4-6b51-31da-d7173df41b47@debian.org> Message-ID: st 9. 1. 2019 v 14:08 odesílatel Thomas Goirand napsal: > > Hi, > > On 1/9/19 12:28 PM, Jan Vondra wrote: > > út 8. 1. 2019 v 12:00 odesílatel Marcin Juszkiewicz > > napsal: > >> > >> W dniu 08.01.2019 o 11:08, Jan Vondra pisze: > >>> Dear Kolla team, > >>> > >>> during project for one of our customers we have upgraded debian part > >>> of kolla project using a queens debian repositories > >>> (http://stretch-queens.debian.net/debian stretch-queens-backports) and > >>> we would like to share this work with community. > >> > >> Thanks for doing that. Is there an option to provide arm64 packages next > >> time? > >> > > > > It's more of a question for OpenStack Debian Team - namely Thomas > > Goirand who creates this repo. > > We do not produce arm64 backports for Stretch, because I don't have > access to an arm64 instance to build packages. It would also be a lot of > manual work, and I'm not sure I would have the time for it. However, I > could help you setting that up, it's not very hard. Also, Debian > official (ie: Sid, Buster) has arm64 packages, and it will be in Buster. > > >>> I would like to ask what's the proper process of contributing since > >>> the patches affects both kolla and kolla-ansible repositories. > >> > >> Send patches for review [1] and then we can discuss about changing them. > >> Remember that we target Stein now. > >> > >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > >> > > > > Thank you for help (and others with updated links) - I will upload > > patches today or tomorrow. > > If you want to contribute to the Debian packaging, it's done in the > Gitlab instance of Debian: > > https://salsa.debian.org/openstack-team > > Contributors are very much welcome! > Well, it's Michals (kevko) responsibility in our company :) > >>> Also any other comments regarding debian in kolla would be appriciated. > >> > >> Love to see someone else caring about Debian in Kolla. I took it over > >> two years ago, revived and moved to 'stretch'. But skipped support for > >> binary packages as there were no up-to-date packages available. > >> > >> In next 2-4 months I plan to migrate Kolla 'master' to Debian 'buster' > >> as it will enter final freeze. Had some discussion with Debian OpenStack > >> team about providing preliminary Stein packages so support for 'binary' > >> type of images could be possible. > > > > I suppose that switch from Queens to Stein would be quite easy since > > all packages in Queens in Debian are Python 3 only so the most of the > > work has been already done. > > Why not switching to Rocky right now, which has been the most tested? The main point is that we have Queens deployed for customers and we have to support it. Still there is planned upgrade to Rocky in few months and probably to Stein sometime... Jan Vondra From alfredo.deluca at gmail.com Wed Jan 9 13:31:46 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 14:31:46 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: Hi Ignazio. the CLI on stack and so on work just fine. root at aio1:~# os stack list +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ | ID | Stack Name | Project | Stack Status | Creation Time | Updated Time | +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ | 7a9a37d4-e2a6-4187-beae-5c5e03d66839 | freddy | f19b0141f7b240ed85f3cb02703a86a5 | CREATE_FAILED | 2019-01-09T11:49:02Z | None | +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ On Wed, Jan 9, 2019 at 1:23 PM Ignazio Cassano wrote: > Alfredo, try to source you openstack environment variable file: > > source admin-openrc > > Then try to execute the following commands > > heat stack-list > > openstack stack list > > > Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> ...more found in logs >> >> 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient >> [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] >> Domain admin client authentication failed: Unauthorized: The request you >> have made requires authentication. (HTTP 401) (Request-ID: >> req-28f9873a-5627-4f2e-9e19-da6c63753383) >> >> So what authentication it need? >> >> >> On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca >> wrote: >> >>> hi Ignazio. >>> the wait condition failed too on your simple stack. >>> Any other idea where to look at? >>> >>> Cheers >>> >>> [image: image.png] >>> >>> >>> >>> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> thanks Ignazio. Appreciated >>>> >>>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Alfredo, attached herer there is an example: >>>>> >>>>> substitute the image with your image, flavor with your flavor, >>>>> key_name with your key and network with your network >>>>> >>>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>>> >>>>>> Cheers >>>>>> >>>>>> >>>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>>> authorization how reported by Alfredo. >>>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>>> Cheers >>>>>>> >>>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Sean. Thanks for that. >>>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>>> I will try also on IRC and I am >>>>>>>> looking on internet a lot. >>>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>>> works fine. >>>>>>>> Cheers >>>>>>>> >>>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>>> wrote: >>>>>>>> >>>>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>>>> the way. >>>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>>> >>>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>>> simple heat stack example with wait conditions inside >>>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>>> > Cheers >>>>>>>>> > >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Alfredo* >>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From alfredo.deluca at gmail.com Wed Jan 9 13:32:46 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Wed, 9 Jan 2019 14:32:46 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: I can create instances and volumes with no issue. it seems to be maybe keystone I think..... On Wed, Jan 9, 2019 at 1:18 PM Ignazio Cassano wrote: > My suggestion was only to verify that the problem is not magnum but > heat/keystone. > Heat stacks generated by magnum use wait conditions. > I am not so expert of heat, but I am sure it uses keystone. > So you have some issues or in keystone or in heat configuration. > Are you able to create instances/volumes on your openstack ? > If yes keystone can be ok. > > Probably Sean could help !!! > > Cheers > Ignazio > > > Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> ...more found in logs >> >> 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient >> [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] >> Domain admin client authentication failed: Unauthorized: The request you >> have made requires authentication. (HTTP 401) (Request-ID: >> req-28f9873a-5627-4f2e-9e19-da6c63753383) >> >> So what authentication it need? >> >> >> On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca >> wrote: >> >>> hi Ignazio. >>> the wait condition failed too on your simple stack. >>> Any other idea where to look at? >>> >>> Cheers >>> >>> [image: image.png] >>> >>> >>> >>> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> thanks Ignazio. Appreciated >>>> >>>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Alfredo, attached herer there is an example: >>>>> >>>>> substitute the image with your image, flavor with your flavor, >>>>> key_name with your key and network with your network >>>>> >>>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>>> >>>>>> Cheers >>>>>> >>>>>> >>>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>>> authorization how reported by Alfredo. >>>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>>> Cheers >>>>>>> >>>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Sean. Thanks for that. >>>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>>> I will try also on IRC and I am >>>>>>>> looking on internet a lot. >>>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>>> works fine. >>>>>>>> Cheers >>>>>>>> >>>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>>> wrote: >>>>>>>> >>>>>>>>> can we tag this conversation with [heat][magnum] in the subject by >>>>>>>>> the way. >>>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>>> >>>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>>> simple heat stack example with wait conditions inside >>>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>>> > Cheers >>>>>>>>> > >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Alfredo* >>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ifatafekn at gmail.com Wed Jan 9 13:37:24 2019 From: ifatafekn at gmail.com (Ifat Afek) Date: Wed, 9 Jan 2019 15:37:24 +0200 Subject: [vitrage] Nominating Ivan Kolodyazhny for Vitrage core Message-ID: Hi, I would like to nominate Ivan Kolodyazhny for Vitrage core. Ivan has been contributing to Vitrage for a while now. He has focused on upgrade support, vitrage-dashboard and vitrage-tempest-plugin enhancements, and during this time gained a lot of knowledge and experience with Vitrage code base. I believe he would make a great addition to our team. Thanks, Ifat. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jan 9 13:37:32 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 9 Jan 2019 14:37:32 +0100 Subject: [heat][magnum] openstack stack fails In-Reply-To: References: Message-ID: I think you need to verify your /etc/heat/heat.conf with the documentation related to openstack version you are using, Simple heat stack without wait conditions works fine ??? Il giorno mer 9 gen 2019 alle ore 14:31 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > Hi Ignazio. > the CLI on stack and so on work just fine. > root at aio1:~# os stack list > > +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ > | ID | Stack Name | Project > | Stack Status | Creation Time | Updated Time | > > +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ > | 7a9a37d4-e2a6-4187-beae-5c5e03d66839 | freddy | > f19b0141f7b240ed85f3cb02703a86a5 | CREATE_FAILED | 2019-01-09T11:49:02Z | > None | > > +--------------------------------------+------------+----------------------------------+---------------+----------------------+--------------+ > > > On Wed, Jan 9, 2019 at 1:23 PM Ignazio Cassano > wrote: > >> Alfredo, try to source you openstack environment variable file: >> >> source admin-openrc >> >> Then try to execute the following commands >> >> heat stack-list >> >> openstack stack list >> >> >> Il giorno mer 9 gen 2019 alle ore 12:33 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> ...more found in logs >>> >>> 2019-01-09 12:12:59.879 178 ERROR heat.engine.clients.keystoneclient >>> [req-3e2f3b5c-bd4c-4394-8e33-de1299169900 admin admin - default default] >>> Domain admin client authentication failed: Unauthorized: The request you >>> have made requires authentication. (HTTP 401) (Request-ID: >>> req-28f9873a-5627-4f2e-9e19-da6c63753383) >>> >>> So what authentication it need? >>> >>> >>> On Wed, Jan 9, 2019 at 12:10 PM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> hi Ignazio. >>>> the wait condition failed too on your simple stack. >>>> Any other idea where to look at? >>>> >>>> Cheers >>>> >>>> [image: image.png] >>>> >>>> >>>> >>>> On Wed, Jan 9, 2019 at 11:43 AM Alfredo De Luca < >>>> alfredo.deluca at gmail.com> wrote: >>>> >>>>> thanks Ignazio. Appreciated >>>>> >>>>> On Wed, Jan 9, 2019 at 11:25 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Alfredo, attached herer there is an example: >>>>>> >>>>>> substitute the image with your image, flavor with your flavor, >>>>>> key_name with your key and network with your network >>>>>> >>>>>> Il giorno mer 9 gen 2019 alle ore 11:21 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> thanks Ignazio. Do you have a quick example for that? pls... >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>>> >>>>>>> On Wed, Jan 9, 2019 at 11:08 AM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> Yes, I presume something goes wrong in heat wait condition >>>>>>>> authorization how reported by Alfredo. >>>>>>>> So I suggested to try a simple heat stack with a wait condition. >>>>>>>> Cheers >>>>>>>> >>>>>>>> Il giorno mer 9 gen 2019 alle ore 10:40 Alfredo De Luca < >>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>> >>>>>>>>> Hi Sean. Thanks for that. >>>>>>>>> Do you have any idea about this error? @Ignazio Cassano >>>>>>>>> I will try also on IRC and I am >>>>>>>>> looking on internet a lot. >>>>>>>>> Next step also I will try to create a simple stack to see if it >>>>>>>>> works fine. >>>>>>>>> Cheers >>>>>>>>> >>>>>>>>> On Wed, Jan 9, 2019 at 9:52 AM Sean Mooney >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> can we tag this conversation with [heat][magnum] in the subject >>>>>>>>>> by the way. >>>>>>>>>> i keep clicking on it to get the context and realising i can help. >>>>>>>>>> >>>>>>>>>> On Wed, 2019-01-09 at 09:08 +0100, Ignazio Cassano wrote: >>>>>>>>>> > Alfredo, you could make another test searching on internet as >>>>>>>>>> simple heat stack example with wait conditions inside >>>>>>>>>> > for checking if heat wait conditions work fine. >>>>>>>>>> > Cheers >>>>>>>>>> > >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *Alfredo* >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Alfredo* >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eyalb1 at gmail.com Wed Jan 9 13:49:51 2019 From: eyalb1 at gmail.com (Eyal B) Date: Wed, 9 Jan 2019 15:49:51 +0200 Subject: [vitrage] Nominating Ivan Kolodyazhny for Vitrage core In-Reply-To: References: Message-ID: +1 On Wed, Jan 9, 2019, 15:42 Ifat Afek Hi, > > > I would like to nominate Ivan Kolodyazhny for Vitrage core. > > Ivan has been contributing to Vitrage for a while now. He has focused on > upgrade support, vitrage-dashboard and vitrage-tempest-plugin enhancements, > and during this time gained a lot of knowledge and experience with Vitrage > code base. I believe he would make a great addition to our team. > > > Thanks, > > Ifat. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Wed Jan 9 13:52:41 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 9 Jan 2019 13:52:41 +0000 Subject: [Kolla] Queens for debian images In-Reply-To: <6e3c0328-c544-a4dd-32e5-d7e45193a4a7@linaro.org> References: <20190108140026.p4462df5otnyizm2@yuggoth.org> <6e3c0328-c544-a4dd-32e5-d7e45193a4a7@linaro.org> Message-ID: <20190109135241.a6mfpgupedylgfws@yuggoth.org> On 2019-01-09 09:57:06 +0100 (+0100), Marcin Juszkiewicz wrote: > W dniu 08.01.2019 o 15:00, Jeremy Stanley pisze: > >> 1. https://wiki.openstack.org/wiki/How_To_Contribute#Reviewing > > [...] > > > > These days it's probably better to recommend > > https://docs.openstack.org/contributors/ since I expect we're about > > ready to retire that old wiki page. > > Then I hope that someone will take care of SEO and redirects. Link I > gave was first link from "openstack contributing" google search. Yes, I believe Kendall's plan there is to replace all (or most of) the content in that article with a link to the contributor guide. We just wanted to be sure everything it mentions is covered in that newer and more durable document before replacing it. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From doug at doughellmann.com Wed Jan 9 14:21:46 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 09 Jan 2019 09:21:46 -0500 Subject: queens [magnum] patches In-Reply-To: References: Message-ID: Ignazio Cassano writes: > Hello, > last week I talked on #openstack-containers IRC about important patches for > magnum reported here: > https://review.openstack.org/#/c/577477/ > > I'd like to know when the above will be backported on queens and if centos7 > and ubuntu packages > will be upgraded with them. > Any roadmap ? > I would go on with magnum testing on queens because I am going to upgrade > from ocata to pike and from pike to queens. > > At this time I have aproduction environment on ocata and a testing > environment on queens. > > Best Regards > Ignazio You can submit those backports yourself, either through the gerrit web UI or by manually creating the patches locally using git commands. There are more details on processes and tools for doing this in the stable maintenance section of the project team guide [1]. As far as when those changes might end up in packages, the community doesn't really have much insight into (or influence over) what stable patches are pulled down by the distributors or how they schedule their updates and releases. So I recommend talking to the folks who prepare the distribution(s) you're interested in, after the backport patches are approved. [1] https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes -- Doug From juliaashleykreger at gmail.com Wed Jan 9 14:47:58 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 9 Jan 2019 06:47:58 -0800 Subject: Ironic ibmc driver for Huawei server In-Reply-To: References: Message-ID: Ironic does not have a deadline for merging specs. We will generally avoid landing large features the closer we get to the end of the cycle. If third party CI is up before the end of the cycle, I suspect it would just be a matter of iterating the driver code through review. You may wish to propose it sooner rather than later, and we can begin to give you feedback from there. -Julia On Tue, Jan 8, 2019 at 11:21 PM xmufive at qq.com wrote: > > Hi Julia, > > When is the deadline of approving specs, I am afraid that huawei ibmc spec will be put off util next release. > > Thanks > Qianbiao NG > > > ------------------ 原始邮件 ------------------ > 发件人: "Julia Kreger"; > 发送时间: 2019年1月9日(星期三) 凌晨2:26 > 收件人: "xmufive at qq.com"; > 抄送: "openstack-discuss"; > 主题: Re: Ironic ibmc driver for Huawei server > > Greetings Qianbiao.NG, > > Welcome to Ironic! > > The purpose and requirement of Third Party CI is to test drivers are > in working order with the current state of the code in Ironic and help > prevent the community from accidentally breaking an in-tree vendor > driver. Vendors do this by providing one or more physical systems in a > pool of hardware that is managed by a Zuul v3 or Jenkins installation > which installs ironic (typically in a virtual machine), and configures > it to perform a deployment upon the physical bare metal node. Upon > failure or successful completion of the test, the results are posted > back to OpenStack Gerrit. > > Ultimately this helps provide the community and the vendor with a > level of assurance in what is released by the ironic community. The > cinder project has a similar policy and I'll email you directly with > the contacts at Huawei that work with the Cinder community, as they > would be familiar with many of the aspects of operating third party > CI. > > You can find additional information here on the requirement and the > reasoning behind it: > > https://specs.openstack.org/openstack/ironic-specs/specs/approved/third-party-ci.html > > We may also be able to put you in touch with some vendors that have > recently worked on implementing third-party CI. I'm presently > inquiring with others if that will be possible. If you are able to > join Internet Relay Chat, our IRC channel (#openstack-ironic) has > several individual who have experience setting up and maintaining > third-party CI for ironic. > > Thanks, > > -Julia > > On Tue, Jan 8, 2019 at 8:54 AM xmufive at qq.com wrote: > > > > Hi julia, > > > > According to the comment of story, > > 1. The spec for huawei ibmc drvier has been post here: https://storyboard.openstack.org/#!/story/2004635 , waiting for review. > > 2. About the third-party CI part, we provide mocked unittests for our driver's code. Not sure what third-party CI works for in this case. What else we should do? > > > > Thanks > > Qianbiao.NG From dh3 at sanger.ac.uk Wed Jan 9 15:13:29 2019 From: dh3 at sanger.ac.uk (Dave Holland) Date: Wed, 9 Jan 2019 15:13:29 +0000 Subject: [cinder] volume encryption performance impact Message-ID: <20190109151329.GA7953@sanger.ac.uk> Hello, I've just started investigating Cinder volume encryption using Queens (RHOSP13) with a Ceph/RBD backend and the performance overhead is... surprising. Some naive bonnie++ numbers, comparing a plain vs encrypted volume: plain: write 1400MB/s, read 390MB/s encrypted: write 81MB/s, read 83MB/s The encryption was configured with: openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LuksEncryptor-Template-256 Does anyone have a similar setup, and can share their performance figures, or give me an idea of what percentage performance impact I should expect? Alternatively: is AES256 overkill, or, where should I start looking for a misconfiguration or bottleneck? Thanks in advance. Dave -- ** Dave Holland ** Systems Support -- Informatics Systems Group ** ** 01223 496923 ** Wellcome Sanger Institute, Hinxton, UK ** -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From Arkady.Kanevsky at dell.com Wed Jan 9 15:20:15 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 9 Jan 2019 15:20:15 +0000 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: References: Message-ID: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> Thanks Boris. Do we still use DriverLog for marketplace driver status updates? Thanks, Arkady From: Boris Renski Sent: Tuesday, January 8, 2019 11:11 AM To: openstack-dev at lists.openstack.org; Ilya Shakhat; Herman Narkaytis; David Stoltenberg Subject: [openstack-dev] [stackalytics] Stackalytics Facelift [EXTERNAL EMAIL] Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: * We have new look and feel at stackalytics.com * We did away with DriverLog and Member Directory, which were not very actively used or maintained. Those are still available via direct links, but not in the men on the top * BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible at the top nav. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback or answer questions. -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From e0ne at e0ne.info Wed Jan 9 15:53:04 2019 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Wed, 9 Jan 2019 17:53:04 +0200 Subject: [cinder] Proposing new Core Members ... In-Reply-To: <20190108223535.GA29520@sm-workstation> References: <7f844f7b-d78e-ca33-b2bb-0244d4f1e3d7@gmail.com> <20190108223535.GA29520@sm-workstation> Message-ID: +1! Welcome to the team, guys! Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Wed, Jan 9, 2019 at 12:36 AM Sean McGinnis wrote: > On Tue, Jan 08, 2019 at 04:00:14PM -0600, Jay Bryant wrote: > > Team, > > > > I would like propose two people who have been taking a more active role > in > > Cinder reviews as Core Team Members: > > > > > > > > I think that both Rajat and Yikun will be welcome additions to help > replace > > the cores that have recently been removed. > > > > +1 from me. Both have been doing a good job giving constructive feedback on > reviews and have been spending some time reviewing code other than their > own > direct interests, so I think they would be welcome additions. > > Sean > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Wed Jan 9 16:02:57 2019 From: aschultz at redhat.com (Alex Schultz) Date: Wed, 9 Jan 2019 09:02:57 -0700 Subject: [tripleo] Re: [infra] NetworkManager on infra Fedora 29 and CentOS nodes In-Reply-To: <20190109061109.GA24618@fedora19.localdomain> References: <20190109061109.GA24618@fedora19.localdomain> Message-ID: On Tue, Jan 8, 2019 at 11:15 PM Ian Wienand wrote: > > Hello, > > Just a heads-up; with Fedora 29 the legacy networking setup was moved > into a separate, not-installed-by-default network-scripts package. > This has prompted us to finally move to managing interfaces on our > Fedora and CentOS CI hosts with NetworkManager (see [1]) > > Support for this is enabled with features added in glean 1.13.0 and > diskimage-builder 1.19.0. > > The newly created Fedora 29 nodes [2] will have it enabled, and [3] > will switch CentOS nodes shortly. This is tested by our nodepool jobs > which build images, upload them into devstack and boot them, and then > check the networking [4]. > Don't suppose we could try this with tripleo jobs prior to cutting them all over could we? We don't use NetworkManager and infact os-net-config doesn't currently support NetworkManager. I don't think it'll cause problems, but I'd like to have some test prior to cutting them all over. Thanks, -Alex > I don't really expect any problems, but be aware NetworkManager > packages will appear on the CentOS 7 and Fedora base images with these > changes. > > Thanks > > -i > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1643763#c2 > [2] https://review.openstack.org/618672 > [3] https://review.openstack.org/619960 > [4] https://review.openstack.org/618671 > From mthode at mthode.org Wed Jan 9 16:54:35 2019 From: mthode at mthode.org (Matthew Thode) Date: Wed, 9 Jan 2019 10:54:35 -0600 Subject: [cinder] volume encryption performance impact In-Reply-To: <20190109151329.GA7953@sanger.ac.uk> References: <20190109151329.GA7953@sanger.ac.uk> Message-ID: <20190109165435.jpmcgxmktabjplps@mthode.org> On 19-01-09 15:13:29, Dave Holland wrote: > Hello, > > I've just started investigating Cinder volume encryption using Queens > (RHOSP13) with a Ceph/RBD backend and the performance overhead is... > surprising. Some naive bonnie++ numbers, comparing a plain vs encrypted > volume: > > plain: write 1400MB/s, read 390MB/s > encrypted: write 81MB/s, read 83MB/s > > The encryption was configured with: > > openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LuksEncryptor-Template-256 > > Does anyone have a similar setup, and can share their performance > figures, or give me an idea of what percentage performance impact I > should expect? Alternatively: is AES256 overkill, or, where should I > start looking for a misconfiguration or bottleneck? > I haven't tested yet, but that doesn't sound right, it sounds like it's not using aes-ni (or tha amd equiv). 256 may be higher than is needed (256 aes has some attacks that 128 does not iirc as well) but should drop perf that much unless it's dropping back to sofware. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From balazs.gibizer at ericsson.com Wed Jan 9 16:56:03 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Wed, 9 Jan 2019 16:56:03 +0000 Subject: [nova] review guide for the bandwidth patches In-Reply-To: <1547029853.1128.0@smtp.office365.com> References: <1545231821.28650.2@smtp.office365.com> <1545316423.10879.3@smtp.office365.com> <1545992000.14055.0@smtp.office365.com> <4dee6ea6-0fe6-77c8-45d1-14638638f3ee@gmail.com> <1546865551.29530.0@smtp.office365.com> <1547029853.1128.0@smtp.office365.com> Message-ID: <1547052955.1128.1@smtp.office365.com> On Wed, Jan 9, 2019 at 11:30 AM, Balázs Gibizer wrote: > > > On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer > wrote: >> >> >>> But, let's chat more about it via a hangout the week after next >>> (week >>> of January 14 when Matt is back), as suggested in #openstack-nova >>> today. We'll be able to have a high-bandwidth discussion then and >>> agree on a decision on how to move forward with this. >> >> Thank you all for the discussion. I agree to have a real-time >> discussion about the way forward. >> >> Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a >> hangouts[2]? > It seems that Tuesday 15th of Jan, 17:00 UTC [2] would be better for the team. So I'm moving the call there. Cheers, gibi [1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI [2] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190115T170000 From mark at stackhpc.com Wed Jan 9 17:08:47 2019 From: mark at stackhpc.com (Mark Goddard) Date: Wed, 9 Jan 2019 17:08:47 +0000 Subject: [kayobe] IRC meetings In-Reply-To: References: Message-ID: Thanks to everyone who replied. There was a tie, so let's go with every other Monday at 14:00 UTC. First meeting will be Monday 21st January in #openstack-kayobe. Mark On Thu, 29 Nov 2018 at 13:07, Mark Goddard wrote: > Hi, > > The community has requested that we start holding regular IRC meetings for > kayobe, and I agree. I suggest we start with meeting every other week, in > the #openstack-kayobe channel. > > I've created a Doodle poll [1] with each hour between 2pm and 6pm UTC > available every weekday. Please respond on the poll with your availability. > > Thanks, > Mark > > [1] https://doodle.com/poll/6di3pddsahg6h66k > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Arne.Wiebalck at cern.ch Wed Jan 9 17:45:25 2019 From: Arne.Wiebalck at cern.ch (Arne Wiebalck) Date: Wed, 9 Jan 2019 17:45:25 +0000 Subject: [cinder] volume encryption performance impact In-Reply-To: <20190109151329.GA7953@sanger.ac.uk> References: <20190109151329.GA7953@sanger.ac.uk> Message-ID: Hi Dave, With the same key length and backend, we’ve done some quick checks at the time, but did not notice any significant performance impact (beyond a slight CPU increase). We did not test beyond the QoS limits we apply, though. Cheers, Arne > On 9 Jan 2019, at 16:13, Dave Holland wrote: > > Hello, > > I've just started investigating Cinder volume encryption using Queens > (RHOSP13) with a Ceph/RBD backend and the performance overhead is... > surprising. Some naive bonnie++ numbers, comparing a plain vs encrypted > volume: > > plain: write 1400MB/s, read 390MB/s > encrypted: write 81MB/s, read 83MB/s > > The encryption was configured with: > > openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LuksEncryptor-Template-256 > > Does anyone have a similar setup, and can share their performance > figures, or give me an idea of what percentage performance impact I > should expect? Alternatively: is AES256 overkill, or, where should I > start looking for a misconfiguration or bottleneck? > > Thanks in advance. > > Dave > -- > ** Dave Holland ** Systems Support -- Informatics Systems Group ** > ** 01223 496923 ** Wellcome Sanger Institute, Hinxton, UK ** > > > -- > The Wellcome Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > -- Arne Wiebalck CERN IT From pierre at stackhpc.com Wed Jan 9 18:21:00 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 9 Jan 2019 18:21:00 +0000 Subject: [blazar] Nominating Tetsuro Nakamura for blazar-core Message-ID: Hello, I would like to nominate Tetsuro Nakamura for membership in the blazar-core team. Tetsuro started contributing to Blazar last summer. He has been contributing great code for integrating Blazar with placement and participating actively in the project. He is also providing good feedback to the rest of the contributors via code review, including on code not related to placement. He would make a great addition to the core team. Unless there are objections, I will add him to the core team in a week's time. Pierre From tbechtold at suse.com Wed Jan 9 19:40:12 2019 From: tbechtold at suse.com (Thomas Bechtold) Date: Wed, 9 Jan 2019 20:40:12 +0100 Subject: [rpm-packaging] Proposing new core member Message-ID: Hi, I would like to nominate Colleen Murphy for rpm-packaging core. Colleen has be active in doing very valuable reviews since some time so I feel she would be a great addition to the team. Please give your +1/-1 in the next days. Cheers, Tom From doug at doughellmann.com Wed Jan 9 19:57:09 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 09 Jan 2019 14:57:09 -0500 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: Message-ID: Doug Hellmann writes: > Doug Hellmann writes: > >> Today devstack requires each project to explicitly indicate that it can >> be installed under python 3, even when devstack itself is running with >> python 3 enabled. >> >> As part of the python3-first goal, I have proposed a change to devstack >> to modify that behavior [1]. With the change in place, when devstack >> runs with python3 enabled all services are installed under python 3, >> unless explicitly listed as not supporting python 3. >> >> If your project has a devstack plugin or runs integration or functional >> test jobs that use devstack, please test your project with the patch >> (you can submit a trivial change to your project and use Depends-On to >> pull in the devstack change). >> >> [1] https://review.openstack.org/#/c/622415/ >> -- >> Doug >> > > We have had a few +1 votes on the patch above with comments that > indicate at least a couple of projects have taken the time to test and > verify that things won't break for them with the change. > > Are we ready to proceed with merging the change? > > -- > Doug > The patch mentioned above that changes the default version of Python in devstack to 3 by default has merged. If this triggers a failure in your devstack-based jobs, you can use the disable_python3_package function to add your package to the list *not* installed using Python 3 until the fixes are available. -- Doug From skaplons at redhat.com Wed Jan 9 20:22:26 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 9 Jan 2019 21:22:26 +0100 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: Message-ID: <1A260E8F-8EC1-4701-BC10-759457F174DA@redhat.com> Hi, Just to be sure, does it mean that we don’t need to add USE_PYTHON3=True to run job on python3, right? — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Doug Hellmann w dniu 09.01.2019, o godz. 20:57: > > Doug Hellmann writes: > >> Doug Hellmann writes: >> >>> Today devstack requires each project to explicitly indicate that it can >>> be installed under python 3, even when devstack itself is running with >>> python 3 enabled. >>> >>> As part of the python3-first goal, I have proposed a change to devstack >>> to modify that behavior [1]. With the change in place, when devstack >>> runs with python3 enabled all services are installed under python 3, >>> unless explicitly listed as not supporting python 3. >>> >>> If your project has a devstack plugin or runs integration or functional >>> test jobs that use devstack, please test your project with the patch >>> (you can submit a trivial change to your project and use Depends-On to >>> pull in the devstack change). >>> >>> [1] https://review.openstack.org/#/c/622415/ >>> -- >>> Doug >>> >> >> We have had a few +1 votes on the patch above with comments that >> indicate at least a couple of projects have taken the time to test and >> verify that things won't break for them with the change. >> >> Are we ready to proceed with merging the change? >> >> -- >> Doug >> > > The patch mentioned above that changes the default version of Python in > devstack to 3 by default has merged. If this triggers a failure in your > devstack-based jobs, you can use the disable_python3_package function to > add your package to the list *not* installed using Python 3 until the > fixes are available. > > -- > Doug > From ltoscano at redhat.com Wed Jan 9 20:45:47 2019 From: ltoscano at redhat.com (Luigi Toscano) Date: Wed, 09 Jan 2019 21:45:47 +0100 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: Message-ID: <1880193.M4ppHeEpuz@whitebase.usersys.redhat.com> On Wednesday, 9 January 2019 20:57:09 CET Doug Hellmann wrote: > Doug Hellmann writes: > > Doug Hellmann writes: > >> Today devstack requires each project to explicitly indicate that it can > >> be installed under python 3, even when devstack itself is running with > >> python 3 enabled. > >> > >> As part of the python3-first goal, I have proposed a change to devstack > >> to modify that behavior [1]. With the change in place, when devstack > >> runs with python3 enabled all services are installed under python 3, > >> unless explicitly listed as not supporting python 3. > >> > >> If your project has a devstack plugin or runs integration or functional > >> test jobs that use devstack, please test your project with the patch > >> (you can submit a trivial change to your project and use Depends-On to > >> pull in the devstack change). > >> > >> [1] https://review.openstack.org/#/c/622415/ > > > > We have had a few +1 votes on the patch above with comments that > > indicate at least a couple of projects have taken the time to test and > > verify that things won't break for them with the change. > > > > Are we ready to proceed with merging the change? > > The patch mentioned above that changes the default version of Python in > devstack to 3 by default has merged. If this triggers a failure in your > devstack-based jobs, you can use the disable_python3_package function to > add your package to the list *not* installed using Python 3 until the > fixes are available. Isn't the purpose of the patch to make sure that all services are installed using Python 3 when Python 3 is enabled, but that we still need to set USE_PYTHON3=True? Ciao -- Luigi From doug at doughellmann.com Wed Jan 9 20:55:29 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 09 Jan 2019 15:55:29 -0500 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: <1880193.M4ppHeEpuz@whitebase.usersys.redhat.com> References: <1880193.M4ppHeEpuz@whitebase.usersys.redhat.com> Message-ID: Luigi Toscano writes: > On Wednesday, 9 January 2019 20:57:09 CET Doug Hellmann wrote: >> Doug Hellmann writes: >> > Doug Hellmann writes: >> >> Today devstack requires each project to explicitly indicate that it can >> >> be installed under python 3, even when devstack itself is running with >> >> python 3 enabled. >> >> >> >> As part of the python3-first goal, I have proposed a change to devstack >> >> to modify that behavior [1]. With the change in place, when devstack >> >> runs with python3 enabled all services are installed under python 3, >> >> unless explicitly listed as not supporting python 3. >> >> >> >> If your project has a devstack plugin or runs integration or functional >> >> test jobs that use devstack, please test your project with the patch >> >> (you can submit a trivial change to your project and use Depends-On to >> >> pull in the devstack change). >> >> >> >> [1] https://review.openstack.org/#/c/622415/ >> > >> > We have had a few +1 votes on the patch above with comments that >> > indicate at least a couple of projects have taken the time to test and >> > verify that things won't break for them with the change. >> > >> > Are we ready to proceed with merging the change? >> >> The patch mentioned above that changes the default version of Python in >> devstack to 3 by default has merged. If this triggers a failure in your >> devstack-based jobs, you can use the disable_python3_package function to >> add your package to the list *not* installed using Python 3 until the >> fixes are available. > > Isn't the purpose of the patch to make sure that all services are installed > using Python 3 when Python 3 is enabled, but that we still need to set > USE_PYTHON3=True? > > Ciao > -- > Luigi > > > It is still necessary to set USE_PYTHON3=True in the job. The logic enabled by that flag used to *also* require each service to individually enable python 3 testing. Now that is no longer true, and services must explicitly *disable* python 3 if it is not supported. The USE_PYTHON3 flag allows us to have 2 jobs, with devstack running under python 2 and python 3. When we drop python 2 support, we can drop the USE_PYTHON3 flag from devstack and always run under python 3. (We could do that before we drop support for 2, but we would have to modify a lot of job configurations and I'm not sure it buys us much given the amount of effort involved there.) -- Doug From skaplons at redhat.com Wed Jan 9 21:11:29 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 9 Jan 2019 22:11:29 +0100 Subject: [dev][goal][python3][qa][devstack][ptl] changing devstack's python 3 behavior In-Reply-To: References: <1880193.M4ppHeEpuz@whitebase.usersys.redhat.com> Message-ID: <9DF55F17-EE56-4237-874E-70DE05A1722A@redhat.com> Hi, Thx for clarification Doug. — Slawek Kaplonski Senior software engineer Red Hat > Wiadomość napisana przez Doug Hellmann w dniu 09.01.2019, o godz. 21:55: > > Luigi Toscano writes: > >> On Wednesday, 9 January 2019 20:57:09 CET Doug Hellmann wrote: >>> Doug Hellmann writes: >>>> Doug Hellmann writes: >>>>> Today devstack requires each project to explicitly indicate that it can >>>>> be installed under python 3, even when devstack itself is running with >>>>> python 3 enabled. >>>>> >>>>> As part of the python3-first goal, I have proposed a change to devstack >>>>> to modify that behavior [1]. With the change in place, when devstack >>>>> runs with python3 enabled all services are installed under python 3, >>>>> unless explicitly listed as not supporting python 3. >>>>> >>>>> If your project has a devstack plugin or runs integration or functional >>>>> test jobs that use devstack, please test your project with the patch >>>>> (you can submit a trivial change to your project and use Depends-On to >>>>> pull in the devstack change). >>>>> >>>>> [1] https://review.openstack.org/#/c/622415/ >>>> >>>> We have had a few +1 votes on the patch above with comments that >>>> indicate at least a couple of projects have taken the time to test and >>>> verify that things won't break for them with the change. >>>> >>>> Are we ready to proceed with merging the change? >>> >>> The patch mentioned above that changes the default version of Python in >>> devstack to 3 by default has merged. If this triggers a failure in your >>> devstack-based jobs, you can use the disable_python3_package function to >>> add your package to the list *not* installed using Python 3 until the >>> fixes are available. >> >> Isn't the purpose of the patch to make sure that all services are installed >> using Python 3 when Python 3 is enabled, but that we still need to set >> USE_PYTHON3=True? >> >> Ciao >> -- >> Luigi >> >> >> > > It is still necessary to set USE_PYTHON3=True in the job. The logic > enabled by that flag used to *also* require each service to individually > enable python 3 testing. Now that is no longer true, and services must > explicitly *disable* python 3 if it is not supported. > > The USE_PYTHON3 flag allows us to have 2 jobs, with devstack running > under python 2 and python 3. When we drop python 2 support, we can drop > the USE_PYTHON3 flag from devstack and always run under python 3. (We > could do that before we drop support for 2, but we would have to modify > a lot of job configurations and I'm not sure it buys us much given the > amount of effort involved there.) > > -- > Doug > From dirk at dmllr.de Wed Jan 9 22:03:41 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Wed, 9 Jan 2019 23:03:41 +0100 Subject: [rpm-packaging] Proposing new core member In-Reply-To: References: Message-ID: Am Mi., 9. Jan. 2019 um 20:41 Uhr schrieb Thomas Bechtold : > Please give your +1/-1 in the next days. +1, well said, happy to have her increase our (too) small core reviewer team! Greetings, Dirk From miguel at mlavalle.com Wed Jan 9 22:07:50 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Wed, 9 Jan 2019 16:07:50 -0600 Subject: [openstack-dev] [neutron] Cancelling the L3 meeting on January 10th Message-ID: Dear Neutron team, Several members of the L3 sub-team are attending internal company meetings, have medical appointments or have other personal commitments at the time of the weekly meeting on January 10th. As a consequence, we are cancelling this meeting and will resume on the 17th. Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Wed Jan 9 23:11:55 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 9 Jan 2019 17:11:55 -0600 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> Message-ID: On 12/20/18 4:41 AM, Herve Beraud wrote: > > > Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong > > a écrit : > > Hi Ben, > > I am apology that in last month we do not have much time maintaining > the code. > > > but if no one's going to use it then I'd rather cut our > > losses than continue pouring time into it. > > I agree, we will wait for the community to decide the need for the > feature. > In the near future, we do not have ability to maintain the code. If > anyone > has interest to continue maintaining the patch, we will support with > document, > reviewing... in our possibility. > > > I can help you to maintain the code if needed. > > Personaly I doesn't need this feature so I agree Ben and Doug point of view. > > We need to measure how many this feature is useful and if it make sense > to support and maintain more code in the future related to this feature > without any usages behind that. We discussed this again in the Oslo meeting this week, and to share with the wider audience here's what I propose: Since the team that initially proposed the feature and that we expected to help maintain it are no longer able to do so, and it's not clear to the Oslo team that there is sufficient demand for a rather complex feature like this, I suggest that we either WIP or abandon the current patch series. Gerrit never forgets, so if at some point there are contributors (new or old) who have a vested interest in the feature we can always resurrect it. If you have any thoughts about this plan please let me know. Otherwise I will act on it sometime in the near-ish future. In the meantime, if anyone is desperate for Oslo work to do here are a few things that have been lingering on my todo list: * We have a unit test in oslo.utils (test_excutils) that is still using mox. That needs to be migrated to mock. * oslo.cookiecutter has a number of things that are out of date (doc layout, lack of reno, coverage job). Since it's unlikely we've reached peak Oslo library we should update that so there aren't a bunch of post-creation changes needed like there were with oslo.upgradecheck (and I'm guessing oslo.limit). * The config validator still needs support for dynamic groups, if oslo.config is your thing. * There are 326 bugs open across Oslo projects. Help wanted. :-) Thanks. -Ben From iwienand at redhat.com Wed Jan 9 23:26:24 2019 From: iwienand at redhat.com (Ian Wienand) Date: Thu, 10 Jan 2019 10:26:24 +1100 Subject: [tripleo] Re: [infra] NetworkManager on infra Fedora 29 and CentOS nodes In-Reply-To: References: <20190109061109.GA24618@fedora19.localdomain> Message-ID: <20190109232624.GB24618@fedora19.localdomain> On Wed, Jan 09, 2019 at 09:02:57AM -0700, Alex Schultz wrote: > Don't suppose we could try this with tripleo jobs prior to cutting > them all over could we? We don't use NetworkManager and infact > os-net-config doesn't currently support NetworkManager. I don't think > it'll cause problems, but I'd like to have some test prior to cutting > them all over. It is possible to stage this in by creating a new NetworkManager enabled node-type. I've proposed that in [1] but it's only useful if you want to then follow-up with setting up testing jobs to use the new node-type. We can then revert and apply the change to regular nodes. By just switching directly in [2], we can quite quickly revert if there should be an issue. We can immediately delete the new image, revert the config change and then worry about fixing it. Staging it is the conservative approach and more work all round but obviously safer; hoping for the best with the escape hatch is probably my preferred option given the low risk. I've WIP'd both reviews so just let us know in there your thoughts. Thanks, -i [1] https://review.openstack.org/629680 [2] https://review.openstack.org/619960 From aschultz at redhat.com Wed Jan 9 23:34:08 2019 From: aschultz at redhat.com (Alex Schultz) Date: Wed, 9 Jan 2019 16:34:08 -0700 Subject: [tripleo] Re: [infra] NetworkManager on infra Fedora 29 and CentOS nodes In-Reply-To: <20190109232624.GB24618@fedora19.localdomain> References: <20190109061109.GA24618@fedora19.localdomain> <20190109232624.GB24618@fedora19.localdomain> Message-ID: On Wed, Jan 9, 2019 at 4:26 PM Ian Wienand wrote: > > On Wed, Jan 09, 2019 at 09:02:57AM -0700, Alex Schultz wrote: > > Don't suppose we could try this with tripleo jobs prior to cutting > > them all over could we? We don't use NetworkManager and infact > > os-net-config doesn't currently support NetworkManager. I don't think > > it'll cause problems, but I'd like to have some test prior to cutting > > them all over. > > It is possible to stage this in by creating a new NetworkManager > enabled node-type. I've proposed that in [1] but it's only useful if > you want to then follow-up with setting up testing jobs to use the new > node-type. We can then revert and apply the change to regular nodes. > > By just switching directly in [2], we can quite quickly revert if > there should be an issue. We can immediately delete the new image, > revert the config change and then worry about fixing it. > > Staging it is the conservative approach and more work all round but > obviously safer; hoping for the best with the escape hatch is probably > my preferred option given the low risk. I've WIP'd both reviews so > just let us know in there your thoughts. > For us to test I think we just need https://review.openstack.org/#/c/629685/ once the node pool change goes in. Then the jobs on that change will be the NetworkManager version. I would really prefer testing this way than possibly having to revert after breaking a bunch of in flight patches. I'll defer to others if they think it's OK to just land it and revert as needed. Thanks, -Alex > Thanks, > > -i > > [1] https://review.openstack.org/629680 > [2] https://review.openstack.org/619960 From iwienand at redhat.com Thu Jan 10 00:43:06 2019 From: iwienand at redhat.com (Ian Wienand) Date: Thu, 10 Jan 2019 11:43:06 +1100 Subject: [infra] Updating fedora-latest nodeset to Fedora 29 Message-ID: <20190110004306.GA995@fedora19.localdomain> Hi, Just a heads up that we're soon switching "fedora-latest" nodes from Fedora 28 to Fedora 29 [1] (setting up this switch took a bit longer than usual, see [2]). Presumably if you're using "fedora-latest" you want the latest Fedora, so this should not be unexpected :) But this is the first time we're making this transition with the "-latest" nodeset, so please report any issues. Thanks, -i [1] https://review.openstack.org/618673 [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001530.html From dangtrinhnt at gmail.com Thu Jan 10 01:24:46 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 10 Jan 2019 10:24:46 +0900 Subject: [Searchlight] Nominating Thuy Dang for Searchlight core Message-ID: Hello team, I would like to nominate Thuy Dang for Searchlight core. He has been leading the effort to clarify our vision and working on some blueprints to make Searchlight a multi-cloud application. I believe Thuy will be a great resource for our team. Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From tuantuluong at gmail.com Thu Jan 10 02:07:10 2019 From: tuantuluong at gmail.com (=?UTF-8?B?bMawxqFuZyBo4buvdSB0deG6pW4=?=) Date: Thu, 10 Jan 2019 10:07:10 +0800 Subject: [Searchlight] Nominating Thuy Dang for Searchlight core In-Reply-To: References: Message-ID: +1 from me :) On Thursday, January 10, 2019, Trinh Nguyen wrote: > Hello team, > > I would like to nominate Thuy Dang for > Searchlight core. He has been leading the effort to clarify our vision and > working on some blueprints to make Searchlight a multi-cloud application. I > believe Thuy will be a great resource for our team. > > Bests, > > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From phuongnh at vn.fujitsu.com Thu Jan 10 03:20:16 2019 From: phuongnh at vn.fujitsu.com (Nguyen Hung, Phuong) Date: Thu, 10 Jan 2019 03:20:16 +0000 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> Message-ID: <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> Hi Ben, > I suggest that we either WIP or abandon the current > patch series. ... > If you have any thoughts about this plan please let me know. Otherwise I > will act on it sometime in the near-ish future. Thanks for your consideration. I am agree with you, please help me to abandon them because I am not privileged with those patches. Regards, Phuong. -----Original Message----- From: Ben Nemec [mailto:openstack at nemebean.com] Sent: Thursday, January 10, 2019 6:12 AM To: Herve Beraud; Nguyen, Hung Phuong Cc: openstack-discuss at lists.openstack.org Subject: Re: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams On 12/20/18 4:41 AM, Herve Beraud wrote: > > > Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong > > a écrit : > > Hi Ben, > > I am apology that in last month we do not have much time maintaining > the code. > > > but if no one's going to use it then I'd rather cut our > > losses than continue pouring time into it. > > I agree, we will wait for the community to decide the need for the > feature. > In the near future, we do not have ability to maintain the code. If > anyone > has interest to continue maintaining the patch, we will support with > document, > reviewing... in our possibility. > > > I can help you to maintain the code if needed. > > Personaly I doesn't need this feature so I agree Ben and Doug point of view. > > We need to measure how many this feature is useful and if it make sense > to support and maintain more code in the future related to this feature > without any usages behind that. We discussed this again in the Oslo meeting this week, and to share with the wider audience here's what I propose: Since the team that initially proposed the feature and that we expected to help maintain it are no longer able to do so, and it's not clear to the Oslo team that there is sufficient demand for a rather complex feature like this, I suggest that we either WIP or abandon the current patch series. Gerrit never forgets, so if at some point there are contributors (new or old) who have a vested interest in the feature we can always resurrect it. If you have any thoughts about this plan please let me know. Otherwise I will act on it sometime in the near-ish future. In the meantime, if anyone is desperate for Oslo work to do here are a few things that have been lingering on my todo list: * We have a unit test in oslo.utils (test_excutils) that is still using mox. That needs to be migrated to mock. * oslo.cookiecutter has a number of things that are out of date (doc layout, lack of reno, coverage job). Since it's unlikely we've reached peak Oslo library we should update that so there aren't a bunch of post-creation changes needed like there were with oslo.upgradecheck (and I'm guessing oslo.limit). * The config validator still needs support for dynamic groups, if oslo.config is your thing. * There are 326 bugs open across Oslo projects. Help wanted. :-) Thanks. -Ben From melwittt at gmail.com Thu Jan 10 03:51:24 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 9 Jan 2019 19:51:24 -0800 Subject: [nova][dev] spec freeze is today/tomorrow Jan 10 Message-ID: <810a0ef0-a943-bc03-7c24-17ceaa6bd241@gmail.com> Hey all, Spec freeze is today/tomorrow, depending on your time zone. We've been tracking specs that are close to approval here: https://etherpad.openstack.org/p/nova-stein-blueprint-spec-freeze Thanks everyone for jumping in and getting so much review done ahead of s-2! Please take one more look and let's get those final approvals done before spec freeze at EOD Jan 10. Best, -melanie From singh.surya64mnnit at gmail.com Thu Jan 10 04:45:49 2019 From: singh.surya64mnnit at gmail.com (Surya Singh) Date: Thu, 10 Jan 2019 10:15:49 +0530 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: Hi Boris Great to see new facelift of Stackalytics. Its really good. I have a query regarding contributors name is not listed as per company affiliation. Before facelift to stackalytics it was showing correct whether i have entry in https://github.com/openstack/stackalytics/blob/master/etc/default_data.json or not. Though now i have pushed the patch for same https://review.openstack.org/629150, but another thing is one of my colleague Vishal Manchanda name is also showing as independent contributor rather than NEC contributor. While his name entry already in etc/default_data.json. Would be great if you check the same. --- Thanks Surya On Tue, Jan 8, 2019 at 11:57 PM Boris Renski wrote: > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - > > We have new look and feel at stackalytics.com > - > > We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the menu on the top > - > > BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible via top menu. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback. > > -Boris > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Thu Jan 10 06:17:46 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 10 Jan 2019 15:17:46 +0900 Subject: Review-Priority for Project Repos In-Reply-To: <20190103135155.GC27473@sm-workstation> References: <20190103135155.GC27473@sm-workstation> Message-ID: <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> ---- On Thu, 03 Jan 2019 22:51:55 +0900 Sean McGinnis wrote ---- > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: > > Dear All, > > > > There are many occasion when we want to priorities some of the patches > > whether it is related to unblock the gates or blocking the non freeze > > patches during RC. > > > > So adding the Review-Priority will allow more precise dashboard. As > > Designate and Cinder projects already experiencing this[1][2] and after > > discussion with Jeremy brought this to ML to interact with these team > > before landing [3], as there is possibility that reapply the priority vote > > following any substantive updates to change could make it more cumbersome > > than it is worth. > > With Cinder this is fairly new, but I think it is working well so far. The > oddity we've run into, that I think you're referring to here, is how those > votes carry forward with updates. > > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when This idea looks great and helpful especially for blockers and cycle priority patches to get regular review bandwidth from Core or Active members of that project. IMO only +ve votes are more appropriate for this label. -1 is little confusing for many reasons like what is the difference between Review-Priority -1 and Code-Review -2 ? Review-Priority -1 means, it is less priority than 0/not labelled (explicitly setting any patch very less priority). After seeing Cinder dashboard, I got to know that -1 is used to block the changes due to procedural or technical reason. But that can be done by -2 on Code-Review label. Keeping Review-Priority label only for priority set makes it more clear which is nothing but allowing only +ve votes for this label. Personally, I prefer only a single vote set which can be +1 to convey that these are the set of changes priority for review but having multiple +ve vote set as per project need/interest is all fine. -gmann > a patchset is updates, the -1 and +2 carry forward. But for some reason we > can't get the +1 to be sticky. > > So far, that's just a slight inconvenience. It would be great if we can figure > out a way to have them all be sticky, but if we need to live with reapplying +1 > votes, that's manageable to me. > > The one thing I have been slightly concerned with is the process around using > these priority votes. It hasn't been an issue, but I could see a scenario where > one core (in Cinder we have it set up so all cores can use the priority voting) > has set something like a procedural -1, then been pulled away or is absent for > an extended period. Like a Workflow -2, another core cannot override that vote. > So until that person is back to remove the -1, that patch would not be able to > be merged. > > Granted, we've lived with this with Workflow -2's for years and it's never been > a major issue, but I think as far as centralizing control, it may make sense to > have a separate smaller group (just the PTL, or PTL and a few "deputies") that > are able to set priorities on patches just to make sure the folks setting it > are the ones that are actively tracking what the priorities are for the > project. > > Anyway, my 2 cents. I can imagine this would work really well for some teams, > less well for others. So if you think it can help you manage your project > priorities, I would recommend giving it a shot and seeing how it goes. You can > always drop it if it ends up not being effective or causing issues. > > Sean > > From muroi.masahito at lab.ntt.co.jp Thu Jan 10 06:40:26 2019 From: muroi.masahito at lab.ntt.co.jp (Masahito MUROI) Date: Thu, 10 Jan 2019 15:40:26 +0900 Subject: [blazar] Nominating Tetsuro Nakamura for blazar-core In-Reply-To: References: Message-ID: <6357fd01-edee-c287-0269-1f8c51386471@lab.ntt.co.jp> +1 Tetsuro is doing a great contributing to Blazar :) best regards, Masahito On 2019/01/10 3:21, Pierre Riteau wrote: > Hello, > > I would like to nominate Tetsuro Nakamura for membership in the > blazar-core team. > > Tetsuro started contributing to Blazar last summer. He has been > contributing great code for integrating Blazar with placement and > participating actively in the project. He is also providing good > feedback to the rest of the contributors via code review, including on > code not related to placement. He would make a great addition to the > core team. > > Unless there are objections, I will add him to the core team in a week's time. > > Pierre > > > From lujinluo at gmail.com Thu Jan 10 06:47:54 2019 From: lujinluo at gmail.com (Lujin Luo) Date: Wed, 9 Jan 2019 22:47:54 -0800 Subject: [neutron] [upgrade] No meeting on Jan. 10th Message-ID: Hi everyone, Due to some personal reasons, I cannot hold the meeting tomorrow. Let's resume next week. Thanks, Lujin From gmann at ghanshyammann.com Thu Jan 10 06:48:47 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 10 Jan 2019 15:48:47 +0900 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> Message-ID: <16836853a7f.f7a92ce550692.4268288898180642317@ghanshyammann.com> ---- On Wed, 02 Jan 2019 20:18:40 +0900 Dmitry Tantsur wrote ---- > Hi all and happy new year :) > > As you know, tempest plugins are branchless, so the CI of ironic-tempest-plugin > has to run tests on all supported branches. Currently it amounts to 16 (!) > voting devstack jobs. With each of them have some small probability of a random > failure, it is impossible to land anything without at least one recheck, usually > more. > > The bad news is, we only run master API tests job, and these tests are changed > more often that the other. We already had a minor stable branch breakage because > of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And I've just > spotted a missing master multinode job, which is defined but does not run for > some reason :( > > Here is my proposal to deal with gate bloat on ironic-tempest-plugin: > > 1. Do not run CI jobs at all for unsupported branches and branches in extended > maintenance. For Ocata this has already been done in [2]. +1. We have the same policy in Tempest also[1]. You mean not to run CI for unsupported/EM branches on the master testing right? CI on Unsupported/EM branch can be run until they all are passing or EM maintainers want to run them. > > 2. Make jobs running with N-3 (currently Pike) and older non-voting (and thus > remove them from the gate queue). I have a gut feeling that a change that breaks > N-3 is very likely to break N-2 (currently Queens) as well, so it's enough to > have N-2 voting. IMO, running all supported stable branches as voting make sense than running oldest one(N-3 as you mentioned) as n-v. That way, tempest-plugins will be successfully maintained to run on N-3 otherwise it is likely to be broken for that branch especially in case of feature discovery based tests. > > 3. Make the discovery and the multinode jobs from all stable branches > non-voting. These jobs cover the tests that get changed very infrequently (if > ever). These are also the jobs with the highest random failure rate. > > 4. Add the API tests, voting for Queens to master, non-voting for Pike (as > proposed above). > > This should leave us with 20 jobs, but with only 11 of them voting. Which is > still a lot, but probably manageable. > > The corresponding change is [3], please comment here or there. > > Dmitry > > [1] https://review.openstack.org/622177 > [2] https://review.openstack.org/621537 > [3] https://review.openstack.org/627955 > > [1] https://docs.openstack.org/tempest/latest/stable_branch_support_policy.html -gmann From gmann at ghanshyammann.com Thu Jan 10 07:16:28 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 10 Jan 2019 16:16:28 +0900 Subject: [ironic] [qa] ironic-tempest-plugin CI bloat In-Reply-To: References: <7755f816-ead1-79d7-f9f0-33bc7e553221@redhat.com> <1546453449.3633235.1623759896.26639384@webmail.messagingengine.com> Message-ID: <168369e9149.f7aa3daf51154.4025722229812805971@ghanshyammann.com> ---- On Thu, 03 Jan 2019 03:39:00 +0900 Dmitry Tantsur wrote ---- > On 1/2/19 7:24 PM, Clark Boylan wrote: > > On Wed, Jan 2, 2019, at 3:18 AM, Dmitry Tantsur wrote: > >> Hi all and happy new year :) > >> > >> As you know, tempest plugins are branchless, so the CI of ironic- > >> tempest-plugin > >> has to run tests on all supported branches. Currently it amounts to 16 > >> (!) > >> voting devstack jobs. With each of them have some small probability of a > >> random > >> failure, it is impossible to land anything without at least one recheck, > >> usually > >> more. > >> > >> The bad news is, we only run master API tests job, and these tests are > >> changed > >> more often that the other. We already had a minor stable branch breakage > >> because > >> of it [1]. We need to run 3 more jobs: for Pike, Queens and Rocky. And > >> I've just > >> spotted a missing master multinode job, which is defined but does not > >> run for > >> some reason :( Yeah, that is because ironic multinode's parent job "tempest-multinode-full" is restricted to run only on master. It was done that way until we had all multinode zuulv3 things backported till pike which is completed already. I am making this job for pike onwards [1] so that multinode job can be run on stable branches also. > >> > >> Here is my proposal to deal with gate bloat on ironic-tempest-plugin: > >> > >> 1. Do not run CI jobs at all for unsupported branches and branches in extended > >> maintenance. For Ocata this has already been done in [2]. > >> > >> 2. Make jobs running with N-3 (currently Pike) and older non-voting (and > >> thus > >> remove them from the gate queue). I have a gut feeling that a change > >> that breaks > >> N-3 is very likely to break N-2 (currently Queens) as well, so it's > >> enough to > >> have N-2 voting. > >> > >> 3. Make the discovery and the multinode jobs from all stable branches > >> non-voting. These jobs cover the tests that get changed very infrequently (if > >> ever). These are also the jobs with the highest random failure rate. > > > > Has any work been done to investigate why these jobs fail? And if not maybe we should stop running the jobs entirely. Non voting jobs that aren't reliable will just get ignored. > > From my experience it's PXE failing or just generic timeout on slow nodes. Note > that they still don't fail too often, it's their total number that makes it > problematic. When you have 20 jobs each failing with, say, 5% rate it's just 35% > chance of passing (unless I cannot do math). > > But to answer your question, yes, we do put work in that. We just never got to > 0% of random failures. While making the multinode job running for stable branches, I got the consistent failure on multinode job for pike, queens which run fine on Rocky. Failure are on migration tests due to hostname mismatch. I have not debugged the failure yet but we will be making multinode runnable on stable branches also. [1] https://review.openstack.org/#/c/610938/ [2] https://review.openstack.org/#/q/topic:tempest-multinode-slow-stable+(status:open+OR+status:merged) -gmann > > > > >> > >> 4. Add the API tests, voting for Queens to master, non-voting for Pike (as > >> proposed above). > >> > >> This should leave us with 20 jobs, but with only 11 of them voting. Which is > >> still a lot, but probably manageable. > >> > >> The corresponding change is [3], please comment here or there. > >> > >> Dmitry > >> > >> [1] https://review.openstack.org/622177 > >> [2] https://review.openstack.org/621537 > >> [3] https://review.openstack.org/627955 > >> > > > > > From gmann at ghanshyammann.com Thu Jan 10 07:33:40 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 10 Jan 2019 16:33:40 +0900 Subject: [sahara][qa][api-sig]Support for Sahara APIv2 in tempest tests, unversioned endpoints In-Reply-To: References: <1818981.9ErCeWV4fL@whitebase.usersys.redhat.com> Message-ID: <16836ae526c.f599c98651471.2262588979872936985@ghanshyammann.com> ---- On Thu, 03 Jan 2019 07:29:27 +0900 Jeremy Freudberg wrote ---- > Hey Luigi. > > I poked around in Tempest and saw these code bits: > https://github.com/openstack/tempest/blob/master/tempest/lib/common/rest_client.py#L210 > https://github.com/openstack/tempest/blob/f9650269a32800fdcb873ff63f366b7bc914b3d7/tempest/lib/auth.py#L53 > > Here's a patch which takes advantage of those bits to append the > version to the unversioned base URL: > https://review.openstack.org/#/c/628056/ > > Hope it works without regression (I'm a bit worried since Tempest does > its own URL mangling rather than nicely use keystoneauth...) Yeah, that is the code where service client can tell which version of API tests needs to use. This was kind of hack we do in Tempest for different versioned API with same endpoint of service Other way to test different API version in Tempest is via catalog_type. You can define the different jobs running the same test for different versioned endpoints defined in tempest's config options catalog_type. But if you want to run the both versions test in same job then api_version is the way to do that. > > On Wed, Jan 2, 2019 at 5:19 AM Luigi Toscano wrote: > > > > Hi all, > > > > I'm working on adding support for APIv2 to the Sahara tempest plugin. > > > > If I get it correctly, there are two main steps > > > > 1) Make sure that that tempest client works with APIv2 (and don't regress with > > APIv1.1). > > > > This mainly mean implementing the tempest client for Sahara APIv2, which > > should not be too complicated. > > > > On the other hand, we hit an issue with the v1.1 client in an APIv2 > > environment. > > A change associated with API v2 is usage of an unversioned endpoint for the > > deployment (see https://review.openstack.org/#/c/622330/ , without the /v1,1/$ > > (tenant_id) suffix) which should magically work with both API variants, but it > > seems that the current tempest client fails in this case: > > > > http://logs.openstack.org/30/622330/1/check/sahara-tests-tempest/7e02114/job-output.txt.gz#_2018-12-05_21_20_23_535544 > > > > Does anyone know if this is an issue with the code of the tempest tests (which > > should maybe have some logic to build the expected endpoint when it's > > unversioned, like saharaclient does) or somewhere else? > > > > > > 2) fix the tests to support APIv2. > > > > Should I duplicate the tests for APIv1.1 and APIv2? Other projects which > > supports different APIs seems to do this. > > But can I freely move the existing tests under a subdirectory > > (sahara_tempest_plugins/tests/api/ -> sahara_tempest_plugins/tests/api/v1/), > > or are there any compatibility concerns? Are the test ID enough to ensure that > > everything works as before? It depends on compatibility and state of version v1.1 and v2. If both are supposed to be compatible at least feature wise then you should not duplicate the test instead run the same set of tests against both version either in same job or in different job. We do that for nova, cinder, image etc where we run same set of the test against 1. compute v2.0 and v2.1, 2. volume v2 and v3. We have done that testing those in different jobs with defining different catalog_type(version endpoints). Duplicating the tests has two drawbacks, 1. maintenance 2. easy to loose the coverage against specific version. -gmann > > > > And what about CLI tests currently under sahara_tempest_plugin/tests/cli/ ? > > They supports both API versions through a configuration flag. Should they be > > duplicated as well? > > > > > > Ciao > > (and happy new year if you have a new one in your calendar!) > > -- > > Luigi > > > > > > > > From ghcks1000 at gmail.com Thu Jan 10 07:51:25 2019 From: ghcks1000 at gmail.com (=?utf-8?Q?=EC=9D=B4=ED=98=B8=EC=B0=AC?=) Date: Thu, 10 Jan 2019 16:51:25 +0900 Subject: [dev][Tacker] Implementing Multisite VNFFG Message-ID: <5c36f97e.1c69fb81.79c09.a033@mx.google.com> Dear Tacker folks,   Hello, I'm interested in implementing multisite VNFFG in Tacker project.   As far as I know, current single Tacker controller can manage multiple Openstack sites (Multisite VIM), but it can create VNFFG in only singlesite, so it can't create VNFFG across multisite. I think if multisite VNFFG is possible, tacker can have more flexibility in managing VNF and VNFFG.   In the current tacker, networking-sfc driver is used to support VNFFG, and networking-sfc uses port chaining to construct service chain. So, I think extending current port chaining in singleiste to multisite can be one solution.   Is there development process about multisite VNFFG in tacker project? Otherwise, I wonder that tacker is interested in this feature. I want to develop this feature for Tacker project if I can.   Yours sincerely, Hochan Lee. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Jan 10 08:28:45 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 10 Jan 2019 09:28:45 +0100 Subject: queens [magnum] patches In-Reply-To: References: Message-ID: Hello Doug, sorry but I am not so expert of gerrit and how community process for patching works. I saw the https://review.openstack.org/#/c/577477/ page but I cannot understand if those patches are approved and backported on stable queens. Please, help me to understand.... For example: I cloned the stable/queens magnum branch, the file magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-master.sh is different from the same file I downloaded from cherry-picks, so I presume the patch is not merged in the branch yet. I presume the link you sent me ( https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes) is for developers....that's right ? Thanks ans sorry for my poor skill Ignazio Il giorno mer 9 gen 2019 alle ore 15:20 Doug Hellmann ha scritto: > Ignazio Cassano writes: > > > Hello, > > last week I talked on #openstack-containers IRC about important patches > for > > magnum reported here: > > https://review.openstack.org/#/c/577477/ > > > > I'd like to know when the above will be backported on queens and if > centos7 > > and ubuntu packages > > will be upgraded with them. > > Any roadmap ? > > I would go on with magnum testing on queens because I am going to upgrade > > from ocata to pike and from pike to queens. > > > > At this time I have aproduction environment on ocata and a testing > > environment on queens. > > > > Best Regards > > Ignazio > > You can submit those backports yourself, either through the gerrit web > UI or by manually creating the patches locally using git commands. There > are more details on processes and tools for doing this in the stable > maintenance section of the project team guide [1]. > > As far as when those changes might end up in packages, the community > doesn't really have much insight into (or influence over) what stable > patches are pulled down by the distributors or how they schedule their > updates and releases. So I recommend talking to the folks who prepare > the distribution(s) you're interested in, after the backport patches are > approved. > > [1] > https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes > > -- > Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From artem.goncharov at gmail.com Thu Jan 10 08:45:29 2019 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Thu, 10 Jan 2019 09:45:29 +0100 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: Hi, I can repeat the issue - stackalytics stopped showing my affiliation correctly (user: gtema, entry in default_data.json is present) Regards, Artem On Thu, Jan 10, 2019 at 5:48 AM Surya Singh wrote: > Hi Boris > > Great to see new facelift of Stackalytics. Its really good. > > I have a query regarding contributors name is not listed as per company > affiliation. > Before facelift to stackalytics it was showing correct whether i have > entry in > https://github.com/openstack/stackalytics/blob/master/etc/default_data.json > or not. > Though now i have pushed the patch for same > https://review.openstack.org/629150, but another thing is one of my > colleague Vishal Manchanda name is also showing as independent contributor > rather than NEC contributor. While his name entry already in > etc/default_data.json. > > Would be great if you check the same. > > --- > Thanks > Surya > > > On Tue, Jan 8, 2019 at 11:57 PM Boris Renski wrote: > >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics openstack project). Brief >> summary of updates: >> >> - >> >> We have new look and feel at stackalytics.com >> - >> >> We did away with DriverLog >> and Member Directory , which >> were not very actively used or maintained. Those are still available via >> direct links, but not in the menu on the top >> - >> >> BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated >> project commits via a separate subsection accessible via top menu. Before >> this was all bunched up in Project Type -> Complimentary >> >> Happy to hear comments or feedback. >> >> -Boris >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ifatafekn at gmail.com Thu Jan 10 09:30:43 2019 From: ifatafekn at gmail.com (Ifat Afek) Date: Thu, 10 Jan 2019 11:30:43 +0200 Subject: [vitrage] Nominating Ivan Kolodyazhny for Vitrage core In-Reply-To: References: Message-ID: Ivan, welcome to the team :-) On Wed, Jan 9, 2019 at 3:50 PM Eyal B wrote: > +1 > > On Wed, Jan 9, 2019, 15:42 Ifat Afek >> Hi, >> >> >> I would like to nominate Ivan Kolodyazhny for Vitrage core. >> >> Ivan has been contributing to Vitrage for a while now. He has focused on >> upgrade support, vitrage-dashboard and vitrage-tempest-plugin enhancements, >> and during this time gained a lot of knowledge and experience with Vitrage >> code base. I believe he would make a great addition to our team. >> >> >> Thanks, >> >> Ifat. >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jpena at redhat.com Thu Jan 10 10:10:31 2019 From: jpena at redhat.com (Javier Pena) Date: Thu, 10 Jan 2019 05:10:31 -0500 (EST) Subject: [rpm-packaging] Proposing new core member In-Reply-To: References: Message-ID: <1523156859.67815456.1547115031073.JavaMail.zimbra@redhat.com> ----- Original Message ----- > Am Mi., 9. Jan. 2019 um 20:41 Uhr schrieb Thomas Bechtold > : > > > Please give your +1/-1 in the next days. > > +1, well said, happy to have her increase our (too) small core reviewer team! > +1, welcome to the core team! Regards, Javier > Greetings, > Dirk > > From jakub.sliva at ultimum.io Thu Jan 10 11:27:06 2019 From: jakub.sliva at ultimum.io (=?UTF-8?B?SmFrdWIgU2zDrXZh?=) Date: Thu, 10 Jan 2019 12:27:06 +0100 Subject: [tc][telemetry][horizon] ceilometer-dashboard repository creation Message-ID: Hello, our company created a little plugin to Horizon and we would like to share it with the community in a bit more official way. So I created change request (https://review.openstack.org/#/c/619235/) in order to create official repository under project Telemetry. However, PTL recommended me to put this new repository under OpenStack without any project - i.e. make it unofficial. I have also discussed this with Horizon team during their meeting (http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-31) and now I am bit stuck because I do not know how to proceed next. Could you, please, advise me? All opinions are appreciated. Jakub Sliva Ultimum Technologies s.r.o. Na Poříčí 1047/26, 11000 Praha 1 Czech Republic http://ultimum.io From paul.bourke at oracle.com Thu Jan 10 11:38:34 2019 From: paul.bourke at oracle.com (Paul Bourke) Date: Thu, 10 Jan 2019 11:38:34 +0000 Subject: [kolla] Stepping down from core Message-ID: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> Hi all, Due to a change of direction for me I'll be stepping down from the Kolla core group. It's been a blast, thanks to everyone I've worked/interacted with over the past few years. Thanks in particular to Eduardo who's done a stellar job of PTL since taking the reins. I hope we'll cross paths again in the future :) All the best! -Paul From dabarren at gmail.com Thu Jan 10 12:07:47 2019 From: dabarren at gmail.com (Eduardo Gonzalez) Date: Thu, 10 Jan 2019 13:07:47 +0100 Subject: [kolla] Stepping down from core In-Reply-To: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> References: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> Message-ID: Hi Paul, So sad see you leaving the team, your work over this years has been critical to make kolla as great as it is. Thank you for this amazing work. Wish you the best on your new projects and hope to cross our path together some day. Feel free to join the team any time if your job responsibility allows you or if have the time as **independent contributor ;) Again, thank you for your work, wish you the best in the future. Regards El jue., 10 ene. 2019 a las 12:43, Paul Bourke () escribió: > Hi all, > > Due to a change of direction for me I'll be stepping down from the Kolla > core group. It's been a blast, thanks to everyone I've worked/interacted > with over the past few years. Thanks in particular to Eduardo who's done > a stellar job of PTL since taking the reins. I hope we'll cross paths > again in the future :) > > All the best! > -Paul > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcin.juszkiewicz at linaro.org Thu Jan 10 12:21:56 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Thu, 10 Jan 2019 13:21:56 +0100 Subject: [kolla] Stepping down from core In-Reply-To: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> References: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> Message-ID: W dniu 10.01.2019 o 12:38, Paul Bourke pisze: > Hi all, > > Due to a change of direction for me I'll be stepping down from the Kolla > core group. It's been a blast, thanks to everyone I've worked/interacted > with over the past few years. Thanks in particular to Eduardo who's done > a stellar job of PTL since taking the reins. I hope we'll cross paths > again in the future :) Sad to see you leaving but such is life. Have fun with whatever else you will be doing. And thanks for all that help I got from you during my Kolla work. From fungi at yuggoth.org Thu Jan 10 12:42:15 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 10 Jan 2019 12:42:15 +0000 Subject: queens [magnum] patches In-Reply-To: References: Message-ID: <20190110124214.bdxiry37z7oymenx@yuggoth.org> On 2019-01-10 09:28:45 +0100 (+0100), Ignazio Cassano wrote: > Hello Doug, sorry but I am not so expert of gerrit and how community > process for patching works. The Code and Documentation volume of the OpenStack Contributor Guide has chapters on the Git and Gerrit workflows our community uses: https://docs.openstack.org/contributors/code-and-documentation/ > I saw the https://review.openstack.org/#/c/577477/ page but I cannot > understand if those patches are approved and backported on stable queens. > Please, help me to understand.... Typically, we propose backports under a common Change-Id to the master branch change. Here you can see that backports to stable/rocky and stable/queens were proposed Monday by Bharat Kunwar: https://review.openstack.org/#/q/Ife5558f1db4e581b64cc4a8ffead151f7b405702 The stable/queens backport is well on its way to approval; it's passing CI jobs (the Verified +1 from Zuul) and already has one of the customary two stable branch core reviews (the Code-Review +2 vote from Spyros Trigazis), so I expect it's well on its way to approval. > For example: I cloned the stable/queens magnum branch, the file > magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-master.sh > is different from the same file I downloaded from cherry-picks, so > I presume the patch is not merged in the branch yet. The stable/queens backport looks like it still needs some work, as evidenced by the Verified -1 vote from Zuul. It's currently failing CI jobs openstack-tox-pep8 (coding style validation) and magnum-functional-k8s (a Kubernetes functional testsuite for Magnum). The names of those jobs in the Gerrit webUI lead to detailed build logs, which can be used to identify and iterate on solutions to get them passing for that change. > I presume the link you sent me ( > https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes > ) is for developers....that's right ? It's for anyone in the community who wants to help. "Developer" is just a reference to someone performing an activity, not a qualification. > Thanks ans sorry for my poor skill [...] Please don't apologize. Skills are just something we learn, nobody is born knowing any of this. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Thu Jan 10 13:05:57 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 10 Jan 2019 13:05:57 +0000 Subject: [tc][telemetry][horizon] ceilometer-dashboard repository creation In-Reply-To: References: Message-ID: <20190110130557.q3fgchx3uot6aupj@yuggoth.org> On 2019-01-10 12:27:06 +0100 (+0100), Jakub Slíva wrote: > our company created a little plugin to Horizon and we would like to > share it with the community in a bit more official way. So I created > change request (https://review.openstack.org/#/c/619235/) in order to > create official repository under project Telemetry. However, PTL > recommended me to put this new repository under OpenStack without any > project - i.e. make it unofficial. > > I have also discussed this with Horizon team during their meeting > (http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-31) > and now I am bit stuck because I do not know how to proceed next. > Could you, please, advise me? It looks like much of this confusion stemmed from recommendation by project-config-core reviewers, unfortunately. We too often see people from official teams in OpenStack request new Git repositories for work their team will be performing, but who forget to also record them in the appropriate governance lists. As a result, if a proposed repository looks closely-related to the work of an existing team (in this case possibly either Horizon or Telemetry) we usually assume this was the case and recommend during the review process that they file a corresponding change to the OpenStack TC's governance repository. Given this is an independent group's work for which neither the Horizon nor Telemetry teams have expressed an interest in adopting responsibility, it's perfectly acceptable to have it operate as an unofficial project or to apply for status as another official project team within OpenStack. The main differences between the two options are that contributors to official OpenStack project teams gain the ability to vote in Technical Committee elections, their repositories can publish documentation on the https://docs.openstack.org/ Web site, they're able to reserve space for team-specific discussions and working sessions at OSF Project Teams Gathering meetings (such as the one coming up in Denver immediately following the Open Infrastructure Summit)... but official project teams are also expected to hold team lead elections twice a year, participate in OpenStack release processes, follow up on implementing cycle goals, and otherwise meet the requirements laid out in our https://governance.openstack.org/tc/reference/new-projects-requirements.html document. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From doug at doughellmann.com Thu Jan 10 13:28:26 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 10 Jan 2019 08:28:26 -0500 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> Message-ID: "Nguyen Hung, Phuong" writes: > Hi Ben, > >> I suggest that we either WIP or abandon the current >> patch series. > ... >> If you have any thoughts about this plan please let me know. Otherwise I >> will act on it sometime in the near-ish future. > > Thanks for your consideration. I am agree with you, please help me to abandon them because I am not privileged with those patches. > > Regards, > Phuong. +1 for abandoning them, at least for now. As Ben points out, gerrit will still have copies. Doug > > -----Original Message----- > From: Ben Nemec [mailto:openstack at nemebean.com] > Sent: Thursday, January 10, 2019 6:12 AM > To: Herve Beraud; Nguyen, Hung Phuong > Cc: openstack-discuss at lists.openstack.org > Subject: Re: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams > > > > On 12/20/18 4:41 AM, Herve Beraud wrote: >> >> >> Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong >> > a écrit : >> >> Hi Ben, >> >> I am apology that in last month we do not have much time maintaining >> the code. >> >> > but if no one's going to use it then I'd rather cut our >> > losses than continue pouring time into it. >> >> I agree, we will wait for the community to decide the need for the >> feature. >> In the near future, we do not have ability to maintain the code. If >> anyone >> has interest to continue maintaining the patch, we will support with >> document, >> reviewing... in our possibility. >> >> >> I can help you to maintain the code if needed. >> >> Personaly I doesn't need this feature so I agree Ben and Doug point of view. >> >> We need to measure how many this feature is useful and if it make sense >> to support and maintain more code in the future related to this feature >> without any usages behind that. > > We discussed this again in the Oslo meeting this week, and to share with > the wider audience here's what I propose: > > Since the team that initially proposed the feature and that we expected > to help maintain it are no longer able to do so, and it's not clear to > the Oslo team that there is sufficient demand for a rather complex > feature like this, I suggest that we either WIP or abandon the current > patch series. Gerrit never forgets, so if at some point there are > contributors (new or old) who have a vested interest in the feature we > can always resurrect it. > > If you have any thoughts about this plan please let me know. Otherwise I > will act on it sometime in the near-ish future. > > In the meantime, if anyone is desperate for Oslo work to do here are a > few things that have been lingering on my todo list: > > * We have a unit test in oslo.utils (test_excutils) that is still using > mox. That needs to be migrated to mock. > * oslo.cookiecutter has a number of things that are out of date (doc > layout, lack of reno, coverage job). Since it's unlikely we've reached > peak Oslo library we should update that so there aren't a bunch of > post-creation changes needed like there were with oslo.upgradecheck (and > I'm guessing oslo.limit). > * The config validator still needs support for dynamic groups, if > oslo.config is your thing. > * There are 326 bugs open across Oslo projects. Help wanted. :-) > > Thanks. > > -Ben > -- Doug From ignaziocassano at gmail.com Thu Jan 10 13:31:44 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 10 Jan 2019 14:31:44 +0100 Subject: queens [magnum] patches In-Reply-To: <20190110124214.bdxiry37z7oymenx@yuggoth.org> References: <20190110124214.bdxiry37z7oymenx@yuggoth.org> Message-ID: Many thanks, Jeremy. Ignazio Il giorno gio 10 gen 2019 alle ore 13:45 Jeremy Stanley ha scritto: > On 2019-01-10 09:28:45 +0100 (+0100), Ignazio Cassano wrote: > > Hello Doug, sorry but I am not so expert of gerrit and how community > > process for patching works. > > The Code and Documentation volume of the OpenStack Contributor Guide > has chapters on the Git and Gerrit workflows our community uses: > > https://docs.openstack.org/contributors/code-and-documentation/ > > > I saw the https://review.openstack.org/#/c/577477/ page but I cannot > > understand if those patches are approved and backported on stable queens. > > Please, help me to understand.... > > Typically, we propose backports under a common Change-Id to the > master branch change. Here you can see that backports to > stable/rocky and stable/queens were proposed Monday by Bharat Kunwar: > > https://review.openstack.org/#/q/Ife5558f1db4e581b64cc4a8ffead151f7b405702 > > The stable/queens backport is well on its way to approval; it's passing > CI jobs (the Verified +1 from Zuul) and already has one of the > customary two stable branch core reviews (the Code-Review +2 vote > from Spyros Trigazis), so I expect it's well on its way to approval. > > > For example: I cloned the stable/queens magnum branch, the file > > > magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-master.sh > > is different from the same file I downloaded from cherry-picks, so > > I presume the patch is not merged in the branch yet. > > The stable/queens backport looks like it still needs some work, as > evidenced by the Verified -1 vote from Zuul. It's currently failing > CI jobs openstack-tox-pep8 (coding style validation) and > magnum-functional-k8s (a Kubernetes functional testsuite for > Magnum). The names of those jobs in the Gerrit webUI lead to > detailed build logs, which can be used to identify and iterate on > solutions to get them passing for that change. > > > I presume the link you sent me ( > > > https://docs.openstack.org/project-team-guide/stable-branches.html#proposing-fixes > > ) is for developers....that's right ? > > It's for anyone in the community who wants to help. "Developer" is > just a reference to someone performing an activity, not a > qualification. > > > Thanks ans sorry for my poor skill > [...] > > Please don't apologize. Skills are just something we learn, nobody > is born knowing any of this. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Thu Jan 10 13:32:59 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 10 Jan 2019 08:32:59 -0500 Subject: Review-Priority for Project Repos In-Reply-To: <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> Message-ID: Ghanshyam Mann writes: > ---- On Thu, 03 Jan 2019 22:51:55 +0900 Sean McGinnis wrote ---- > > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: > > > Dear All, > > > > > > There are many occasion when we want to priorities some of the patches > > > whether it is related to unblock the gates or blocking the non freeze > > > patches during RC. > > > > > > So adding the Review-Priority will allow more precise dashboard. As > > > Designate and Cinder projects already experiencing this[1][2] and after > > > discussion with Jeremy brought this to ML to interact with these team > > > before landing [3], as there is possibility that reapply the priority vote > > > following any substantive updates to change could make it more cumbersome > > > than it is worth. > > > > With Cinder this is fairly new, but I think it is working well so far. The > > oddity we've run into, that I think you're referring to here, is how those > > votes carry forward with updates. > > > > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when > > This idea looks great and helpful especially for blockers and cycle priority patches to get regular > review bandwidth from Core or Active members of that project. > > IMO only +ve votes are more appropriate for this label. -1 is little confusing for many reasons like > what is the difference between Review-Priority -1 and Code-Review -2 ? Review-Priority -1 means, > it is less priority than 0/not labelled (explicitly setting any patch very less priority). > > After seeing Cinder dashboard, I got to know that -1 is used to block the changes due to procedural > or technical reason. But that can be done by -2 on Code-Review label. Keeping Review-Priority label > only for priority set makes it more clear which is nothing but allowing only +ve votes for this label. > Personally, I prefer only a single vote set which can be +1 to convey that these are the set of changes > priority for review but having multiple +ve vote set as per project need/interest is all fine. > > -gmann Given the complexity of our review process already, if this new aspect is going to spread it would be really nice if we could try to agree on a standard way to apply it. Not only would that let someone build a dashboard for cross-project priorities, but it would mean contributors wouldn't need to learn different rules for interacting with each of our teams. Doug From ildiko.vancsa at gmail.com Thu Jan 10 13:41:26 2019 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Thu, 10 Jan 2019 14:41:26 +0100 Subject: [edge] Use cases mapping to MVP architectures - FEEDBACK NEEDED Message-ID: Hi, We are reaching out to you about the use cases for edge cloud infrastructure that the Edge Computing Group is working on to collect. They are recorded in our wiki [1] and they describe high level scenarios when an edge cloud infrastructure would be needed. During the second Denver PTG discussions we drafted two MVP architectures what we could build from the current functionality of OpenStack with some slight modifications [2]. These are based on the work of James and his team from Oath. We differentiate between a distributed [3] and a centralized [4] control plane architecture scenarios. In one of the Berlin Forum sessions we were asked to map the MVP architecture scenarios to the use cases so I made an initial mapping and now we are looking for feedback. This mapping only means, that the listed use case can be implemented using the MVP architecture scenarios. It should be noted, that none of the MVP architecture scenarios provide solution for edge cloud infrastructure upgrade or centralized management. Please comment on the wiki or in a reply to this mail in case you have questions or disagree with the initial mapping we put together. Please let us know if you have any questions. Here is the use cases and the mapped architecture scenarios: Mobile service provider 5G/4G virtual RAN deployment and Edge Cloud B2B2X [5] Both distributed [3] and centralized [4] Universal customer premise equipment (uCPE) for Enterprise Network Services[6] Both distributed [3] and centralized [4] Unmanned Aircraft Systems (Drones) [7] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Cloud Storage Gateway - Storage at the Edge [8] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Open Caching - stream/store data at the edge [9] Both distributed [3] and centralized [4] Smart City as Software-Defined closed-loop system [10] The use case is not complete enough to figure out Augmented Reality -- Sony Gaming Network [11] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Analytics/control at the edge [12] The use case is not complete enough to figure out Manage retail chains - chick-fil-a [13] The use case is not complete enough to figure out At this moment chick-fil-a uses a different Kubernetes cluster in every edge location and they manage them using Git [14] Smart Home [15] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event Data Collection - Smart cooler/cold chain tracking [16] None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event VPN Gateway Service Delivery [17] The use case is not complete enough to figure out [1]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases [2]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures [3]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Distributed_Control_Plane_Scenario [4]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Centralized_Control_Plane_Scenario [5]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Mobile_service_provider_5G.2F4G_virtual_RAN_deployment_and_Edge_Cloud_B2B2X. [6]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Universal_customer_premise_equipment_.28uCPE.29_for_Enterprise_Network_Services [7]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Unmanned_Aircraft_Systems_.28Drones.29 [8]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Cloud_Storage_Gateway_-_Storage_at_the_Edge [9]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Open_Caching_-_stream.2Fstore_data_at_the_edge [10]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_City_as_Software-Defined_closed-loop_system [11]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Augmented_Reality_--_Sony_Gaming_Network [12]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Analytics.2Fcontrol_at_the_edge [13]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Manage_retail_chains_-_chick-fil-a [14]: https://schd.ws/hosted_files/kccna18/34/GitOps.pdf [15]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_Home [16]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Data_Collection_-_Smart_cooler.2Fcold_chain_tracking [17]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#VPN_Gateway_Service_Delivery Thanks and Best Regards, Gergely and Ildikó From doug at doughellmann.com Thu Jan 10 13:47:53 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 10 Jan 2019 08:47:53 -0500 Subject: [tc][telemetry][horizon] ceilometer-dashboard repository creation In-Reply-To: <20190110130557.q3fgchx3uot6aupj@yuggoth.org> References: <20190110130557.q3fgchx3uot6aupj@yuggoth.org> Message-ID: Jeremy Stanley writes: > On 2019-01-10 12:27:06 +0100 (+0100), Jakub Slíva wrote: >> our company created a little plugin to Horizon and we would like to >> share it with the community in a bit more official way. So I created >> change request (https://review.openstack.org/#/c/619235/) in order to >> create official repository under project Telemetry. However, PTL >> recommended me to put this new repository under OpenStack without any >> project - i.e. make it unofficial. >> >> I have also discussed this with Horizon team during their meeting >> (http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-01-09-15.02.log.html#l-31) >> and now I am bit stuck because I do not know how to proceed next. >> Could you, please, advise me? > > It looks like much of this confusion stemmed from recommendation by > project-config-core reviewers, unfortunately. We too often see > people from official teams in OpenStack request new Git repositories > for work their team will be performing, but who forget to also > record them in the appropriate governance lists. As a result, if a > proposed repository looks closely-related to the work of an existing > team (in this case possibly either Horizon or Telemetry) we usually > assume this was the case and recommend during the review process > that they file a corresponding change to the OpenStack TC's > governance repository. Given this is an independent group's work for > which neither the Horizon nor Telemetry teams have expressed an > interest in adopting responsibility, it's perfectly acceptable to > have it operate as an unofficial project or to apply for status as > another official project team within OpenStack. > > The main differences between the two options are that contributors > to official OpenStack project teams gain the ability to vote in > Technical Committee elections, their repositories can publish > documentation on the https://docs.openstack.org/ Web site, they're > able to reserve space for team-specific discussions and working > sessions at OSF Project Teams Gathering meetings (such as the one > coming up in Denver immediately following the Open Infrastructure > Summit)... but official project teams are also expected to hold team > lead elections twice a year, participate in OpenStack release > processes, follow up on implementing cycle goals, and otherwise meet > the requirements laid out in our > https://governance.openstack.org/tc/reference/new-projects-requirements.html > document. > -- > Jeremy Stanley Jakub, thank you for starting this thread. As you can see from Jeremy's response, you have a couple of options. You had previously told me you wanted the repository to be "official", and since the existing teams do not want to manage it I think that it is likely that you will want to create a new team for it. However, since that path does introduce some obligations, before you go ahead it would be good to understand what benefits you are seeking by joining an official team. Can you fill in some background for us, so we can offer the best guidance? -- Doug From lyarwood at redhat.com Thu Jan 10 13:56:05 2019 From: lyarwood at redhat.com (Lee Yarwood) Date: Thu, 10 Jan 2019 13:56:05 +0000 Subject: [cinder] volume encryption performance impact In-Reply-To: <20190109151329.GA7953@sanger.ac.uk> References: <20190109151329.GA7953@sanger.ac.uk> Message-ID: <20190110135605.qd34tb54deh5zv6f@lyarwood.usersys.redhat.com> On 09-01-19 15:13:29, Dave Holland wrote: > Hello, > > I've just started investigating Cinder volume encryption using Queens > (RHOSP13) with a Ceph/RBD backend and the performance overhead is... > surprising. Some naive bonnie++ numbers, comparing a plain vs encrypted > volume: > > plain: write 1400MB/s, read 390MB/s > encrypted: write 81MB/s, read 83MB/s > > The encryption was configured with: > > openstack volume type create --encryption-provider nova.volume.encryptors.luks.LuksEncryptor --encryption-cipher aes-xts-plain64 --encryption-key-size 256 --encryption-control-location front-end LuksEncryptor-Template-256 > > Does anyone have a similar setup, and can share their performance > figures, or give me an idea of what percentage performance impact I > should expect? Alternatively: is AES256 overkill, or, where should I > start looking for a misconfiguration or bottleneck? What's the underlying version of QEMU being used here? FWIW I can't recall seeing any performance issues when working on and verifying this downstream with QEMU 2.10. Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: not available URL: From cdent+os at anticdent.org Thu Jan 10 14:14:38 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 10 Jan 2019 14:14:38 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC Message-ID: Recently Thierry, with the help of other TC members, wrote down the perceived role of the TC [1]. This was inspired by the work on the "Vision for OpenStack Clouds" [2]. If we think we should have that document to help validate and direct our software development, we should have something similar to validate governance. Now we need to make sure the document reflects not just how things are but also how they should be. We (the TC) would like feedback from the community on the following general questions (upon which you should feel free to expand as necessary). * Does the document accurately reflect what you see the TC doing? * What's in the list that shouldn't be? * What's not in the list that should be? * Should something that is listed be done more or less? Discussions like these are sometimes perceived as pointless navel gazing. That's a fair complaint when they result in nothing changing (if it should). In this case however, it is fair to say that the composition of the OpenStack community is changing and we _may_ need some adjustments in governance to effectively adapt. We can't know if any changes should be big or little until we talk about them. We have several weeks before the next TC election, so now seems an appropriate time. Note that the TC was chartered with a mission [3]: The Technical Committee (“TC”) is tasked with providing the technical leadership for OpenStack as a whole (all official projects, as defined below). It enforces OpenStack ideals (Openness, Transparency, Commonality, Integration, Quality…), decides on issues affecting multiple projects, forms an ultimate appeals board for technical decisions, and generally has technical oversight over all of OpenStack. Thanks for your participation and help. [1] https://governance.openstack.org/tc/reference/role-of-the-tc.html [2] https://governance.openstack.org/tc/reference/technical-vision.html [3] https://governance.openstack.org/tc/reference/charter.html#mission -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From dangtrinhnt at gmail.com Thu Jan 10 14:41:27 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 10 Jan 2019 23:41:27 +0900 Subject: [Searchlight] We reached Stein-2 milestone Message-ID: Hi team, Just so you know, we reached Stein-2 milestone and were able to release Searchlight yesterday :) Yay! I put a document here [1] to summarize what we covered in this release. Hope that it will get you excited and understand our vision. [1] https://www.dangtrinh.com/2019/01/searchlight-at-stein-2-r-14-r-13.html Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Thu Jan 10 15:15:24 2019 From: hberaud at redhat.com (Herve Beraud) Date: Thu, 10 Jan 2019 16:15:24 +0100 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> Message-ID: Make sense so +1 Le jeu. 10 janv. 2019 14:27, Doug Hellmann a écrit : > "Nguyen Hung, Phuong" writes: > > > Hi Ben, > > > >> I suggest that we either WIP or abandon the current > >> patch series. > > ... > >> If you have any thoughts about this plan please let me know. Otherwise > I > >> will act on it sometime in the near-ish future. > > > > Thanks for your consideration. I am agree with you, please help me to > abandon them because I am not privileged with those patches. > > > > Regards, > > Phuong. > > +1 for abandoning them, at least for now. As Ben points out, gerrit will > still have copies. > > Doug > > > > > -----Original Message----- > > From: Ben Nemec [mailto:openstack at nemebean.com] > > Sent: Thursday, January 10, 2019 6:12 AM > > To: Herve Beraud; Nguyen, Hung Phuong > > Cc: openstack-discuss at lists.openstack.org > > Subject: Re: [oslo][migrator] RFE Configuration mapping tool for upgrade > - coordinate teams > > > > > > > > On 12/20/18 4:41 AM, Herve Beraud wrote: > >> > >> > >> Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong > >> > a écrit : > >> > >> Hi Ben, > >> > >> I am apology that in last month we do not have much time maintaining > >> the code. > >> > >> > but if no one's going to use it then I'd rather cut our > >> > losses than continue pouring time into it. > >> > >> I agree, we will wait for the community to decide the need for the > >> feature. > >> In the near future, we do not have ability to maintain the code. If > >> anyone > >> has interest to continue maintaining the patch, we will support with > >> document, > >> reviewing... in our possibility. > >> > >> > >> I can help you to maintain the code if needed. > >> > >> Personaly I doesn't need this feature so I agree Ben and Doug point of > view. > >> > >> We need to measure how many this feature is useful and if it make sense > >> to support and maintain more code in the future related to this feature > >> without any usages behind that. > > > > We discussed this again in the Oslo meeting this week, and to share with > > the wider audience here's what I propose: > > > > Since the team that initially proposed the feature and that we expected > > to help maintain it are no longer able to do so, and it's not clear to > > the Oslo team that there is sufficient demand for a rather complex > > feature like this, I suggest that we either WIP or abandon the current > > patch series. Gerrit never forgets, so if at some point there are > > contributors (new or old) who have a vested interest in the feature we > > can always resurrect it. > > > > If you have any thoughts about this plan please let me know. Otherwise I > > will act on it sometime in the near-ish future. > > > > In the meantime, if anyone is desperate for Oslo work to do here are a > > few things that have been lingering on my todo list: > > > > * We have a unit test in oslo.utils (test_excutils) that is still using > > mox. That needs to be migrated to mock. > > * oslo.cookiecutter has a number of things that are out of date (doc > > layout, lack of reno, coverage job). Since it's unlikely we've reached > > peak Oslo library we should update that so there aren't a bunch of > > post-creation changes needed like there were with oslo.upgradecheck (and > > I'm guessing oslo.limit). > > * The config validator still needs support for dynamic groups, if > > oslo.config is your thing. > > * There are 326 bugs open across Oslo projects. Help wanted. :-) > > > > Thanks. > > > > -Ben > > > > -- > Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From msm at redhat.com Thu Jan 10 15:24:53 2019 From: msm at redhat.com (Michael McCune) Date: Thu, 10 Jan 2019 10:24:53 -0500 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: thanks for posting this Chris, i have one minor question On Thu, Jan 10, 2019 at 9:17 AM Chris Dent wrote: > Now we need to make sure the document reflects not just how things > are but also how they should be. We (the TC) would like feedback > from the community on the following general questions (upon which > you should feel free to expand as necessary). where is the best venue for providing feedback? i see these documents are published, should we start threads on the ml (or use this one), or make issues somewhere? peace o/ From cdent+os at anticdent.org Thu Jan 10 15:27:08 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 10 Jan 2019 15:27:08 +0000 (GMT) Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: On Thu, 10 Jan 2019, Michael McCune wrote: > On Thu, Jan 10, 2019 at 9:17 AM Chris Dent wrote: >> Now we need to make sure the document reflects not just how things >> are but also how they should be. We (the TC) would like feedback >> from the community on the following general questions (upon which >> you should feel free to expand as necessary). > > where is the best venue for providing feedback? > > i see these documents are published, should we start threads on the ml > (or use this one), or make issues somewhere? Sorry that I wasn't clear about that. Here, on this thread, would be a great place to start. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From bdobreli at redhat.com Thu Jan 10 15:34:45 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Thu, 10 Jan 2019 16:34:45 +0100 Subject: [Edge-computing] Use cases mapping to MVP architectures - FEEDBACK NEEDED In-Reply-To: References: Message-ID: <133ad3bc-bc57-b627-7b43-94f8ac846746@redhat.com> On 10.01.2019 14:43, Ildiko Vancsa wrote: > Hi, > > We are reaching out to you about the use cases for edge cloud infrastructure that the Edge Computing Group is working on to collect. They are recorded in our wiki [1] and they describe high level scenarios when an edge cloud infrastructure would be needed. Hello. Verifying the mappings created for the "Elementary operations on a site" [18] feature against the distributed glance specification [19], I can see a vital feature is missing for "Advanced operations on a site", like creating an image locally, when the parent control plane is not available. And consequences coming off that, like availability of create snapshots for Nova as well. All that boils down to a) better identifying the underlying requirement/limitations for CRUD operations available for middle edge sites in the Distributed Control Plane case. And b) the requirement of data replication and conflicts resolving tooling, which comes out, if we assume we want all CRUDs being always available for middle edge sites disregard of the parent edge's control plane state. So that is the missing and important thing to have socialised and noted for the mappings. [18] https://wiki.openstack.org/wiki/MappingOfUseCasesFeaturesRequirementsAndUserStories#Elementary_operations_on_one_site [19] https://review.openstack.org/619638 > > During the second Denver PTG discussions we drafted two MVP architectures what we could build from the current functionality of OpenStack with some slight modifications [2]. These are based on the work of James and his team from Oath. We differentiate between a distributed [3] and a centralized [4] control plane architecture scenarios. > > In one of the Berlin Forum sessions we were asked to map the MVP architecture scenarios to the use cases so I made an initial mapping and now we are looking for feedback. > > This mapping only means, that the listed use case can be implemented using the MVP architecture scenarios. It should be noted, that none of the MVP architecture scenarios provide solution for edge cloud infrastructure upgrade or centralized management. > > Please comment on the wiki or in a reply to this mail in case you have questions or disagree with the initial mapping we put together. > > Please let us know if you have any questions. > > > Here is the use cases and the mapped architecture scenarios: > > Mobile service provider 5G/4G virtual RAN deployment and Edge Cloud B2B2X [5] > Both distributed [3] and centralized [4] > Universal customer premise equipment (uCPE) for Enterprise Network Services[6] > Both distributed [3] and centralized [4] > Unmanned Aircraft Systems (Drones) [7] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > Cloud Storage Gateway - Storage at the Edge [8] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > Open Caching - stream/store data at the edge [9] > Both distributed [3] and centralized [4] > Smart City as Software-Defined closed-loop system [10] > The use case is not complete enough to figure out > Augmented Reality -- Sony Gaming Network [11] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > Analytics/control at the edge [12] > The use case is not complete enough to figure out > Manage retail chains - chick-fil-a [13] > The use case is not complete enough to figure out > At this moment chick-fil-a uses a different Kubernetes cluster in every edge location and they manage them using Git [14] > Smart Home [15] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > Data Collection - Smart cooler/cold chain tracking [16] > None - assuming that this Use Case requires a Small Edge instance which can work in case of a network partitioning event > VPN Gateway Service Delivery [17] > The use case is not complete enough to figure out > > [1]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases > [2]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures > [3]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Distributed_Control_Plane_Scenario > [4]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Centralized_Control_Plane_Scenario > [5]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Mobile_service_provider_5G.2F4G_virtual_RAN_deployment_and_Edge_Cloud_B2B2X. > [6]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Universal_customer_premise_equipment_.28uCPE.29_for_Enterprise_Network_Services > [7]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Unmanned_Aircraft_Systems_.28Drones.29 > [8]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Cloud_Storage_Gateway_-_Storage_at_the_Edge > [9]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Open_Caching_-_stream.2Fstore_data_at_the_edge > [10]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_City_as_Software-Defined_closed-loop_system > [11]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Augmented_Reality_--_Sony_Gaming_Network > [12]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Analytics.2Fcontrol_at_the_edge > [13]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Manage_retail_chains_-_chick-fil-a > [14]: https://schd.ws/hosted_files/kccna18/34/GitOps.pdf > [15]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Smart_Home > [16]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#Data_Collection_-_Smart_cooler.2Fcold_chain_tracking > [17]: https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases#VPN_Gateway_Service_Delivery > > > Thanks and Best Regards, > Gergely and Ildikó > > > > _______________________________________________ > Edge-computing mailing list > Edge-computing at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing > -- Best regards, Bogdan Dobrelya, Irc #bogdando From rob at cleansafecloud.com Thu Jan 10 15:49:14 2019 From: rob at cleansafecloud.com (Robert Donovan) Date: Thu, 10 Jan 2019 15:49:14 +0000 Subject: [nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation Message-ID: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> Hello Nova folks, I spoke to some of you very briefly about this in Berlin (thanks again for your time), and we were resigned to turning off SMT to fully protect against future CPU cache side-channel attacks as I know many others have done. However, we have stubbornly done a bit of last-resort research and testing into using vCPU pinning on a per-tenant basis as an alternative and I’d like to lay it out in more detail for you to make sure there are no legs in the idea before abandoning it completely. The idea is to use libvirt’s vcpupin ability to ensure that two different tenants never share the same physical CPU core, so they cannot theoretically steal each other’s data via an L1 or L2 cache side-channel. The pinning would be optimised to make use of as many logical cores as possible for any given tenant. We would also isolate other key system processes to a separate range of physical cores. After discussions in Berlin, we ran some tests with live migration, as this is key to our maintenance activities and would be a show-stopped if it didn’t work. We found that removing any pinning restrictions immediately prior to migration resulted in them being completely reset on the target host, which could then be optimised accordingly post-migration. Unfortunately, there would be a small window of time where we couldn’t prevent tenants from sharing a physical core on the target host after a migration, but we think this is an acceptable risk given the nature of these attacks. Obviously, this approach may not be appropriate in many circumstances, such as if you have many tenants who just run single VMs with one vCPU, or if over-allocation is in use. We have also only looked at KVM and libvirt. I would love to know what people think of this approach however. Are there any other clear issues that you can think of which we may not have considered? If it seems like a reasonable idea, is it something that could fit into Nova and, if so, where in the architecture is the best place for it to sit? I know you can currently specify per-instance CPU pinning via flavor parameters, so a similar approach could be taken for this strategy. Alternatively, we can look at implementing it as an external plugin of some kind for use by those with a similar setup. Many thanks, Rob From msm at redhat.com Thu Jan 10 15:49:21 2019 From: msm at redhat.com (Michael McCune) Date: Thu, 10 Jan 2019 10:49:21 -0500 Subject: [tc] [all] Please help verify the role of the TC In-Reply-To: References: Message-ID: On Thu, Jan 10, 2019 at 10:31 AM Chris Dent wrote: > Sorry that I wasn't clear about that. Here, on this thread, would be > a great place to start. no problem, thanks for the clarification =) peace o/ From jaypipes at gmail.com Thu Jan 10 16:05:43 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Thu, 10 Jan 2019 11:05:43 -0500 Subject: [nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation In-Reply-To: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> References: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> Message-ID: <0b37748c-bbc4-e5cf-a434-6adcd0248b64@gmail.com> On 01/10/2019 10:49 AM, Robert Donovan wrote: > Hello Nova folks, > > I spoke to some of you very briefly about this in Berlin (thanks again for your time), and we were resigned to turning off SMT to fully protect against future CPU cache side-channel attacks as I know many others have done. However, we have stubbornly done a bit of last-resort research and testing into using vCPU pinning on a per-tenant basis as an alternative and I’d like to lay it out in more detail for you to make sure there are no legs in the idea before abandoning it completely. > > The idea is to use libvirt’s vcpupin ability to ensure that two different tenants never share the same physical CPU core, so they cannot theoretically steal each other’s data via an L1 or L2 cache side-channel. The pinning would be optimised to make use of as many logical cores as possible for any given tenant. We would also isolate other key system processes to a separate range of physical cores. After discussions in Berlin, we ran some tests with live migration, as this is key to our maintenance activities and would be a show-stopped if it didn’t work. We found that removing any pinning restrictions immediately prior to migration resulted in them being completely reset on the target host, which could then be optimised accordingly post-migration. Unfortunately, there would be a small window of time where we couldn’t prevent tenants from sharing a physical core on the target host after a migration, but we think this is an acceptable risk given the nature of these attacks. > > Obviously, this approach may not be appropriate in many circumstances, such as if you have many tenants who just run single VMs with one vCPU, or if over-allocation is in use. We have also only looked at KVM and libvirt. I would love to know what people think of this approach however. Are there any other clear issues that you can think of which we may not have considered? If it seems like a reasonable idea, is it something that could fit into Nova and, if so, where in the architecture is the best place for it to sit? I know you can currently specify per-instance CPU pinning via flavor parameters, so a similar approach could be taken for this strategy. Alternatively, we can look at implementing it as an external plugin of some kind for use by those with a similar setup. IMHO, if you're going to go through all the hassle of pinning guest vCPU threads to distinct logical host processors, you might as well just use dedicated CPU resources for everything. As you mention above, you can't have overcommit anyway if you're concerned about this problem. Once you have a 1.0 cpu_allocation_ratio, you're essentially limiting your CPU resources to a dedicated host CPU -> guest CPU situation so you might as well just use CPU pinning and deal with all the headaches that brings with it. Best, jay From chris.friesen at windriver.com Thu Jan 10 16:08:05 2019 From: chris.friesen at windriver.com (Chris Friesen) Date: Thu, 10 Jan 2019 10:08:05 -0600 Subject: [nova] Mempage fun In-Reply-To: <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> References: <5227a54413a3ac699f820c4103049db53e1fd66c.camel@redhat.com> <1546937673.17763.2@smtp.office365.com> <55a61624deac4452f49343c73df22639de35f34f.camel@redhat.com> Message-ID: <328b78c1-5993-aef1-b279-fb04677b6e98@windriver.com> On 1/8/2019 12:38 PM, Stephen Finucane wrote: > I have (1) fixed here: > > https://review.openstack.org/#/c/629281/ > > That said, I'm not sure if it's the best thing to do. From what I'm > hearing, it seems the advice we should be giving is to not mix > instances with/without NUMA topologies, with/without hugepages and > with/without CPU pinning. We've only documented the latter, as > discussed on this related bug by cfriesen: > > https://bugs.launchpad.net/nova/+bug/1792985 > > Given that we should be advising folks not to mix these (something I > wasn't aware of until now), what does the original patch actually give > us? I think we should look at it from the other direction...what is the ultimate *desired* behaviour? Personally, I'm coming at it from a "small-cloud" perspective where we may only have one or two compute nodes. As such, the host-aggregate solution doesn't really work. I would like to be able to run cpu-pinned and cpu-shared instances on the same node. I would like to run small-page (with overcommit) and huge-page (without overcommit) instances on the same node. I would like to run cpu-shared/small-page instances (which float over the whole host) on the same host as a cpu-pinned/small-page instance (which is pinned to specific NUMA nodes). We have a warning in the docs currently that is specifically for separating CPU-pinned and CPU-shared instances, but we also have a spec that plans to specifically support that case. The way the code is currently written we also need to separate NUMA-affined small-page instances from non-NUMA-affined small-page instances, but I think that's a bug, not a sensible design. Chris From grant at absolutedevops.io Thu Jan 10 13:16:58 2019 From: grant at absolutedevops.io (Grant Morley) Date: Thu, 10 Jan 2019 13:16:58 +0000 Subject: Issues setting up a SolidFire node with Cinder Message-ID: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> Hi all, We are in the process of trying to add a SolidFire storage solution to our existing OpenStack setup and seem to have hit a snag with cinder / iscsi. We are trying to create a bootable volume to allow us to launch an instance from it, but we are getting some errors in our cinder-volumes containers that seem to suggest they can't connect to iscsi although the volume seems to create fine on the SolidFire node. The command we are running is: openstack volume create --image $image-id --size 20 --bootable --type solidfire sf-volume-v12 The volume seems to create on SolidFire but I then see these errors in the "cinder-volume.log" https://pastebin.com/LyjLUhfk The volume containers can talk to the iscsi VIP on the SolidFire so I am a bit stuck and wondered if anyone had come across any issues before? Kind Regards, -- Grant Morley Cloud Lead Absolute DevOps Ltd Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP www.absolutedevops.io grant at absolutedevops.io 0845 874 0580 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Thu Jan 10 17:12:41 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 10 Jan 2019 11:12:41 -0600 Subject: Review-Priority for Project Repos In-Reply-To: <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> Message-ID: <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> On 1/10/19 12:17 AM, Ghanshyam Mann wrote: > ---- On Thu, 03 Jan 2019 22:51:55 +0900 Sean McGinnis wrote ---- > > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: > > > Dear All, > > > > > > There are many occasion when we want to priorities some of the patches > > > whether it is related to unblock the gates or blocking the non freeze > > > patches during RC. > > > > > > So adding the Review-Priority will allow more precise dashboard. As > > > Designate and Cinder projects already experiencing this[1][2] and after > > > discussion with Jeremy brought this to ML to interact with these team > > > before landing [3], as there is possibility that reapply the priority vote > > > following any substantive updates to change could make it more cumbersome > > > than it is worth. > > > > With Cinder this is fairly new, but I think it is working well so far. The > > oddity we've run into, that I think you're referring to here, is how those > > votes carry forward with updates. > > > > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when > > This idea looks great and helpful especially for blockers and cycle priority patches to get regular > review bandwidth from Core or Active members of that project. > > IMO only +ve votes are more appropriate for this label. -1 is little confusing for many reasons like > what is the difference between Review-Priority -1 and Code-Review -2 ? Review-Priority -1 means, > it is less priority than 0/not labelled (explicitly setting any patch very less priority). > > After seeing Cinder dashboard, I got to know that -1 is used to block the changes due to procedural > or technical reason. But that can be done by -2 on Code-Review label. Keeping Review-Priority label > only for priority set makes it more clear which is nothing but allowing only +ve votes for this label. > Personally, I prefer only a single vote set which can be +1 to convey that these are the set of changes > priority for review but having multiple +ve vote set as per project need/interest is all fine. I don't know if this was the reasoning behind Cinder's system, but I know some people object to procedural -2 because it's a big hammer to essentially say "not right now". It overloads the meaning of the vote in a potentially confusing way that requires explanation every time it's used. At least I hope procedural -2's always include a comment. Whether adding a whole new vote type is a meaningful improvement is another question, but if we're adding the type anyway for prioritization it might make sense to use it to replace procedural -2. Especially if we could make it so any core can change it (apparently not right now), whereas -2 requires the original core to come back and remove it. From singh.surya64mnnit at gmail.com Thu Jan 10 17:31:52 2019 From: singh.surya64mnnit at gmail.com (Surya Singh) Date: Thu, 10 Jan 2019 23:01:52 +0530 Subject: [kolla] Stepping down from core In-Reply-To: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> References: <5329f6c9-5bc9-1b20-5531-cfab2b58108b@oracle.com> Message-ID: Hi Paul, Thanks a lot for your long term contribution to make Kolla great project. Sad to see you stepping down. Hope to see you around. All the very best for your new project. --- Thanks Surya On Thu, Jan 10, 2019 at 5:13 PM Paul Bourke wrote: > Hi all, > > Due to a change of direction for me I'll be stepping down from the Kolla > core group. It's been a blast, thanks to everyone I've worked/interacted > with over the past few years. Thanks in particular to Eduardo who's done > a stellar job of PTL since taking the reins. I hope we'll cross paths > again in the future :) > > All the best! > -Paul > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Thu Jan 10 17:34:41 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 10 Jan 2019 11:34:41 -0600 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> Message-ID: <5C378231.8010603@openstack.org> > Arkady.Kanevsky at dell.com > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > We do still use DriverLog for the Marketplace drivers listing. We have a cronjob set up to ingest nightly from Stackalytics. We also have the ability to CRUD the listings in the Foundation website CMS. That said, as Boris mentioned, the list is really not used much and I know there is a lot of out of date info there. We're planning to move the marketplace list to yaml in a public repo, similar to what we did for OpenStack Map [1]. Cheers, Jimmy [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > January 8, 2019 at 11:10 AM > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Thu Jan 10 17:34:41 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 10 Jan 2019 11:34:41 -0600 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> Message-ID: <5C378231.8010603@openstack.org> > Arkady.Kanevsky at dell.com > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > We do still use DriverLog for the Marketplace drivers listing. We have a cronjob set up to ingest nightly from Stackalytics. We also have the ability to CRUD the listings in the Foundation website CMS. That said, as Boris mentioned, the list is really not used much and I know there is a lot of out of date info there. We're planning to move the marketplace list to yaml in a public repo, similar to what we did for OpenStack Map [1]. Cheers, Jimmy [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > January 8, 2019 at 11:10 AM > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Thu Jan 10 17:36:09 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 10 Jan 2019 11:36:09 -0600 Subject: [oslo][migrator] RFE Configuration mapping tool for upgrade - coordinate teams In-Reply-To: References: <4c6d85cc-a566-f981-433e-992a7433a236@nemebean.com> <15e6dee690c342de97c2686771dae2c8@G07SGEXCMSGPS05.g07.fujitsu.local> Message-ID: Thanks for the quick feedback everyone. I've abandoned the patch series, although I did pull out one change that seemed to be a valid bugfix independent of the migrator work: https://review.openstack.org/#/c/607690/ On 1/10/19 9:15 AM, Herve Beraud wrote: > Make sense so +1 > > Le jeu. 10 janv. 2019 14:27, Doug Hellmann > a écrit : > > "Nguyen Hung, Phuong" > writes: > > > Hi Ben, > > > >> I suggest that we either WIP or abandon the current > >> patch series. > > ... > >> If you have any thoughts about this plan please let me know. > Otherwise I > >> will act on it sometime in the near-ish future. > > > > Thanks for your consideration. I am agree with you, please help > me to abandon them because I am not privileged with those patches. > > > > Regards, > > Phuong. > > +1 for abandoning them, at least for now. As Ben points out, gerrit will > still have copies. > > Doug > > > > > -----Original Message----- > > From: Ben Nemec [mailto:openstack at nemebean.com > ] > > Sent: Thursday, January 10, 2019 6:12 AM > > To: Herve Beraud; Nguyen, Hung Phuong > > Cc: openstack-discuss at lists.openstack.org > > > Subject: Re: [oslo][migrator] RFE Configuration mapping tool for > upgrade - coordinate teams > > > > > > > > On 12/20/18 4:41 AM, Herve Beraud wrote: > >> > >> > >> Le jeu. 20 déc. 2018 à 09:26, Nguyen Hung, Phuong > >> > >> a > écrit : > >> > >>     Hi Ben, > >> > >>     I am apology that in last month we do not have much time > maintaining > >>     the code. > >> > >>      > but if no one's going to use it then I'd rather cut our > >>      > losses than continue pouring time into it. > >> > >>     I agree, we will wait for the community to decide the need > for the > >>     feature. > >>     In the near future, we do not have ability to maintain the > code. If > >>     anyone > >>     has interest to continue maintaining the patch, we will > support with > >>     document, > >>     reviewing... in our possibility. > >> > >> > >> I can help you to maintain the code if needed. > >> > >> Personaly I doesn't need this feature so I agree Ben and Doug > point of view. > >> > >> We need to measure how many this feature is useful and if it > make sense > >> to support and maintain more code in the future related to this > feature > >> without any usages behind that. > > > > We discussed this again in the Oslo meeting this week, and to > share with > > the wider audience here's what I propose: > > > > Since the team that initially proposed the feature and that we > expected > > to help maintain it are no longer able to do so, and it's not > clear to > > the Oslo team that there is sufficient demand for a rather complex > > feature like this, I suggest that we either WIP or abandon the > current > > patch series. Gerrit never forgets, so if at some point there are > > contributors (new or old) who have a vested interest in the > feature we > > can always resurrect it. > > > > If you have any thoughts about this plan please let me know. > Otherwise I > > will act on it sometime in the near-ish future. > > > > In the meantime, if anyone is desperate for Oslo work to do here > are a > > few things that have been lingering on my todo list: > > > > * We have a unit test in oslo.utils (test_excutils) that is still > using > > mox. That needs to be migrated to mock. > > * oslo.cookiecutter has a number of things that are out of date (doc > > layout, lack of reno, coverage job). Since it's unlikely we've > reached > > peak Oslo library we should update that so there aren't a bunch of > > post-creation changes needed like there were with > oslo.upgradecheck (and > > I'm guessing oslo.limit). > > * The config validator still needs support for dynamic groups, if > > oslo.config is your thing. > > * There are 326 bugs open across Oslo projects. Help wanted. :-) > > > > Thanks. > > > > -Ben > > > > -- > Doug > From Arkady.Kanevsky at dell.com Thu Jan 10 17:38:23 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Thu, 10 Jan 2019 17:38:23 +0000 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <5C378231.8010603@openstack.org> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> Message-ID: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Thanks Jimmy. Since I am responsible for updating marketplace per release I just need to know what mechanism to use and which file I need to patch. Thanks, Arkady From: Jimmy McArthur Sent: Thursday, January 10, 2019 11:35 AM To: openstack-dev at lists.openstack.org; openstack-discuss at lists.openstack.org Subject: Re: [openstack-dev] [stackalytics] Stackalytics Facelift [EXTERNAL EMAIL] Arkady.Kanevsky at dell.com January 9, 2019 at 9:20 AM Thanks Boris. Do we still use DriverLog for marketplace driver status updates? We do still use DriverLog for the Marketplace drivers listing. We have a cronjob set up to ingest nightly from Stackalytics. We also have the ability to CRUD the listings in the Foundation website CMS. That said, as Boris mentioned, the list is really not used much and I know there is a lot of out of date info there. We're planning to move the marketplace list to yaml in a public repo, similar to what we did for OpenStack Map [1]. Cheers, Jimmy [1] https://git.openstack.org/cgit/openstack/openstack-map/ Thanks, Arkady From: Boris Renski Sent: Tuesday, January 8, 2019 11:11 AM To: openstack-dev at lists.openstack.org; Ilya Shakhat; Herman Narkaytis; David Stoltenberg Subject: [openstack-dev] [stackalytics] Stackalytics Facelift [EXTERNAL EMAIL] Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: * We have new look and feel at stackalytics.com * We did away with DriverLog and Member Directory, which were not very actively used or maintained. Those are still available via direct links, but not in the men on the top * BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible at the top nav. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback or answer questions. -Boris Boris Renski January 8, 2019 at 11:10 AM Folks, Happy New Year! We wanted to start the year by giving a facelift to stackalytics.com (based on stackalytics openstack project). Brief summary of updates: * We have new look and feel at stackalytics.com * We did away with DriverLog and Member Directory, which were not very actively used or maintained. Those are still available via direct links, but not in the men on the top * BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated project commits via a separate subsection accessible at the top nav. Before this was all bunched up in Project Type -> Complimentary Happy to hear comments or feedback or answer questions. -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Thu Jan 10 17:56:54 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Thu, 10 Jan 2019 17:56:54 +0000 Subject: [nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation In-Reply-To: <0b37748c-bbc4-e5cf-a434-6adcd0248b64@gmail.com> References: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> <0b37748c-bbc4-e5cf-a434-6adcd0248b64@gmail.com> Message-ID: <1c2024d88b8c900edb2f063b4203da3d5cc76c11.camel@redhat.com> On Thu, 2019-01-10 at 11:05 -0500, Jay Pipes wrote: > On 01/10/2019 10:49 AM, Robert Donovan wrote: > > Hello Nova folks, > > > > I spoke to some of you very briefly about this in Berlin (thanks > > again for your time), and we were resigned to turning off SMT to > > fully protect against future CPU cache side-channel attacks as I > > know many others have done. However, we have stubbornly done a bit > > of last-resort research and testing into using vCPU pinning on a > > per-tenant basis as an alternative and I’d like to lay it out in > > more detail for you to make sure there are no legs in the idea > > before abandoning it completely. > > > > The idea is to use libvirt’s vcpupin ability to ensure that two > > different tenants never share the same physical CPU core, so they > > cannot theoretically steal each other’s data via an L1 or L2 cache > > side-channel. The pinning would be optimised to make use of as many > > logical cores as possible for any given tenant. We would also > > isolate other key system processes to a separate range of physical > > cores. After discussions in Berlin, we ran some tests with live > > migration, as this is key to our maintenance activities and would > > be a show-stopped if it didn’t work. We found that removing any > > pinning restrictions immediately prior to migration resulted in > > them being completely reset on the target host, which could then be > > optimised accordingly post-migration. Unfortunately, there would be > > a small window of time where we couldn’t prevent tenants from > > sharing a physical core on the target host after a migration, but > > we think this is an acceptable risk given the nature of these > > attacks. > > > > Obviously, this approach may not be appropriate in many > > circumstances, such as if you have many tenants who just run single > > VMs with one vCPU, or if over-allocation is in use. We have also > > only looked at KVM and libvirt. I would love to know what people > > think of this approach however. Are there any other clear issues > > that you can think of which we may not have considered? If it seems > > like a reasonable idea, is it something that could fit into Nova > > and, if so, where in the architecture is the best place for it to > > sit? I know you can currently specify per-instance CPU pinning via > > flavor parameters, so a similar approach could be taken for this > > strategy. Alternatively, we can look at implementing it as an > > external plugin of some kind for use by those with a similar setup. > > IMHO, if you're going to go through all the hassle of pinning guest vCPU > threads to distinct logical host processors, you might as well just use > dedicated CPU resources for everything. As you mention above, you can't > have overcommit anyway if you're concerned about this problem. Once you > have a 1.0 cpu_allocation_ratio, you're essentially limiting your CPU > resources to a dedicated host CPU -> guest CPU situation so you might as > well just use CPU pinning and deal with all the headaches that brings > with it. Indeed. My initial answer to this was "use CPU thread policies" (specifically, the 'require' policy) to ensure each instance owns its entire core, thinking you were using dedicated/pinned CPUs. For shared CPUs, I'm not sure how we could ever do something like you've proposed in a manner that would result in less than the ~20% or so performance degradation I usually see quoted when turning off SMT. Far too much second guessing of the expected performance requirements of the guest would be necessary. Stephen From kbcaulder at gmail.com Thu Jan 10 18:34:40 2019 From: kbcaulder at gmail.com (Brandon Caulder) Date: Thu, 10 Jan 2019 10:34:40 -0800 Subject: [cinder] db sync error upgrading from pike to queens Message-ID: Hi, I am receiving the following error when performing an offline upgrade of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to openstack-cinder-1:12.0.3-1.el7. # cinder-manage db version 105 # cinder-manage --debug db sync Error during database migration: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes SET shared_targets=%(shared_targets)s'] [parameters: {'shared_targets': 1}] # cinder-manage db version 114 The db version does not upgrade to queens version 117. Any help would be appreciated. Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Thu Jan 10 19:01:27 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Thu, 10 Jan 2019 13:01:27 -0600 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: Message-ID: Brandon, I am thinking you are hitting this bug: https://bugs.launchpad.net/cinder/+bug/1806156 I think you can work around it by retrying the migration with the volume service running.  You may, however, want to check with Iain MacDonnell as he has been looking at this for a while. Thanks! Jay On 1/10/2019 12:34 PM, Brandon Caulder wrote: > Hi, > > I am receiving the following error when performing an offline upgrade > of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > openstack-cinder-1:12.0.3-1.el7. > > # cinder-manage db version > 105 > > # cinder-manage --debug db sync > Error during database migration: (pymysql.err.OperationalError) (2013, > 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes > SET shared_targets=%(shared_targets)s'] [parameters: > {'shared_targets': 1}] > > # cinder-manage db version > 114 > > The db version does not upgrade to queens version 117.  Any help would > be appreciated. > > Thank you From smooney at redhat.com Thu Jan 10 19:02:51 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 10 Jan 2019 19:02:51 +0000 Subject: [nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation In-Reply-To: <1c2024d88b8c900edb2f063b4203da3d5cc76c11.camel@redhat.com> References: <22D88272-896F-43EF-88AA-15DA202C5465@cleansafecloud.com> <0b37748c-bbc4-e5cf-a434-6adcd0248b64@gmail.com> <1c2024d88b8c900edb2f063b4203da3d5cc76c11.camel@redhat.com> Message-ID: On Thu, 2019-01-10 at 17:56 +0000, Stephen Finucane wrote: > On Thu, 2019-01-10 at 11:05 -0500, Jay Pipes wrote: > > On 01/10/2019 10:49 AM, Robert Donovan wrote: > > > Hello Nova folks, > > > > > > I spoke to some of you very briefly about this in Berlin (thanks > > > again for your time), and we were resigned to turning off SMT to > > > fully protect against future CPU cache side-channel attacks as I > > > know many others have done. However, we have stubbornly done a bit > > > of last-resort research and testing into using vCPU pinning on a > > > per-tenant basis as an alternative and I’d like to lay it out in > > > more detail for you to make sure there are no legs in the idea > > > before abandoning it completely. > > > > > > The idea is to use libvirt’s vcpupin ability to ensure that two > > > different tenants never share the same physical CPU core, so they > > > cannot theoretically steal each other’s data via an L1 or L2 cache > > > side-channel. The pinning would be optimised to make use of as many > > > logical cores as possible for any given tenant. We would also > > > isolate other key system processes to a separate range of physical > > > cores. After discussions in Berlin, we ran some tests with live > > > migration, as this is key to our maintenance activities and would > > > be a show-stopped if it didn’t work. We found that removing any > > > pinning restrictions immediately prior to migration resulted in > > > them being completely reset on the target host, which could then be > > > optimised accordingly post-migration. Unfortunately, there would be > > > a small window of time where we couldn’t prevent tenants from > > > sharing a physical core on the target host after a migration, but > > > we think this is an acceptable risk given the nature of these > > > attacks. > > > > > > Obviously, this approach may not be appropriate in many > > > circumstances, such as if you have many tenants who just run single > > > VMs with one vCPU, or if over-allocation is in use. We have also > > > only looked at KVM and libvirt. I would love to know what people > > > think of this approach however. Are there any other clear issues > > > that you can think of which we may not have considered? If it seems > > > like a reasonable idea, is it something that could fit into Nova > > > and, if so, where in the architecture is the best place for it to > > > sit? I know you can currently specify per-instance CPU pinning via > > > flavor parameters, so a similar approach could be taken for this > > > strategy. Alternatively, we can look at implementing it as an > > > external plugin of some kind for use by those with a similar setup. > > > > IMHO, if you're going to go through all the hassle of pinning guest vCPU > > threads to distinct logical host processors, you might as well just use > > dedicated CPU resources for everything. As you mention above, you can't > > have overcommit anyway if you're concerned about this problem. Once you > > have a 1.0 cpu_allocation_ratio, you're essentially limiting your CPU > > resources to a dedicated host CPU -> guest CPU situation so you might as > > well just use CPU pinning and deal with all the headaches that brings > > with it. > > Indeed. My initial answer to this was "use CPU thread policies" > (specifically, the 'require' policy) to ensure each instance owns its > entire core, thinking you were using dedicated/pinned CPUs. the isolate policy should address this. the require policy would for a even number of cores and a singel numa node. the require policy does not adress this is you have multiple numa nodes e.g. a 14 cores spread aross 2 numa nodes with require will have one free ht sibling on each numa node when pinned unless we hava a check for that i missed. > For shared > CPUs, I'm not sure how we could ever do something like you've proposed > in a manner that would result in less than the ~20% or so performance > degradation I usually see quoted when turning off SMT. Far too much > second guessing of the expected performance requirements of the guest > would be necessary. for shared cpus the assumtion is that as the guest cores are floating that your victim and payload vm woudl not remain running on the same core/hypertread for a protracted period of time. if both are activly using cpu cycles then the kernel schuler will schduler them to different threads/cores to allow them to exectue without contention. Note that im not saying there is not a risk but tenat aware shcduleing for shared cpus effefctivly mean we woudl have to stop supporting floating instance entirely and only allow oversubsripton to happen between vms from the same tenant which is a unlikely to ever happen in a cloud enviorment as teant vms typically are not coloated on a single host and second is not desirable in all environments. > Stephen > > From brenski at mirantis.com Thu Jan 10 19:10:42 2019 From: brenski at mirantis.com (Boris Renski) Date: Thu, 10 Jan 2019 11:10:42 -0800 Subject: [openstack-discuss] [stackalytics] Stackalytics facelift In-Reply-To: References: Message-ID: Hey guys! thanks for the heads up on this. Let us check and fix ASAP. On Thu, Jan 10, 2019 at 12:45 AM Artem Goncharov wrote: > Hi, > > I can repeat the issue - stackalytics stopped showing my affiliation > correctly (user: gtema, entry in default_data.json is present) > > Regards, > Artem > > On Thu, Jan 10, 2019 at 5:48 AM Surya Singh > wrote: > >> Hi Boris >> >> Great to see new facelift of Stackalytics. Its really good. >> >> I have a query regarding contributors name is not listed as per company >> affiliation. >> Before facelift to stackalytics it was showing correct whether i have >> entry in >> https://github.com/openstack/stackalytics/blob/master/etc/default_data.json >> or not. >> Though now i have pushed the patch for same >> https://review.openstack.org/629150, but another thing is one of my >> colleague Vishal Manchanda name is also showing as independent contributor >> rather than NEC contributor. While his name entry already in >> etc/default_data.json. >> >> Would be great if you check the same. >> >> --- >> Thanks >> Surya >> >> >> On Tue, Jan 8, 2019 at 11:57 PM Boris Renski >> wrote: >> >>> Folks, >>> >>> Happy New Year! We wanted to start the year by giving a facelift to >>> stackalytics.com (based on stackalytics openstack project). Brief >>> summary of updates: >>> >>> - >>> >>> We have new look and feel at stackalytics.com >>> - >>> >>> We did away with DriverLog >>> and Member Directory , which >>> were not very actively used or maintained. Those are still available via >>> direct links, but not in the menu on the top >>> - >>> >>> BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated >>> project commits via a separate subsection accessible via top menu. Before >>> this was all bunched up in Project Type -> Complimentary >>> >>> Happy to hear comments or feedback. >>> >>> -Boris >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From iain.macdonnell at oracle.com Thu Jan 10 19:11:36 2019 From: iain.macdonnell at oracle.com (iain MacDonnell) Date: Thu, 10 Jan 2019 11:11:36 -0800 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: Message-ID: Different issue, I believe (DB sync vs. online migrations) - it just happens that both pertain to shared targets. Brandon, might you have a very large number of rows in your volumes table? Have you been purging soft-deleted rows? ~iain On 1/10/19 11:01 AM, Jay Bryant wrote: > Brandon, > > I am thinking you are hitting this bug: > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > I think you can work around it by retrying the migration with the volume > service running.  You may, however, want to check with Iain MacDonnell > as he has been looking at this for a while. > > Thanks! > Jay > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: >> Hi, >> >> I am receiving the following error when performing an offline upgrade >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to >> openstack-cinder-1:12.0.3-1.el7. >> >> # cinder-manage db version >> 105 >> >> # cinder-manage --debug db sync >> Error during database migration: (pymysql.err.OperationalError) (2013, >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes >> SET shared_targets=%(shared_targets)s'] [parameters: >> {'shared_targets': 1}] >> >> # cinder-manage db version >> 114 >> >> The db version does not upgrade to queens version 117.  Any help would >> be appreciated. >> >> Thank you > From sean.mcginnis at gmx.com Thu Jan 10 19:37:09 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Jan 2019 13:37:09 -0600 Subject: [release] Release countdown for week R-12, Jan 14-18 Message-ID: <20190110193709.GA14554@sm-workstation> Development Focus ----------------- Focus should be on wrapping up any design specs, then moving on to implementation as we head into the last stretch of Stein. General Information ------------------- Stein-2 is the membership freeze for deliverables to be included in the Stein coordinated release. We've reached out to a few folks, but if your project has any new deliverables that have not been released yet, please let us know ASAP if you hope to have them included in Stein. Following the changes we had proposed at the beginning of the release cycle, the release team will be proposing releases for any libraries that have significant changes merged that have not been released. PTL's and release liaisons, please watch for these and give a +1 to acknowledge them. If there is some reason to hold off on a release, let us know that as well. A +1 would be appreciated, but if we do not hear anything at all by the end of the week, we will assume things are OK to proceed. Upcoming Deadlines & Dates -------------------------- Individual OpenStack Foundation Board election: Jan 14-18 Non-client library freeze: February 28 -- Sean McGinnis (smcginnis) From sean.mcginnis at gmx.com Thu Jan 10 19:42:28 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 10 Jan 2019 13:42:28 -0600 Subject: Review-Priority for Project Repos In-Reply-To: <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> Message-ID: <20190110194227.GB14554@sm-workstation> > > I don't know if this was the reasoning behind Cinder's system, but I know > some people object to procedural -2 because it's a big hammer to essentially > say "not right now". It overloads the meaning of the vote in a potentially > confusing way that requires explanation every time it's used. At least I > hope procedural -2's always include a comment. > This was exactly the reasoning. -2 is overloaded, but its primary meaning was/is "we do not want this code change". It just happens that it was also a convenient way to say that with "right now" at the end. The Review-Priority -1 is a clear way to say whether something is held because it can't be merged right now due to procedural or process reasons, versus something that we just don't want at all. From openstack at nemebean.com Thu Jan 10 20:07:41 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 10 Jan 2019 14:07:41 -0600 Subject: [tripleo] OVB 2.0-dev branch merged to master Message-ID: In preparation for importing OVB to Gerrit, the 2.0-dev branch was merged back to master. If the 2.0 changes break you, please use the stable/1.0 branch instead, which was created from the last commit before 2.0-dev was merged. -Ben From brenski at mirantis.com Thu Jan 10 20:54:56 2019 From: brenski at mirantis.com (Boris Renski) Date: Thu, 10 Jan 2019 12:54:56 -0800 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Message-ID: I think it would make sense to move driverlog to a separate domain... something like driverlog.openstack.org or something On Thu, Jan 10, 2019 at 9:45 AM wrote: > Thanks Jimmy. > > Since I am responsible for updating marketplace per release I just need to > know what mechanism to use and which file I need to patch. > > Thanks, > > Arkady > > > > *From:* Jimmy McArthur > *Sent:* Thursday, January 10, 2019 11:35 AM > *To:* openstack-dev at lists.openstack.org; > openstack-discuss at lists.openstack.org > *Subject:* Re: [openstack-dev] [stackalytics] Stackalytics Facelift > > > > [EXTERNAL EMAIL] > > > > Arkady.Kanevsky at dell.com > > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > We do still use DriverLog for the Marketplace drivers listing. We have a > cronjob set up to ingest nightly from Stackalytics. We also have the > ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I know > there is a lot of out of date info there. We're planning to move the > marketplace list to yaml in a public repo, similar to what we did for > OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman Narkaytis; > David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > > > [EXTERNAL EMAIL] > > Folks, > > > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - We have new look and feel at stackalytics.com > - We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the men on the top > - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible at the top nav. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > > > -Boris > > Boris Renski > > January 8, 2019 at 11:10 AM > > Folks, > > > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - We have new look and feel at stackalytics.com > - We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the men on the top > - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible at the top nav. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > > > -Boris > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brenski at mirantis.com Thu Jan 10 20:54:56 2019 From: brenski at mirantis.com (Boris Renski) Date: Thu, 10 Jan 2019 12:54:56 -0800 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Message-ID: I think it would make sense to move driverlog to a separate domain... something like driverlog.openstack.org or something On Thu, Jan 10, 2019 at 9:45 AM wrote: > Thanks Jimmy. > > Since I am responsible for updating marketplace per release I just need to > know what mechanism to use and which file I need to patch. > > Thanks, > > Arkady > > > > *From:* Jimmy McArthur > *Sent:* Thursday, January 10, 2019 11:35 AM > *To:* openstack-dev at lists.openstack.org; > openstack-discuss at lists.openstack.org > *Subject:* Re: [openstack-dev] [stackalytics] Stackalytics Facelift > > > > [EXTERNAL EMAIL] > > > > Arkady.Kanevsky at dell.com > > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > We do still use DriverLog for the Marketplace drivers listing. We have a > cronjob set up to ingest nightly from Stackalytics. We also have the > ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I know > there is a lot of out of date info there. We're planning to move the > marketplace list to yaml in a public repo, similar to what we did for > OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman Narkaytis; > David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > > > [EXTERNAL EMAIL] > > Folks, > > > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - We have new look and feel at stackalytics.com > - We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the men on the top > - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible at the top nav. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > > > -Boris > > Boris Renski > > January 8, 2019 at 11:10 AM > > Folks, > > > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics openstack project). Brief summary > of updates: > > - We have new look and feel at stackalytics.com > - We did away with DriverLog > and Member Directory , which > were not very actively used or maintained. Those are still available via > direct links, but not in the men on the top > - BIGGEST CHANGE: You can now track some of the CNCF and Unaffiliated > project commits via a separate subsection accessible at the top nav. Before > this was all bunched up in Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > > > -Boris > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kbcaulder at gmail.com Thu Jan 10 21:32:27 2019 From: kbcaulder at gmail.com (Brandon Caulder) Date: Thu, 10 Jan 2019 13:32:27 -0800 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: Message-ID: Hi Iain, There are 424 rows in volumes which drops down to 185 after running cinder-manage db purge 1. Restarting the volume service after package upgrade and running sync again does not remediate the problem, although running db sync a second time does bump the version up to 117, the following appears in the volume.log... http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ Thanks On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell wrote: > > Different issue, I believe (DB sync vs. online migrations) - it just > happens that both pertain to shared targets. > > Brandon, might you have a very large number of rows in your volumes > table? Have you been purging soft-deleted rows? > > ~iain > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > Brandon, > > > > I am thinking you are hitting this bug: > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > I think you can work around it by retrying the migration with the volume > > service running. You may, however, want to check with Iain MacDonnell > > as he has been looking at this for a while. > > > > Thanks! > > Jay > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > >> Hi, > >> > >> I am receiving the following error when performing an offline upgrade > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > >> openstack-cinder-1:12.0.3-1.el7. > >> > >> # cinder-manage db version > >> 105 > >> > >> # cinder-manage --debug db sync > >> Error during database migration: (pymysql.err.OperationalError) (2013, > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes > >> SET shared_targets=%(shared_targets)s'] [parameters: > >> {'shared_targets': 1}] > >> > >> # cinder-manage db version > >> 114 > >> > >> The db version does not upgrade to queens version 117. Any help would > >> be appreciated. > >> > >> Thank you > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aspiers at suse.com Thu Jan 10 22:02:35 2019 From: aspiers at suse.com (Adam Spiers) Date: Thu, 10 Jan 2019 22:02:35 +0000 Subject: [meta-sig][docs] new section for SIG documentation on docs.o.o Message-ID: <20190110220235.2rggnmxwxqyn6lnz@pacific.linksys.moosehall> For the Stein release there is now a new section for SIGs on the documentation home page: https://docs.openstack.org/stein/ Currently only the self-healing SIG has a link but if other SIGs have links to add, it won't feel so lonely ;-) From jimmy at openstack.org Thu Jan 10 17:42:40 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 10 Jan 2019 11:42:40 -0600 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Message-ID: <5C378410.6050603@openstack.org> Absolutely. When we get there, I'll send an announcement to the MLs and ping you :) I don't currently have a timeline, but given the Stackalytics changes, this might speed it up a bit. > Arkady.Kanevsky at dell.com > January 10, 2019 at 11:38 AM > > Thanks Jimmy. > > Since I am responsible for updating marketplace per release I just > need to know what mechanism to use and which file I need to patch. > > Thanks, > > Arkady > > *From:*Jimmy McArthur > *Sent:* Thursday, January 10, 2019 11:35 AM > *To:* openstack-dev at lists.openstack.org; > openstack-discuss at lists.openstack.org > *Subject:* Re: [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > > > Arkady.Kanevsky at dell.com > > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > We do still use DriverLog for the Marketplace drivers listing. We > have a cronjob set up to ingest nightly from Stackalytics. We also > have the ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I > know there is a lot of out of date info there. We're planning to move > the marketplace list to yaml in a public repo, similar to what we did > for OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > *From:*Boris Renski > > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org > ; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift > to stackalytics.com (based on > stackalytics openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member > Directory , which were > not very actively used or maintained. Those are still > available via direct links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection > accessible at the top nav. Before this was all bunched up in > Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > > January 8, 2019 at 11:10 AM > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift > to stackalytics.com (based on > stackalytics openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member > Directory , which were > not very actively used or maintained. Those are still > available via direct links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection > accessible at the top nav. Before this was all bunched up in > Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Jimmy McArthur > January 10, 2019 at 11:34 AM > >> Arkady.Kanevsky at dell.com >> January 9, 2019 at 9:20 AM >> >> Thanks Boris. >> >> Do we still use DriverLog for marketplace driver status updates? >> > We do still use DriverLog for the Marketplace drivers listing. We > have a cronjob set up to ingest nightly from Stackalytics. We also > have the ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I > know there is a lot of out of date info there. We're planning to move > the marketplace list to yaml in a public repo, similar to what we did > for OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ >> >> Thanks, >> >> Arkady >> >> *From:* Boris Renski >> *Sent:* Tuesday, January 8, 2019 11:11 AM >> *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman >> Narkaytis; David Stoltenberg >> *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift >> >> [EXTERNAL EMAIL] >> >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics >> openstack project). Brief summary of updates: >> >> * We have new look and feel at stackalytics.com >> >> * We did away with DriverLog >> and Member Directory >> , which were not very >> actively used or maintained. Those are still available via direct >> links, but not in the men on the top >> * BIGGEST CHANGE: You can now track some of the CNCF and >> Unaffiliated project commits via a separate subsection accessible >> at the top nav. Before this was all bunched up in Project Type -> >> Complimentary >> >> Happy to hear comments or feedback or answer questions. >> >> -Boris >> >> Boris Renski >> January 8, 2019 at 11:10 AM >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics >> openstack project). Brief summary of updates: >> >> * We have new look and feel at stackalytics.com >> >> * We did away with DriverLog >> and Member Directory >> , which were not very >> actively used or maintained. Those are still available via direct >> links, but not in the men on the top >> * BIGGEST CHANGE: You can now track some of the CNCF and >> Unaffiliated project commits via a separate subsection accessible >> at the top nav. Before this was all bunched up in Project Type -> >> Complimentary >> >> Happy to hear comments or feedback or answer questions. >> >> -Boris > > Arkady.Kanevsky at dell.com > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > Thanks, > > Arkady > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > January 8, 2019 at 11:10 AM > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Thu Jan 10 17:42:40 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 10 Jan 2019 11:42:40 -0600 Subject: [openstack-dev] [stackalytics] Stackalytics Facelift In-Reply-To: <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> References: <45e9c80f282d4d2a880b279b990a964c@AUSX13MPS308.AMER.DELL.COM> <5C378231.8010603@openstack.org> <4b8edd5beecd4915b06278524482431e@AUSX13MPS308.AMER.DELL.COM> Message-ID: <5C378410.6050603@openstack.org> Absolutely. When we get there, I'll send an announcement to the MLs and ping you :) I don't currently have a timeline, but given the Stackalytics changes, this might speed it up a bit. > Arkady.Kanevsky at dell.com > January 10, 2019 at 11:38 AM > > Thanks Jimmy. > > Since I am responsible for updating marketplace per release I just > need to know what mechanism to use and which file I need to patch. > > Thanks, > > Arkady > > *From:*Jimmy McArthur > *Sent:* Thursday, January 10, 2019 11:35 AM > *To:* openstack-dev at lists.openstack.org; > openstack-discuss at lists.openstack.org > *Subject:* Re: [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > > > Arkady.Kanevsky at dell.com > > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > We do still use DriverLog for the Marketplace drivers listing. We > have a cronjob set up to ingest nightly from Stackalytics. We also > have the ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I > know there is a lot of out of date info there. We're planning to move > the marketplace list to yaml in a public repo, similar to what we did > for OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ > > Thanks, > > Arkady > > *From:*Boris Renski > > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org > ; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift > to stackalytics.com (based on > stackalytics openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member > Directory , which were > not very actively used or maintained. Those are still > available via direct links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection > accessible at the top nav. Before this was all bunched up in > Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > > January 8, 2019 at 11:10 AM > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift > to stackalytics.com (based on > stackalytics openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member > Directory , which were > not very actively used or maintained. Those are still > available via direct links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection > accessible at the top nav. Before this was all bunched up in > Project Type -> Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Jimmy McArthur > January 10, 2019 at 11:34 AM > >> Arkady.Kanevsky at dell.com >> January 9, 2019 at 9:20 AM >> >> Thanks Boris. >> >> Do we still use DriverLog for marketplace driver status updates? >> > We do still use DriverLog for the Marketplace drivers listing. We > have a cronjob set up to ingest nightly from Stackalytics. We also > have the ability to CRUD the listings in the Foundation website CMS. > > That said, as Boris mentioned, the list is really not used much and I > know there is a lot of out of date info there. We're planning to move > the marketplace list to yaml in a public repo, similar to what we did > for OpenStack Map [1]. > > Cheers, > Jimmy > > [1] https://git.openstack.org/cgit/openstack/openstack-map/ >> >> Thanks, >> >> Arkady >> >> *From:* Boris Renski >> *Sent:* Tuesday, January 8, 2019 11:11 AM >> *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman >> Narkaytis; David Stoltenberg >> *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift >> >> [EXTERNAL EMAIL] >> >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics >> openstack project). Brief summary of updates: >> >> * We have new look and feel at stackalytics.com >> >> * We did away with DriverLog >> and Member Directory >> , which were not very >> actively used or maintained. Those are still available via direct >> links, but not in the men on the top >> * BIGGEST CHANGE: You can now track some of the CNCF and >> Unaffiliated project commits via a separate subsection accessible >> at the top nav. Before this was all bunched up in Project Type -> >> Complimentary >> >> Happy to hear comments or feedback or answer questions. >> >> -Boris >> >> Boris Renski >> January 8, 2019 at 11:10 AM >> Folks, >> >> Happy New Year! We wanted to start the year by giving a facelift to >> stackalytics.com (based on stackalytics >> openstack project). Brief summary of updates: >> >> * We have new look and feel at stackalytics.com >> >> * We did away with DriverLog >> and Member Directory >> , which were not very >> actively used or maintained. Those are still available via direct >> links, but not in the men on the top >> * BIGGEST CHANGE: You can now track some of the CNCF and >> Unaffiliated project commits via a separate subsection accessible >> at the top nav. Before this was all bunched up in Project Type -> >> Complimentary >> >> Happy to hear comments or feedback or answer questions. >> >> -Boris > > Arkady.Kanevsky at dell.com > January 9, 2019 at 9:20 AM > > Thanks Boris. > > Do we still use DriverLog for marketplace driver status updates? > > Thanks, > > Arkady > > *From:* Boris Renski > *Sent:* Tuesday, January 8, 2019 11:11 AM > *To:* openstack-dev at lists.openstack.org; Ilya Shakhat; Herman > Narkaytis; David Stoltenberg > *Subject:* [openstack-dev] [stackalytics] Stackalytics Facelift > > [EXTERNAL EMAIL] > > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris > > Boris Renski > January 8, 2019 at 11:10 AM > Folks, > > Happy New Year! We wanted to start the year by giving a facelift to > stackalytics.com (based on stackalytics > openstack project). Brief summary of updates: > > * We have new look and feel at stackalytics.com > > * We did away with DriverLog > and Member Directory > , which were not very > actively used or maintained. Those are still available via direct > links, but not in the men on the top > * BIGGEST CHANGE: You can now track some of the CNCF and > Unaffiliated project commits via a separate subsection accessible > at the top nav. Before this was all bunched up in Project Type -> > Complimentary > > Happy to hear comments or feedback or answer questions. > > -Boris -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony at bakeyournoodle.com Thu Jan 10 22:54:43 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Fri, 11 Jan 2019 09:54:43 +1100 Subject: [meta-sig][docs] new section for SIG documentation on docs.o.o In-Reply-To: <20190110220235.2rggnmxwxqyn6lnz@pacific.linksys.moosehall> References: <20190110220235.2rggnmxwxqyn6lnz@pacific.linksys.moosehall> Message-ID: <20190110225442.GI28232@thor.bakeyournoodle.com> On Thu, Jan 10, 2019 at 10:02:35PM +0000, Adam Spiers wrote: > For the Stein release there is now a new section for SIGs on the > documentation home page: > > https://docs.openstack.org/stein/ > > Currently only the self-healing SIG has a link but if other SIGs > have links to add, it won't feel so lonely ;-) Hi Adam, Silly question but how would I added the Extended Maintenance SIG there? We really only have https://docs.openstack.org/project-team-guide/stable-branches.html to link to but you;d feel less lonely ;P Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From smooney at redhat.com Thu Jan 10 23:03:06 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 10 Jan 2019 23:03:06 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <20190110194227.GB14554@sm-workstation> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> <20190110194227.GB14554@sm-workstation> Message-ID: <16ba68b1772befaf5d689ecfb8a7b60ad055bdeb.camel@redhat.com> On Thu, 2019-01-10 at 13:42 -0600, Sean McGinnis wrote: > > > > I don't know if this was the reasoning behind Cinder's system, but I know > > some people object to procedural -2 because it's a big hammer to essentially > > say "not right now". It overloads the meaning of the vote in a potentially > > confusing way that requires explanation every time it's used. At least I > > hope procedural -2's always include a comment. > > > > This was exactly the reasoning. -2 is overloaded, but its primary meaning > was/is "we do not want this code change". It just happens that it was also a > convenient way to say that with "right now" at the end. > > The Review-Priority -1 is a clear way to say whether something is held because > it can't be merged right now due to procedural or process reasons, versus > something that we just don't want at all. for what its worth my understanding of why a procdural -2 is more correct is that this change cannot be merged because it has not met the procedual requirement to be considerd for this release. haveing received several over the years i have never seen it to carry any malaise or weight then the zuul pep8 job complianing about the line lenght of my code. with either a procedural -2 or a verify -1 from zuul my code is equally un mergeable. the prime example being a patch that requires a spec that has not been approved. while most cores will not approve chage when other cores have left a -1 mistakes happen and the -2 does emphasise the point that even if the code is perfect under the porject processes this change should not be acitvly reporposed until the issue raised by the -2 has been addressed. In the case of a procedual -2 that typically means the spec is merge or the master branch opens for the next cycle. i agree that procedural -2's can seam harsh at first glance but i have also never seen one left without a comment explaining why it was left. the issue with a procedural -1 is i can jsut resubmit the patch several times and it can get lost in the comments. we recently intoduced a new review priority lable if we really wanted to disabiguate form normal -2s then we coudl have an explcitly lable for it but i personally would prefer to keep procedural -2s. anyway that just my two cents. > > From aspiers at suse.com Thu Jan 10 23:50:11 2019 From: aspiers at suse.com (Adam Spiers) Date: Thu, 10 Jan 2019 23:50:11 +0000 Subject: [meta-sig][docs] new section for SIG documentation on docs.o.o In-Reply-To: <20190110225442.GI28232@thor.bakeyournoodle.com> References: <20190110220235.2rggnmxwxqyn6lnz@pacific.linksys.moosehall> <20190110225442.GI28232@thor.bakeyournoodle.com> Message-ID: <20190110235010.3ozo6hgxbgrvoqxx@pacific.linksys.moosehall> Tony Breeds wrote: >On Thu, Jan 10, 2019 at 10:02:35PM +0000, Adam Spiers wrote: >>For the Stein release there is now a new section for SIGs on the >>documentation home page: >> >> https://docs.openstack.org/stein/ >> >>Currently only the self-healing SIG has a link but if other SIGs >>have links to add, it won't feel so lonely ;-) > >Hi Adam, > Silly question but how would I added the Extended Maintenance SIG >there? Yeah sorry, it was more silly that I didn't think to explain that :-) >We really only have >https://docs.openstack.org/project-team-guide/stable-branches.html to >link to but you;d feel less lonely ;P Indeed we would ;-) You can just submit a simple change to www/stein/index.html in openstack-manuals, and then run tox to check the render locally. Here's the self-healing SIG addition for you to copy from: https://review.openstack.org/#/c/628054/2/www/stein/index.html From jungleboyj at gmail.com Fri Jan 11 00:10:43 2019 From: jungleboyj at gmail.com (Jay S. Bryant) Date: Thu, 10 Jan 2019 18:10:43 -0600 Subject: Issues setting up a SolidFire node with Cinder In-Reply-To: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> References: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> Message-ID: <6f53c037-b03d-1550-3e7a-e42850d950ec@gmail.com> Grant, So, the copy is failing because it can't find the volume to copy the image into. I would check the host and container for any iSCSI errors as well as the backend.  It appears that something is going wrong when attempting to temporarily attach the volume to write the image into it. Jay On 1/10/2019 7:16 AM, Grant Morley wrote: > > Hi all, > > We are in the process of trying to add a SolidFire storage solution to > our existing OpenStack setup and seem to have hit a snag with cinder / > iscsi. > > We are trying to create a bootable volume to allow us to launch an > instance from it, but we are getting some errors in our cinder-volumes > containers that seem to suggest they can't connect to iscsi although > the volume seems to create fine on the SolidFire node. > > The command we are running is: > > openstack volume create --image $image-id --size 20 --bootable --type > solidfire sf-volume-v12 > > The volume seems to create on SolidFire but I then see these errors in > the "cinder-volume.log" > > https://pastebin.com/LyjLUhfk > > The volume containers can talk to the iscsi VIP on the SolidFire so I > am a bit stuck and wondered if anyone had come across any issues before? > > Kind Regards, > > > -- > Grant Morley > Cloud Lead > Absolute DevOps Ltd > Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP > www.absolutedevops.io > grant at absolutedevops.io 0845 874 0580 -------------- next part -------------- An HTML attachment was scrubbed... URL: From duc.openstack at gmail.com Fri Jan 11 00:56:31 2019 From: duc.openstack at gmail.com (Duc Truong) Date: Thu, 10 Jan 2019 16:56:31 -0800 Subject: [senlin] Meeting today at 0530 UTC Message-ID: Everyone, Our regular Senlin meetings are resuming today. This is an even week so the meeting will be happening on Friday at 530 UTC. Regards, Duc From ekcs.openstack at gmail.com Fri Jan 11 01:40:39 2019 From: ekcs.openstack at gmail.com (Eric K) Date: Thu, 10 Jan 2019 17:40:39 -0800 Subject: [congress][infra] override-checkout problem Message-ID: The congress-tempest-plugin zuul jobs against stable branches appear to be working incorrectly. Tests that should fail on stable/rocky (and indeed fails when triggered by congress patch [1]) are passing when triggered by congress-tempest-plugin patch [2]. I'd assume it's some kind of zuul misconfiguration in congress-tempest-plugin [3], but I've so far failed to figure out what's wrong. Particularly strange is that the job-output appears to show it checking out the right thing [4]. Any thoughts or suggestions? Thanks so much! [1] https://review.openstack.org/#/c/629070/ http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87474d7/logs/testr_results.html.gz The two failing z3 tests should indeed fail because the feature was not available in rocky. The tests were introduced because for some reason they pass in the job triggered by a patch in congress-tempest-plugin. [2] https://review.openstack.org/#/c/618951/ http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/logs/testr_results.html.gz [3] https://github.com/openstack/congress-tempest-plugin/blob/master/.zuul.yaml#L4 [4] http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/job-output.txt.gz#_2019-01-09_05_18_08_183562 shows congress is checked out to the correct commit at the top of the stable/rocky branch. From adriant at catalyst.net.nz Fri Jan 11 06:18:15 2019 From: adriant at catalyst.net.nz (Adrian Turjak) Date: Fri, 11 Jan 2019 19:18:15 +1300 Subject: [tc][all] Project deletion community goal for Train cycle Message-ID: Hello OpenStackers! As discussed at the Berlin Summit, one of the proposed community goals was project deletion and resource clean-up. Essentially the problem here is that for almost any company that is running OpenStack we run into the issue of how to delete a project and all the resources associated with that project. What we need is an OpenStack wide solution that every project supports which allows operators of OpenStack to delete everything related to a given project. Before we can choose this as a goal, we need to define what the actual proposed solution is, and what each service is either implementing or contributing to. I've started an Etherpad here: https://etherpad.openstack.org/p/community-goal-project-deletion Please add to it if I've missed anything about the problem description, or to flesh out the proposed solutions, but try to mostly keep any discussion here on the mailing list, so that the Etherpad can hopefully be more of a summary of where the discussions have led. This is mostly a starting point, and I expect there to be a lot of opinions and probably some push back from doing anything too big. That said, this is a major issue in OpenStack, and something we really do need because OpenStack is too big and too complicated for this not to exist in a smart cross-project manner. Let's solve this the best we can! Cheers, Adrian Turjak From eandersson at blizzard.com Fri Jan 11 08:13:41 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Fri, 11 Jan 2019 08:13:41 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com>, <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> Message-ID: <052175D3-F10F-4777-9C93-D2551F801720@blizzard.com> This has worked great for Designate as most of the reviewers have limited time, it has helped us focus on core issues and get critical patches out a lot faster than we otherwise would. Sent from my iPhone > On Jan 10, 2019, at 9:14 AM, Ben Nemec wrote: > > > >> On 1/10/19 12:17 AM, Ghanshyam Mann wrote: >> ---- On Thu, 03 Jan 2019 22:51:55 +0900 Sean McGinnis wrote ---- >> > On Fri, Dec 28, 2018 at 11:04:41AM +0530, Surya Singh wrote: >> > > Dear All, >> > > >> > > There are many occasion when we want to priorities some of the patches >> > > whether it is related to unblock the gates or blocking the non freeze >> > > patches during RC. >> > > >> > > So adding the Review-Priority will allow more precise dashboard. As >> > > Designate and Cinder projects already experiencing this[1][2] and after >> > > discussion with Jeremy brought this to ML to interact with these team >> > > before landing [3], as there is possibility that reapply the priority vote >> > > following any substantive updates to change could make it more cumbersome >> > > than it is worth. >> > >> > With Cinder this is fairly new, but I think it is working well so far. The >> > oddity we've run into, that I think you're referring to here, is how those >> > votes carry forward with updates. >> > >> > I set up Cinder with -1, +1, and +2 as possible priority votes. It appears when >> This idea looks great and helpful especially for blockers and cycle priority patches to get regular >> review bandwidth from Core or Active members of that project. >> IMO only +ve votes are more appropriate for this label. -1 is little confusing for many reasons like >> what is the difference between Review-Priority -1 and Code-Review -2 ? Review-Priority -1 means, >> it is less priority than 0/not labelled (explicitly setting any patch very less priority). >> After seeing Cinder dashboard, I got to know that -1 is used to block the changes due to procedural >> or technical reason. But that can be done by -2 on Code-Review label. Keeping Review-Priority label >> only for priority set makes it more clear which is nothing but allowing only +ve votes for this label. >> Personally, I prefer only a single vote set which can be +1 to convey that these are the set of changes >> priority for review but having multiple +ve vote set as per project need/interest is all fine. > > I don't know if this was the reasoning behind Cinder's system, but I know some people object to procedural -2 because it's a big hammer to essentially say "not right now". It overloads the meaning of the vote in a potentially confusing way that requires explanation every time it's used. At least I hope procedural -2's always include a comment. > > Whether adding a whole new vote type is a meaningful improvement is another question, but if we're adding the type anyway for prioritization it might make sense to use it to replace procedural -2. Especially if we could make it so any core can change it (apparently not right now), whereas -2 requires the original core to come back and remove it. > From dharmendra.kushwaha at india.nec.com Fri Jan 11 11:31:16 2019 From: dharmendra.kushwaha at india.nec.com (Dharmendra Kushwaha) Date: Fri, 11 Jan 2019 11:31:16 +0000 Subject: [dev][Tacker] Implementing Multisite VNFFG In-Reply-To: <5c36f97e.1c69fb81.79c09.a033@mx.google.com> References: <5c36f97e.1c69fb81.79c09.a033@mx.google.com> Message-ID: Dear Lee, Good point & Thanks for the proposal. Currently no ongoing activity on that. And That will be great help if you lead this feature enhancement. Feel free to join Tacker weekly meeting with some initial drafts. Thanks & Regards Dharmendra Kushwaha From: 이호찬 [mailto:ghcks1000 at gmail.com] Sent: 10 January 2019 13:21 To: openstack-discuss at lists.openstack.org Subject: [dev][Tacker] Implementing Multisite VNFFG Dear Tacker folks, Hello, I'm interested in implementing multisite VNFFG in Tacker project. As far as I know, current single Tacker controller can manage multiple Openstack sites (Multisite VIM), but it can create VNFFG in only singlesite, so it can't create VNFFG across multisite. I think if multisite VNFFG is possible, tacker can have more flexibility in managing VNF and VNFFG. In the current tacker, networking-sfc driver is used to support VNFFG, and networking-sfc uses port chaining to construct service chain. So, I think extending current port chaining in singleiste to multisite can be one solution. Is there development process about multisite VNFFG in tacker project? Otherwise, I wonder that tacker is interested in this feature. I want to develop this feature for Tacker project if I can. Yours sincerely, Hochan Lee. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Jan 11 12:57:39 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 11 Jan 2019 21:57:39 +0900 Subject: [congress][infra] override-checkout problem In-Reply-To: References: Message-ID: <1683cfd4d5c.ec4bd8195854.3368279581961683040@ghanshyammann.com> Hi Eric, This seems the same issue happening on congress-tempest-plugin gate where 'congress-devstack-py35-api-mysql-queens' is failing [1]. python-congressclient was not able to install and openstack client trow error for congress command. The issue is stable branch jobs on congress-tempest-plugin does checkout the master version for all repo instead of what mentioned in override-checkout var. If you see congress's rocky patch, congress is checkout out with rocky version[2] but congress-tempest-plugin patch's rocky job checkout the master version of congress instead of rocky version [3]. That is why your test expectedly fail on congress patch but pass on congress-tempest-plugin. Root cause is that override-checkout var does not work on the legacy job (it is only zuulv3 job var, if I am not wrong), you need to use BRANCH_OVERRIDE for legacy jobs. Myself, amotoki and akhil was trying lot other workarounds to debug the root cause but at the end we just notice that congress jobs are legacy jobs and using override-checkout :). I have submitted the testing patch with BRANCH_OVERRIDE for congress-tempest-plugin queens job[4]. Which seems working fine, I can make those patches more formal for merge. Another thing I was discussing with Akhil that new tests of builins feature need another feature flag (different than congressz3.enabled) as that feature of z3 is in stein onwards only. [1] https://review.openstack.org/#/c/618951/ [2] http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87474d7/logs/pip2-freeze.txt.gz [3] http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/logs/pip2-freeze.txt.gz [4] https://review.openstack.org/#/q/topic:fix-stable-branch-testing+(status:open+OR+status:merged) -gmann ---- On Fri, 11 Jan 2019 10:40:39 +0900 Eric K wrote ---- > The congress-tempest-plugin zuul jobs against stable branches appear > to be working incorrectly. Tests that should fail on stable/rocky (and > indeed fails when triggered by congress patch [1]) are passing when > triggered by congress-tempest-plugin patch [2]. > > I'd assume it's some kind of zuul misconfiguration in > congress-tempest-plugin [3], but I've so far failed to figure out > what's wrong. Particularly strange is that the job-output appears to > show it checking out the right thing [4]. > > Any thoughts or suggestions? Thanks so much! > > [1] > https://review.openstack.org/#/c/629070/ > http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87474d7/logs/testr_results.html.gz > The two failing z3 tests should indeed fail because the feature was > not available in rocky. The tests were introduced because for some > reason they pass in the job triggered by a patch in > congress-tempest-plugin. > > [2] > https://review.openstack.org/#/c/618951/ > http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/logs/testr_results.html.gz > > [3] https://github.com/openstack/congress-tempest-plugin/blob/master/.zuul.yaml#L4 > > [4] http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-rocky/23c0214/job-output.txt.gz#_2019-01-09_05_18_08_183562 > shows congress is checked out to the correct commit at the top of the > stable/rocky branch. > > From alfredo.deluca at gmail.com Fri Jan 11 14:01:10 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Fri, 11 Jan 2019 15:01:10 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Hi Ignazio. So...on horizon I changed the project name from *admin* to *service* and that error disappeared even tho now I have a different erro with network..... is service the project where you run the vm on Magnum? Cheers On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano wrote: > Hi Alfredo, > attached here there is my magnum.conf for queens release > As you can see my heat sections are empty > When you create your cluster, I suggest to check heat logs e magnum logs > for verifyng what is wrong > Ignazio > > > > Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> so. Creating a stack either manually or dashboard works fine. The problem >> seems to be when I create a cluster (kubernetes/swarm) that I got that >> error. >> Maybe the magnum conf it's not properly setup? >> In the heat section of the magnum.conf I have only >> *[heat_client]* >> *region_name = RegionOne* >> *endpoint_type = internalURL* >> >> Cheers >> >> >> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >> alfredo.deluca at gmail.com> wrote: >> >>> Yes. Next step is to check with ansible. >>> I do think it's some rights somewhere... >>> I'll check later. Thanks >>> >>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano >> wrote: >>> >>>> Alfredo, >>>> 1 . how did you run the last heat template? By dashboard ? >>>> 2. Using openstack command you can check if ansible configured heat >>>> user/domain correctly >>>> >>>> >>>> It seems a problem related to >>>> heat user rights? >>>> >>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>> alfredo.deluca at gmail.com> ha scritto: >>>> >>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>> killed by signal 15 >>>>> which is last log from a few days ago. >>>>> >>>>> While the journal of the heat engine says >>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>> heat-engine service. >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>> SAWarning: Unicode type received non-unicode bind param value >>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>> occurrences) >>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>> (util.ellipses_string(value),)) >>>>> >>>>> >>>>> I also checked the configuration and it seems to be ok. the problem is >>>>> that I installed openstack with ansible-openstack.... so I can't change >>>>> anything unless I re run everything. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Check heat user and domani are c onfigured like at the following: >>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>> >>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>> >>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>> >>>>>>>> I ll try asap. Thanks >>>>>>>> >>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>> >>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>> heat is working fine? >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>> >>>>>>>>>> HI IGNAZIO >>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>> creating the master. >>>>>>>>>> >>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> Anycase during deployment you can connect with ssh to the master >>>>>>>>>>> and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>> >>>>>>>>>>>> Cheers >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my answer >>>>>>>>>>>>> could help you.... >>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>> Ignazio >>>>>>>>>>>>> >>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>> one.... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Alfredo* >>>>>>>>>>>> >>>>>>>>>>>> >>>>> >>>>> -- >>>>> *Alfredo* >>>>> >>>>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From lajos.katona at ericsson.com Fri Jan 11 14:19:26 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Fri, 11 Jan 2019 14:19:26 +0000 Subject: [L2-Gateway] Message-ID: Hi, I have a question regarding networking-l2gw, specifically l2gw-connection. We have an issue where the hw switch configured by networking-l2gw is slow, so when the l2gw-connection is created the API returns successfully, but the dataplane configuration is not yet ready. Do you think that adding state field to the connection is feasible somehow? By checking the vtep schema (http://www.openvswitch.org/support/dist-docs/vtep.5.html) no such information is available on vtep level. Thanks in advance for the help. Regarads Lajos From rico.lin.guanyu at gmail.com Fri Jan 11 14:26:32 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Fri, 11 Jan 2019 22:26:32 +0800 Subject: [all] New Automatic SIG (continue discussion) Message-ID: Dear all To continue the discussion of whether we should have new SIG for autoscaling. I think we already got enough time for this ML [1], and it's time to jump to the next step. As we got a lot of positive feedbacks from ML [1], I think it's definitely considered an action to create a new SIG, do some init works, and finally Here are some things that we can start right now, to come out with the name of SIG, the definition and mission. Here's my draft plan: To create a SIG name `Automatic SIG`, with given initial mission to improve automatic scaling with (but not limited to) OpenStack. As we discussed in forum [2], to have scenario tests and documents will be considered as actions for the initial mission. I gonna assume we will start from scenarios which already provide some basic tests and documents which we can adapt very soon and use them to build a SIG environment. And the long-term mission of this SIG is to make sure we provide good documentation and test coverage for most automatic functionality. I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we can provide more value if there are more needs in the future. Just like the example which Adam raised `self-optimizing` from people who are using watcher [3]. Let me know if you got any concerns about this name. And to clarify, there will definitely some cross SIG co-work between this new SIG and Self-Healing SIG (there're some common requirements even across self-healing and autoscaling features.). We also need to make sure we do not provide any duplicated work against self-healing SIG. As a start, let's only focus on autoscaling scenario, and make sure we're doing it right before we move to multiple cases. If no objection, I will create the new SIG before next weekend and plan a short schedule in Denver summit and PTG. [1] http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000284.html [2] https://etherpad.openstack.org/p/autoscaling-integration-and-feedback [3] http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000813.html -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas at lrasc.fr Fri Jan 11 14:28:12 2019 From: nicolas at lrasc.fr (nicolas at lrasc.fr) Date: Fri, 11 Jan 2019 15:28:12 +0100 Subject: [dev][Tacker] Implementing Multisite VNFFG In-Reply-To: References: <5c36f97e.1c69fb81.79c09.a033@mx.google.com> Message-ID: <5869a6ccf31f156b7e1dec1ef8969558@lrasc.fr> Hi all, First, I am not involved in tacker or in networking-sfc, so I can be wrong. Just to be sure by 'multiple VIM' you mean multi domains, multi autonomous systems, multi OpenStack/VNFinfra sites that are all different? When it comes to VNFFG over multiple VIM, I think a question is: what does the networking-sfc driver already support? Some other questions: 1. In a single VIM situation, does the networking-sfc driver support VNFFG (or port chaining) over multiple different IP subnets? 2. Does networking-sfc support both IPv4 and IPv6? 3. What routing/steering protocol (NSH, SRv6) does networking-sfc support? 4. How healthy (or up to date) is the development of networking-sfc? I think modifying tacker (or modifying any other VNF Orchestrator that is plugged to an OpenStack VIM with networking-sfc driver installed) alone is not enough. Maybe VNFFG over multiple VIM needs an SDN controller, maybe it needs new feature in the networking-sfc driver or new features in neutron... --- Nicolas On 2019-01-11 12:31, Dharmendra Kushwaha wrote: > Dear Lee, > > Good point & Thanks for the proposal. > > Currently no ongoing activity on that. And That will be great help if you lead this feature enhancement. > > Feel free to join Tacker weekly meeting with some initial drafts. > > Thanks & Regards > > Dharmendra Kushwaha > > FROM: 이호찬 [mailto:ghcks1000 at gmail.com] > SENT: 10 January 2019 13:21 > TO: openstack-discuss at lists.openstack.org > SUBJECT: [dev][Tacker] Implementing Multisite VNFFG > > Dear Tacker folks, > > Hello, I'm interested in implementing multisite VNFFG in Tacker project. > > As far as I know, current single Tacker controller can manage multiple Openstack sites (Multisite VIM), but it can create VNFFG in only singlesite, so it can't create VNFFG across multisite. I think if multisite VNFFG is possible, tacker can have more flexibility in managing VNF and VNFFG. > > In the current tacker, networking-sfc driver is used to support VNFFG, and networking-sfc uses port chaining to construct service chain. So, I think extending current port chaining in singleiste to multisite can be one solution. > > Is there development process about multisite VNFFG in tacker project? Otherwise, I wonder that tacker is interested in this feature. I want to develop this feature for Tacker project if I can. > > Yours sincerely, > > Hochan Lee. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dh3 at sanger.ac.uk Fri Jan 11 14:32:10 2019 From: dh3 at sanger.ac.uk (Dave Holland) Date: Fri, 11 Jan 2019 14:32:10 +0000 Subject: [cinder] volume encryption performance impact In-Reply-To: <20190110135605.qd34tb54deh5zv6f@lyarwood.usersys.redhat.com> References: <20190109151329.GA7953@sanger.ac.uk> <20190110135605.qd34tb54deh5zv6f@lyarwood.usersys.redhat.com> Message-ID: <20190111143210.GE7953@sanger.ac.uk> Thanks Lee, Arne, Thomas for replying. On Thu, Jan 10, 2019 at 01:56:05PM +0000, Lee Yarwood wrote: > What's the underlying version of QEMU being used here? It's qemu-kvm-rhev-2.10.0-21.el7_5.4.x86_64 > FWIW I can't recall seeing any performance issues when working on and > verifying this downstream with QEMU 2.10. I had wondered about https://bugzilla.redhat.com/1500334 (LUKS driver buffer size) which fits the symptoms, but the fix apparently went in to qemu-kvm-rhev-2.10.0-11.el7 so shouldn't be affecting us. I have a case open with RH Support now and I am keeping my fingers crossed. We will be redeploying this system again shortly with the latest Queens/RHOSP13 package versions, so should end up with qemu-kvm-rhev-2.12.0-18.el7_6.1.x86_64 and I will re-test then. Cheers, Dave -- ** Dave Holland ** Systems Support -- Informatics Systems Group ** ** 01223 496923 ** Wellcome Sanger Institute, Hinxton, UK ** -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From pabelanger at redhat.com Fri Jan 11 14:46:23 2019 From: pabelanger at redhat.com (Paul Belanger) Date: Fri, 11 Jan 2019 09:46:23 -0500 Subject: [infra] Updating fedora-latest nodeset to Fedora 29 In-Reply-To: <20190110004306.GA995@fedora19.localdomain> References: <20190110004306.GA995@fedora19.localdomain> Message-ID: <20190111144623.GA29154@localhost.localdomain> On Thu, Jan 10, 2019 at 11:43:06AM +1100, Ian Wienand wrote: > Hi, > > Just a heads up that we're soon switching "fedora-latest" nodes from > Fedora 28 to Fedora 29 [1] (setting up this switch took a bit longer > than usual, see [2]). Presumably if you're using "fedora-latest" you > want the latest Fedora, so this should not be unexpected :) But this > is the first time we're making this transition with the "-latest" > nodeset, so please report any issues. > Great work, just looked at fedora-latest job for windmill and no failures. Thanks! Paul From ignaziocassano at gmail.com Fri Jan 11 14:51:53 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 11 Jan 2019 15:51:53 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: Hi Alfredo, I am using admin project. If your run the simple heat stack I sent you from service projects, it works ? Il giorno ven 11 gen 2019 alle ore 15:01 Alfredo De Luca < alfredo.deluca at gmail.com> ha scritto: > Hi Ignazio. So...on horizon I changed the project name from *admin* to > *service* and that error disappeared even tho now I have a different erro > with network..... > is service the project where you run the vm on Magnum? > > Cheers > > > > On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano > wrote: > >> Hi Alfredo, >> attached here there is my magnum.conf for queens release >> As you can see my heat sections are empty >> When you create your cluster, I suggest to check heat logs e magnum logs >> for verifyng what is wrong >> Ignazio >> >> >> >> Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < >> alfredo.deluca at gmail.com> ha scritto: >> >>> so. Creating a stack either manually or dashboard works fine. The >>> problem seems to be when I create a cluster (kubernetes/swarm) that I got >>> that error. >>> Maybe the magnum conf it's not properly setup? >>> In the heat section of the magnum.conf I have only >>> *[heat_client]* >>> *region_name = RegionOne* >>> *endpoint_type = internalURL* >>> >>> Cheers >>> >>> >>> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >>> alfredo.deluca at gmail.com> wrote: >>> >>>> Yes. Next step is to check with ansible. >>>> I do think it's some rights somewhere... >>>> I'll check later. Thanks >>>> >>>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano < >>>> ignaziocassano at gmail.com wrote: >>>> >>>>> Alfredo, >>>>> 1 . how did you run the last heat template? By dashboard ? >>>>> 2. Using openstack command you can check if ansible configured heat >>>>> user/domain correctly >>>>> >>>>> >>>>> It seems a problem related to >>>>> heat user rights? >>>>> >>>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>>> alfredo.deluca at gmail.com> ha scritto: >>>>> >>>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child 4202 >>>>>> killed by signal 15 >>>>>> which is last log from a few days ago. >>>>>> >>>>>> While the journal of the heat engine says >>>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>>> heat-engine service. >>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>>> SAWarning: Unicode type received non-unicode bind param value >>>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>>> occurrences) >>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>> (util.ellipses_string(value),)) >>>>>> >>>>>> >>>>>> I also checked the configuration and it seems to be ok. the problem >>>>>> is that I installed openstack with ansible-openstack.... so I can't change >>>>>> anything unless I re run everything. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Check heat user and domani are c onfigured like at the following: >>>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>>> >>>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>>> >>>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>>> >>>>>>>>> I ll try asap. Thanks >>>>>>>>> >>>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>> >>>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>>> heat is working fine? >>>>>>>>>> Ignazio >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>> >>>>>>>>>>> HI IGNAZIO >>>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>>> creating the master. >>>>>>>>>>> >>>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>>> >>>>>>>>>>>> Anycase during deployment you can connect with ssh to the >>>>>>>>>>>> master and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>>> Ignazio >>>>>>>>>>>> >>>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>> >>>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my >>>>>>>>>>>>>> answer could help you.... >>>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>>> Ignazio >>>>>>>>>>>>>> >>>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>>> one.... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>> >>>>>> -- >>>>>> *Alfredo* >>>>>> >>>>>> >>> >>> -- >>> *Alfredo* >>> >>> > > -- > *Alfredo* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dprince at redhat.com Fri Jan 11 15:04:06 2019 From: dprince at redhat.com (Dan Prince) Date: Fri, 11 Jan 2019 10:04:06 -0500 Subject: [TripleO] flattening breakages Message-ID: <299e2464dbbe0c73335aedf86f7f206bd5a58a3c.camel@redhat.com> I noticed a few breakages [1][2] today with the flattening effort in the codebase. Specifically we are missing some of the 'monitoring_subscription' sections in the flattened files. We apparently have no CI on these ATM so please be careful in reviewing patches in this regard until (and if) we can add CI on this feature. I fear this type of restructuring is going to break subtle things and highlight what we don't have CI on. Some of the 3rd party vendor integration worries me in that we've got no upstream way of testing this stuff ATM. [1] https://review.openstack.org/#/c/630280/ (Ironic) [2] https://review.openstack.org/#/c/630281/ (Aodh) From saphi070 at gmail.com Fri Jan 11 15:16:37 2019 From: saphi070 at gmail.com (Sa Pham) Date: Fri, 11 Jan 2019 22:16:37 +0700 Subject: [all] New Automatic SIG (continue discussion) In-Reply-To: References: Message-ID: +1 from me. On Fri, Jan 11, 2019 at 9:32 PM Rico Lin wrote: > Dear all > > To continue the discussion of whether we should have new SIG for > autoscaling. > > I think we already got enough time for this ML [1], and it's time to jump > to the next step. > As we got a lot of positive feedbacks from ML [1], I think it's definitely > considered an action to create a new SIG, do some init works, and finally > Here are some things that we can start right now, to come out with the > name of SIG, the definition and mission. > > Here's my draft plan: > To create a SIG name `Automatic SIG`, with given initial mission to improve > automatic scaling with (but not limited to) OpenStack. As we discussed in > forum [2], to have scenario tests and documents will be considered as > actions for the initial mission. I gonna assume we will start from > scenarios which already provide some basic tests and documents which we can > adapt very soon and use them to build a SIG environment. And the long-term > mission of this SIG is to make sure we provide good documentation and test > coverage for most automatic functionality. > > I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we can > provide more value if there are more needs in the future. Just like the > example which Adam raised `self-optimizing` from people who are > using watcher [3]. > Let me know if you got any concerns about this name. > And to clarify, there will definitely some cross SIG co-work between this > new SIG and Self-Healing SIG (there're some common requirements even across > self-healing and autoscaling features.). We also need to make sure we do > not provide any duplicated work against self-healing SIG. > As a start, let's only focus on autoscaling scenario, and make sure we're > doing it right before we move to multiple cases. > > If no objection, I will create the new SIG before next weekend and plan a > short schedule in Denver summit and PTG. > > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000284.html > > [2] https://etherpad.openstack.org/p/autoscaling-integration-and-feedback > [3] > http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000813.html > > -- > May The Force of OpenStack Be With You, > > *Rico Lin*irc: ricolin > > -- Sa Pham Dang Cloud RnD Team - VCCloud Phone/Telegram: 0986.849.582 Skype: great_bn -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Fri Jan 11 15:16:50 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 11 Jan 2019 16:16:50 +0100 Subject: Issues setting up a SolidFire node with Cinder In-Reply-To: <6f53c037-b03d-1550-3e7a-e42850d950ec@gmail.com> References: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> <6f53c037-b03d-1550-3e7a-e42850d950ec@gmail.com> Message-ID: <20190111151650.phmxm22rzigmmgo5@localhost> On 10/01, Jay S. Bryant wrote: > Grant, > > So, the copy is failing because it can't find the volume to copy the image > into. > > I would check the host and container for any iSCSI errors as well as the > backend.  It appears that something is going wrong when attempting to > temporarily attach the volume to write the image into it. > > Jay Hi, I've also seen this error when the initiator name in /etc/iscsi/initiatorname.iscsi inside the container does not match the one in use by the iscsid initiator daemon. This can happen because the initiator name was changed after the daemon started or because it is not shared between the container and the host. I've also seen this happen (thought this is not the case) on VM migrations when the driver has a bug and doesn't return the right connection information (returns the first one). I would recommend setting the log level to debug to see additional info from OS-Brick. I've debugged these type of issues many times, and if it's not a production env I usually go with: - Setting a breakpoint in the OS-Brick code: I stop at the right place and check the state of the system and how the volume has been exported and mapped in the backend. - Installing cinderlib on a virtualenv (with --system-site-packages) in the cinder node, then using cinderlib to create a volume and debug an attach operation same as in previous step, like this: * Prepare the env: $ virtualenv --system-site-packages venv $ source venv/bin/activate (venv) $ pip install cinderlib * Run python interpreter (venv) $ python * Initialize cinderlib to store volumes in ./cl3.sqlite >>> import cinderlib as cl >>> db_connection = 'sqlite:///cl3.sqlite' >>> persistence_config = {'storage': 'db', 'connection': db_connection} >>> cl.setup(persistence_config=persistence_config, disable_logs=False, debug=True) * Setup the backend. You'll have to use your own configuration here: >>> sf = cl.Backend( ... volume_backend_name='solidfire', ... volume_driver='cinder.volume.drivers.solidfire.SolidFireDriver', ... san_ip='192.168.1.4', ... san_login='admin', ... san_password='admin_password', ... sf_allow_template_caching=False) * Create a 1GB empty volume: >>> vol = sf.create_volume(1) * Debug the attachment: >>> import pdb >>> pdb.run('att = vol.attach()') - If it's a container I usually execute a bash terminal interactively and pip install cinderlib and do the debugging like in the step above. Cheers, Gorka. > > On 1/10/2019 7:16 AM, Grant Morley wrote: > > > > Hi all, > > > > We are in the process of trying to add a SolidFire storage solution to > > our existing OpenStack setup and seem to have hit a snag with cinder / > > iscsi. > > > > We are trying to create a bootable volume to allow us to launch an > > instance from it, but we are getting some errors in our cinder-volumes > > containers that seem to suggest they can't connect to iscsi although the > > volume seems to create fine on the SolidFire node. > > > > The command we are running is: > > > > openstack volume create --image $image-id --size 20 --bootable --type > > solidfire sf-volume-v12 > > > > The volume seems to create on SolidFire but I then see these errors in > > the "cinder-volume.log" > > > > https://pastebin.com/LyjLUhfk > > > > The volume containers can talk to the iscsi VIP on the SolidFire so I am > > a bit stuck and wondered if anyone had come across any issues before? > > > > Kind Regards, > > > > > > -- > > Grant Morley > > Cloud Lead > > Absolute DevOps Ltd > > Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP > > www.absolutedevops.io > > grant at absolutedevops.io 0845 874 0580 From geguileo at redhat.com Fri Jan 11 15:23:18 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 11 Jan 2019 16:23:18 +0100 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: Message-ID: <20190111152318.ztuwirfgypehdfp6@localhost> On 10/01, Brandon Caulder wrote: > Hi Iain, > > There are 424 rows in volumes which drops down to 185 after running > cinder-manage db purge 1. Restarting the volume service after package > upgrade and running sync again does not remediate the problem, although > running db sync a second time does bump the version up to 117, the > following appears in the volume.log... > > http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ > Hi, If I understand correctly the steps were: - Run DB sync --> Fail - Run DB purge - Restart volume services - See the log error - Run DB sync --> version proceeds to 117 If that is the case, could you restart the services again now that the migration has been moved to version 117? If the cinder-volume service is able to restart please run the online data migrations with the service running. Cheers, Gorka. > Thanks > > On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell > wrote: > > > > > Different issue, I believe (DB sync vs. online migrations) - it just > > happens that both pertain to shared targets. > > > > Brandon, might you have a very large number of rows in your volumes > > table? Have you been purging soft-deleted rows? > > > > ~iain > > > > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > > Brandon, > > > > > > I am thinking you are hitting this bug: > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > > > > I think you can work around it by retrying the migration with the volume > > > service running. You may, however, want to check with Iain MacDonnell > > > as he has been looking at this for a while. > > > > > > Thanks! > > > Jay > > > > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > > >> Hi, > > >> > > >> I am receiving the following error when performing an offline upgrade > > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > > >> openstack-cinder-1:12.0.3-1.el7. > > >> > > >> # cinder-manage db version > > >> 105 > > >> > > >> # cinder-manage --debug db sync > > >> Error during database migration: (pymysql.err.OperationalError) (2013, > > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes > > >> SET shared_targets=%(shared_targets)s'] [parameters: > > >> {'shared_targets': 1}] > > >> > > >> # cinder-manage db version > > >> 114 > > >> > > >> The db version does not upgrade to queens version 117. Any help would > > >> be appreciated. > > >> > > >> Thank you > > > > > > > From alfredo.deluca at gmail.com Fri Jan 11 15:38:55 2019 From: alfredo.deluca at gmail.com (Alfredo De Luca) Date: Fri, 11 Jan 2019 16:38:55 +0100 Subject: openstack stack fails In-Reply-To: References: Message-ID: nope. I created another one and I got this error... Create_Failed: Resource CREATE failed: ValueError: resources.my_instance: nics are required after microversion 2.36 On Fri, Jan 11, 2019 at 3:52 PM Ignazio Cassano wrote: > Hi Alfredo, I am using admin project. > If your run the simple heat stack I sent you from service projects, it > works ? > > > Il giorno ven 11 gen 2019 alle ore 15:01 Alfredo De Luca < > alfredo.deluca at gmail.com> ha scritto: > >> Hi Ignazio. So...on horizon I changed the project name from *admin* to >> *service* and that error disappeared even tho now I have a different >> erro with network..... >> is service the project where you run the vm on Magnum? >> >> Cheers >> >> >> >> On Sun, Dec 30, 2018 at 8:43 AM Ignazio Cassano >> wrote: >> >>> Hi Alfredo, >>> attached here there is my magnum.conf for queens release >>> As you can see my heat sections are empty >>> When you create your cluster, I suggest to check heat logs e magnum >>> logs for verifyng what is wrong >>> Ignazio >>> >>> >>> >>> Il giorno dom 30 dic 2018 alle ore 01:31 Alfredo De Luca < >>> alfredo.deluca at gmail.com> ha scritto: >>> >>>> so. Creating a stack either manually or dashboard works fine. The >>>> problem seems to be when I create a cluster (kubernetes/swarm) that I got >>>> that error. >>>> Maybe the magnum conf it's not properly setup? >>>> In the heat section of the magnum.conf I have only >>>> *[heat_client]* >>>> *region_name = RegionOne* >>>> *endpoint_type = internalURL* >>>> >>>> Cheers >>>> >>>> >>>> On Fri, Dec 28, 2018 at 10:15 PM Alfredo De Luca < >>>> alfredo.deluca at gmail.com> wrote: >>>> >>>>> Yes. Next step is to check with ansible. >>>>> I do think it's some rights somewhere... >>>>> I'll check later. Thanks >>>>> >>>>> On Fri., 28 Dec. 2018, 7:39 pm Ignazio Cassano < >>>>> ignaziocassano at gmail.com wrote: >>>>> >>>>>> Alfredo, >>>>>> 1 . how did you run the last heat template? By dashboard ? >>>>>> 2. Using openstack command you can check if ansible configured heat >>>>>> user/domain correctly >>>>>> >>>>>> >>>>>> It seems a problem related to >>>>>> heat user rights? >>>>>> >>>>>> Il giorno Ven 28 Dic 2018 09:06 Alfredo De Luca < >>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Ignazio. The engine log doesn 't say anything...except >>>>>>> 2018-12-17 11:51:35.284 4064 INFO oslo_service.service [-] Child >>>>>>> 4202 killed by signal 15 >>>>>>> which is last log from a few days ago. >>>>>>> >>>>>>> While the journal of the heat engine says >>>>>>> Dec 28 06:36:29 aio1-heat-api-container-16f41ed7 systemd[1]: Started >>>>>>> heat-engine service. >>>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>>> /openstack/venvs/heat-19.0.0.0b1/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:226: >>>>>>> SAWarning: Unicode type received non-unicode bind param value >>>>>>> 'data-processing-cluster'. (this warning may be suppressed after 10 >>>>>>> occurrences) >>>>>>> Dec 28 06:36:31 aio1-heat-api-container-16f41ed7 heat-engine[91]: >>>>>>> (util.ellipses_string(value),)) >>>>>>> >>>>>>> >>>>>>> I also checked the configuration and it seems to be ok. the problem >>>>>>> is that I installed openstack with ansible-openstack.... so I can't change >>>>>>> anything unless I re run everything. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Dec 28, 2018 at 8:57 AM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> Check heat user and domani are c onfigured like at the following: >>>>>>>> https://docs.openstack.org/heat/rocky/install/install-rdo.html >>>>>>>> >>>>>>>> Il giorno Gio 27 Dic 2018 23:25 Alfredo De Luca < >>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>> >>>>>>>>> Hi Ignazio. I tried to spin up a stack but I got an error... >>>>>>>>> Authorization failed. Not sure why. I am a bit stuck >>>>>>>>> >>>>>>>>> On Sun., 23 Dec. 2018, 9:19 pm Alfredo De Luca < >>>>>>>>> alfredo.deluca at gmail.com wrote: >>>>>>>>> >>>>>>>>>> I ll try asap. Thanks >>>>>>>>>> >>>>>>>>>> On Sat., 22 Dec. 2018, 10:50 pm Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>> >>>>>>>>>>> Hi Alfredo, have you tried a simple heat template to verify if >>>>>>>>>>> heat is working fine? >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Il giorno Sab 22 Dic 2018 20:51 Alfredo De Luca < >>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>> >>>>>>>>>>>> HI IGNAZIO >>>>>>>>>>>> The problem is that doesn't go that far... It fails before even >>>>>>>>>>>> creating the master. >>>>>>>>>>>> >>>>>>>>>>>> On Sat., 22 Dec. 2018, 6:06 pm Ignazio Cassano < >>>>>>>>>>>> ignaziocassano at gmail.com wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Anycase during deployment you can connect with ssh to the >>>>>>>>>>>>> master and tail the /var/log/ cloud in it output for checking. >>>>>>>>>>>>> Ignazio >>>>>>>>>>>>> >>>>>>>>>>>>> Il giorno Sab 22 Dic 2018 17:18 Alfredo De Luca < >>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>> >>>>>>>>>>>>>> Ciao Ignazio >>>>>>>>>>>>>> What do you mean with master? you mean k8s master? >>>>>>>>>>>>>> I guess everything is fine... but I'll double check. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Dec 22, 2018 at 9:30 AM Ignazio Cassano < >>>>>>>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Alfredo, I am working on queens and I am not sure my >>>>>>>>>>>>>>> answer could help you.... >>>>>>>>>>>>>>> Can your master speak with kyestone public endpoint port >>>>>>>>>>>>>>> (5000) ? >>>>>>>>>>>>>>> Ignazio >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Il giorno Ven 21 Dic 2018 16:20 Alfredo De Luca < >>>>>>>>>>>>>>> alfredo.deluca at gmail.com> ha scritto: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi all. >>>>>>>>>>>>>>>> I installed magnum on openstack and now, after a few issue >>>>>>>>>>>>>>>> with cinder type list error, it passed that issue but now I have another >>>>>>>>>>>>>>>> one.... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> AuthorizationFailure: >>>>>>>>>>>>>>>> resources.kube_masters.resources[0].resources.master_wait_handle: >>>>>>>>>>>>>>>> Authorization failed. >>>>>>>>>>>>>>>> Not sure what to do nor check >>>>>>>>>>>>>>>> Any clue? >>>>>>>>>>>>>>>> Cheers >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Alfredo* >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Alfredo* >>>>>>> >>>>>>> >>>> >>>> -- >>>> *Alfredo* >>>> >>>> >> >> -- >> *Alfredo* >> >> -- *Alfredo* -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Fri Jan 11 15:44:13 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 11 Jan 2019 15:44:13 +0000 (GMT) Subject: [placement] update 19-01 Message-ID: HTML: https://anticdent.org/placement-update-19-01.html Hello! Here's placement update 19-01. Not a ton to report this week, so this will mostly be updating the lists provided last week. # Most Important As mentioned last week, there will be a meeting next week to discuss what is left before we can pull the trigger on [deleting the placement code from nova](https://review.openstack.org/618215). Wednesday is looking like a good day, perhaps at 1700UTC, but we'll need to confirm that on Monday when more people are around. Feel free to respond on this thread if that won't work for you (and suggest an alternative). Since deleting the code is dependent on deployment tooling being able to handle extracted placement (and upgrades to it), reviewing that work is important (see below). # What's Changed * It was nova's spec freeze this week, so a lot of effort was spent getting some specs reviewed and merged. That's reflected in the shorter specs section, below. * Placement had a release and was published to [pypi](https://pypi.org/project/openstack-placement/). This was a good excuse to write (yet another) blog post on [how easy it is to play with](https://anticdent.org/placement-from-pypi.html). # Bugs * Placement related [bugs not yet in progress](https://goo.gl/TgiPXb): 14. -1. * [In progress placement bugs](https://goo.gl/vzGGDQ) 16. +1 # Specs With spec freeze this week, this will be the last time we'll see this section until near the end of this cycle. Only one of the specs listed last week merged (placement for counting quota). * Account for host agg allocation ratio in placement (Still in rocky/) * Add subtree filter for GET /resource_providers * Resource provider - request group mapping in allocation candidate * VMware: place instances on resource pool (still in rocky/) * Standardize CPU resource tracking * Allow overcommit of dedicated CPU (Has an alternative which changes allocations to a float) * Modelling passthrough devices for report to placement * Nova Cyborg interaction specification. * supporting virtual NVDIMM devices * Proposes NUMA topology with RPs * Count quota based on resource class * Adds spec for instance live resize * Provider config YAML file * Resource modeling in cyborg. * Support filtering of allocation_candidates by forbidden aggregates * support virtual persistent memory # Main Themes ## Making Nested Useful I've been saying for a few weeks that "progress continues on gpu-reshaping for libvirt and xen" but it looks like the work at: * is actually stalled. Anyone have some insight on the status of that work? Also making use of nested is bandwidth-resource-provider: * There's a [review guide](http://lists.openstack.org/pipermail/openstack-discuss/2018-December/001129.html) for those patches. Eric's in the process of doing lots of cleanups to how often the ProviderTree in the resource tracker is checked against placement, and a variety of other "let's make this more right" changes in the same neighborhood: * Stack at: ## Extraction Besides the meeting mentioned above, I've refactored the extraction etherpad to make a [new version](https://etherpad.openstack.org/p/placement-extract-stein-5) that has less noise in it so the required actions are a bit more clear. The tasks remain much the same as mentioned last week: the reshaper work mentioned above and the work to get deployment tools operating with an extracted placement: * [TripleO](https://review.openstack.org/#/q/topic:tripleo-placement-extraction) * [OpenStack Ansible](https://review.openstack.org/#/q/project:openstack/openstack-ansible-os_placement) * [Kolla and Kolla Ansible](https://review.openstack.org/#/q/topic:split-placement) Loci's change to have an extracted placement has merged. Kolla has a patch to [include the upgrade script](https://review.openstack.org/#/q/topic:upgrade-placement). It raises the question of how or if the `mysql-migrate-db.sh` should be distributed. Should it maybe end up in the pypi distribution? (The rest of this section is duplicated from last week.) Documentation tuneups: * Release-notes: This is blocked until we refactor the release notes to reflect _now_ better. * The main remaining task here is participating in [openstack-manuals](https://docs.openstack.org/doc-contrib-guide/doc-index.html), to that end: * A stack of changes to nova to remove placement from the install docs. * Install docs in placement. I wrote to the [mailing list](http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001379.html) asking for input on making sure these things are close to correct, especially with regard to distro-specific things like package names. * Change to openstack-manuals to assert that placement is publishing install docs. Depends on the above. * There is a patch to [delete placement](https://review.openstack.org/#/c/618215/) from nova that we've put an administrative -2 on while we determine where things are (see about the meeting above). * There's a pending patch to support [online data migrations](https://review.openstack.org/#/c/624942/). This is important to make sure that fixup commands like `create_incomplete_consumers` can be safely removed from nova and implemented in placement. # Other There are still 13 [open changes](https://review.openstack.org/#/q/project:openstack/placement+status:open) in placement itself. Most of the time critical work is happening elsewhere (notably the deployment tool changes listed above). Of those placement changes, the [database-related](https://review.openstack.org/#/q/owner:nakamura.tetsuro%2540lab.ntt.co.jp+status:open+project:openstack/placement) ones from Tetsuro are the most important. Outside of placement: * Neutron minimum bandwidth implementation * zun: Use placement for unified resource management * WIP: add Placement aggregates tests (in tempest) * blazar: Consider the number of reservation inventory * Add placement client for basic GET operations (to tempest) # End If anyone has submitted, or is planning to, a proposal for summit that is placement-related, it would be great to hear about it. I had thought about doing a resilient placement in kubernetes with cockroachdb for the edge sort of thing, but then realized my motivations were suspect and I have enough to do otherwise. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From colleen at gazlene.net Fri Jan 11 15:44:39 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 11 Jan 2019 16:44:39 +0100 Subject: [dev][keystone] Keystone Team Update - Week of 7 January 2019 Message-ID: <1547221479.1146713.1631988432.52724E11@webmail.messagingengine.com> # Keystone Team Update - Week of 7 January 2019 Happy new year! We are ramping back up following the holidays. ## News ### Cross-Project Limits Followup We are trying to close in on a stable API for limits and want to restart the discussion on what the other projects need from it. Please chime in on the thread or the linked reviews[1]. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001518.html ## Open Specs Stein specs: https://bit.ly/2Pi6dGj Ongoing specs: https://bit.ly/2OyDLTh Spec freeze is today but we have two open specs for still open for Stein, we will need to decide whether to push them or grant exceptions for them, keeping in mind there is not much time left for implementation at this point. ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 27 changes this week. ## Changes that need Attention Search query: https://bit.ly/2RLApdA There are 103 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. Lance's patch bomb of doom still needs more review attention. ## Bugs Since the last report we opened 7 new bugs and closed 10. Bugs opened (7) Bug #1810485 (keystone:Medium) opened by Guang Yee https://bugs.launchpad.net/keystone/+bug/1810485 Bug #1810983 (keystone:Medium) opened by Guang Yee https://bugs.launchpad.net/keystone/+bug/1810983 Bug #1809779 (keystone:Undecided) opened by Yang Youseok https://bugs.launchpad.net/keystone/+bug/1809779 Bug #1810393 (keystone:Undecided) opened by wangxiyuan https://bugs.launchpad.net/keystone/+bug/1810393 Bug #1810278 (keystonemiddleware:Undecided) opened by Yang Youseok https://bugs.launchpad.net/keystonemiddleware/+bug/1810278 Bug #1810761 (keystonemiddleware:Undecided) opened by Hugo Kou https://bugs.launchpad.net/keystonemiddleware/+bug/1810761 Bug #1811351 (python-keystoneclient:Undecided) opened by Colleen Murphy https://bugs.launchpad.net/python-keystoneclient/+bug/1811351 Bugs closed (2) Bug #1809779 (keystone:Undecided) https://bugs.launchpad.net/keystone/+bug/1809779 Bug #1810761 (keystonemiddleware:Undecided) https://bugs.launchpad.net/keystonemiddleware/+bug/1810761 Bugs fixed (8) Bug #1805403 (keystone:Medium) fixed by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1805403 Bug #1810485 (keystone:Medium) fixed by Guang Yee https://bugs.launchpad.net/keystone/+bug/1810485 Bug #1810983 (keystone:Medium) fixed by no one https://bugs.launchpad.net/keystone/+bug/1810983 Bug #1786594 (keystone:Low) fixed by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1786594 Bug #1793374 (keystone:Low) fixed by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1793374 Bug #1810393 (keystone:Undecided) fixed by wangxiyuan https://bugs.launchpad.net/keystone/+bug/1810393 Bug #1809101 (keystonemiddleware:Undecided) fixed by leehom https://bugs.launchpad.net/keystonemiddleware/+bug/1809101 Bug #1807184 (oslo.policy:Medium) fixed by Brian Rosmaita https://bugs.launchpad.net/oslo.policy/+bug/1807184 ## Milestone Outlook https://releases.openstack.org/stein/schedule.html Spec freeze is today. The feature proposal freeze is at the end of this month, with feature freeze just five weeks after. ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter Dashboard generated using gerrit-dash-creator and https://gist.github.com/lbragstad/9b0477289177743d1ebfc276d1697b67 From aspiers at suse.com Fri Jan 11 16:14:17 2019 From: aspiers at suse.com (Adam Spiers) Date: Fri, 11 Jan 2019 16:14:17 +0000 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: References: Message-ID: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> Rico Lin wrote: >Dear all > >To continue the discussion of whether we should have new SIG for >autoscaling. > >I think we already got enough time for this ML [1], and it's time to jump >to the next step. >As we got a lot of positive feedbacks from ML [1], I think it's definitely >considered an action to create a new SIG, do some init works, and finally >Here are some things that we can start right now, to come out with the name >of SIG, the definition and mission. > >Here's my draft plan: >To create a SIG name `Automatic SIG`, with given initial mission to improve >automatic scaling with (but not limited to) OpenStack. As we discussed in >forum [2], to have scenario tests and documents will be considered as >actions for the initial mission. I gonna assume we will start from >scenarios which already provide some basic tests and documents which we can >adapt very soon and use them to build a SIG environment. And the long-term >mission of this SIG is to make sure we provide good documentation and test >coverage for most automatic functionality. > >I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we can >provide more value if there are more needs in the future. Just like the >example which Adam raised `self-optimizing` from people who are >using watcher [3]. >Let me know if you got any concerns about this name. I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound quite right to me, because it's not clear what is being automated. For example from the outside people might think it was a SIG about CI, or about automated testing, or both - or even some kind of automatic creation of new SIGs ;-) Here are some alternative suggestions: - Optimization SIG - Self-optimization SIG - Auto-optimization SIG - Adaptive Cloud SIG - Self-adaption SIG - Auto-adaption SIG - Auto-configuration SIG although I'm not sure these are a huge improvement on "Autoscaling SIG" - maybe some are too broad, or too vague. It depends on how likely it is that the scope will go beyond just auto-scaling. Of course you could also just stick with the original idea of "Auto-scaling" :-) >And to clarify, there will definitely some cross SIG co-work between this >new SIG and Self-Healing SIG (there're some common requirements even across >self-healing and autoscaling features.). We also need to make sure we do >not provide any duplicated work against self-healing SIG. >As a start, let's only focus on autoscaling scenario, and make sure we're >doing it right before we move to multiple cases. Sounds good! >If no objection, I will create the new SIG before next weekend and plan a >short schedule in Denver summit and PTG. Thanks for driving this! From geguileo at redhat.com Fri Jan 11 16:16:45 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 11 Jan 2019 17:16:45 +0100 Subject: [Openstack-discuss][Cinder] Fail to mount the volume on the target node In-Reply-To: References: Message-ID: <20190111161645.6lljpxjf66jgsbby@localhost> On 31/12, Minjun Hong wrote: > Hi. > After I installed Cinder, I have had a problem which I cannot make > instances with volume storage. > I created an instance on Horizon and it always has failed. > Actually, I'm using Xen as the hypervisor and there was not any special log > about Nova. > But, in the Xen's log (/var/log/xen/bootloader.5.log), I saw the hypervisor > cannot find the volume which is provided by Cinder: > > Traceback (most recent call last): > > File "/usr/lib/xen/bin/pygrub", line 929, in > > raise RuntimeError, "Unable to find partition containing kernel" > > RuntimeError: Unable to find partition containing kernel > > > And, I also found something noticeable in the Cinder's log > ('/var/log/cinder/cinder-volume.log' on the Block storage node): > > 2018-12-31 04:08:11.189 12380 INFO cinder.volume.manager > > [req-93eb0ad3-6c6c-4842-851f-435e15d8639b bb1e571e4d64462bac80654b153a88c3 > > 96ad10a59d114042b8f1ee82c438649a - default default] Attaching volume > > 4c21b8f1-ff07-4916-9692-e74759635978 to instance > > bea7dca6-fb04-4791-bac9-3ad560280cc3 at mountpoint /dev/xvda on host None. > > > It seems that Cinder cannot receive information of the target node ('on > host None' above) so, I think it can cause the problem that Cinder fails to > provide the volume due to lack of the host information. > Since I could not find any other logs except that log, I need more hints. > Please give me some help > > Thanks! > Regard, Hi, The "on host None" message looks like Nova is either not sending the "host" key in the connector information or is sending it set to '' or None. You'd need to see the logs in DEBUG level to know which it is. And that is strange, because the "host" key is set by os-brick when Nova calls the "get_connector_properties": props['host'] = host if host else socket.gethostname() So even if Nova doesn't have the "host" config option set, os-brick should get the hostname of the node. But from Cinder's perspective I don't think that's necessarily a problem. How was the volume created? Because that looks like a problem with the contents of the volume, as it is not complaining about not being able to map/export it or attach it. Cheers, Gorka. From emilien at redhat.com Fri Jan 11 16:20:48 2019 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 11 Jan 2019 11:20:48 -0500 Subject: [RHEL8-OSP15] Container Runtimes integration - Status report #7 Message-ID: Welcome to the seventh status report about the progress we make to Container Runtimes into Red Hat OpenStack Platform, version 15. You can read the previous report here: http://post-office.corp.redhat.com/archives/container-teams/2018-December/msg00090.html Our efforts are tracked here: https://trello.com/b/S8TmOU0u/tripleo-podman TL;DR =========================================== - Some OSP folks will meet in Brno next week, to work together on RHEL8/OSP15. See [1]. - We have replaced the Docker Healthchecks by SystemD timers when Podman is deployed. Now figuring out the next steps [2]. - Slow progress on the Python-based uploader (using tar-split + buildah), slowed by bugs. - We are waiting for podman 1.0 so we can build / test / ship it in TripleO CI. Context reminder =========================================== The OpenStack team is preparing the 15th version of Red Hat OpenStack Platform that will work on RHEL8. We are working together to support the future Container Runtimes which replace Docker. Done =========================================== - Implemented Podman healthchecks with SystemD timers: https://review.openstack.org/#/c/620372/ - Renamed SystemD services controlling Podman containers to not conflict with baremetal services https://review.openstack.org/#/c/623241/ - podman issues (reported by us) closed: - pull: error setting new rlimits: operation not permitted https://github.com/containers/libpod/issues/2123 - New podman version introduce new issue with selinux and relabelling: relabel failed "/run/netns": operation not supported https://github.com/containers/libpod/issues/2034 - container create failed: container_linux.go:336: starting container process caused "setup user: permission denied" https://github.com/containers/libpod/issues/1980 - "podman inspect --type image --format exists " reports a not-friendly error when image doesn't exist in local storage https://github.com/containers/libpod/issues/1845 - container create failed: container_linux.go:336: starting container process caused "process_linux.go:293: applying cgroup configuration for process caused open /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus: no such file or directory" https://github.com/containers/libpod/issues/1841 - paunch/runner: test if image exists before running inspect https://review.openstack.org/#/c/619313/ - Fixing a bunch of issues with docker-puppet.py to reduce chances of race conditions. - A lot of SElinux work, to make everything working in Enforced mode. - tar-split packaging is done, and will be consumed in TripleO for the python image uplaoded In progress =========================================== - Still investigating standard_init_linux.go:203: exec user process caused \"no such file or directory\" [5]. This one is nasty and painful. It involves concurrency and we are evaluating solutions, but we'll probably end up reduce the default multi-processing of podman commands from 6 to 3 by default. - Investigating ways to gate new versions of Podman + dependencies: https://review.rdoproject.org/r/#/c/17960/ - Investigating how to consume systemd timers in sensu (healtchecks) [2] - Investigating and prototyping a pattern to safely spawn a container from a container with systemd https://review.openstack.org/#/c/620062 - Investigating how we can prune Docker data when upgrading from Docker to Podman https://review.openstack.org/#/c/620405/ - Using the new "podman image exist" in Paunch https://review.openstack.org/#/c/619313/ - Still implementing a Python-based container uploader (using tar-split and buildah) - this method will be the default later: https://review.openstack.org/#/c/616018/ - Testing future Podman 1.0 in TripleO [3] - Help the Skydive team to migrate to Podman [4] Blocked =========================================== Podman 1.0 contains a lot of fixes that we need (from libpod and vendored as well). Any comment or feedback is welcome, thanks for reading! [1] https://docs.google.com/document/d/18-1M1eSnlls6j2Op2TxyvyuqoOksxmwHOhqaD6B8FQY/edit [2] https://trello.com/c/g6bi5DQF/4-healthchecks [3] https://trello.com/c/2tXNLJUN/58-test-podman-10 [4] https://trello.com/c/tW935FGe/56-migrate-ansible-skydive-to-podman [5] https://github.com/containers/libpod/issues/1844 -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Fri Jan 11 16:24:42 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 11 Jan 2019 10:24:42 -0600 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> Message-ID: <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> On 1/11/19 10:14 AM, Adam Spiers wrote: > Rico Lin wrote: >> Dear all >> >> To continue the discussion of whether we should have new SIG for >> autoscaling. >> I think we already got enough time for this ML  [1], and it's time to >> jump to the next step. As we got a lot of positive feedbacks from ML >> [1], I think it's definitely considered an action to create a new SIG, >> do some init works, and finally Here are some things that we can start >> right now, to come out with the name of SIG, the definition and mission. >> Here's my draft plan: To create a SIG name `Automatic SIG`, with given >> initial mission to improve automatic scaling with (but not limited to) >> OpenStack. As we discussed in forum [2], to have scenario tests and >> documents will be considered as actions for the initial mission. I >> gonna assume we will start from scenarios which already provide some >> basic tests and documents which we can adapt very soon and use them to >> build a SIG environment. And the long-term mission of this SIG is to >> make sure we provide good documentation and test coverage for most >> automatic functionality. >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we >> can provide more value if there are more needs in the future. Just >> like the example which Adam raised `self-optimizing` from people who >> are using watcher [3]. Let me know if you got any concerns about this >> name. > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound > quite right to me, because it's not clear what is being automated. For > example from the outside people might think it was a SIG about CI, or > about automated testing, or both - or even some kind of automatic > creation of new SIGs ;-) > Here are some alternative suggestions: > - Optimization SIG > - Self-optimization SIG > - Auto-optimization SIG > - Adaptive Cloud SIG > - Self-adaption SIG > - Auto-adaption SIG > - Auto-configuration SIG > > although I'm not sure these are a huge improvement on "Autoscaling SIG" > - maybe some are too broad, or too vague.  It depends on how likely it > is that the scope will go beyond just auto-scaling.  Of course you could > also just stick with the original idea of "Auto-scaling" :-) I'm inclined to argue that limiting the scope of this SIG is actually a feature, not a bug. Better to have a tightly focused SIG that has very specific, achievable goals than to try to boil the ocean by solving all of the auto* problems in OpenStack. We all know how "one SIG to rule them all" ends. ;-) >> And to clarify, there will definitely some cross SIG co-work between >> this new SIG and Self-Healing SIG (there're some common requirements >> even across self-healing and autoscaling features.). We also need to >> make sure we do not provide any duplicated work against self-healing >> SIG. As a start, let's only focus on autoscaling scenario, and make >> sure we're doing it right before we move to multiple cases. > > Sounds good! >> If no objection, I will create the new SIG before next weekend and >> plan a short schedule in Denver summit and PTG. > > Thanks for driving this! From kbcaulder at gmail.com Fri Jan 11 16:25:55 2019 From: kbcaulder at gmail.com (Brandon Caulder) Date: Fri, 11 Jan 2019 08:25:55 -0800 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: <20190111152318.ztuwirfgypehdfp6@localhost> References: <20190111152318.ztuwirfgypehdfp6@localhost> Message-ID: Hi, The steps were... - purge - shutdown cinder-scheduler, cinder-api - upgrade software - restart cinder-volume - sync (upgrade fails and stops at v114) - sync again (db upgrades to v117) - restart cinder-volume - stacktrace observed in volume.log Thanks On Fri, Jan 11, 2019 at 7:23 AM Gorka Eguileor wrote: > On 10/01, Brandon Caulder wrote: > > Hi Iain, > > > > There are 424 rows in volumes which drops down to 185 after running > > cinder-manage db purge 1. Restarting the volume service after package > > upgrade and running sync again does not remediate the problem, although > > running db sync a second time does bump the version up to 117, the > > following appears in the volume.log... > > > > http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ > > > > Hi, > > If I understand correctly the steps were: > > - Run DB sync --> Fail > - Run DB purge > - Restart volume services > - See the log error > - Run DB sync --> version proceeds to 117 > > If that is the case, could you restart the services again now that the > migration has been moved to version 117? > > If the cinder-volume service is able to restart please run the online > data migrations with the service running. > > Cheers, > Gorka. > > > > Thanks > > > > On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell < > iain.macdonnell at oracle.com> > > wrote: > > > > > > > > Different issue, I believe (DB sync vs. online migrations) - it just > > > happens that both pertain to shared targets. > > > > > > Brandon, might you have a very large number of rows in your volumes > > > table? Have you been purging soft-deleted rows? > > > > > > ~iain > > > > > > > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > > > Brandon, > > > > > > > > I am thinking you are hitting this bug: > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > > > > > > > I think you can work around it by retrying the migration with the > volume > > > > service running. You may, however, want to check with Iain > MacDonnell > > > > as he has been looking at this for a while. > > > > > > > > Thanks! > > > > Jay > > > > > > > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > > > >> Hi, > > > >> > > > >> I am receiving the following error when performing an offline > upgrade > > > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > > > >> openstack-cinder-1:12.0.3-1.el7. > > > >> > > > >> # cinder-manage db version > > > >> 105 > > > >> > > > >> # cinder-manage --debug db sync > > > >> Error during database migration: (pymysql.err.OperationalError) > (2013, > > > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE > volumes > > > >> SET shared_targets=%(shared_targets)s'] [parameters: > > > >> {'shared_targets': 1}] > > > >> > > > >> # cinder-manage db version > > > >> 114 > > > >> > > > >> The db version does not upgrade to queens version 117. Any help > would > > > >> be appreciated. > > > >> > > > >> Thank you > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mihalis68 at gmail.com Fri Jan 11 16:26:08 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Fri, 11 Jan 2019 11:26:08 -0500 Subject: [ops] OpenStack Operators Meetup, March 2019 Message-ID: Dear All The OpenStack Ops Meetups team is pleased to announce preliminary details for the next Ops Meetup. The event will be held March 7th and 8th in Berlin, Germany and is being hosted by Deutsche Telekom(DT). We thank them for their kind offer to host this event. The exact venue has not yet been decided but DT has two similar facilities both reserved at present and they will be working out shortly which one works better for them. DT's proposal is here https://etherpad.openstack.org/p/ops-meetup-venue-discuss-1st-2019-berlin The meetups team will be sharing the planning docs for the technical agenda in the next few weeks. So far, there has been interest expressed in having a research track at this meetup alongside the general track. Please let us know ASAP if that is of interest. Looking forward to seeing operators in Berlin! Chris Morgan (on behalf of the ops meetups team) -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Fri Jan 11 16:32:21 2019 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 11 Jan 2019 11:32:21 -0500 Subject: [RHEL8-OSP15] Container Runtimes integration - Status report #7 In-Reply-To: References: Message-ID: I didn't mean to send that on that list, but whatever. Nothing is confidential in that email. Except a few links that nobody cares. What I realized though is that I think it's time to communicate this effort in the public, which was impossible for me until now because RHEL8. For the next edition, I will send it to this list so anyone interested by podman can take a look. Also I'm available for any questions if needed. Thanks & sorry for noise. Emilien On Fri, Jan 11, 2019 at 11:20 AM Emilien Macchi wrote: > Welcome to the seventh status report about the progress we make to > Container Runtimes into Red Hat OpenStack Platform, version 15. > You can read the previous report here: > > http://post-office.corp.redhat.com/archives/container-teams/2018-December/msg00090.html > Our efforts are tracked here: https://trello.com/b/S8TmOU0u/tripleo-podman > > > TL;DR > =========================================== > - Some OSP folks will meet in Brno next week, to work together on > RHEL8/OSP15. See [1]. > - We have replaced the Docker Healthchecks by SystemD timers when Podman > is deployed. Now figuring out the next steps [2]. > - Slow progress on the Python-based uploader (using tar-split + buildah), > slowed by bugs. > - We are waiting for podman 1.0 so we can build / test / ship it in > TripleO CI. > > Context reminder > =========================================== > The OpenStack team is preparing the 15th version of Red Hat OpenStack > Platform that will work on RHEL8. > We are working together to support the future Container Runtimes which > replace Docker. > > Done > =========================================== > - Implemented Podman healthchecks with SystemD timers: > https://review.openstack.org/#/c/620372/ > - Renamed SystemD services controlling Podman containers to not conflict > with baremetal services https://review.openstack.org/#/c/623241/ > - podman issues (reported by us) closed: > - pull: error setting new rlimits: operation not permitted > https://github.com/containers/libpod/issues/2123 > - New podman version introduce new issue with selinux and relabelling: > relabel failed "/run/netns": operation not supported > https://github.com/containers/libpod/issues/2034 > - container create failed: container_linux.go:336: starting container > process caused "setup user: permission denied" > https://github.com/containers/libpod/issues/1980 > - "podman inspect --type image --format exists " reports a > not-friendly error when image doesn't exist in local storage > https://github.com/containers/libpod/issues/1845 > - container create failed: container_linux.go:336: starting container > process caused "process_linux.go:293: applying cgroup configuration for > process caused open /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus: no > such file or directory" https://github.com/containers/libpod/issues/1841 > - paunch/runner: test if image exists before running inspect > https://review.openstack.org/#/c/619313/ > - Fixing a bunch of issues with docker-puppet.py to reduce chances of race > conditions. > - A lot of SElinux work, to make everything working in Enforced mode. > - tar-split packaging is done, and will be consumed in TripleO for the > python image uplaoded > > In progress > =========================================== > - Still investigating standard_init_linux.go:203: exec user process caused > \"no such file or directory\" [5]. This one is nasty and painful. It > involves concurrency and we are evaluating solutions, but we'll probably > end up reduce the default multi-processing of podman commands from 6 to 3 > by default. > - Investigating ways to gate new versions of Podman + dependencies: > https://review.rdoproject.org/r/#/c/17960/ > - Investigating how to consume systemd timers in sensu (healtchecks) [2] > - Investigating and prototyping a pattern to safely spawn a container from > a container with systemd https://review.openstack.org/#/c/620062 > - Investigating how we can prune Docker data when upgrading from Docker to > Podman https://review.openstack.org/#/c/620405/ > - Using the new "podman image exist" in Paunch > https://review.openstack.org/#/c/619313/ > - Still implementing a Python-based container uploader (using tar-split > and buildah) - this method will be the default later: > https://review.openstack.org/#/c/616018/ > - Testing future Podman 1.0 in TripleO [3] > - Help the Skydive team to migrate to Podman [4] > > Blocked > =========================================== > Podman 1.0 contains a lot of fixes that we need (from libpod and vendored > as well). > > Any comment or feedback is welcome, thanks for reading! > > [1] > https://docs.google.com/document/d/18-1M1eSnlls6j2Op2TxyvyuqoOksxmwHOhqaD6B8FQY/edit > [2] https://trello.com/c/g6bi5DQF/4-healthchecks > [3] https://trello.com/c/2tXNLJUN/58-test-podman-10 > [4] https://trello.com/c/tW935FGe/56-migrate-ansible-skydive-to-podman > [5] https://github.com/containers/libpod/issues/1844 > -- > Emilien Macchi > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Fri Jan 11 16:40:29 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 11 Jan 2019 16:40:29 +0000 Subject: [nova] Retiring gantt, python-ganttclient projects Message-ID: <1fec3e43b5247493614fe3f3b175133408f960e2.camel@redhat.com> Hey, These projects are mega old, don't appear to have been official projects, and should have been retired a long time ago. This is serves as a heads up on the off-chance someone has managed to do something with them. Stephen From jungleboyj at gmail.com Fri Jan 11 16:45:56 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Fri, 11 Jan 2019 10:45:56 -0600 Subject: Issues setting up a SolidFire node with Cinder In-Reply-To: <5738052c-10c5-10db-e63c-7aee351db87c@absolutedevops.io> References: <3cf42fec-b3c0-396e-3d85-2a396deb5df7@absolutedevops.io> <6f53c037-b03d-1550-3e7a-e42850d950ec@gmail.com> <2b527e54-1cfc-e2c3-31ea-b3d64225a9cb@gmail.com> <5738052c-10c5-10db-e63c-7aee351db87c@absolutedevops.io> Message-ID: Grant, Ah, if you are using a different VLAN for your storage traffic than that is likely the cause of the problem.  Good luck getting the networking issue resolved. Jay On 1/11/2019 9:12 AM, Grant Morley wrote: > > Jay, > > Thanks for that info. It appears that the cinder-volume service can > speak to the  SolidFire over the network but for some reason it can't > actually access it over iSCSI. I think it might be something to do > with how we are tagging / untagging VLANs. > > Thank you for your help, I think I am heading in the right direction now! > > Kind Regards, > > Grant > > On 11/01/2019 14:44, Jay Bryant wrote: >> >> Grant, >> >> Doing the boot from volume is actually quite different than attaching >> a volume to an instance. >> >> In the case that you are doing the boot from volume (assuming that >> your glance storage is not in the Solidfire) the volume is created >> and attached to where the cinder-volume service is running.  Then the >> image is written into the volume. >> >> Have you verified that the host and container that is running >> cinder-volume is able to access the Solidfire backend? >> >> Jay >> >> On 1/11/2019 4:45 AM, Grant Morley wrote: >>> >>> Hi Jay, >>> >>> Thanks for the tip there. I am still having some trouble with it >>> which is really annoying. The strange thing is, I can launch a >>> volume and attach it to an instance absolutely fine. The only issue >>> I am having is literally creating this bootable volume. >>> >>> I assume creating a volume and attaching it to an instance is >>> exactly the same as creating a bootable volume minus the Nova part? >>> >>> I would just expect nothing to work if nothing could speak to the >>> SolidFire. >>> >>> Would it make a difference if the current image that is being copied >>> over to the bootable volume is in a ceph cluster? I know glance >>> should deal with it but I am wondering if the copy of the image is >>> the actual issue? >>> >>> Thanks again, >>> >>> On 11/01/2019 00:10, Jay S. Bryant wrote: >>>> >>>> Grant, >>>> >>>> So, the copy is failing because it can't find the volume to copy >>>> the image into. >>>> >>>> I would check the host and container for any iSCSI errors as well >>>> as the backend.  It appears that something is going wrong when >>>> attempting to temporarily attach the volume to write the image into it. >>>> >>>> Jay >>>> >>>> On 1/10/2019 7:16 AM, Grant Morley wrote: >>>>> >>>>> Hi all, >>>>> >>>>> We are in the process of trying to add a SolidFire storage >>>>> solution to our existing OpenStack setup and seem to have hit a >>>>> snag with cinder / iscsi. >>>>> >>>>> We are trying to create a bootable volume to allow us to launch an >>>>> instance from it, but we are getting some errors in our >>>>> cinder-volumes containers that seem to suggest they can't connect >>>>> to iscsi although the volume seems to create fine on the SolidFire >>>>> node. >>>>> >>>>> The command we are running is: >>>>> >>>>> openstack volume create --image $image-id --size 20 --bootable >>>>> --type solidfire sf-volume-v12 >>>>> >>>>> The volume seems to create on SolidFire but I then see these >>>>> errors in the "cinder-volume.log" >>>>> >>>>> https://pastebin.com/LyjLUhfk >>>>> >>>>> The volume containers can talk to the iscsi VIP on the SolidFire >>>>> so I am a bit stuck and wondered if anyone had come across any >>>>> issues before? >>>>> >>>>> Kind Regards, >>>>> >>>>> >>>>> -- >>>>> Grant Morley >>>>> Cloud Lead >>>>> Absolute DevOps Ltd >>>>> Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP >>>>> www.absolutedevops.io >>>>> grant at absolutedevops.io 0845 874 >>>>> 0580 >>> -- >>> Grant Morley >>> Cloud Lead >>> Absolute DevOps Ltd >>> Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP >>> www.absolutedevops.io >>> grant at absolutedevops.io 0845 874 0580 > -- > Grant Morley > Cloud Lead > Absolute DevOps Ltd > Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP > www.absolutedevops.io > grant at absolutedevops.io 0845 874 0580 -------------- next part -------------- An HTML attachment was scrubbed... URL: From dirk at dmllr.de Fri Jan 11 17:11:03 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Fri, 11 Jan 2019 18:11:03 +0100 Subject: [self-healing-sig] best practices for haproxy health checking Message-ID: Hi, Does anyone have a good pointer for good healthchecks to be used by the frontend api haproxy loadbalancer? in one case that I am looking at right now, the entry haproxy loadbalancer was not able to detect a particular backend being not responding to api requests, so it flipped up and down repeatedly, causing intermittend spurious 503 errors. The backend was able to respond to connections and to basic HTTP GET requests (e.g. / or even /v3 as path), but when it got a "real" query it hung. the reason for that was, as it turned out, the configured caching backend memcached on that machine being locked up (due to some other bug). I wonder if there is a better way to check if a backend is "working" and what the best practices around this are. A potential thought I had was to do the backend check via some other healthcheck specific port that runs a custom daemon that does more sophisticated checks like checking for system wide errors (like memcache, database, rabbitmq) being unavailable on that node, and hence not accepting any api traffic until that is being resolved. Any pointers to read upon / best practices appreciated. Thanks, Dirk From duc.openstack at gmail.com Fri Jan 11 17:14:03 2019 From: duc.openstack at gmail.com (Duc Truong) Date: Fri, 11 Jan 2019 09:14:03 -0800 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> Message-ID: +1 on limiting the scope to autoscaling at first. I prefer the name autoscaling since the mission is to improve automatic scaling. If the mission is changed later, we can change the name of the SIG to reflect that. On Fri, Jan 11, 2019 at 8:24 AM Ben Nemec wrote: > > > > On 1/11/19 10:14 AM, Adam Spiers wrote: > > Rico Lin wrote: > >> Dear all > >> > >> To continue the discussion of whether we should have new SIG for > >> autoscaling. > >> I think we already got enough time for this ML [1], and it's time to > >> jump to the next step. As we got a lot of positive feedbacks from ML > >> [1], I think it's definitely considered an action to create a new SIG, > >> do some init works, and finally Here are some things that we can start > >> right now, to come out with the name of SIG, the definition and mission. > >> Here's my draft plan: To create a SIG name `Automatic SIG`, with given > >> initial mission to improve automatic scaling with (but not limited to) > >> OpenStack. As we discussed in forum [2], to have scenario tests and > >> documents will be considered as actions for the initial mission. I > >> gonna assume we will start from scenarios which already provide some > >> basic tests and documents which we can adapt very soon and use them to > >> build a SIG environment. And the long-term mission of this SIG is to > >> make sure we provide good documentation and test coverage for most > >> automatic functionality. > >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we > >> can provide more value if there are more needs in the future. Just > >> like the example which Adam raised `self-optimizing` from people who > >> are using watcher [3]. Let me know if you got any concerns about this > >> name. > > > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound > > quite right to me, because it's not clear what is being automated. For > > example from the outside people might think it was a SIG about CI, or > > about automated testing, or both - or even some kind of automatic > > creation of new SIGs ;-) > > Here are some alternative suggestions: > > - Optimization SIG > > - Self-optimization SIG > > - Auto-optimization SIG > > - Adaptive Cloud SIG > > - Self-adaption SIG > > - Auto-adaption SIG > > - Auto-configuration SIG > > > > although I'm not sure these are a huge improvement on "Autoscaling SIG" > > - maybe some are too broad, or too vague. It depends on how likely it > > is that the scope will go beyond just auto-scaling. Of course you could > > also just stick with the original idea of "Auto-scaling" :-) > > I'm inclined to argue that limiting the scope of this SIG is actually a > feature, not a bug. Better to have a tightly focused SIG that has very > specific, achievable goals than to try to boil the ocean by solving all > of the auto* problems in OpenStack. We all know how "one SIG to rule > them all" ends. ;-) > > >> And to clarify, there will definitely some cross SIG co-work between > >> this new SIG and Self-Healing SIG (there're some common requirements > >> even across self-healing and autoscaling features.). We also need to > >> make sure we do not provide any duplicated work against self-healing > >> SIG. As a start, let's only focus on autoscaling scenario, and make > >> sure we're doing it right before we move to multiple cases. > > > > Sounds good! > >> If no objection, I will create the new SIG before next weekend and > >> plan a short schedule in Denver summit and PTG. > > > > Thanks for driving this! > From rfolco at redhat.com Fri Jan 11 17:27:01 2019 From: rfolco at redhat.com (Rafael Folco) Date: Fri, 11 Jan 2019 15:27:01 -0200 Subject: [openstack-dev][tripleo] TripleO CI Summary: Sprint 24 Message-ID: Greetings, The TripleO CI team has just completed Sprint 24 (Dec 20 thru Jan 09). The following is a summary of completed work during this sprint cycle: - Created Zuul configuration and changed repository scripts for the new Fedora 28 promotion pipeline, including container build jobs. - Replaced multinode scenarios(1-4) with standalone scenarios(1-4) jobs across TripleO projects. Also fixed missing services for standalone scenario jobs. A few changes are still “in-flight” and are close to merge. - Tempest is now successfully running on Fedora 28 standalone jobs. - Improved reproducer solution using upstream Zuul containers by moving code to an ansible role in rdo-infra/ansible-role-tripleo-ci-reproducer and automated the libvirt setup. - Created a new OVB workflow without te-broker, moved OVB repo from github to gerrit and did a PoC with new reproducer and OVB jobs. The tebroker is no longer part of the ovb workflow. The planned work for the next sprint [1] are: - Apply changes for Fedora 28 promotion pipeline in production environment to start collecting logs for container build job. - Complete transition from multinode scenarios (1-4) to standalone jobs across all TripleO projects. - Improve the new Zuul container reproducer by automating nodepool config for libvirt. - Enable CI on OVB under ovb’s new git repo in the openstack namespace. - Refactor the upstream zuul job configuration to consolidate file parameters into one repo openstack-infra/tripleo-ci. - Begin to move the RDO-Phase2 Baremetal jobs to upstream tripleo. The Ruck and Rover for this sprint are Arx Cruz (arxcruz) and Sorin Sbarnea (ssbarnea). Please direct questions or queries to them regarding CI status or issues in #tripleo, ideally to whomever has the ‘|ruck’ suffix on their nick. Notes are recorded on etherpad [2]. Thanks, rfolco [1] https://tree.taiga.io/project/tripleo-ci-board/taskboard/unified-sprint-4 [2] https://review.rdoproject.org/etherpad/p/ruckrover-sprint25 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Fri Jan 11 17:31:34 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 11 Jan 2019 11:31:34 -0600 Subject: [self-healing-sig] best practices for haproxy health checking In-Reply-To: References: Message-ID: On 1/11/19 11:11 AM, Dirk Müller wrote: > Hi, > > Does anyone have a good pointer for good healthchecks to be used by > the frontend api haproxy loadbalancer? > > in one case that I am looking at right now, the entry haproxy > loadbalancer was not able > to detect a particular backend being not responding to api requests, > so it flipped up and down repeatedly, causing intermittend spurious > 503 errors. > > The backend was able to respond to connections and to basic HTTP GET > requests (e.g. / or even /v3 as path), but when it got a "real" query > it hung. the reason for that was, as it turned out, > the configured caching backend memcached on that machine being locked > up (due to some other bug). > > I wonder if there is a better way to check if a backend is "working" > and what the best practices around this are. A potential thought I had > was to do the backend check via some other healthcheck specific port > that runs a custom daemon that does more sophisticated checks like > checking for system wide errors (like memcache, database, rabbitmq) > being unavailable on that node, and hence not accepting any api > traffic until that is being resolved. A very similar thing has been proposed: https://review.openstack.org/#/c/531456/ It also came up as a possible community goal for Train: http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000558.html But to my knowledge no one has stepped forward to drive the work. It seems to be something people generally agree we need, but nobody has time to do. :-( > > Any pointers to read upon / best practices appreciated. > > Thanks, > Dirk > From openstack at nemebean.com Fri Jan 11 17:34:10 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 11 Jan 2019 11:34:10 -0600 Subject: [openstack-dev][tripleo] TripleO CI Summary: Sprint 24 In-Reply-To: References: Message-ID: On 1/11/19 11:27 AM, Rafael Folco wrote: > moved OVB repo from github to gerrit This is news to me. ;-) The work to do the gerrit import is still underway, but should be done soon: https://review.openstack.org/#/c/620613/ I have a bit more cleanup to do in the github repo and then we can proceed. -Ben From mrhillsman at gmail.com Fri Jan 11 17:56:16 2019 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Fri, 11 Jan 2019 11:56:16 -0600 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> Message-ID: +1 SIGs should have limited scope - shared interest in a particular area - even if that area is something broad like security the mission and work should be specific which could lead to working groups, additional SIGs, projects, etc so I want to be careful how I word it but yes limited scope is the ideal way to start a SIG imo. On Fri, Jan 11, 2019 at 11:14 AM Duc Truong wrote: > +1 on limiting the scope to autoscaling at first. I prefer the name > autoscaling since the mission is to improve automatic scaling. If the > mission is changed later, we can change the name of the SIG to reflect > that. > > On Fri, Jan 11, 2019 at 8:24 AM Ben Nemec wrote: > > > > > > > > On 1/11/19 10:14 AM, Adam Spiers wrote: > > > Rico Lin wrote: > > >> Dear all > > >> > > >> To continue the discussion of whether we should have new SIG for > > >> autoscaling. > > >> I think we already got enough time for this ML [1], and it's time to > > >> jump to the next step. As we got a lot of positive feedbacks from ML > > >> [1], I think it's definitely considered an action to create a new SIG, > > >> do some init works, and finally Here are some things that we can start > > >> right now, to come out with the name of SIG, the definition and > mission. > > >> Here's my draft plan: To create a SIG name `Automatic SIG`, with given > > >> initial mission to improve automatic scaling with (but not limited to) > > >> OpenStack. As we discussed in forum [2], to have scenario tests and > > >> documents will be considered as actions for the initial mission. I > > >> gonna assume we will start from scenarios which already provide some > > >> basic tests and documents which we can adapt very soon and use them to > > >> build a SIG environment. And the long-term mission of this SIG is to > > >> make sure we provide good documentation and test coverage for most > > >> automatic functionality. > > >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we > > >> can provide more value if there are more needs in the future. Just > > >> like the example which Adam raised `self-optimizing` from people who > > >> are using watcher [3]. Let me know if you got any concerns about this > > >> name. > > > > > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound > > > quite right to me, because it's not clear what is being automated. For > > > example from the outside people might think it was a SIG about CI, or > > > about automated testing, or both - or even some kind of automatic > > > creation of new SIGs ;-) > > > Here are some alternative suggestions: > > > - Optimization SIG > > > - Self-optimization SIG > > > - Auto-optimization SIG > > > - Adaptive Cloud SIG > > > - Self-adaption SIG > > > - Auto-adaption SIG > > > - Auto-configuration SIG > > > > > > although I'm not sure these are a huge improvement on "Autoscaling SIG" > > > - maybe some are too broad, or too vague. It depends on how likely it > > > is that the scope will go beyond just auto-scaling. Of course you > could > > > also just stick with the original idea of "Auto-scaling" :-) > > > > I'm inclined to argue that limiting the scope of this SIG is actually a > > feature, not a bug. Better to have a tightly focused SIG that has very > > specific, achievable goals than to try to boil the ocean by solving all > > of the auto* problems in OpenStack. We all know how "one SIG to rule > > them all" ends. ;-) > > > > >> And to clarify, there will definitely some cross SIG co-work between > > >> this new SIG and Self-Healing SIG (there're some common requirements > > >> even across self-healing and autoscaling features.). We also need to > > >> make sure we do not provide any duplicated work against self-healing > > >> SIG. As a start, let's only focus on autoscaling scenario, and make > > >> sure we're doing it right before we move to multiple cases. > > > > > > Sounds good! > > >> If no objection, I will create the new SIG before next weekend and > > >> plan a short schedule in Denver summit and PTG. > > > > > > Thanks for driving this! > > > > -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Fri Jan 11 18:26:43 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Fri, 11 Jan 2019 11:26:43 -0700 Subject: [openstack-dev][tripleo] TripleO CI Summary: Sprint 24 In-Reply-To: References: Message-ID: On Fri, Jan 11, 2019 at 10:38 AM Ben Nemec wrote: > > > On 1/11/19 11:27 AM, Rafael Folco wrote: > > moved OVB repo from github to gerrit > Same, that should be in plan for this sprint. Just a misunderstanding that I should have caught. My current understanding is that OVB is in the process of moving to the openstack namespace, and Sagi is prepping CI for it. Thanks Ben! > > This is news to me. ;-) > > The work to do the gerrit import is still underway, but should be done > soon: https://review.openstack.org/#/c/620613/ > > I have a bit more cleanup to do in the github repo and then we can proceed. > > -Ben > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aspiers at suse.com Fri Jan 11 18:51:56 2019 From: aspiers at suse.com (Adam Spiers) Date: Fri, 11 Jan 2019 18:51:56 +0000 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> Message-ID: <20190111185156.fmpaplichmwpvk5u@pacific.linksys.moosehall> Fine by me - sounds like we have a consensus for autoscaling then? Melvin Hillsman wrote: >+1 SIGs should have limited scope - shared interest in a particular area - >even if that area is something broad like security the mission and work >should be specific which could lead to working groups, additional SIGs, >projects, etc so I want to be careful how I word it but yes limited scope >is the ideal way to start a SIG imo. > >On Fri, Jan 11, 2019 at 11:14 AM Duc Truong wrote: > >> +1 on limiting the scope to autoscaling at first. I prefer the name >> autoscaling since the mission is to improve automatic scaling. If the >> mission is changed later, we can change the name of the SIG to reflect >> that. >> >> On Fri, Jan 11, 2019 at 8:24 AM Ben Nemec wrote: >> > >> > >> > >> > On 1/11/19 10:14 AM, Adam Spiers wrote: >> > > Rico Lin wrote: >> > >> Dear all >> > >> >> > >> To continue the discussion of whether we should have new SIG for >> > >> autoscaling. >> > >> I think we already got enough time for this ML [1], and it's time to >> > >> jump to the next step. As we got a lot of positive feedbacks from ML >> > >> [1], I think it's definitely considered an action to create a new SIG, >> > >> do some init works, and finally Here are some things that we can start >> > >> right now, to come out with the name of SIG, the definition and >> mission. >> > >> Here's my draft plan: To create a SIG name `Automatic SIG`, with given >> > >> initial mission to improve automatic scaling with (but not limited to) >> > >> OpenStack. As we discussed in forum [2], to have scenario tests and >> > >> documents will be considered as actions for the initial mission. I >> > >> gonna assume we will start from scenarios which already provide some >> > >> basic tests and documents which we can adapt very soon and use them to >> > >> build a SIG environment. And the long-term mission of this SIG is to >> > >> make sure we provide good documentation and test coverage for most >> > >> automatic functionality. >> > >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make sure we >> > >> can provide more value if there are more needs in the future. Just >> > >> like the example which Adam raised `self-optimizing` from people who >> > >> are using watcher [3]. Let me know if you got any concerns about this >> > >> name. >> > > >> > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound >> > > quite right to me, because it's not clear what is being automated. For >> > > example from the outside people might think it was a SIG about CI, or >> > > about automated testing, or both - or even some kind of automatic >> > > creation of new SIGs ;-) >> > > Here are some alternative suggestions: >> > > - Optimization SIG >> > > - Self-optimization SIG >> > > - Auto-optimization SIG >> > > - Adaptive Cloud SIG >> > > - Self-adaption SIG >> > > - Auto-adaption SIG >> > > - Auto-configuration SIG >> > > >> > > although I'm not sure these are a huge improvement on "Autoscaling SIG" >> > > - maybe some are too broad, or too vague. It depends on how likely it >> > > is that the scope will go beyond just auto-scaling. Of course you >> could >> > > also just stick with the original idea of "Auto-scaling" :-) >> > >> > I'm inclined to argue that limiting the scope of this SIG is actually a >> > feature, not a bug. Better to have a tightly focused SIG that has very >> > specific, achievable goals than to try to boil the ocean by solving all >> > of the auto* problems in OpenStack. We all know how "one SIG to rule >> > them all" ends. ;-) >> > >> > >> And to clarify, there will definitely some cross SIG co-work between >> > >> this new SIG and Self-Healing SIG (there're some common requirements >> > >> even across self-healing and autoscaling features.). We also need to >> > >> make sure we do not provide any duplicated work against self-healing >> > >> SIG. As a start, let's only focus on autoscaling scenario, and make >> > >> sure we're doing it right before we move to multiple cases. >> > > >> > > Sounds good! >> > >> If no objection, I will create the new SIG before next weekend and >> > >> plan a short schedule in Denver summit and PTG. >> > > >> > > Thanks for driving this! >> > >> >> > >-- >Kind regards, > >Melvin Hillsman >mrhillsman at gmail.com >mobile: (832) 264-2646 From openstack at nemebean.com Fri Jan 11 21:39:04 2019 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 11 Jan 2019 15:39:04 -0600 Subject: Review-Priority for Project Repos In-Reply-To: <16ba68b1772befaf5d689ecfb8a7b60ad055bdeb.camel@redhat.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> <20190110194227.GB14554@sm-workstation> <16ba68b1772befaf5d689ecfb8a7b60ad055bdeb.camel@redhat.com> Message-ID: <5db92af7-e533-2da2-9b32-49f195472837@nemebean.com> On 1/10/19 5:03 PM, Sean Mooney wrote: > On Thu, 2019-01-10 at 13:42 -0600, Sean McGinnis wrote: >>> >>> I don't know if this was the reasoning behind Cinder's system, but I know >>> some people object to procedural -2 because it's a big hammer to essentially >>> say "not right now". It overloads the meaning of the vote in a potentially >>> confusing way that requires explanation every time it's used. At least I >>> hope procedural -2's always include a comment. >>> >> >> This was exactly the reasoning. -2 is overloaded, but its primary meaning >> was/is "we do not want this code change". It just happens that it was also a >> convenient way to say that with "right now" at the end. >> >> The Review-Priority -1 is a clear way to say whether something is held because >> it can't be merged right now due to procedural or process reasons, versus >> something that we just don't want at all. > for what its worth my understanding of why a procdural -2 is more correct is that this change > cannot be merged because it has not met the procedual requirement to be considerd for this release. > haveing received several over the years i have never seen it to carry any malaise > or weight then the zuul pep8 job complianing about the line lenght of my code. > with either a procedural -2 or a verify -1 from zuul my code is equally un mergeable. > > the prime example being a patch that requires a spec that has not been approved. > while most cores will not approve chage when other cores have left a -1 mistakes happen > and the -2 does emphasise the point that even if the code is perfect under the porject > processes this change should not be acitvly reporposed until the issue raised by the -2 > has been addressed. In the case of a procedual -2 that typically means the spec is merge > or the master branch opens for the next cycle. > > i agree that procedural -2's can seam harsh at first glance but i have also never seen one > left without a comment explaining why it was left. the issue with a procedural -1 is i can > jsut resubmit the patch several times and it can get lost in the comments. I don't think that's a problem with this new field. It sounds like priority -1 carries over from PS to PS. > > we recently intoduced a new review priority lable > if we really wanted to disabiguate form normal -2s then we coudl have an explcitly lable for it > but i personally would prefer to keep procedural -2s. To be clear, I have both used and received procedural -2's as well and they don't particularly bother me, but I can see where if you were someone who was new to the community or just a part-time contributor not as familiar with our processes it might be an unpleasant experience to see that -2 show up. As I said, I don't know that I would advocate for this on the basis of replacing procedural -2 alone, but if we're adding the category anyway I mildly prefer using it for procedural blockers in the future. From smooney at redhat.com Fri Jan 11 22:09:44 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 11 Jan 2019 22:09:44 +0000 Subject: Review-Priority for Project Repos In-Reply-To: <5db92af7-e533-2da2-9b32-49f195472837@nemebean.com> References: <20190103135155.GC27473@sm-workstation> <1683668d5b6.b859b68d50455.3029454740955847790@ghanshyammann.com> <2845db76-3308-7e10-9d79-963fa8b20d01@nemebean.com> <20190110194227.GB14554@sm-workstation> <16ba68b1772befaf5d689ecfb8a7b60ad055bdeb.camel@redhat.com> <5db92af7-e533-2da2-9b32-49f195472837@nemebean.com> Message-ID: On Fri, 2019-01-11 at 15:39 -0600, Ben Nemec wrote: > > On 1/10/19 5:03 PM, Sean Mooney wrote: > > On Thu, 2019-01-10 at 13:42 -0600, Sean McGinnis wrote: > > > > > > > > I don't know if this was the reasoning behind Cinder's system, but I know > > > > some people object to procedural -2 because it's a big hammer to essentially > > > > say "not right now". It overloads the meaning of the vote in a potentially > > > > confusing way that requires explanation every time it's used. At least I > > > > hope procedural -2's always include a comment. > > > > > > > > > > This was exactly the reasoning. -2 is overloaded, but its primary meaning > > > was/is "we do not want this code change". It just happens that it was also a > > > convenient way to say that with "right now" at the end. > > > > > > The Review-Priority -1 is a clear way to say whether something is held because > > > it can't be merged right now due to procedural or process reasons, versus > > > something that we just don't want at all. > > > > for what its worth my understanding of why a procdural -2 is more correct is that this change > > cannot be merged because it has not met the procedual requirement to be considerd for this release. > > haveing received several over the years i have never seen it to carry any malaise > > or weight then the zuul pep8 job complianing about the line lenght of my code. > > with either a procedural -2 or a verify -1 from zuul my code is equally un mergeable. > > > > the prime example being a patch that requires a spec that has not been approved. > > while most cores will not approve chage when other cores have left a -1 mistakes happen > > and the -2 does emphasise the point that even if the code is perfect under the porject > > processes this change should not be acitvly reporposed until the issue raised by the -2 > > has been addressed. In the case of a procedual -2 that typically means the spec is merge > > or the master branch opens for the next cycle. > > > > i agree that procedural -2's can seam harsh at first glance but i have also never seen one > > left without a comment explaining why it was left. the issue with a procedural -1 is i can > > jsut resubmit the patch several times and it can get lost in the comments. > > I don't think that's a problem with this new field. It sounds like > priority -1 carries over from PS to PS. > > > > > we recently intoduced a new review priority lable > > if we really wanted to disabiguate form normal -2s then we coudl have an explcitly lable for it > > but i personally would prefer to keep procedural -2s. > > To be clear, I have both used and received procedural -2's as well and > they don't particularly bother me, but I can see where if you were > someone who was new to the community or just a part-time contributor not > as familiar with our processes it might be an unpleasant experience to > see that -2 show up. As I said, I don't know that I would advocate for > this on the basis of replacing procedural -2 alone, but if we're adding > the category anyway I mildly prefer using it for procedural blockers in > the future. i think i partally missunderstood the proposal. i had parsed it as replacing procedual code review -2 with a code review -1 rahter then review priority -1. if all projects adopt review priorty going forward then that might makes sense for those that dont i think code review -2 still makes sense. From rico.lin.guanyu at gmail.com Sat Jan 12 00:36:32 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Sat, 12 Jan 2019 08:36:32 +0800 Subject: [all][meta-sig] New Automatic SIG (continue discussion) In-Reply-To: <20190111185156.fmpaplichmwpvk5u@pacific.linksys.moosehall> References: <20190111161417.aswwj5jmtasfabg6@pacific.linksys.moosehall> <2fb328bf-f0f8-8d7d-a0cd-672bb1fefaa8@nemebean.com> <20190111185156.fmpaplichmwpvk5u@pacific.linksys.moosehall> Message-ID: Adam Spiers 於 2019年1月12日 週六,上午2:59寫道: > Fine by me - sounds like we have a consensus for autoscaling then? I think “Autoscaling SIG” gets the majority vote. Let’s give it few more days for people in different time zones. > > Melvin Hillsman wrote: > >+1 SIGs should have limited scope - shared interest in a particular area - > >even if that area is something broad like security the mission and work > >should be specific which could lead to working groups, additional SIGs, > >projects, etc so I want to be careful how I word it but yes limited scope > >is the ideal way to start a SIG imo. > > > >On Fri, Jan 11, 2019 at 11:14 AM Duc Truong > wrote: > > > >> +1 on limiting the scope to autoscaling at first. I prefer the name > >> autoscaling since the mission is to improve automatic scaling. If the > >> mission is changed later, we can change the name of the SIG to reflect > >> that. > >> > >> On Fri, Jan 11, 2019 at 8:24 AM Ben Nemec > wrote: > >> > > >> > > >> > > >> > On 1/11/19 10:14 AM, Adam Spiers wrote: > >> > > Rico Lin wrote: > >> > >> Dear all > >> > >> > >> > >> To continue the discussion of whether we should have new SIG for > >> > >> autoscaling. > >> > >> I think we already got enough time for this ML [1], and it's time > to > >> > >> jump to the next step. As we got a lot of positive feedbacks from > ML > >> > >> [1], I think it's definitely considered an action to create a new > SIG, > >> > >> do some init works, and finally Here are some things that we can > start > >> > >> right now, to come out with the name of SIG, the definition and > >> mission. > >> > >> Here's my draft plan: To create a SIG name `Automatic SIG`, with > given > >> > >> initial mission to improve automatic scaling with (but not limited > to) > >> > >> OpenStack. As we discussed in forum [2], to have scenario tests and > >> > >> documents will be considered as actions for the initial mission. I > >> > >> gonna assume we will start from scenarios which already provide > some > >> > >> basic tests and documents which we can adapt very soon and use > them to > >> > >> build a SIG environment. And the long-term mission of this SIG is > to > >> > >> make sure we provide good documentation and test coverage for most > >> > >> automatic functionality. > >> > >> I suggest `Automatic SIG` instead of `Autoscaling SIG` to make > sure we > >> > >> can provide more value if there are more needs in the future. Just > >> > >> like the example which Adam raised `self-optimizing` from people > who > >> > >> are using watcher [3]. Let me know if you got any concerns about > this > >> > >> name. > >> > > > >> > > I'm +1 for creating the SIG, although "Automatic SIG" doesn't sound > >> > > quite right to me, because it's not clear what is being automated. > For > >> > > example from the outside people might think it was a SIG about CI, > or > >> > > about automated testing, or both - or even some kind of automatic > >> > > creation of new SIGs ;-) > >> > > Here are some alternative suggestions: > >> > > - Optimization SIG > >> > > - Self-optimization SIG > >> > > - Auto-optimization SIG > >> > > - Adaptive Cloud SIG > >> > > - Self-adaption SIG > >> > > - Auto-adaption SIG > >> > > - Auto-configuration SIG > >> > > > >> > > although I'm not sure these are a huge improvement on "Autoscaling > SIG" > >> > > - maybe some are too broad, or too vague. It depends on how likely > it > >> > > is that the scope will go beyond just auto-scaling. Of course you > >> could > >> > > also just stick with the original idea of "Auto-scaling" :-) > >> > > >> > I'm inclined to argue that limiting the scope of this SIG is actually > a > >> > feature, not a bug. Better to have a tightly focused SIG that has very > >> > specific, achievable goals than to try to boil the ocean by solving > all > >> > of the auto* problems in OpenStack. We all know how "one SIG to rule > >> > them all" ends. ;-) > >> > > >> > >> And to clarify, there will definitely some cross SIG co-work > between > >> > >> this new SIG and Self-Healing SIG (there're some common > requirements > >> > >> even across self-healing and autoscaling features.). We also need > to > >> > >> make sure we do not provide any duplicated work against > self-healing > >> > >> SIG. As a start, let's only focus on autoscaling scenario, and make > >> > >> sure we're doing it right before we move to multiple cases. > >> > > > >> > > Sounds good! > >> > >> If no objection, I will create the new SIG before next weekend and > >> > >> plan a short schedule in Denver summit and PTG. > >> > > > >> > > Thanks for driving this! > >> > > >> > >> > > > >-- > >Kind regards, > > > >Melvin Hillsman > >mrhillsman at gmail.com > >mobile: (832) 264-2646 > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekcs.openstack at gmail.com Sat Jan 12 02:52:57 2019 From: ekcs.openstack at gmail.com (Eric K) Date: Fri, 11 Jan 2019 18:52:57 -0800 Subject: [congress][infra] override-checkout problem Message-ID: Hi Ghanshyam, On 1/11/19, 4:57 AM, "Ghanshyam Mann" wrote: >Hi Eric, > >This seems the same issue happening on congress-tempest-plugin gate where >'congress-devstack-py35-api-mysql-queens' is failing [1]. >python-congressclient was >not able to install and openstack client trow error for congress command. > >The issue is stable branch jobs on congress-tempest-plugin does checkout >the master version for all repo >instead of what mentioned in override-checkout var. > >If you see congress's rocky patch, congress is checkout out with rocky >version[2] but >congress-tempest-plugin patch's rocky job checkout the master version of >congress instead of rocky version [3]. >That is why your test expectedly fail on congress patch but pass on >congress-tempest-plugin. > >Root cause is that override-checkout var does not work on the legacy job >(it is only zuulv3 job var, if I am not wrong), >you need to use BRANCH_OVERRIDE for legacy jobs. Myself, amotoki and >akhil was trying lot other workarounds >to debug the root cause but at the end we just notice that congress jobs >are legacy jobs and using override-checkout :). Gosh thanks so much for the investigation. Yes it's a legacy-dsvm job. So sorry for the run around! I'm thinking of taking the opportunity to migrate to devstack-tempest job. I've taken a first stab here: https://review.openstack.org/#/c/630414/ > >I have submitted the testing patch with BRANCH_OVERRIDE for >congress-tempest-plugin queens job[4]. >Which seems working fine, I can make those patches more formal for merge. And thanks so much for putting together those patches using BRANCH_OVERRIDE! Merging sounds good unless we can quickly migrate To non-legacy jobs. Realistically it'll probably end up take a while to get everything migrated and working. > > >Another thing I was discussing with Akhil that new tests of builins >feature need another feature flag >(different than congressz3.enabled) as that feature of z3 is in stein >onwards only. Yup. I was going to do that but wanted to first figure out why it wasn't failing on tempest plugin. I've now submitted a patch to do that. > > >[1] https://review.openstack.org/#/c/618951/ >[2] >http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87 >474d7/logs/pip2-freeze.txt.gz >[3] >http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-ro >cky/23c0214/logs/pip2-freeze.txt.gz >[4] >https://review.openstack.org/#/q/topic:fix-stable-branch-testing+(status:o >pen+OR+status:merged) > >-gmann > > ---- On Fri, 11 Jan 2019 10:40:39 +0900 Eric K > wrote ---- > > The congress-tempest-plugin zuul jobs against stable branches appear > > to be working incorrectly. Tests that should fail on stable/rocky (and > > indeed fails when triggered by congress patch [1]) are passing when > > triggered by congress-tempest-plugin patch [2]. > > > > I'd assume it's some kind of zuul misconfiguration in > > congress-tempest-plugin [3], but I've so far failed to figure out > > what's wrong. Particularly strange is that the job-output appears to > > show it checking out the right thing [4]. > > > > Any thoughts or suggestions? Thanks so much! > > > > [1] > > https://review.openstack.org/#/c/629070/ > > >http://logs.openstack.org/70/629070/4/check/congress-devstack-api-mysql/87 >474d7/logs/testr_results.html.gz > > The two failing z3 tests should indeed fail because the feature was > > not available in rocky. The tests were introduced because for some > > reason they pass in the job triggered by a patch in > > congress-tempest-plugin. > > > > [2] > > https://review.openstack.org/#/c/618951/ > > >http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-ro >cky/23c0214/logs/testr_results.html.gz > > > > [3] >https://github.com/openstack/congress-tempest-plugin/blob/master/.zuul.yam >l#L4 > > > > [4] >http://logs.openstack.org/51/618951/3/check/congress-devstack-api-mysql-ro >cky/23c0214/job-output.txt.gz#_2019-01-09_05_18_08_183562 > > shows congress is checked out to the correct commit at the top of the > > stable/rocky branch. > > > > > > > From raniaadouni at gmail.com Sun Jan 13 10:16:23 2019 From: raniaadouni at gmail.com (Rania Adouni) Date: Sun, 13 Jan 2019 11:16:23 +0100 Subject: [openstack-ZUN] Message-ID: hi everyone , I was trying to deploy wordpress -zun by using heat , this is the template I used "https://pastebin.com/0PGtWSVw" . now the stack create successfully the mysql image running but the wordpress image alwayes stopped and when I try to started and access to the container " openstack appcontainer exec --interactive rho-1-container apache2-foreground " i get this output : ********************************* connected to container "rho-1-container" type ~. to disconnect AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.16.0.3. Set the 'ServerName' directive globally to suppress this message AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.16.0.3. Set the 'ServerName' directive globally to suppress this message [Sun Jan 13 10:07:24.463058 2019] [mpm_prefork:notice] [pid 77] AH00163: Apache/2.4.25 (Debian) PHP/7.2.14 configured -- resuming normal operations [Sun Jan 13 10:07:24.463196 2019] [core:notice] [pid 77] AH00094: Command line: 'apache2 -D FOREGROUND' ***************************************************** and then the status of wordpress image back stopped !!!! the logs of wordpress image can be found here : https://pastebin.com/CitXk6zN thanks for any help !! -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Sun Jan 13 18:11:29 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Sun, 13 Jan 2019 12:11:29 -0600 Subject: [ironic][neutron] nf_conntrack_helper now disabled by default In-Reply-To: References: <1546880738.2949141.1627872736.6DF3C255@webmail.messagingengine.com> Message-ID: Hi Derek, Yes, these rules would need to be added inside the router namespace when it is created and it seems to me it is a workable solution. I will raise this work in the next L3 sub-team meeting, so we keep an eye on the patches / progress you make Regards Miguel On Mon, Jan 7, 2019 at 11:54 AM Derek Higgins wrote: > On Mon, 7 Jan 2019 at 17:08, Clark Boylan wrote: > > > > On Mon, Jan 7, 2019, at 8:48 AM, Julia Kreger wrote: > > > Thanks for bringing this up Derek! > > > Comments below. > > > > > > On Mon, Jan 7, 2019 at 8:30 AM Derek Higgins > wrote: > > > > > > > > Hi All, > > > > > > > > Shortly before the holidays CI jobs moved from xenial to bionic, for > > > > Ironic this meant a bunch failures[1], all have now been dealt with, > > > > with the exception of the UEFI job. It turns out that during this job > > > > our (virtual) baremetal nodes use tftp to download a ipxe image. In > > > > order to track these tftp connections we have been making use of the > > > > fact that nf_conntrack_helper has been enabled by default. In newer > > > > kernel versions[2] this is no longer the case and I'm now trying to > > > > figure out the best way to deal with the new behaviour. I've put > > > > together some possible solutions along with some details on why they > > > > are not ideal and would appreciate some opinions > > > > > > The git commit message suggests that users should explicitly put in > rules such > > > that the traffic is matched. I feel like the kernel change ends up > > > being a behavior > > > change in this case. > > > > > > I think the reasonable path forward is to have a configuration > > > parameter that the > > > l3 agent can use to determine to set the netfilter connection tracker > helper. > > > > > > Doing so, allows us to raise this behavior change to operators > minimizing the > > > need of them having to troubleshoot it in production, and gives them a > choice > > > in the direction that they wish to take. > > > > https://home.regit.org/netfilter-en/secure-use-of-helpers/ seems to > cover this. Basically you should explicitly enable specific helpers when > you need them rather than relying on the auto helper rules. > > Thanks, I forgot to point out the option of adding these rules, If I > understand it correctly they would need to be added inside the router > namespace when neutron creates it, somebody from neutron might be able > to indicate if this is a workable solution. > > > > > Maybe even avoid the configuration option entirely if ironic and neutron > can set the required helper for tftp when tftp is used? > > > > > > > > [trim] > > > > > > > [more trimming] > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongbin034 at gmail.com Sun Jan 13 21:32:43 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Sun, 13 Jan 2019 16:32:43 -0500 Subject: [openstack-ZUN] In-Reply-To: References: Message-ID: Hi Rania, It seems I can reproduce the error by using your template (with modification of the private/public network name). The problem is resolved after I switched to the "mysql:5.7" image: http://paste.openstack.org/compare/742277/742276/ . It might relate to this issue: https://github.com/docker-library/wordpress/issues/313 . If it still doesn't work after switching the image, give another try by opening the mysql port in the security groups. For example: https://github.com/hongbin/heat-templates/commit/848d4cce49e85e0fff4b06c35c71de43532389f2 . Let me know if it still doesn't work. Best regards, Hongbin On Sun, Jan 13, 2019 at 11:23 AM Rania Adouni wrote: > hi everyone , > > I was trying to deploy wordpress -zun by using heat , this is the template > I used "https://pastebin.com/0PGtWSVw" . > now the stack create successfully the mysql image running but the > wordpress image alwayes stopped and when I try to started and access to the > container " openstack appcontainer exec --interactive rho-1-container > apache2-foreground " > i get this output : > ********************************* > connected to container "rho-1-container" > type ~. to disconnect > AH00558: apache2: Could not reliably determine the server's fully > qualified domain name, using 172.16.0.3. Set the 'ServerName' directive > globally to suppress this message > AH00558: apache2: Could not reliably determine the server's fully > qualified domain name, using 172.16.0.3. Set the 'ServerName' directive > globally to suppress this message > [Sun Jan 13 10:07:24.463058 2019] [mpm_prefork:notice] [pid 77] AH00163: > Apache/2.4.25 (Debian) PHP/7.2.14 configured -- resuming normal operations > [Sun Jan 13 10:07:24.463196 2019] [core:notice] [pid 77] AH00094: Command > line: 'apache2 -D FOREGROUND' > ***************************************************** > and then the status of wordpress image back stopped !!!! > the logs of wordpress image can be found here : > https://pastebin.com/CitXk6zN > > thanks for any help !! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Sun Jan 13 22:45:00 2019 From: emilien at redhat.com (Emilien Macchi) Date: Sun, 13 Jan 2019 17:45:00 -0500 Subject: [tripleo] TripleO Stein milestone 2 released ! Message-ID: We just released Stein Milestone 2 for TripleO, thanks all for your work: https://launchpad.net/tripleo/+milestone/stein-2 If your blueprint is done, please mark it as "Implemented" Or move it to stein-3. Bugs in progress will be moved to stein-3 automatically. By the end of the week, I'll move them myself otherwise but please do it if you can. I'll provide interesting stats at the end of Stein, where we compare numbers of fixed bugs and implemented blueprints over the cycles. Thanks, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Mon Jan 14 02:05:48 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 14 Jan 2019 11:05:48 +0900 Subject: [Searchlight] Nominating Thuy Dang for Searchlight core In-Reply-To: References: Message-ID: Hi, Welcome to the core team, Thuy Dang :) Bests, On Thu, Jan 10, 2019 at 11:07 AM lương hữu tuấn wrote: > +1 from me :) > > On Thursday, January 10, 2019, Trinh Nguyen wrote: > >> Hello team, >> >> I would like to nominate Thuy Dang for >> Searchlight core. He has been leading the effort to clarify our vision and >> working on some blueprints to make Searchlight a multi-cloud application. I >> believe Thuy will be a great resource for our team. >> >> Bests, >> >> >> -- >> *Trinh Nguyen* >> *www.edlab.xyz * >> >> -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Mon Jan 14 02:53:06 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 14 Jan 2019 11:53:06 +0900 Subject: [Searchlight] Team meeting cancelled today Message-ID: Hi team, I will help to coordinate an upstream training webinar at 1400 today [1] so will have to cancel the team meeting. If you guys want to discuss something, please let me know, I will be on the IRC channel. [1] https://www.meetup.com/VietOpenStack/events/257860457/ Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From lajos.katona at ericsson.com Mon Jan 14 08:59:53 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Mon, 14 Jan 2019 08:59:53 +0000 Subject: [L2-Gateway] l2gw-connection status In-Reply-To: References: Message-ID: Hi, Sorry, missing subject.... On 2019. 01. 11. 15:19, Lajos Katona wrote: > Hi, > > I have a question regarding networking-l2gw, specifically l2gw-connection. > We have an issue where the hw switch configured by networking-l2gw is > slow, so when the l2gw-connection is created the API returns > successfully, but the dataplane configuration is not yet ready. > Do you think that adding state field to the connection is feasible somehow? > By checking the vtep schema > (http://www.openvswitch.org/support/dist-docs/vtep.5.html) no such > information is available on vtep level. > > Thanks in advance for the help. > > Regarads > Lajos From sbauza at redhat.com Mon Jan 14 11:19:46 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Mon, 14 Jan 2019 12:19:46 +0100 Subject: [nova] Retiring gantt, python-ganttclient projects In-Reply-To: <1fec3e43b5247493614fe3f3b175133408f960e2.camel@redhat.com> References: <1fec3e43b5247493614fe3f3b175133408f960e2.camel@redhat.com> Message-ID: On Fri, Jan 11, 2019 at 5:44 PM Stephen Finucane wrote: > Hey, > > These projects are mega old, don't appear to have been official > projects, and should have been retired a long time ago. This is serves > as a heads up on the off-chance someone has managed to do something > with them. > > All good with me. It even could be confusing for people want to know about placement and scheduler. Do you need me for retiring the repos ? -Sylvain Stephen > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Mon Jan 14 11:52:58 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 14 Jan 2019 11:52:58 +0000 Subject: [nova] Retiring gantt, python-ganttclient projects In-Reply-To: References: <1fec3e43b5247493614fe3f3b175133408f960e2.camel@redhat.com> Message-ID: <69c83bb74b7341414156cfe48b7e64368d11b9bd.camel@redhat.com> On Mon, 2019-01-14 at 12:19 +0100, Sylvain Bauza wrote: > On Fri, Jan 11, 2019 at 5:44 PM Stephen Finucane > wrote: > > Hey, > > > > > > > > These projects are mega old, don't appear to have been official > > > > projects, and should have been retired a long time ago. This is > > serves > > > > as a heads up on the off-chance someone has managed to do something > > > > with them. > > > > > > All good with me. It even could be confusing for people want to know > about placement and scheduler. > Do you need me for retiring the repos ? > -Sylvain Indeed. Reviews are here: * https://review.openstack.org/630154 * https://review.openstack.org/630138 Looks like it has to be you or John Garbutt to push them and close them out, as you're the only still active cores I can see. Thanks, Stephen -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Mon Jan 14 12:34:19 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 14 Jan 2019 13:34:19 +0100 Subject: [cinder] db sync error upgrading from pike to queens In-Reply-To: References: <20190111152318.ztuwirfgypehdfp6@localhost> Message-ID: <20190114123419.mqblajjrvzduo4f6@localhost> On 11/01, Brandon Caulder wrote: > Hi, > > The steps were... > - purge > - shutdown cinder-scheduler, cinder-api > - upgrade software > - restart cinder-volume Hi, You should not restart cinder volume services before doing the DB sync, otherwise the Cinder service is likely to fail. > - sync (upgrade fails and stops at v114) > - sync again (db upgrades to v117) > - restart cinder-volume > - stacktrace observed in volume.log > At this point this could be a DB issue: https://bugs.mysql.com/bug.php?id=67926 https://jira.mariadb.org/browse/MDEV-10558 Cheers, Gorka. > Thanks > > On Fri, Jan 11, 2019 at 7:23 AM Gorka Eguileor wrote: > > > On 10/01, Brandon Caulder wrote: > > > Hi Iain, > > > > > > There are 424 rows in volumes which drops down to 185 after running > > > cinder-manage db purge 1. Restarting the volume service after package > > > upgrade and running sync again does not remediate the problem, although > > > running db sync a second time does bump the version up to 117, the > > > following appears in the volume.log... > > > > > > http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/ > > > > > > > Hi, > > > > If I understand correctly the steps were: > > > > - Run DB sync --> Fail > > - Run DB purge > > - Restart volume services > > - See the log error > > - Run DB sync --> version proceeds to 117 > > > > If that is the case, could you restart the services again now that the > > migration has been moved to version 117? > > > > If the cinder-volume service is able to restart please run the online > > data migrations with the service running. > > > > Cheers, > > Gorka. > > > > > > > Thanks > > > > > > On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell < > > iain.macdonnell at oracle.com> > > > wrote: > > > > > > > > > > > Different issue, I believe (DB sync vs. online migrations) - it just > > > > happens that both pertain to shared targets. > > > > > > > > Brandon, might you have a very large number of rows in your volumes > > > > table? Have you been purging soft-deleted rows? > > > > > > > > ~iain > > > > > > > > > > > > On 1/10/19 11:01 AM, Jay Bryant wrote: > > > > > Brandon, > > > > > > > > > > I am thinking you are hitting this bug: > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e= > > > > > > > > > > > > > > > I think you can work around it by retrying the migration with the > > volume > > > > > service running. You may, however, want to check with Iain > > MacDonnell > > > > > as he has been looking at this for a while. > > > > > > > > > > Thanks! > > > > > Jay > > > > > > > > > > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: > > > > >> Hi, > > > > >> > > > > >> I am receiving the following error when performing an offline > > upgrade > > > > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to > > > > >> openstack-cinder-1:12.0.3-1.el7. > > > > >> > > > > >> # cinder-manage db version > > > > >> 105 > > > > >> > > > > >> # cinder-manage --debug db sync > > > > >> Error during database migration: (pymysql.err.OperationalError) > > (2013, > > > > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE > > volumes > > > > >> SET shared_targets=%(shared_targets)s'] [parameters: > > > > >> {'shared_targets': 1}] > > > > >> > > > > >> # cinder-manage db version > > > > >> 114 > > > > >> > > > > >> The db version does not upgrade to queens version 117. Any help > > would > > > > >> be appreciated. > > > > >> > > > > >> Thank you > > > > > > > > > > > > > > > From amotoki at gmail.com Mon Jan 14 12:36:42 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Mon, 14 Jan 2019 21:36:42 +0900 Subject: [oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0 In-Reply-To: <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> References: <7f02504e-e6e1-85d6-d982-b1866d648c7f@nemebean.com> <614961bb-99d7-a4e9-5fcd-26ae6aa15648@nemebean.com> <181A642E-55B0-4559-8C33-C1CD4061B5BB@redhat.com> <07e3d4a1-fc12-be77-a1a2-6fe2f7b6bca2@nemebean.com> <3005d010-4e44-f06b-f521-1f4a41e3b174@nemebean.com> Message-ID: The similar failure happens in neutron-fwaas. This blocks several patches in neutron-fwaas including policy-in-code support. https://bugs.launchpad.net/neutron/+bug/1811506 Most failures are fixed by applying Ben's neutron fix https://review.openstack.org/#/c/629335/ [1], but we still have one failure in neutron_fwaas.tests.functional.privileged.test_utils.InNamespaceTest.test_in_namespace [2]. This failure is caused by oslo.privsep 1.31.0 too. This does not happen with 1.30.1. Any help would be appreciated. [1] neutron-fwaas change https://review.openstack.org/#/c/630451/ [2] http://logs.openstack.org/51/630451/2/check/legacy-neutron-fwaas-dsvm-functional/05b9131/logs/testr_results.html.gz -- Akihiro Motoki (irc: amotoki) 2019年1月9日(水) 9:32 Ben Nemec : > I think I've got it. At least in my local tests, the handle pointer > being passed from C -> Python -> C was getting truncated at the Python > step because we didn't properly define the type. If the address assigned > was larger than would fit in a standard int then we passed what amounted > to a bogus pointer back to the C code, which caused the segfault. > > I have no idea why privsep threading would have exposed this, other than > maybe running in threads affected the address space somehow? > > In any case, https://review.openstack.org/629335 has got these > functional tests working for me locally in oslo.privsep 1.31.0. It would > be great if somebody could try them out and verify that I didn't just > find a solution that somehow only works on my system. :-) > > -Ben > > On 1/8/19 4:30 PM, Ben Nemec wrote: > > > > > > On 1/8/19 2:22 PM, Slawomir Kaplonski wrote: > >> Hi Ben, > >> > >> I was also looking at it today. I’m totally not an C and Oslo.privsep > >> expert but I think that there is some new process spawned here. > >> I put pdb before line > >> > https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/netlink_lib.py#L191 > >> where this issue happen. Then, with "ps aux” I saw: > >> > >> vagrant at fullstack-ubuntu ~ $ ps aux | grep privsep > >> root 18368 0.1 0.5 185752 33544 pts/1 Sl+ 13:24 0:00 > >> /opt/stack/neutron/.tox/dsvm-functional/bin/python > >> /opt/stack/neutron/.tox/dsvm-functional/bin/privsep-helper > >> --config-file neutron/tests/etc/neutron.conf --privsep_context > >> neutron.privileged.default --privsep_sock_path > >> /tmp/tmpG5iqb9/tmp1dMGq0/privsep.sock > >> vagrant 18555 0.0 0.0 14512 1092 pts/2 S+ 13:25 0:00 grep > >> --color=auto privsep > >> > >> But then when I continue run test, and it segfaulted, in journal log I > >> have: > >> > >> Jan 08 13:25:29 fullstack-ubuntu kernel: privsep-helper[18369] > >> segfault at 140043e8 ip 00007f8e1800ef32 sp 00007f8e18a63320 error 4 > >> in libnetfilter_conntrack.so.3.5.0[7f8e18009000+1a000] > >> > >> Please check pic